When developing enterprise-level applications, we often need to call external services and resources. These services could be a network location, database server, or web service. Whenever we call a service, there is a chance that a problem with the network or the end-service itself could cause a service failure. One method of attempting to overcome a service failure is to queue requests and retry periodically. This allows us to continue processing requests until the service becomes available again. However, if a network or service is experiencing problems, hammering it with retry attempts will not help the service to recover, especially if it is under increased load. Such a pounding can cause even more damage and interruption to services. If we know there could potentially be a problem with a service, we can help take some of the strain by implementing a Circuit Breaker pattern on the client application.
Circuit breakers in our home prevent a surge of current from damaging appliances or overheating the wiring. They work by allowing a certain level of current to enter the system. If the current exceeds the threshold, the circuit opens, stopping the current from flowing and preventing further damage. Once the problem has been fixed, the circuit breaker can be reset which closes the circuit and allows electricity to flow again. The Circuit Breaker patten uses the same concept by stopping requests to a resource if the number of failures exceed a certain threshold.
The Circuit Breaker pattern is described in Michael T. Nygard’s book, Release It! Design and Deploy Production-Ready Software. The pattern has three operational states: closed, open and half-open.
In the “closed” state, operations are executed as usual. If an operation throws an exception, the failure count is incremented and an OperationFailedException is thrown. If the failure count exceeds the threshold, the circuit breaker trips into the “open” state. If a call succeeds before the threshold is reached, the failure count is reset.
In the “open” state, all calls to the operation will fail immediately and throw an OpenCircuitException. A timeout is started when the circuit breaker trips. Once the timeout is reached, the circuit breaker enters a “half-open” state.
In the “half-open” state, the circuit breaker allows one operation to execute. If this operation fails, the circuit breaker re-enters the “open” state and the timeout is reset. If the operation succeeds, the circuit breaker enters the “closed” state and the process starts over.
You can download the circuit breaker code and tests here. If you have any comments or suggestions, I would love to hear them!
Update: I have posted a new article that contains a number of additions and improvements to the circuit breaker code.
For more information on this pattern and many other ways to improve software stability, capacity and operational ability, I highly recommend the book Release It! Design and Deploy Production-Ready Software by Michael T. Nygard.