Archive for February, 2008

Implementing The Circuit Breaker Pattern In C# – Part 2

In my previous post, I discussed an implementation of The Circuit Breaker Pattern as described in Michael T. Nygard’s book, Release It! Design and Deploy Production-Ready Software. In this post, I will talk about several additions and improvements I have made to the initial implementation.

Service Level

The Circuit Breaker Pattern contains a failure count that keeps track of the number of consecutive exceptions thrown by an operation. When the failure count reaches a threshold, the circuit breaker trips. If an operation succeeds before the failure threshold is reached, the failure count is reset to zero. This works well if the service outage causes multiple consecutive failures, but if the threshold is set to 100 and the service fails 99 times, then one operation succeeds, the failure count is reset to 0, even though there is obviously a problem with the service that should be handled.

To deal with intermittent service failures, I have implemented a “service level” calculation. This indicates the ratio of successful operations to failed operations, expressed as a percentage. For example, if the circuit breaker has a threshold of 100 and an operation fails 50 times, then the current service level is 50%. If the service recovers and 25 operations succeed, then the service level will be 75%. The circuit breaker will not completely reset after a single successful operation. Each successful operation increments the service level, while each failed operation decrements the service level. Once the service level reaches 0%, i.e. the ratio of failed operations over successful ones have reached the threshold, the circuit trips.

A ServiceLevelChanged event allows the client application to be notified of service level changes. This could be used for monitoring performance, or for tracking service levels against a Service Level Agreement (SLA).

The threshold value determines the circuit breaker’s resistance to failures. If a client makes a lot of calls to a service, a higher threshold will allow it more time to recover from failures before tripping. If the client makes fewer calls, but the calls are expensive to the service, a lower threshold will allow the circuit breaker to trip more easily.

Ignored Exception Types

Sometimes a service will throw an exception as part of the service logic. We might not want these exceptions to affect the circuit breaker service level. I have added an IgnoredExceptionTypes property, which holds a list of exception types for the circuit breaker to ignore. If an operation throws one of these exceptions, the exception is thrown back to the caller and is not logged as a failure.

 CircuitBreaker cb = new CircuitBreaker();
 cb.IgnoredExceptionTypes.Add(typeof(AuthorizationException));

Invoker Exceptions

If the operation invoker throws an exception that was not caused by the operation itself, then the exception is thrown back to the caller and is not logged as a failure.

Threading

As mentioned in a comment by Søren on my last post, it is likely a circuit breaker would be used in a multi-threaded environment, therefore it should be able to function property when multiple threads are executing operations.

The failure count is now updated atomically using the System.Threading.Interlocked.Increment and System.Threading.Interlocked.Decrement methods. This ensures the failure count variable is locked while being modified by a thread. Other threads wanting to update the failure count must wait until it is released.

While this does not guarantee the circuit breaker is completely thread-safe, it does prevent problems with multiple threads executing operations and tracking failures. I have to confess I’m not an expert at multi-threaded application designs, so if anyone has any further suggestions on how to make the circuit breaker more thread-safe, I would love to hear them!

For more information implementing threading, see the .NET Framework Threading Design Guidelines.

Download

Download the circuit breaker code and unit tests (VS 2008).

I hope you find these enhancements helpful. Does providing a service level make sense? How can I improve multi-threading support? If you have any comments or suggestions, please let me know your thoughts.

Implementing The Circuit Breaker Pattern In C#

When developing enterprise-level applications, we often need to call external services and resources. These services could be a network location, database server, or web service. Whenever we call a service, there is a chance that a problem with the network or the end-service itself could cause a service failure. One method of attempting to overcome a service failure is to queue requests and retry periodically. This allows us to continue processing requests until the service becomes available again. However, if a network or service is experiencing problems, hammering it with retry attempts will not help the service to recover, especially if it is under increased load. Such a pounding can cause even more damage and interruption to services. If we know there could potentially be a problem with a service, we can help take some of the strain by implementing a Circuit Breaker pattern on the client application.

Circuit breakers in our home prevent a surge of current from damaging appliances or overheating the wiring. They work by allowing a certain level of current to enter the system. If the current exceeds the threshold, the circuit opens, stopping the current from flowing and preventing further damage. Once the problem has been fixed, the circuit breaker can be reset which closes the circuit and allows electricity to flow again. The Circuit Breaker patten uses the same concept by stopping requests to a resource if the number of failures exceed a certain threshold.

The Circuit Breaker pattern is described in Michael T. Nygard’s book, Release It! Design and Deploy Production-Ready Software. The pattern has three operational states: closed, open and half-open.

In the “closed” state, operations are executed as usual. If an operation throws an exception, the failure count is incremented and an OperationFailedException is thrown. If the failure count exceeds the threshold, the circuit breaker trips into the “open” state. If a call succeeds before the threshold is reached, the failure count is reset.

In the “open” state, all calls to the operation will fail immediately and throw an OpenCircuitException. A timeout is started when the circuit breaker trips. Once the timeout is reached, the circuit breaker enters a “half-open” state.

In the “half-open” state, the circuit breaker allows one operation to execute. If this operation fails, the circuit breaker re-enters the “open” state and the timeout is reset. If the operation succeeds, the circuit breaker enters the “closed” state and the process starts over.

You can download the circuit breaker code and tests here. If you have any comments or suggestions, I would love to hear them!

Update: I have posted a new article that contains a number of additions and improvements to the circuit breaker code.

For more information on this pattern and many other ways to improve software stability, capacity and operational ability, I highly recommend the book Release It! Design and Deploy Production-Ready Software by Michael T. Nygard.

Circuit breaker class diagram