Skip to main content

Circuit Breaker Pattern with Polly

A circuit breaker is a state machine that stops sending requests to a failing service, protecting both the service and your system from cascading failures. It transitions through three states—Closed (pass requests), Open (reject immediately), and Half-Open (probe for recovery)—based on failure thresholds. When deployed correctly, circuit breakers can reduce failed request volume by 60-80% and give downstream services time to recover.

Circuit Breaker States Explained

The circuit breaker monitors failures and transitions between states:

Closed: Normal operation. Requests pass through to the target service. If failures stay below a threshold, the circuit remains closed.

Open: The failure threshold is exceeded. New requests fail immediately without calling the target service. This prevents hammering a degraded service and gives it time to recover.

Half-Open: After an open window expires, the circuit tests the service with a single request (or a small number). If it succeeds, the circuit closes; if it fails, it opens again.

Closed ──(failures exceed threshold)──> Open
↑ │
└──(test request succeeds)─ Half-Open──┘

Implementing a Basic Circuit Breaker

using Polly;
using System;
using System.Net.Http;
using System.Threading.Tasks;

var circuitBreaker = Policy
.Handle<HttpRequestException>()
.OrResult<HttpResponseMessage>(r =>
!r.IsSuccessStatusCode
)
.CircuitBreakerAsync<HttpResponseMessage>(
handledEventsAllowedBeforeBreaking: 5,
durationOfBreak: TimeSpan.FromSeconds(30),
onBreak: async (outcome, duration) =>
{
Console.WriteLine($"Circuit opened for {duration.TotalSeconds}s. Error: {outcome.Exception?.Message}");
await Task.CompletedTask;
},
onReset: async () =>
{
Console.WriteLine("Circuit closed. Service recovered.");
await Task.CompletedTask;
},
onHalfOpen: async () =>
{
Console.WriteLine("Circuit half-open. Testing service...");
await Task.CompletedTask;
}
);

using var client = new HttpClient();
for (int i = 0; i < 10; i++)
{
try
{
var response = await circuitBreaker.ExecuteAsync(async () =>
await client.GetAsync("https://api.example.com/data")
);
Console.WriteLine($"Request {i + 1}: Success");
}
catch (BrokenCircuitException)
{
Console.WriteLine($"Request {i + 1}: Circuit is open. Failing fast.");
}
catch (Exception ex)
{
Console.WriteLine($"Request {i + 1}: Error - {ex.Message}");
}

await Task.Delay(TimeSpan.FromSeconds(2));
}

Configuration breakdown:

  • handledEventsAllowedBeforeBreaking: 5 — Open the circuit after 5 failures.
  • durationOfBreak: TimeSpan.FromSeconds(30) — Keep the circuit open for 30 seconds.
  • onBreak — Callback when circuit opens (log, alert, notify).
  • onReset — Callback when circuit closes after recovery.
  • onHalfOpen — Callback when testing if the service recovered.

Real-world scenario: Database connection pool is exhausted. Requests fail for 5 consecutive calls. The circuit opens, rejecting new requests immediately. After 30 seconds, the circuit enters half-open state and probes with a single request. If it succeeds, the circuit closes and normal traffic resumes.

Advanced Configuration: Failure Threshold as Percentage

Instead of a fixed failure count, you can use a percentage threshold. Open the circuit if 50% of requests fail:

using Polly;
using System;
using System.Net.Http;
using System.Threading.Tasks;

var circuitBreaker = Policy
.Handle<HttpRequestException>()
.OrResult<HttpResponseMessage>(r => !r.IsSuccessStatusCode)
.CircuitBreakerAsync<HttpResponseMessage>(
failureThreshold: 0.5, // 50% failure rate
samplingDuration: TimeSpan.FromSeconds(10),
minimumThroughput: 5, // Require at least 5 requests to measure
durationOfBreak: TimeSpan.FromSeconds(60),
onBreak: async (outcome, duration) =>
{
Console.WriteLine($"Circuit opened: 50% failure rate detected. Waiting {duration.TotalSeconds}s.");
await Task.CompletedTask;
}
);

// Usage
using var client = new HttpClient();
var response = await circuitBreaker.ExecuteAsync(async () =>
await client.GetAsync("https://api.example.com/data")
);

Why use percentage? Fixed thresholds may be too sensitive in low-traffic scenarios. A percentage-based threshold ignores noise and opens only during genuine degradation.

Testing Circuit Breaker State Transitions

Unit tests verify the circuit breaker transitions correctly:

using Polly;
using Xunit;
using System;
using System.Threading.Tasks;

public class CircuitBreakerTests
{
[Fact]
public async Task CircuitOpensAfterThresholdExceeded()
{
var policy = Policy
.Handle<Exception>()
.CircuitBreakerAsync(
handledEventsAllowedBeforeBreaking: 3,
durationOfBreak: TimeSpan.FromMilliseconds(100)
);

// Trigger 3 failures
for (int i = 0; i < 3; i++)
{
await Assert.ThrowsAsync<Exception>(
() => policy.ExecuteAsync(() =>
throw new Exception("Simulated failure")
)
);
}

// Circuit should now be open; this should fail fast with BrokenCircuitException
await Assert.ThrowsAsync<BrokenCircuitException>(
() => policy.ExecuteAsync(() =>
throw new Exception("This should not execute")
)
);
}

[Fact]
public async Task CircuitRecoversAfterBreakDuration()
{
var policy = Policy
.Handle<Exception>()
.CircuitBreakerAsync(
handledEventsAllowedBeforeBreaking: 1,
durationOfBreak: TimeSpan.FromMilliseconds(100)
);

// Trigger one failure to open circuit
await Assert.ThrowsAsync<Exception>(
() => policy.ExecuteAsync(() => throw new Exception("Fail"))
);

// Wait for break duration to expire
await Task.Delay(TimeSpan.FromMilliseconds(150));

// Circuit should enter half-open and allow the request
var result = await policy.ExecuteAsync(() =>
Task.FromResult("Success")
);

Assert.Equal("Success", result);
}
}

Combining Circuit Breaker with Retry

A real microservice combines circuit breaker with retry: retry for transient glitches, but circuit breaker to protect against sustained outages:

using Polly;
using System;
using System.Net.Http;
using System.Threading.Tasks;

var retryPolicy = Policy
.Handle<HttpRequestException>()
.WaitAndRetryAsync(
retryCount: 3,
sleepDurationProvider: attempt =>
TimeSpan.FromMilliseconds(Math.Pow(2, attempt) * 100)
);

var circuitBreakerPolicy = Policy
.Handle<HttpRequestException>()
.CircuitBreakerAsync(
handledEventsAllowedBeforeBreaking: 5,
durationOfBreak: TimeSpan.FromSeconds(30)
);

// Wrap: retry first, then circuit breaker
var combinedPolicy = Policy.WrapAsync(retryPolicy, circuitBreakerPolicy);

using var client = new HttpClient();
var response = await combinedPolicy.ExecuteAsync(async () =>
await client.GetAsync("https://api.example.com/data")
);

Execution order: Request → Retry (up to 3 times) → Circuit Breaker (check state, open/reject if needed) → call target.

Key Takeaways

  • Circuit breaker prevents cascading failures by stopping requests to a failing service.
  • Three states (Closed, Open, Half-Open) manage the transition from normal operation to recovery.
  • Threshold options include fixed failure count or percentage-based rate; choose based on your traffic volume.
  • Break duration should allow time for the target service to recover (30 seconds to 5 minutes typical).
  • Always combine circuit breaker with retry: retry handles transients, circuit breaker handles sustained outages.

Frequently Asked Questions

What is the difference between circuit breaker open and closed?

Closed means the circuit is operational; requests pass through and failures are tracked. Open means the failure threshold is exceeded; new requests fail immediately with BrokenCircuitException without calling the target. This protects both the service and your threads.

How do I set the break duration?

Start with 30-60 seconds for most services. If the service takes longer to recover (e.g., database full restart), use 2-5 minutes. Monitor recovery time in production and adjust. Too short, and the circuit reopens before recovery completes. Too long, and users see unnecessary downtime.

Can I use circuit breaker for all operations?

Circuit breaker is most valuable for calls to external services (APIs, databases, message queues) where failures can cascade. Don't use it for in-process operations or where every millisecond counts; the state tracking adds small overhead.

What happens to requests in flight when the circuit opens?

In-flight requests complete or fail normally. Only new requests are rejected immediately with BrokenCircuitException. This is why the break duration must be long enough for the service to actually recover.

Further Reading