Polly Resilience: Fundamentals Guide (2026)
Polly is a .NET resilience and transient-fault-handling library that wraps synchronous and asynchronous code to guard against temporary failures in distributed systems. It provides policies for retries, circuit breakers, timeouts, bulkhead isolation, and fallbacks—each addressing a specific failure scenario. With the Resilience Pipeline introduced in .NET 8, Polly integrates natively into service registration, offering built-in telemetry and streamlined composition.
What Is Polly Resilience?
Polly resilience in .NET means using declarative policies to handle transient failures without rewriting your core business logic. A transient failure is a temporary problem—a network timeout, a briefly unavailable service, a rate-limit rejection—that might succeed if retried. Polly's policies automate retry logic, prevent cascading failures, and degrade gracefully rather than crashing.
The Polly library (NuGet: Polly, Polly.ResilientHttpClient) was created by Dylan Reisenberger and the community (GitHub: github.com/App-vNext/Polly). Over 300 million downloads demonstrate wide adoption in production microservices. In 2024–2026, Polly became foundational to .NET resilience patterns, competing with built-in CircuitBreakerPolicy alternatives and reinforced by the Resilience Pipeline standardization.
Core Polly Policy Types
Polly provides five main policy types, each solving a specific problem:
Retry Policies
A retry policy automatically retries failed requests using configurable delays and exponential backoff. If a call throws an exception or returns a specific status code, the policy waits and tries again. This is ideal for intermittent network glitches or brief service unavailability.
Example use case: An HTTP call to a downstream API times out once every 100 requests. A simple retry with a 500 ms delay catches it and succeeds on the second attempt.
Circuit Breaker Policies
A circuit breaker stops sending requests to a failing service, protecting the service from overload and your system from cascading failures. The circuit has three states: Closed (requests pass through), Open (requests fail immediately), and Half-Open (testing if the service recovered).
Example use case: A database becomes overwhelmed. The circuit breaker opens after 5 consecutive failures, rejecting new requests for 30 seconds. After that window, it probes with a single request; if it succeeds, the circuit closes.
Timeout Policies
A timeout policy enforces a maximum execution time. If an operation takes longer than the limit, Polly cancels it and throws a TimeoutRejectedException. This prevents threads from hanging indefinitely on slow or stuck operations.
Example use case: A third-party API call should complete in 2 seconds. If it hangs for 5 seconds, the timeout policy cancels and retries immediately.
Bulkhead Isolation Policies
A bulkhead limits the number of concurrent executions. Excess requests are queued or rejected, preventing one slow operation from exhausting all available thread pool resources. This isolates failures to a single subsystem.
Example use case: A microservice has 16 thread pool threads. A bulkhead reserves 4 threads for calls to Service-A and 4 for Service-B. If Service-A slows down, it won't starve Service-B.
Fallback Policies
A fallback specifies an alternative action if the primary operation fails—return a cached value, a default response, or a gracefully degraded result. This keeps the application alive even if a dependency is temporarily down.
Example use case: A recommendation service is unreachable. The fallback returns a list of trending items from cache instead of failing the user's request.
When to Use Each Policy
The table below guides policy selection by failure scenario:
| Failure Scenario | Best Policy | Why |
|---|---|---|
| Temporary network glitch, brief service hiccup | Retry | Waits a moment and tries again; simple and effective for transient blips. |
| Service degraded or overloaded; risk of cascade | Circuit Breaker | Stops hammering the service; lets it recover. |
| Operation hangs indefinitely | Timeout | Frees resources and unblocks waiting threads. |
| All failures for a service might cascade | Bulkhead | Isolates blast radius; other services stay responsive. |
| Dependency unavailable; need partial service | Fallback | Serve from cache or degraded mode instead of returning error. |
Key Takeaways
- Polly is a .NET resilience library for handling transient failures in distributed systems without rewriting business logic.
- Five core policies address different failure modes: retries, circuit breakers, timeouts, bulkheads, and fallbacks.
- Combining policies is common: retry → circuit breaker → fallback → timeout wrapping one call.
- Resilience Pipeline (.NET 8+) offers native integration, better observability, and simplified policy composition.
- Know your failure modes before choosing policies; wrong policy selection adds latency or masks real problems.
Frequently Asked Questions
What is the difference between a retry policy and a circuit breaker?
A retry policy repeats a failed operation immediately or after a delay, assuming the failure is temporary and the operation might succeed. A circuit breaker stops attempting after repeated failures, protecting the target service from overload. Retries handle transient glitches; circuit breakers protect against sustained outages.
Should I always use exponential backoff for retries?
Exponential backoff (delay doubles each retry) is best for rate-limited or resource-contention scenarios, spreading retries over time and reducing thundering herd. For truly random transient glitches, fixed delay or linear backoff suffices. Measure your failure rate and adjust; excessive backoff adds unacceptable latency.
Can I use Polly with HttpClient?
Yes. You can wrap HttpClient calls with Polly policies, or inject policies into the HTTP pipeline via IAsyncPolicy<HttpResponseMessage> and handler factories. The newer Resilience Pipeline integrates natively with dependency injection, reducing boilerplate.
What is bulkhead isolation and when do I need it?
Bulkhead isolates a subset of resources (threads, connections) for a specific operation, preventing one slow service from starving others. Use it when multiple independent services share thread pool or connection pool resources and you want to limit cross-service blast radius.
Further Reading
- Polly GitHub Repository
- Microsoft: Handle transient faults with Polly
- Resilience Pipeline .NET
- Release It! Second Edition—Michael T. Nygard (seminal text on resilience patterns)