Skip to main content

Polly Resilience: Fundamentals Guide (2026)

Polly is a .NET resilience and transient-fault-handling library that wraps synchronous and asynchronous code to guard against temporary failures in distributed systems. It provides policies for retries, circuit breakers, timeouts, bulkhead isolation, and fallbacks—each addressing a specific failure scenario. With the Resilience Pipeline introduced in .NET 8, Polly integrates natively into service registration, offering built-in telemetry and streamlined composition.

What Is Polly Resilience?

Polly resilience in .NET means using declarative policies to handle transient failures without rewriting your core business logic. A transient failure is a temporary problem—a network timeout, a briefly unavailable service, a rate-limit rejection—that might succeed if retried. Polly's policies automate retry logic, prevent cascading failures, and degrade gracefully rather than crashing.

The Polly library (NuGet: Polly, Polly.ResilientHttpClient) was created by Dylan Reisenberger and the community (GitHub: github.com/App-vNext/Polly). Over 300 million downloads demonstrate wide adoption in production microservices. In 2024–2026, Polly became foundational to .NET resilience patterns, competing with built-in CircuitBreakerPolicy alternatives and reinforced by the Resilience Pipeline standardization.

Core Polly Policy Types

Polly provides five main policy types, each solving a specific problem:

Retry Policies

A retry policy automatically retries failed requests using configurable delays and exponential backoff. If a call throws an exception or returns a specific status code, the policy waits and tries again. This is ideal for intermittent network glitches or brief service unavailability.

Example use case: An HTTP call to a downstream API times out once every 100 requests. A simple retry with a 500 ms delay catches it and succeeds on the second attempt.

Circuit Breaker Policies

A circuit breaker stops sending requests to a failing service, protecting the service from overload and your system from cascading failures. The circuit has three states: Closed (requests pass through), Open (requests fail immediately), and Half-Open (testing if the service recovered).

Example use case: A database becomes overwhelmed. The circuit breaker opens after 5 consecutive failures, rejecting new requests for 30 seconds. After that window, it probes with a single request; if it succeeds, the circuit closes.

Timeout Policies

A timeout policy enforces a maximum execution time. If an operation takes longer than the limit, Polly cancels it and throws a TimeoutRejectedException. This prevents threads from hanging indefinitely on slow or stuck operations.

Example use case: A third-party API call should complete in 2 seconds. If it hangs for 5 seconds, the timeout policy cancels and retries immediately.

Bulkhead Isolation Policies

A bulkhead limits the number of concurrent executions. Excess requests are queued or rejected, preventing one slow operation from exhausting all available thread pool resources. This isolates failures to a single subsystem.

Example use case: A microservice has 16 thread pool threads. A bulkhead reserves 4 threads for calls to Service-A and 4 for Service-B. If Service-A slows down, it won't starve Service-B.

Fallback Policies

A fallback specifies an alternative action if the primary operation fails—return a cached value, a default response, or a gracefully degraded result. This keeps the application alive even if a dependency is temporarily down.

Example use case: A recommendation service is unreachable. The fallback returns a list of trending items from cache instead of failing the user's request.

When to Use Each Policy

The table below guides policy selection by failure scenario:

Failure ScenarioBest PolicyWhy
Temporary network glitch, brief service hiccupRetryWaits a moment and tries again; simple and effective for transient blips.
Service degraded or overloaded; risk of cascadeCircuit BreakerStops hammering the service; lets it recover.
Operation hangs indefinitelyTimeoutFrees resources and unblocks waiting threads.
All failures for a service might cascadeBulkheadIsolates blast radius; other services stay responsive.
Dependency unavailable; need partial serviceFallbackServe from cache or degraded mode instead of returning error.

Key Takeaways

  • Polly is a .NET resilience library for handling transient failures in distributed systems without rewriting business logic.
  • Five core policies address different failure modes: retries, circuit breakers, timeouts, bulkheads, and fallbacks.
  • Combining policies is common: retry → circuit breaker → fallback → timeout wrapping one call.
  • Resilience Pipeline (.NET 8+) offers native integration, better observability, and simplified policy composition.
  • Know your failure modes before choosing policies; wrong policy selection adds latency or masks real problems.

Frequently Asked Questions

What is the difference between a retry policy and a circuit breaker?

A retry policy repeats a failed operation immediately or after a delay, assuming the failure is temporary and the operation might succeed. A circuit breaker stops attempting after repeated failures, protecting the target service from overload. Retries handle transient glitches; circuit breakers protect against sustained outages.

Should I always use exponential backoff for retries?

Exponential backoff (delay doubles each retry) is best for rate-limited or resource-contention scenarios, spreading retries over time and reducing thundering herd. For truly random transient glitches, fixed delay or linear backoff suffices. Measure your failure rate and adjust; excessive backoff adds unacceptable latency.

Can I use Polly with HttpClient?

Yes. You can wrap HttpClient calls with Polly policies, or inject policies into the HTTP pipeline via IAsyncPolicy<HttpResponseMessage> and handler factories. The newer Resilience Pipeline integrates natively with dependency injection, reducing boilerplate.

What is bulkhead isolation and when do I need it?

Bulkhead isolates a subset of resources (threads, connections) for a specific operation, preventing one slow service from starving others. Use it when multiple independent services share thread pool or connection pool resources and you want to limit cross-service blast radius.

Further Reading