Skip to main content

OpenTelemetry Metrics: .NET Monitoring

Metrics quantify application behavior—request counts, error rates, latency percentiles, and resource usage. While traces show you what happened to a single request, metrics reveal patterns across thousands: "Are errors increasing? Is p95 latency degrading?" OpenTelemetry metrics standardize how .NET applications emit these observations.

Metrics drove a 40-minute incident resolution at a fintech firm. When transactions began failing, traditional logs showed the error message but not the rate—was it 1% or 50%? Metrics revealed the error rate jumped from 0.01% to 15% within 60 seconds, pinpointing the exact time a dependency broke. This section teaches you to instrument similar critical metrics in your .NET services.

Understanding Metric Types

OpenTelemetry defines four metric types, each suited to different measurements:

Metric TypeExampleWhen to Use
Counter"HTTP requests processed: 10,450"Monotonically increasing totals (requests, errors, processed items)
Histogram"Request latency distribution: p50=120ms, p99=850ms"Recording observations (latency, size) to compute percentiles
Gauge"Active connections: 342"Current values (active threads, memory, open files)
UpDownCounter"Queue depth: +3 jobs, -2 completed"Values that increase and decrease (queue size, worker pool)

Each has different semantics for aggregation and storage. Counters are summed across instances; gauges report current values; histograms compute percentiles.

Setting Up a MeterProvider

A MeterProvider manages metrics, similar to TracerProvider for traces. Install the Prometheus exporter first:

dotnet add package OpenTelemetry.Exporter.Prometheus.AspNetCore

Then configure:

using OpenTelemetry;
using OpenTelemetry.Metrics;
using OpenTelemetry.Resources;

var resource = ResourceBuilder
.CreateDefault()
.AddService("order-service", serviceVersion: "1.2.0")
.Build();

var meterProvider = new MeterProviderBuilder()
.SetResource(resource)
.AddConsoleExporter()
.Build();

// In ASP.NET Core Startup:
builder.Services.AddOpenTelemetry()
.WithMetrics(metering =>
{
metering
.SetResource(resource)
.AddPrometheusHttpListener()
.Build();
});

The PrometheusHttpListener exposes metrics on /metrics endpoint (standard Prometheus scrape target).

Creating and Recording Counters

A counter is a monotonically increasing value—perfect for tracking totals:

using System.Diagnostics.Metrics;

public class OrderService
{
private readonly Counter<long> _ordersProcessed;
private readonly Counter<long> _ordersErrored;
private readonly Histogram<double> _orderProcessingTime;
private readonly UpDownCounter<int> _activeOrders;

public OrderService(Meter meter)
{
_ordersProcessed = meter.CreateCounter<long>(
"orders.processed",
unit: "1",
description: "Number of orders successfully processed"
);

_ordersErrored = meter.CreateCounter<long>(
"orders.errors",
unit: "1",
description: "Number of orders that failed processing"
);

_orderProcessingTime = meter.CreateHistogram<double>(
"order.processing.duration",
unit: "ms",
description: "Time to process an order"
);

_activeOrders = meter.CreateUpDownCounter<int>(
"orders.active",
unit: "1",
description: "Current number of orders being processed"
);
}

public async Task ProcessOrderAsync(Order order)
{
_activeOrders.Add(1);
var startTime = DateTime.UtcNow;

try
{
// Process order
await Task.Delay(100); // Simulate work

var duration = (DateTime.UtcNow - startTime).TotalMilliseconds;
_ordersProcessed.Add(1, new KeyValuePair("result", "success"));
_orderProcessingTime.Record(duration);
}
catch (Exception ex)
{
_ordersErrored.Add(1, new KeyValuePair("error_type", ex.GetType().Name));
throw;
}
finally
{
_activeOrders.Add(-1);
}
}
}

// Register in Startup
var meter = new Meter("OrderService", "1.0.0");
var orderService = new OrderService(meter);

Each metric is recorded with optional attributes (key-value tags) for filtering: "result": "success" lets you track success and failure rates separately.

Recording Histograms for Percentiles

Histograms capture a distribution of values, allowing backends to compute percentiles:

public class InventoryService
{
private readonly Histogram<double> _lookupLatency;
private readonly Counter<long> _lookups;

public InventoryService(Meter meter)
{
_lookupLatency = meter.CreateHistogram<double>(
"inventory.lookup.duration",
unit: "ms",
description: "Latency of inventory database lookups"
);

_lookups = meter.CreateCounter<long>(
"inventory.lookups",
unit: "1",
description: "Total inventory lookups"
);
}

public async Task<int> GetStockAsync(string sku)
{
var startTime = DateTime.UtcNow;
var attributes = new TagList { { "sku", sku } };

try
{
var stock = await _database.QueryAsync<int>(
"SELECT stock FROM inventory WHERE sku = @sku",
new { sku }
);

var duration = (DateTime.UtcNow - startTime).TotalMilliseconds;
_lookupLatency.Record(duration, attributes);
_lookups.Add(1, attributes);

return stock.First();
}
catch (TimeoutException ex)
{
var duration = (DateTime.UtcNow - startTime).TotalMilliseconds;
_lookupLatency.Record(duration, new TagList
{
{ "sku", sku },
{ "status", "timeout" }
});
_lookups.Add(1, new TagList
{
{ "sku", sku },
{ "status", "error" }
});
throw;
}
}
}

Prometheus backend aggregates these observations into percentiles. With 1000 recorded values, you get p50, p90, p99 latencies—essential for SLO monitoring.

Recording Gauges for Current State

A gauge records a single value at a point in time—useful for measurements that go up and down:

using System.Diagnostics.Metrics;

public class CacheService
{
private readonly ObservableGauge<int> _cacheSize;
private readonly Meter _meter;
private Dictionary<string, object> _cache;

public CacheService(Meter meter)
{
_meter = meter;
_cache = new();

// Observable gauge: called periodically by the meter
_cacheSize = _meter.CreateObservableGauge<int>(
"cache.size",
() => new Measurement<int>(_cache.Count),
unit: "entries",
description: "Number of cached entries"
);
}

public void Set(string key, object value)
{
_cache[key] = value;
// Gauge automatically reports current cache.Count
}
}

Observable gauges are called periodically by the meter, capturing the current state. This is more efficient than recording a gauge on every change.

Viewing Metrics with Console Exporter

For development, print metrics to console:

var meterProvider = new MeterProviderBuilder()
.SetResource(resource)
.AddConsoleExporter((options) =>
{
options.AggregationTemporality = AggregationTemporality.Delta;
})
.AddMeter("OrderService")
.Build();

Running your application outputs:

orders.processed: 42 (unit: 1)
Attributes: result=success (42)

orders.errors: 2 (unit: 1)
Attributes: error_type=TimeoutException (1), error_type=ValidationError (1)

order.processing.duration:
Boundaries: [0, 10, 100, 1000, 10000] ms
Counts: [2, 28, 10, 2, 0]

orders.active: 0 (unit: 1)

This view shows the metric type, attributes, and (for histograms) percentile buckets.

Key Takeaways

  • Use counters for cumulative totals (requests, errors, processed items).
  • Use histograms for distributions (latency, request size, batch duration).
  • Use gauges for instantaneous values (active connections, memory, queue depth).
  • Attach attributes (error type, SKU, customer tier) to enable filtering and slicing in dashboards.
  • The Prometheus exporter makes metrics queryable via tools like Grafana (covered in article 8).

Frequently Asked Questions

What is the overhead of recording metrics?

Very low. A single counter increment is roughly 1–2 microseconds. Gauges incur overhead only when they are sampled (usually 10–60 seconds).

Should I record both counters and histograms for the same operation?

For latency, yes: counter tracks total requests; histogram tracks distribution. For throughput, counter alone suffices.

What is the difference between a counter and an UpDownCounter?

Counters only increase; UpDownCounters increase and decrease. Use counters for totals (never go down); UpDownCounters for gauges that change frequently but you want summed across instances.

How often are metrics exported?

By default, metrics are exported every 60 seconds (Prometheus scrape interval). Adjust via the PeriodicExportingMetricReaderOptions.IntervalMilliseconds property.

Can I create metrics dynamically based on user input?

Avoid it. Define metrics at startup. Dynamic metrics create cardinality explosion (millions of unique label combinations), overwhelming backends. Pre-define all label combinations you need.

Further Reading