obleth exposes Prometheus metrics always-on, and OpenTelemetry distributed tracing
obleth exposes Prometheus metrics always-on, and OpenTelemetry distributed tracing when you point it at an OTLP collector.
The data plane serves metrics on OBLETH_METRICS_LISTEN (default 0.0.0.0:9091,
path /metrics). Notable series:
| Metric | Type | Notes |
|---|---|---|
obleth_requests_total{admission,status} | counter | requests by admission outcome + status class |
obleth_tokens_in_total / obleth_tokens_out_total | counter | token throughput |
obleth_ttft_ms / obleth_total_ms | histogram | time-to-first-token + total latency |
obleth_in_flight / obleth_queue_depth | gauge | live concurrency + queue |
obleth_cache_lookups_total{result} | counter | response cache hits/misses |
obleth_cache_tokens_saved_total | counter | tokens served from cache |
obleth_mcp_requests_total{server,status} | counter | MCP gateway calls |
obleth_telemetry_dropped | gauge | usage records dropped under buffer pressure |
In Compose, enable the observability profile for a bundled Prometheus
(:9095) and Grafana (:3001). In Kubernetes, set serviceMonitor.enabled=true
if the Prometheus Operator CRDs are installed.
Tracing is fully gated on OBLETH_OTEL_ENDPOINT. Unset, there is zero tracing
overhead and obleth logs to stdout as usual. Set it to an OTLP/HTTP collector
base URL and obleth exports spans to {endpoint}/v1/traces.
# Docker Compose (.env)
OBLETH_OTEL_ENDPOINT=http://jaeger:4318
# Helm values
obleth:
otelEndpoint: http://my-collector:4318
The Compose observability profile ships Jaeger all-in-one (OTLP enabled) with
its UI on http://localhost:16686.
Each request produces a proxy_request span with child spans for the pipeline
phases:
auth_resolve — API key resolution (moka → Redis)cache_lookup — response cache checkreserve_budget — atomic token-budget reservationupstream_request — the call to the inference backendMCP gateway calls produce an mcp_request span tagged with the server name.
obleth.OBLETH_OTEL_ENDPOINT at any OTLP-compatible backend (Jaeger, Tempo, an
OpenTelemetry Collector, or a vendor endpoint).