Observability

obleth exposes Prometheus metrics always-on, and OpenTelemetry distributed tracing

obleth exposes Prometheus metrics always-on, and OpenTelemetry distributed tracing when you point it at an OTLP collector.

Prometheus metrics

The data plane serves metrics on OBLETH_METRICS_LISTEN (default 0.0.0.0:9091, path /metrics). Notable series:

MetricTypeNotes
obleth_requests_total{admission,status}counterrequests by admission outcome + status class
obleth_tokens_in_total / obleth_tokens_out_totalcountertoken throughput
obleth_ttft_ms / obleth_total_mshistogramtime-to-first-token + total latency
obleth_in_flight / obleth_queue_depthgaugelive concurrency + queue
obleth_cache_lookups_total{result}counterresponse cache hits/misses
obleth_cache_tokens_saved_totalcountertokens served from cache
obleth_mcp_requests_total{server,status}counterMCP gateway calls
obleth_telemetry_droppedgaugeusage records dropped under buffer pressure

In Compose, enable the observability profile for a bundled Prometheus (:9095) and Grafana (:3001). In Kubernetes, set serviceMonitor.enabled=true if the Prometheus Operator CRDs are installed.

OTLP tracing

Tracing is fully gated on OBLETH_OTEL_ENDPOINT. Unset, there is zero tracing overhead and obleth logs to stdout as usual. Set it to an OTLP/HTTP collector base URL and obleth exports spans to {endpoint}/v1/traces.

# Docker Compose (.env)
OBLETH_OTEL_ENDPOINT=http://jaeger:4318
# Helm values
obleth:
  otelEndpoint: http://my-collector:4318

The Compose observability profile ships Jaeger all-in-one (OTLP enabled) with its UI on http://localhost:16686.

Spans

Each request produces a proxy_request span with child spans for the pipeline phases:

  • auth_resolve — API key resolution (moka → Redis)
  • cache_lookup — response cache check
  • reserve_budget — atomic token-budget reservation
  • upstream_request — the call to the inference backend

MCP gateway calls produce an mcp_request span tagged with the server name.

Notes

  • obleth uses the OTLP/HTTP (protobuf) exporter with a background batch processor, so trace export never blocks the request hot path.
  • The service name is reported as obleth.
  • Point OBLETH_OTEL_ENDPOINT at any OTLP-compatible backend (Jaeger, Tempo, an OpenTelemetry Collector, or a vendor endpoint).