ClickHouse Usage Schema

The usage table schema, useful queries for billing and analytics, and data retention configuration.

ClickHouse stores the append-only usage ledger. It is write-only from obleth's perspective — the gateway inserts rows and never updates or deletes them.

usage table

CREATE TABLE obleth.usage (
    request_id       UUID,
    tenant_id        UUID,
    key_id           UUID,
    model            String,
    admission        LowCardinality(String), -- 'fast' | 'queued' | 'brownout' | 'rejected'
    weight           Int64,        -- tenant weight at time of request
    input_tokens     UInt32,
    output_tokens    UInt32,
    estimated_tokens UInt32,       -- pre-request estimate used for budgeting
    queue_wait_ms    UInt32,
    ttft_ms          UInt32,       -- time to first token
    total_ms         UInt32,       -- total request duration
    status_code      UInt16,
    cache_status     LowCardinality(String) DEFAULT 'off', -- 'hit' | 'miss' | 'off'
    ts_ms            Int64,        -- Unix timestamp milliseconds
    ts               DateTime64(3) MATERIALIZED fromUnixTimestamp64Milli(ts_ms)
) ENGINE = MergeTree()
  PARTITION BY toYYYYMMDD(ts)
  ORDER BY (tenant_id, ts_ms);

Useful queries

Token usage by tenant (last 24h)

SELECT
    tenant_id,
    sum(input_tokens)  AS input_tokens,
    sum(output_tokens) AS output_tokens,
    count()            AS requests
FROM obleth.usage
WHERE ts_ms >= (now() - INTERVAL 24 HOUR) * 1000
GROUP BY tenant_id
ORDER BY output_tokens DESC;

Cache hit rate by model

SELECT
    model,
    countIf(cache_status = 'hit')  AS hits,
    countIf(cache_status = 'miss') AS misses,
    round(countIf(cache_status = 'hit') / count() * 100, 1) AS hit_pct
FROM obleth.usage
WHERE ts_ms >= (now() - INTERVAL 24 HOUR) * 1000
GROUP BY model
ORDER BY hits DESC;

Admission class distribution

SELECT
    admission,
    count() AS requests,
    round(count() / sum(count()) OVER () * 100, 1) AS pct
FROM obleth.usage
WHERE ts_ms >= (now() - INTERVAL 1 HOUR) * 1000
GROUP BY admission;

P50/P95/P99 latency

SELECT
    quantile(0.50)(ttft_ms) AS p50_ttft_ms,
    quantile(0.95)(ttft_ms) AS p95_ttft_ms,
    quantile(0.99)(ttft_ms) AS p99_ttft_ms,
    quantile(0.50)(total_ms) AS p50_total_ms,
    quantile(0.95)(total_ms) AS p95_total_ms
FROM obleth.usage
WHERE ts_ms >= (now() - INTERVAL 1 HOUR) * 1000;

Cost by tenant (with per-model pricing)

The Management API /api/v1/costs endpoint does this join for you. For a raw query:

SELECT
    u.tenant_id,
    sum(u.input_tokens * m.input_cost_per_token
      + u.output_tokens * m.output_cost_per_token) AS total_cost_usd
FROM obleth.usage u
-- join against your Postgres models table via ClickHouse dictionary or external query
WHERE u.ts_ms >= (now() - INTERVAL 30 DAY) * 1000
GROUP BY u.tenant_id;

Data retention

Add a TTL to automatically remove old rows:

ALTER TABLE obleth.usage
MODIFY TTL toDateTime(ts_ms / 1000) + INTERVAL 90 DAY;

WAL and replay

When ClickHouse is unavailable after startup, obleth spills rows to a WAL file (OBLETH_WAL_PATH). On reconnect, the WAL is replayed in order. Rows in the WAL have the same structure as usage.