Run one end-to-end benchmark that seeds tenants, applies load, samples fairshare state, and verifies the usage ledger.
The benchmark harness is a reproducible proof of obleth's fairshare claim: a low-weight workload can flood the gateway first, a boosted workload can join later, and both tenants still make progress under contention.
Prerequisites: Node.js 18+ and the Docker Compose stack running.
cd obleth-gateway
node bench/run-benchmark.mjs
The command:
mock-model for local mock runs, or verifies that your real MODEL is already registered.chatbot weight 500, api-batch weight 50.sk_* tenant keys.PUT /api/v1/capacity.api-batch starts first, then chatbot joins.GET /api/v1/fairshare/live.Generated artifacts are written outside the repo by default:
| File | Default path |
|---|---|
| Tenant keys | /tmp/obleth-bench/keys.json |
| Run summary | /tmp/obleth-bench/run-meta.json |
| Fairshare samples | /tmp/obleth-bench/fairshare-samples.jsonl |
Register the model in the control plane first, then set MODEL to the registered model name:
CAPACITY=16 \
DURATION_S=120 \
OUTPUT_TOKENS=150 \
MODEL=gemma4-31b-it \
CONC=64 \
PROXY_BASE=http://localhost \
node bench/run-benchmark.mjs
For a short smoke run:
CAPACITY=8 DURATION_S=30 OUTPUT_TOKENS=32 MODEL=gemma4-31b-it node bench/run-benchmark.mjs
| Variable | Default | Effect |
|---|---|---|
ADMIN_BASE | http://localhost:9090 | Management API base URL |
ADMIN_TOKEN | dev-admin-token | Management API bearer token |
PROXY_BASE | http://localhost | Data-plane base URL |
MODEL | mock-model | Registered model name |
CAPACITY | 8 | Live global in-flight limit |
DURATION_S | 60 | Overlap duration after all tenants have joined |
STAGGER_CHATBOT_S | 10 | Seconds api-batch floods before chatbot joins |
CONC | 32 | Worker count per active tenant |
OUTPUT_TOKENS | 150 | max_tokens per request |
MIN_COMPLETION_RATIO | 2 | Minimum chatbot/api-batch overlap completion ratio |
MAX_ERROR_RATE | 0.05 | Maximum client error rate per tenant |
LEDGER_TOLERANCE | 0.2 | Allowed ClickHouse/client completion delta |
REQUIRE_SATURATION | 1 | Set 0 for light smoke runs |
CHAOS | unset | Set 1 to pause ClickHouse and Redis mid-run |
CONTAINER_CLI | docker | Use podman for Podman Compose |
BENCH_OUT_DIR | /tmp/obleth-bench | Output directory |
PASS means:
api-batch was not starved;chatbot completed materially more overlap work than api-batch;The old split scripts were easy to misuse because setup, load, ledger checks, and chaos were separate. run-benchmark.mjs keeps those pieces in one path so the benchmark is harder to accidentally bend into a misleading result.