Source-available · Rust data plane · OpenAI-compatible

The fairshare layer
for multi-tenant AI.

obleth sits between your edge load balancer and inference backends, admitting tenants fairly under contention before requests reach vLLM, Aibrix, or another OpenAI-compatible upstream.

Section 01 // Admission

Fairness before routing

obleth puts a tenant-aware admission decision in front of model routing. The shared pool stays fair when demand spikes, and the backend only sees work that has already earned a slot.

01

Caller

Resolve the tenant

The incoming key becomes tenant context before the request can reach a model backend.

02

Admission

Choose the next slot

Under contention, obleth selects work by fairshare so one workload cannot run away with the pool.

03

Backend

Stream with context

The admitted request streams upstream while the same tenant context follows the result.

The practical effect: production traffic keeps breathing room, batch work still advances, and noisy tenants stop dominating simply because they arrived first.

Fairshare docs

Section 02 // Core architecture

Hot-path responsibilities

obleth keeps policy decisions in front of the inference layer: identify the caller, decide admission, route the request, and record the outcome.

admission_control

Fairshare

A single scheduler bounds global in-flight work and grants queued requests by fairshare policy when the cluster is saturated.

tenant_identity

Policy context

Every API key resolves to tenant weight, quota, fairshare group, and disabled state before upstream work begins.

streaming_proxy

Routing

OpenAI-compatible requests stream through to model backends, with optional exact-match response caching and MCP routes.

Pricing

Run it yourself. Bring us in when it matters.

obleth is source-available with no license fee for your own workloads. Voxel provides optional paid help for production rollout, capacity planning, and platform integration.

Software

$0

Run the gateway for your own workloads. No metered product fee from obleth.

Services

Optional

Architecture review, hardening, load testing, and custom integration support from Voxel.