Software: $0.

obleth is source-available under Elastic License 2.0. Run it for your own workloads without license fees or metered usage gates; your practical limits are your hardware, configured quotas, and deployment choices.

Get started View source

Included at $0

The full gateway

✓Tenants & API keys

✓Fairshare admission

✓Response caching

✓Token budget rate limiting

✓ClickHouse usage ledger

✓Prometheus & OTLP tracing

✓MCP gateway routing

✓Brownout degradation

✓Helm chart & Docker Compose

Optional

Professional services from Voxel

Voxel is a specialized consulting firm focused on high-performance computing and AI infrastructure. We built obleth from years of operating multi-tenant GPU clusters. If you want expert guidance deploying it — we are here.

This is optional. The software is fully documented and designed to be deployed independently. But if you are preparing for a major rollout and want architectural review, capacity planning, or custom integration support — reach out.

Get in touch

Architecture design

Topology review and HA design across Kubernetes or bare-metal.

Capacity modeling

Tenant weights, hardware sizing, and traffic shaping strategies.

Integration planning

Connecting inference engines, databases, and caches to your stack.

Platform telemetry

Usage ledgers, metric scraping, and chargeback accounting.

Load testing

Custom harnesses validating multi-tenant constraints under stress.

Who we work with

Teams running their own hardware

RESEARCH_COMPUTING

Universities & labs

Research computing centers sharing GPU clusters across departments and grant-funded projects. Fairshare ensures equitable distribution so no single runaway job consumes the whole cluster.

AI_STARTUPS

AI teams self-hosting

Teams outgrowing managed API costs that need to run their own infrastructure. obleth adds the multi-tenant layer that raw backends lack—identity, fairshare admission, and usage accounting.

ENTERPRISE_PLATFORM

Internal ML platforms

Platform teams serving multiple business units from a centralized GPU pool. Fairshare weights let production traffic keep priority without fully blocking internal experimentation.

HPC_CLOUD

HPC & cloud providers

Organizations offering inference-as-a-service on colocated hardware. Tenant isolation and asynchronous usage ledgers let you reliably bill per token.

MIGRATION

Moving off managed APIs

Teams migrating from managed APIs to self-hosted models. Existing OpenAI-compatible clients usually need only a base URL and key change.

SELF_SERVE

Deploy it yourself

obleth is source-available and fully documented. If you prefer to move fast independently, our documentation has everything you need—we are just here when you want an expert review.

Quick start

No sales call required.

Clone the repo. Run the Compose stack. You have a production-grade multi-tenant AI gateway in minutes. Paid help is there when a rollout needs architecture review, hardening, or integration work.

Quick start Contact Voxel