Priority Boost

How to change a tenant's fairshare weight live via the Management API or dashboard, and how quickly the change propagates.

Priority boosts are one of obleth's headline features: you can change a tenant's weight at runtime and it takes effect on the next admission decision, without restarting obleth or reloading any config.

What weight controls

Under the weighted fairshare algorithm, the scheduler divides capacity proportionally to tenant weights. A tenant with weight=500 gets roughly 10× the throughput of one with weight=50 under saturation. Weight does not affect:

  • Token budget (TPM) — that's tokens_per_minute
  • Per-tenant concurrency cap — that's max_in_flight

Boosting via the API

# Double the weight from 100 to 200
curl -X PATCH http://localhost:9090/api/v1/tenants/$TID/weight \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"weight": 200}'

The response is the updated tenant object. The change:

  1. Is written to Postgres (tenants.weight) and logged to audit_log.
  2. Is synced to Redis (the ResolvedKey JSON is updated).
  3. Triggers a pub/sub invalidation on obleth:invalidate to evict the key from every pod's moka cache.

Effective latency: the change is live within milliseconds of the API call returning.

Boosting via the dashboard

Navigate to Tenants → select a tenant → click the weight field → enter the new value → save.

The dashboard calls PATCH /api/v1/tenants/{id}/weight under the hood.

Priority boost scenarios

Incident response

Your production chatbot is being saturated by a batch analytics job that someone started at peak time:

# Find the batch tenant
curl http://localhost:9090/api/v1/tenants -H "Authorization: Bearer $TOKEN" | jq '.[] | select(.name == "batch-analytics") | .id'

# Find the chatbot tenant
CHATBOT_ID=...
BATCH_ID=...

# Boost chatbot to 10x the batch job
curl -X PATCH http://localhost:9090/api/v1/tenants/$CHATBOT_ID/weight \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"weight": 1000}'

# Throttle the batch job
curl -X PATCH http://localhost:9090/api/v1/tenants/$BATCH_ID/weight \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"weight": 10}'

The scheduler picks this up on the next admission decision. No restart, no config file, no deployment.

Restoring normal weights

curl -X PATCH http://localhost:9090/api/v1/tenants/$CHATBOT_ID/weight \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"weight": 200}'

curl -X PATCH http://localhost:9090/api/v1/tenants/$BATCH_ID/weight \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"weight": 50}'

Audit trail

Every weight change is recorded in audit_log:

curl http://localhost:9090/api/v1/audit \
  -H "Authorization: Bearer $TOKEN" | jq '.[] | select(.action == "update_weight")'

Each entry includes the actor (admin token identifier), the tenant ID, the old weight, and the new weight with a timestamp.

Group weights (hierarchical mode)

Under OBLETH_FAIRSHARE_ALGORITHM=hierarchical, you can also adjust group weights:

# Boost the entire prod group
curl -X PATCH http://localhost:9090/api/v1/fairshare/groups/prod/weight \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"weight": 1000}'

This changes how much of the global capacity pool the prod group receives, affecting all tenants in that group simultaneously.