Backup & Restore

How to back up obleth's three datastores — Postgres, Redis, and ClickHouse — and restore from backup.

obleth's durability story is divided by datastore role:

DatastoreContainsDurability approach
PostgresAll config, keys, tenantsFull backup + WAL archiving
RedisHot cache, live token budgetsPersistence optional (rebuildable from Postgres)
ClickHouseUsage ledgerReplication + optional external backup

Postgres

Postgres is the source of truth for all configuration. Back it up like any production Postgres database.

pg_dump (simple)

pg_dump -h localhost -U obleth -d obleth -F c -f obleth-$(date +%Y%m%d).dump

Continuous WAL archiving

For production, configure WAL archiving with pgBackRest or Barman to get point-in-time recovery (PITR). With CloudNativePG:

spec:
  backup:
    barmanObjectStore:
      destinationPath: "s3://my-bucket/obleth-pg/"
      s3Credentials:
        accessKeyId:
          name: pg-backup-creds
          key: ACCESS_KEY_ID

Restore

pg_restore -h localhost -U obleth -d obleth obleth-20240101.dump

After restoring Postgres, restart obleth pods to reload the cache from the restored database.

Redis

Redis is a hot cache. All Redis data can be reconstructed from Postgres on startup (obleth warms the cache on first use). Redis backup is recommended but not strictly required for data safety.

Enable Redis persistence

In the Redis configuration (or via Docker Compose environment):

appendonly yes
appendfsync everysec

This writes an AOF (append-only file) that can be replayed on restart. For Docker Compose:

redis:
  command: redis-server --appendonly yes --appendfsync everysec

Restore

Simply restore the AOF or RDB file and start Redis. obleth will resume using the warm cache.

If Redis data is lost entirely, obleth falls back to Postgres for all key lookups on the first request for each key, then caches them. There is no operational action needed — it self-heals.

ClickHouse

ClickHouse holds the usage ledger. It is append-only and does not need to be consistent with real-time traffic (the WAL handles in-flight records during an outage).

Data retention

Configure a TTL on usage to automatically drop old data:

ALTER TABLE obleth.usage
MODIFY TTL toDateTime(ts_ms / 1000) + INTERVAL 90 DAY;

Backup

For managed ClickHouse (ClickHouse Cloud, Altinity), use the provider's backup feature. For self-hosted:

clickhouse-backup create obleth-backup-$(date +%Y%m%d)
clickhouse-backup upload obleth-backup-$(date +%Y%m%d) --remote-storage=s3

Using clickhouse-backup.

Restore

clickhouse-backup download obleth-backup-20240101
clickhouse-backup restore obleth-backup-20240101

After a ClickHouse outage without backup

If ClickHouse data is lost and no backup exists, usage history is gone. Current tenants, keys, and config are safe in Postgres. Billing/audit reconstruction requires replaying the WAL files from all obleth pods during the outage window.

Disaster recovery summary

FailureImpactRecovery
Redis lostToken budgets reset; cache coldPostgres rebuild on next request; zero manual steps
Postgres lost without backupAll config, keys, tenants lostRestore from backup; no alternative
ClickHouse lost without backupUsage history lost; billing data lostPartial reconstruction from pod WAL files
Pod lostIn-flight requests fail; WAL for that pod lostWAL data for that pod's current batch is lost