Scale-to-zero & lifecycle
On an object-storage backend the engine's compute is stateless, so it can idle all the way to nothing — only the bytes in object storage bill at rest. The twill-controller turns that into an elastic service: cold-start on first connection, scale to zero when idle, and survive a burst of simultaneous wakeups.
The idea
Scale-to-zero rests on one invariant: compute is stateless — all durable state lives in object storage behind the Storage seam. A warm instance is just a cache plus a CPU; destroying it loses nothing durable, so the controller is free to stop the engine whenever a database is idle and reconstruct an equivalent instance on the next connection.
Scale-to-zero needs an s3:// backend
Idling compute to nothing requires storage disaggregation: durability must bottom out on object storage so that stopping the engine loses nothing. A pure-embedded file:// app has no disaggregation — its durable state is a local file the process owns, so there is nothing to scale to. Use s3://, r2://, or gs:// for scale-to-zero.
The lifecycle controller
The twill-controller crate is a thin, stateless supervisor. It composes the engine's existing primitives rather than reimplementing them: opening a Database acquires the writer fence and replays the WAL — that is the cache warm — and dropping it releases the fence. On top of that the controller adds the state machine, an idle-timeout reaper, a lease heartbeat, and thundering-herd handling.
The controller owns no durable state
Every durable byte lives in object storage behind the storage seam. The controller holds only in-memory lifecycle bookkeeping; it is restartable and replaceable, and reconstructs from storage. It is never on the data path — it supervises instances, it does not proxy SQL.
Cold ──start──▶ Warming ──opened──▶ Active ──no work──▶ Idle ──timeout──▶ Stopping
▲ │ (open fails) ▲ │ │
│ ▼ └─────┘ (new connection re-activates)│
└─────────────────┴───────────────────────────────────────────────────────────┘
(Stopping always lands back in Cold)
Cold → Warming → Active → Idle → Stopping → Cold. A new connection pulls an Idle instance back to Active; a failed warm returns cleanly to Cold; Stopping releases the fence and lands back in Cold.
| State | Meaning | Leaves to |
|---|---|---|
| Cold | No process. Only object-storage bytes bill at rest. | Warming (on first connection) |
| Warming | Cold-starting: handle open + fence acquire + WAL replay (cache warm). | Active (opened) / Cold (open fails) |
| Active | Serving connections; cache warm; writer lease held. | Idle (no active leases) |
| Idle | Warm but with zero active connections; lease still heartbeat. | Active (new connection) / Stopping (idle timeout) |
| Stopping | Tearing down: drop the handle so the engine releases the fence. | Cold |
Cold start
A cold start is exactly: process start + cache warm + fence acquire + WAL replay. In the controller these collapse into one step — Database::open(url) acquires the single-writer fence and replays the durable WAL, which is what warms the instance. The dominant tail term is cache warm: the first reads after a cold start miss the local cache and fall through to object storage, so a large random working set is the worst case.
Idle reaper
A background reaper runs every reap_interval. For each warm instance with no active leases it moves Active → Idle, and once an instance has sat Idle past idle_timeout (and keep_warm is off) it tears the handle down — Stopping → Cold — releasing the fence. The lifecycle and heartbeat threads live in the controller, deliberately not in the embedded engine core, which stays thread-free so embedders own their own scheduling.
Single-writer lease heartbeat
The writer lease is durable and fenced by a monotonic CAS epoch. The reaper heartbeats it for every warm instance (Active or Idle) by calling Database::renew_lease(); if a renewal fails — the instance has been fenced by a newer writer — the controller treats it as fatal, drops the handle, and returns the instance to Cold. Split-brain is impossible by construction: only one CAS epoch wins each append, so a stalled writer's appends are simply rejected.
Keep-warm & thundering-herd admission
Two mechanisms bound the cold-start tail under load:
- MUST dedupe: N concurrent
startcalls for one cold database trigger exactly one Warming transition; the rest wait on that single warm rather than each spawning a process. - SHOULD admit under a cap: a bounded warm-admission semaphore (
max_concurrent_warms) limits how many distinct databases warm at once, so a herd of many cold databases cannot saturate CPU or the object store's request budget. - MAY keep warm: with
keep_warmon, idle instances stay resident pastidle_timeoutto cut post-idle latency for latency-critical, low-traffic databases.
The cost trade-off
Scale-to-zero trades a cold-start tail for zero idle compute. The single knob with the most leverage is idle_timeout:
| Workload shape | idle_timeout | keep_warm |
|---|---|---|
| Bursty, recurring (every few minutes) | longer | off |
| Latency-critical, low traffic | moderate | on |
| Truly rare / archival | short (default 30 s) | off |
| Predictable spike (deploy / cron) | default | pre-warm ahead |
Too short an idle_timeout pays the cold-start tax repeatedly on bursty-but-recurring traffic; too long wastes compute and defeats scale-to-zero. Tune it against each database's inter-arrival distribution, and reach for keep_warm only where the cold-start tail actually hurts.
Stopping is never a stopped commit
Scale-to-zero never compromises durability. The controller stops an idle engine, never a commit: an instance only tears down once it has no active leases, and the engine's append_wal is durable before any commit is acked. Dropping the warm handle releases the fence cleanly for the next writer.
Related
s3:// backend that makes scale-to-zero possible.
BRBranchingMany branches can exist with zero running instances — branches are orthogonal to lifecycle.
QSQuickstartGo disaggregated with a one-line connection-string change.
CTLLifecycle & Controller (spec)The full state machine, fencing, and thundering-herd model in the design spec.