Browse
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety
Meta
Browse
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety
Meta
Stateless compute refers to a distributed database architecture where the compute layer operates without relying on local persistent storage. Instead of maintaining state locally, the compute nodes stream write-ahead logs (WAL) to a distributed group of safekeepers that manage durability and consistency guarantees. This architectural pattern fundamentally decouples the compute tier from storage, enabling independent scaling and improving fault tolerance characteristics.
In traditional database architectures, compute nodes maintain local storage and are responsible for persisting data to disk. Stateless compute inverts this relationship: the compute layer becomes ephemeral and stateless, delegating all durability concerns to dedicated storage nodes (safekeepers). When a transaction writes data, the compute node streams the WAL records to a quorum of safekeepers rather than writing directly to local storage 1).
This design eliminates the torn page failure mode, a critical reliability concern in traditional systems where unexpected crashes during page writes can leave data in an inconsistent state. By delegating persistence to remote safekeepers operating with quorum-based replication, stateless compute systems achieve stronger durability guarantees without requiring local disk operations during the transaction path.
The separation of compute from storage enables independent horizontal scaling of each tier. Compute nodes can be provisioned, removed, or scaled without affecting the storage layer's state, and vice versa. This decoupling provides significant operational advantages:
* Compute elasticity: Nodes can be added or removed based on query workload without data migration * Storage independence: The storage tier can scale according to data volume requirements separately from compute capacity * Cost optimization: Resources can be right-sized for their specific workload characteristics
Stateless compute architectures support rapid failover and recovery since compute nodes do not need to restore local state after failures—they can immediately resume operations by reconnecting to the safekeepers 2).
The stateless compute model enables performance improvements in the write path. Rather than servicing local disk I/O operations, compute nodes can stream WAL records asynchronously to safekeepers, reducing the latency of transaction commits. Systems implementing this pattern report significant write performance improvements compared to traditional architectures constrained by local storage I/O.
The quorum-based safekeeper design ensures that writes achieve durability guarantees (typically matching fsync semantics) without requiring the compute node to wait for multiple disk operations. This pipelining effect, combined with the elimination of torn page concerns, contributes to the architectural efficiency gains.
Stateless compute systems require careful management of the WAL stream and safekeeper coordination. The quorum mechanism must balance consistency guarantees with availability—typical implementations use odd-numbered quorums (3, 5, or 7 safekeepers) to ensure majority-based consensus. Network latency between compute and safekeepers becomes a critical performance factor.
Recovery and restart operations in stateless architectures differ from traditional systems. Rather than performing crash recovery on local storage, compute nodes query safekeepers to reconstruct necessary state. This requirement shapes the design of recovery protocols and safekeeper data structures.
Stateless compute architectures have emerged in modern cloud-native database systems designed for elastic, serverless deployment models. These systems benefit particularly from the decoupling properties when serving highly variable workloads where compute capacity changes frequently.