This comparison examines two fundamentally different architectural approaches to managing Write-Ahead Log (WAL) page images in database systems, particularly relevant to PostgreSQL-compatible distributed databases. The distinction between generating page images at the compute layer versus the storage layer represents a critical design decision affecting performance, scalability, and resource utilization in modern cloud-native database architectures.
Compute-layer WAL image generation refers to the traditional PostgreSQL approach where page images are embedded directly into the WAL stream at the database compute node. In this model, when a page is modified, the full 8KB page image is serialized and written into the WAL record, creating substantial log overhead 1). This approach centralizes image generation responsibility at the compute layer, where database processes are already managing query execution, parsing, and transaction processing.
Storage-layer image generation, by contrast, delegates page image creation to distributed storage infrastructure—typically a pageserver or storage service layer. Rather than embedding complete page images in WAL, this approach records only the delta or change information, allowing the storage layer to reconstruct full page images on-demand by accumulating deltas over time. This architectural shift fundamentally redistributes computational load from the constrained compute tier to the elastically scalable storage infrastructure 2).
Original PostgreSQL's compute-layer approach generates significant WAL amplification. Embedding full 8KB page images in WAL records can inflate log volume by up to 15x compared to delta-only approaches, as each modification requires transmission of the complete page state 3). This amplification creates network bandwidth bottlenecks, particularly in distributed deployments where WAL must traverse wide-area networks between compute and storage nodes.
Storage-layer image generation approaches such as Lakebase's architecture achieve substantially reduced WAL traffic through delta accumulation. By recording only the changes rather than full page images, systems can reduce WAL traffic by approximately 94% compared to traditional approaches 4). This reduction directly translates to decreased network bandwidth requirements, lower storage costs for WAL retention, and improved write performance by eliminating the overhead of serializing entire pages with each modification.
The compute-layer approach concentrates image generation work at database compute nodes, which are typically limited in number and pre-sized for query processing workloads. Adding WAL image generation increases CPU and I/O burden on these constrained resources, potentially creating contention with transaction processing and query execution. The compute layer must handle variable write rates without the ability to scale horizontally for this specific function.
Storage-layer approaches distribute image generation across elastic storage infrastructure. Pageservers or storage services can scale independently from compute, handling growing WAL accumulation and image reconstruction without impacting database query performance. This separation of concerns enables horizontal scaling of storage infrastructure capacity without provisioning additional compute resources 5), aligning with cloud-native principles of independent tier scaling.
Compute-layer WAL generation offers conceptual simplicity—page images are embedded at the point of modification, ensuring immediate consistency between compute and storage. However, this simplicity comes at the cost of resource amplification and bandwidth waste. The approach works acceptably for traditional single-node or small-cluster PostgreSQL deployments but becomes problematic at scale.
Storage-layer image generation requires more sophisticated architecture. The storage layer must maintain delta history, implement accumulation logic, and manage image reconstruction pathways. This added complexity is offset by substantial operational benefits: reduced WAL volume, distributed computational load, and decoupled scaling characteristics. Systems must carefully track delta lineage to ensure correct image reconstruction and maintain sufficient historical context for point-in-time recovery 6).
Traditional PostgreSQL continues using compute-layer WAL image generation as the default behavior, appropriate for deployments where simplicity and established patterns outweigh performance considerations. Modern distributed PostgreSQL-compatible systems, including Lakebase, adopt storage-layer approaches to address scalability requirements in cloud environments. These systems are specifically designed for multi-tenant, high-throughput scenarios where WAL amplification would create unacceptable bottlenecks.
The shift toward storage-layer image generation reflects broader architectural trends in cloud databases: decoupling compute from storage, implementing tiered scaling strategies, and optimizing for network-efficient operations in distributed systems.