====== Pageserver ====== **Pageserver** is a storage layer component within the Lakebase architecture that serves as a persistent data management system. The pageserver maintains long-lived data across object storage backends and local cache layers, providing a critical infrastructure element for data [[durability|durability]] and retrieval within Lakebase deployments (([[https://www.databricks.com/blog/take-control-customer-managed-keys-lakebase-postgres|Databricks - Take Control: Customer-Managed Keys in Lakebase Postgres (2026]])). ===== Architecture and Data Storage ===== The pageserver functions as an intermediary storage abstraction within the [[lakebase|Lakebase]] system, managing the persistence of data across multiple storage tiers. Data is stored in object storage backends, which provides scalability and cost efficiency for large datasets. Beyond object storage, the pageserver maintains local cache layers that improve access performance for frequently referenced data segments. This multi-tier approach balances the durability guarantees of persistent object storage with the performance characteristics of local caching mechanisms. The pageserver architecture reflects a separation of concerns design pattern, where storage management is decoupled from compute operations. This separation enables independent scaling of storage and computational resources, a fundamental characteristic of disaggregated database architectures that has become increasingly prevalent in cloud-native database systems. ===== Security and Encryption ===== Data protection within the pageserver implements cryptographic controls through **CMK (Customer-Managed Key) integration** (([[https://www.databricks.com/blog/take-control-customer-managed-keys-lakebase-postgres|Databricks - Take Control: Customer-Managed Keys in Lakebase Postgres (2026]])). The Lakebase CMK implementation ensures that pageserver data is encrypted using customer-controlled encryption keys rather than service-managed keys. This approach provides organizations with enhanced control over encryption key lifecycles, rotation policies, and access permissions. CMK-protected encryption satisfies requirements in regulated industries where data sovereignty and encryption key management fall under customer responsibility rather than service provider control. The encryption model covers both data at rest in object storage and cached data maintained in local pageserver instances. This comprehensive protection model ensures consistent security posture across the entire storage layer, regardless of whether data currently resides in the persistent object storage tier or in local cache. ===== Integration with Lakebase Architecture ===== The pageserver operates as a core infrastructure component within the broader Lakebase ecosystem, which represents [[databricks|Databricks]]' approach to providing PostgreSQL-compatible database functionality with cloud-native architectural principles. The pageserver's role in managing persistent data storage enables Lakebase to support transactional semantics while leveraging the scalability and cost characteristics of object storage systems. The separation of the pageserver as a distinct component allows Lakebase deployments to implement features such as branching, time-travel queries, and rapid cloning operations that depend on efficient access to historical data states. The local caching layer optimizes query performance by reducing latency for data access patterns while maintaining [[consistency|consistency]] guarantees through cache coherency mechanisms. ===== Operational Considerations ===== Deployments utilizing pageserver components must consider cache sizing, object storage backend selection, and encryption key management policies. Organizations implementing Lakebase with pageserver infrastructure should evaluate key rotation frequencies, audit logging for encryption operations, and disaster recovery procedures for both cached and persistent data tiers. The pageserver's multi-tier storage model introduces considerations around eventual consistency between cache and persistent layers, though these effects are typically managed transparently by the Lakebase query engine. Performance optimization may require tuning cache sizing parameters based on specific workload characteristics and access patterns. ===== See Also ===== * [[lakehouse_architecture|Lakehouse Architecture]] * [[lakebase|Lakebase]] * [[safekeeper|Safekeeper]] ===== References =====