====== System Tables ====== **System Tables** are specialized audit and usage tracking tables in Databricks that provide comprehensive visibility into query execution, performance metrics, data lineage, and cost information across the platform. These tables enable governance teams and data engineers to monitor workload spending, track performance characteristics, and maintain compliance across analytics and data engineering workflows. ===== Overview and Purpose ===== System Tables function as a centralized observability layer within Databricks, automatically capturing detailed metadata about data platform operations without requiring manual instrumentation. Rather than aggregating metrics from external monitoring tools, System Tables store execution records directly within the Databricks ecosystem, allowing organizations to query this data using standard SQL and integrate it with existing data pipelines (([[https://www.databricks.com/blog/open-platform-unified-pipelines-why-dbt-databricks-accelerating|Databricks - Open Platform Unified Pipelines (2026]])). The primary value proposition of System Tables centers on three key capabilities: **governance visibility** into which users and workloads consume platform resources, **performance monitoring** of query execution patterns and optimization opportunities, and **cost attribution** enabling showback or chargeback models for different business units or teams. This granular tracking supports both operational decision-making and financial accountability in data-driven organizations. ===== Key Metrics and Data Captured ===== System Tables capture multidimensional information about platform operations: * **Query Execution Metrics**: Wall-clock time, CPU time, memory consumption, I/O patterns, and query stages * **Lineage Information**: Data source and destination tracking, transformation dependencies, and upstream/downstream relationships across dbt workloads and SQL workflows * **Cost Attribution**: Compute unit consumption, storage costs, and resource allocation mapped to specific users, teams, or projects * **Performance Characteristics**: Query optimization opportunities, scan efficiency, and bottleneck identification * **User and Workload Activity**: Job execution history, scheduling patterns, and resource utilization by team or application This information enables analysis of platform health, identification of performance regressions, and optimization of resource allocation based on actual consumption patterns rather than theoretical capacity planning. ===== Governance and Compliance Applications ===== System Tables support multi-tenant governance models essential in enterprise data platforms. Governance teams can leverage these tables to implement several critical controls: **Workload Monitoring**: Track dbt DAG execution performance, identify long-running transformations, and optimize pipeline efficiency. Teams can monitor scheduled jobs, ad-hoc queries, and streaming workloads with consistent observability. **Cost Governance**: Attribute compute spending to business units, projects, or applications, enabling organizations to implement chargeback models and allocate costs fairly. Cost tracking at fine granularity supports budget management and resource optimization decisions. **Compliance and Audit Trails**: Maintain compliance with regulatory frameworks by documenting who accessed what data, when queries executed, and what transformations occurred. These audit records support both internal governance and external compliance reporting. **Performance SLA Tracking**: Monitor whether workloads meet performance targets, identify degradation patterns, and proactively optimize before user impact occurs. ===== Integration with Data Platforms ===== System Tables operate within Databricks' unified analytics platform, capturing metadata from multiple compute layers and workload types. The tables integrate with dbt-Databricks integration, enabling specialized tracking of data transformation workflows alongside other SQL and Python workloads. Data engineers can query System Tables using standard SQL, join them with business data to correlate performance with business metrics, and export this metadata to downstream business intelligence or cost management systems. The accessibility of System Tables through standard SQL queries, rather than proprietary APIs, enables organizations to build custom monitoring dashboards, automated alerting systems, and compliance reports using their existing BI tools and analytics infrastructure. ===== Limitations and Considerations ===== While System Tables provide valuable platform visibility, several implementation considerations warrant attention. Metadata query latency may introduce delays between event occurrence and availability in System Tables, affecting real-time alerting use cases. Organizations must manage data retention policies to balance historical analysis needs against storage costs. Complex multi-tenant environments may require sophisticated cost allocation logic beyond direct attribution. Additionally, System Tables capture platform-level metrics but may not fully represent end-to-end application performance if workloads span multiple systems. ===== See Also ===== * [[dbsql_granular_cost_monitoring|DBSQL Granular Cost Monitoring]] * [[databricks|Databricks]] * [[streaming_tables|Streaming Tables]] * [[databricks_unity_catalog|Databricks Unity Catalog]] * [[query_tags|Query Tags]] ===== References =====