Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety & Security
Evaluation
Meta
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety & Security
Evaluation
Meta
The Nvidia Vera Rubin platform is a comprehensive AI and HPC system integrating Rubin GPUs, Vera CPUs, advanced networking, and storage components. Announced at GTC 2025 as the successor to the Blackwell architecture, Vera Rubin delivers up to 5x inference performance and 10x lower cost per token at rack scale. 1)
| Specification | Detail |
|---|---|
| Process Node | TSMC 3nm |
| Transistors | 336 billion (dual reticle-size compute dies + 2 I/O tiles) |
| Memory | 288 GB HBM4 at 22 TB/s bandwidth |
| Inference Performance | 50 PFLOPS NVFP4 (5x Blackwell) |
| Training Performance | 35 PFLOPS NVFP4 (3.5x Blackwell) |
| Transformer Engine | 3rd generation with hardware-accelerated adaptive compression |
| TDP | Max Q ~1.8 kW, Max P ~2.3 kW |
The Rubin CPX variant features 128 GB GDDR7 for massive-context inference (million-token workloads). The NVL144 CPX platform delivers 8 exaflops AI performance with 100 TB fast memory. 2)
| Specification | Detail |
|---|---|
| Cores | 88 custom Armv9.2 Olympus cores (176 threads via Spatial Multithreading) |
| Memory | Up to 1.5 TB SOCAMM2 LPDDR5X at 1.2 TB/s bandwidth |
| GPU Connectivity | 1.8 TB/s NVLink-C2C (7x PCIe Gen 6) |
| Performance | 2x Grace CPU in Blackwell systems |
| Features | Scalable Coherency Fabric (SCF); confidential computing; agentic reasoning support |
The Vera CPU handles data movement, analytics, and orchestration, pairing with Rubin GPUs or operating standalone for HPC and cloud workloads. 3)
The primary deployment unit for Vera Rubin is the NVL72 rack:
| Component | Specification |
|---|---|
| Compute | 72 Rubin GPUs + 36 Vera CPUs |
| Inference | 3.6 EFLOPS NVFP4 |
| Training | 2.5 EFLOPS |
| GPU Memory | 20.7 TB HBM4 (1.6 PB/s bandwidth) |
| CPU Memory | 54 TB LPDDR5X |
| Scale-Up | NVLink 6 at 260 TB/s rack-scale bandwidth |
| Cooling | 100% liquid (45°C inlet) |
| Serviceability | Modular cable-free trays (18x faster service vs Blackwell) |
The NVL72 scales to 40 racks in a POD configuration, delivering 60 exaflops aggregate performance. 4)
First deployments are expected in H2 2026 from AWS, Google Cloud, Microsoft Azure, Oracle Cloud, and CoreWeave. Volume production of supporting technologies (HBM4, SOCAMM2) is already underway. NVL144 CPX racks targeting 600 kW are projected for 2027. 6)