Table of Contents

Nvidia Vera Rubin Platform

The Nvidia Vera Rubin platform is a comprehensive AI and HPC system integrating Rubin GPUs, Vera CPUs, advanced networking, and storage components. Announced at GTC 2025 as the successor to the Blackwell architecture, Vera Rubin delivers up to 5x inference performance and 10x lower cost per token at rack scale. 1)

Rubin GPU

Specification Detail
Process Node TSMC 3nm
Transistors 336 billion (dual reticle-size compute dies + 2 I/O tiles)
Memory 288 GB HBM4 at 22 TB/s bandwidth
Inference Performance 50 PFLOPS NVFP4 (5x Blackwell)
Training Performance 35 PFLOPS NVFP4 (3.5x Blackwell)
Transformer Engine 3rd generation with hardware-accelerated adaptive compression
TDP Max Q ~1.8 kW, Max P ~2.3 kW

The Rubin CPX variant features 128 GB GDDR7 for massive-context inference (million-token workloads). The NVL144 CPX platform delivers 8 exaflops AI performance with 100 TB fast memory. 2)

Vera CPU

Specification Detail
Cores 88 custom Armv9.2 Olympus cores (176 threads via Spatial Multithreading)
Memory Up to 1.5 TB SOCAMM2 LPDDR5X at 1.2 TB/s bandwidth
GPU Connectivity 1.8 TB/s NVLink-C2C (7x PCIe Gen 6)
Performance 2x Grace CPU in Blackwell systems
Features Scalable Coherency Fabric (SCF); confidential computing; agentic reasoning support

The Vera CPU handles data movement, analytics, and orchestration, pairing with Rubin GPUs or operating standalone for HPC and cloud workloads. 3)

NVL72 Rack Configuration

The primary deployment unit for Vera Rubin is the NVL72 rack:

Component Specification
Compute 72 Rubin GPUs + 36 Vera CPUs
Inference 3.6 EFLOPS NVFP4
Training 2.5 EFLOPS
GPU Memory 20.7 TB HBM4 (1.6 PB/s bandwidth)
CPU Memory 54 TB LPDDR5X
Scale-Up NVLink 6 at 260 TB/s rack-scale bandwidth
Cooling 100% liquid (45°C inlet)
Serviceability Modular cable-free trays (18x faster service vs Blackwell)

The NVL72 scales to 40 racks in a POD configuration, delivering 60 exaflops aggregate performance. 4)

Networking

Timeline

First deployments are expected in H2 2026 from AWS, Google Cloud, Microsoft Azure, Oracle Cloud, and CoreWeave. Volume production of supporting technologies (HBM4, SOCAMM2) is already underway. NVL144 CPX racks targeting 600 kW are projected for 2027. 6)

See Also

References