Nvidia Vera Rubin Platform

The Nvidia Vera Rubin platform is a comprehensive AI and HPC system integrating Rubin GPUs, Vera CPUs, advanced networking, and storage components. Announced at GTC 2025 as the successor to the Blackwell architecture, Vera Rubin delivers up to 5x inference performance and 10x lower cost per token at rack scale. ¹⁾

Rubin GPU

Specification	Detail
Process Node	TSMC 3nm
Transistors	336 billion (dual reticle-size compute dies + 2 I/O tiles)
Memory	288 GB HBM4 at 22 TB/s bandwidth
Inference Performance	50 PFLOPS NVFP4 (5x Blackwell)
Training Performance	35 PFLOPS NVFP4 (3.5x Blackwell)
Transformer Engine	3rd generation with hardware-accelerated adaptive compression
TDP	Max Q ~1.8 kW, Max P ~2.3 kW

The Rubin CPX variant features 128 GB GDDR7 for massive-context inference (million-token workloads). The NVL144 CPX platform delivers 8 exaflops AI performance with 100 TB fast memory. ²⁾

Vera CPU

Specification	Detail
Cores	88 custom Armv9.2 Olympus cores (176 threads via Spatial Multithreading)
Memory	Up to 1.5 TB SOCAMM2 LPDDR5X at 1.2 TB/s bandwidth
GPU Connectivity	1.8 TB/s NVLink-C2C (7x PCIe Gen 6)
Performance	2x Grace CPU in Blackwell systems
Features	Scalable Coherency Fabric (SCF); confidential computing; agentic reasoning support

The Vera CPU handles data movement, analytics, and orchestration, pairing with Rubin GPUs or operating standalone for HPC and cloud workloads. ³⁾

NVL72 Rack Configuration

The primary deployment unit for Vera Rubin is the NVL72 rack:

Component	Specification
Compute	72 Rubin GPUs + 36 Vera CPUs
Inference	3.6 EFLOPS NVFP4
Training	2.5 EFLOPS
GPU Memory	20.7 TB HBM4 (1.6 PB/s bandwidth)
CPU Memory	54 TB LPDDR5X
Scale-Up	NVLink 6 at 260 TB/s rack-scale bandwidth
Cooling	100% liquid (45°C inlet)
Serviceability	Modular cable-free trays (18x faster service vs Blackwell)

The NVL72 scales to 40 racks in a POD configuration, delivering 60 exaflops aggregate performance. ⁴⁾

Networking

NVLink 6 — 260 TB/s scale-up bandwidth; enables 576-GPU die world size in Rubin Ultra ⁵⁾
ConnectX-9 SuperNIC — 1.6 Tb/s per GPU with 8 NICs per tray
Spectrum-6 / Quantum-CX9 — Photonics for Ethernet and InfiniBand scale-out
BlueField-4 DPU — 64-core Grace CPU + ConnectX-9 with 2x bandwidth, 3x memory, 6x compute versus BF-3

Timeline

First deployments are expected in H2 2026 from AWS, Google Cloud, Microsoft Azure, Oracle Cloud, and CoreWeave. Volume production of supporting technologies (HBM4, SOCAMM2) is already underway. NVL144 CPX racks targeting 600 kW are projected for 2027. ⁶⁾

References

¹⁾ , ⁴⁾ , ⁵⁾ , ⁶⁾

Source: Hashrate Index — Vera Rubin NVL72 Specs

²⁾

Source: NVIDIA — Rubin CPX Announcement

³⁾

Source: NVIDIA — Rubin Platform

Table of Contents