NVIDIA Vera Rubin
Up to 3.5X more compute and 2.8X more memory bandwidth per GPU than Blackwell. Six new chips, co-designed as one rack-scale system. CUDO is preparing managed Vera Rubin clusters for H2 2026.
NVIDIA Vera Rubin
Infrastructure and technology partners
Inside the Rubin architecture
Vera Rubin is NVIDIA’s next-generation AI platform, announced at CES 2026 and scheduled for partner availability in the second half of 2026. It is built from six co-designed chips, including the Rubin GPU, the Vera CPU, the NVLink 6 switch, the ConnectX-9 SuperNIC, the BlueField-4 DPU, and the Spectrum-6 Ethernet switch. NVIDIA describes it as treating the data centre, not the chip, as the unit of compute. Vera Rubin is available in two form factors, the NVL72 rack (72 GPU packages, 36 Vera CPUs, liquid-cooled) and the HGX Rubin NVL8 node (8 GPU packages, liquid-cooled, x86 host).
Rubin GPU
336 billion transistors across two reticle-limit dies plus two I/O chiplets, manufactured on TSMC N3P
224 streaming multiprocessors, 896 Tensor Cores, 3rd-generation Transformer Engine
35 PFLOPS dense FP4 per GPU, up to 50 PFLOPS effective FP4 for inference via Transformer Engine sparsity
17.5 PFLOPS dense FP8
288 GB HBM4 memory (8 stacks, 12-Hi) with up to 22 TB/s bandwidth
3.6 TB/s NVLink 6 bidirectional bandwidth per GPU
2,300W TDP. Liquid cooling required
Vera CPU
88 custom Arm-based Olympus cores with spatial multi-threading (176 threads)
Up to 1.5 TB LPDDR5X SOCAMM memory with 1.2 TB/s bandwidth
3X memory capacity versus Grace
1.8 TB/s coherent NVLink-C2C interface to GPU
2X data processing performance versus Grace (NVIDIA claim)
NVLink 6
3.6 TB/s bidirectional bandwidth per GPU (2X Blackwell NVLink 5)
9 NVSwitch6 trays per NVL72 rack
260 TB/s aggregate NVLink fabric across 72 GPUs
Double the rack-level interconnect bandwidth of Blackwell NVL72
Performance vs Blackwell
Per GPU (versus Blackwell B200)
3.5X dense FP4 compute (35 vs 10 PFLOPS)
3.5X dense FP8 compute (17.5 vs 5 PFLOPS)
2.8X memory bandwidth (up to 22 vs 8 TB/s)
1.5X memory capacity (288 vs 192 GB)
Per rack (Vera Rubin NVL72)
2,520 PFLOPS dense FP4 (up to 3,600 PFLOPS effective)
1,260 PFLOPS dense FP8
21 TB HBM4
54 TB LPDDR5X
Up to 1.6 PB/s HBM bandwidth
All specifications and performance claims are based on NVIDIA’s announcements at CES 2026 and third-party analysis. Final specifications may change before availability. NVIDIA’s product page notes that projected performance is subject to change. Initial HBM4 memory bandwidth may be lower than the 22 TB/s target.
What Vera Rubin means for CUDO customers
Seamless upgrade from Blackwell
Vera Rubin NVL72 is built on the third-generation MGX rack design, which NVIDIA describes as a seamless transition from prior Blackwell generations. CUDO will manage the transition from Blackwell to Rubin when hardware is available.
Two form factors
NVIDIA offers Vera Rubin in both NVL72 rack-scale and HGX NVL8 node configurations, both liquid-cooled. NVL72 puts 72 GPUs in a single NVLink domain for rack-scale training and inference. HGX NVL8 provides 8-GPU nodes for workloads that don't require rack-scale connectivity. CUDO is evaluating both form factors for its managed infrastructure platform.
Power and cooling readiness
Each Rubin GPU draws 2,300W, significantly more than Blackwell's 1,000-1,400W range. CUDO's liquid-cooled data centres are being prepared for Vera Rubin's higher power density. Register your interest now to begin capacity planning ahead of availability.
Vera Rubin vs Blackwell at a glance
Headline performance multipliers are versus Blackwell B200. Blackwell Ultra (B300) included in the table for reference. Rubin figures from NVIDIA CES 2026 and independent industry analysis. Final specifications may change before availability.
Starting from
Contact us for pricing
Architecture
NVIDIA Rubin
GPU
72x NVIDIA Rubin GPUs (configured as 36x Vera Rubin Superchips)
GPU memory
20.7 TB total HBM4 (288 GB per GPU), up to 1,584 TB/s aggregate bandwidth (~22 TB/s per GPU)
FP4 (NVFP4) tensor core performance
3,600 PFLOPS (3.6 ExaFLOPS) aggregate inference | 2,520 PFLOPS aggregate training (50 PFLOPS per single GPU)
FP8 tensor core performance
~1,800 PFLOPS (1.8 ExaFLOPS) aggregate
NVIDIA NVSwitch
Sixth-generation NVIDIA NVLink 6.0 Switches
NVIDIA NVLink bandwidth
~260 TB/s aggregate intra-rack interconnect throughput (3.6 TB/s per GPU, allowing all 72 GPUs to communicate flawlessly as one)
System power usage
~190 kW rack TDP (Max Q configuration) up to ~230 kW peak (Max P configuration)
CPU
36x NVIDIA Vera CPUs (Each featuring 88 custom Armv9.2 "Olympus" cores, 3,168 cores total per rack)
System memory
Up to 54 TB total LPDDR5X (up to 1.5 TB per Vera CPU)
Networking
NVIDIA ConnectX-9 SuperNICs (Up to 1.6 Tb/s with integrated Silicon Photonics) NVIDIA BlueField-4 DPUs (with integrated SSDs for key-value cache storage) Scale-out capabilities via NVIDIA Spectrum-6 Ethernet or Quantum-CX9 InfiniBand
Management network
Host baseboard management controller (BMC) with RJ45 per tray, Top-of-Rack (TOR) out-of-band management switches
Storage
NVIDIA BlueField-4 integrated NVMe for cache, plus tray-level E1.S PCIe Gen6 SSDs and M.2 boot drives
Software
NVIDIA AI Enterprise (including NIM microservices), Red Hat AI / Red Hat Enterprise Linux / OpenShift, CoreWeave Mission Control, NVIDIA DGX OS
Rack units (RU)
48
Operating temperature
Requires advanced Direct-to-Chip (D2C) liquid cooling (The VR200 NVL72 compute and NVSwitch trays utilize a fanless design, with liquid cooling flow requirements nearly double that of the Blackwell generation).
Need GPU clusters today?
CUDO’s Blackwell clusters are available now. HGX B200, HGX B300, GB200 NVL72, and GB300 NVL72 configurations are deployed across ISO-certified data centres in North America, Europe, the UK, and MENA. CUDO is planning upgrade paths from Blackwell to Vera Rubin for when hardware becomes available.
Register your interest in Vera Rubin
CUDO is preparing managed Vera Rubin infrastructure for H2 2026. Register now to discuss early access, capacity planning, and deployment timelines with our engineering team.