NVIDIA Vera Rubin

Up to 3.5X more compute and 2.8X more memory bandwidth per GPU than Blackwell. Six new chips, co-designed as one rack-scale system. CUDO is preparing managed Vera Rubin clusters for H2 2026.

NVIDIA Vera Rubin

Infrastructure and technology partners

Inside the Rubin architecture

Vera Rubin is NVIDIA’s next-generation AI platform, announced at CES 2026 and scheduled for partner availability in the second half of 2026. It is built from six co-designed chips, including the Rubin GPU, the Vera CPU, the NVLink 6 switch, the ConnectX-9 SuperNIC, the BlueField-4 DPU, and the Spectrum-6 Ethernet switch. NVIDIA describes it as treating the data centre, not the chip, as the unit of compute. Vera Rubin is available in two form factors, the NVL72 rack (72 GPU packages, 36 Vera CPUs, liquid-cooled) and the HGX Rubin NVL8 node (8 GPU packages, liquid-cooled, x86 host).

Rubin GPU

336 billion transistors across two reticle-limit dies plus two I/O chiplets, manufactured on TSMC N3P

224 streaming multiprocessors, 896 Tensor Cores, 3rd-generation Transformer Engine

35 PFLOPS dense FP4 per GPU, up to 50 PFLOPS effective FP4 for inference via Transformer Engine sparsity

17.5 PFLOPS dense FP8

288 GB HBM4 memory (8 stacks, 12-Hi) with up to 22 TB/s bandwidth

3.6 TB/s NVLink 6 bidirectional bandwidth per GPU

2,300W TDP. Liquid cooling required

Vera CPU

88 custom Arm-based Olympus cores with spatial multi-threading (176 threads)

Up to 1.5 TB LPDDR5X SOCAMM memory with 1.2 TB/s bandwidth

3X memory capacity versus Grace

1.8 TB/s coherent NVLink-C2C interface to GPU

2X data processing performance versus Grace (NVIDIA claim)

NVLink 6

3.6 TB/s bidirectional bandwidth per GPU (2X Blackwell NVLink 5)

9 NVSwitch6 trays per NVL72 rack

260 TB/s aggregate NVLink fabric across 72 GPUs

Double the rack-level interconnect bandwidth of Blackwell NVL72

Performance vs Blackwell

Per GPU (versus Blackwell B200)

3.5X dense FP4 compute (35 vs 10 PFLOPS)

3.5X dense FP8 compute (17.5 vs 5 PFLOPS)

2.8X memory bandwidth (up to 22 vs 8 TB/s)

1.5X memory capacity (288 vs 192 GB)

Per rack (Vera Rubin NVL72)

2,520 PFLOPS dense FP4 (up to 3,600 PFLOPS effective)

1,260 PFLOPS dense FP8

21 TB HBM4

54 TB LPDDR5X

Up to 1.6 PB/s HBM bandwidth

All specifications and performance claims are based on NVIDIA’s announcements at CES 2026 and third-party analysis. Final specifications may change before availability. NVIDIA’s product page notes that projected performance is subject to change. Initial HBM4 memory bandwidth may be lower than the 22 TB/s target.

What Vera Rubin means for CUDO customers

Seamless upgrade from Blackwell

Vera Rubin NVL72 is built on the third-generation MGX rack design, which NVIDIA describes as a seamless transition from prior Blackwell generations. CUDO will manage the transition from Blackwell to Rubin when hardware is available.

Two form factors

NVIDIA offers Vera Rubin in both NVL72 rack-scale and HGX NVL8 node configurations, both liquid-cooled. NVL72 puts 72 GPUs in a single NVLink domain for rack-scale training and inference. HGX NVL8 provides 8-GPU nodes for workloads that don't require rack-scale connectivity. CUDO is evaluating both form factors for its managed infrastructure platform.

Power and cooling readiness

Each Rubin GPU draws 2,300W, significantly more than Blackwell's 1,000-1,400W range. CUDO's liquid-cooled data centres are being prepared for Vera Rubin's higher power density. Register your interest now to begin capacity planning ahead of availability.

Vera Rubin vs Blackwell at a glance

Headline performance multipliers are versus Blackwell B200. Blackwell Ultra (B300) included in the table for reference. Rubin figures from NVIDIA CES 2026 and independent industry analysis. Final specifications may change before availability.

Starting from

Architecture

NVIDIA Rubin

GPU

72x NVIDIA Rubin GPUs (configured as 36x Vera Rubin Superchips)

GPU memory

20.7 TB total HBM4 (288 GB per GPU), up to 1,584 TB/s aggregate bandwidth (~22 TB/s per GPU)

FP4 (NVFP4) tensor core performance

3,600 PFLOPS (3.6 ExaFLOPS) aggregate inference | 2,520 PFLOPS aggregate training (50 PFLOPS per single GPU)

FP8 tensor core performance

~1,800 PFLOPS (1.8 ExaFLOPS) aggregate

NVIDIA NVSwitch

Sixth-generation NVIDIA NVLink 6.0 Switches

NVIDIA NVLink bandwidth

~260 TB/s aggregate intra-rack interconnect throughput (3.6 TB/s per GPU, allowing all 72 GPUs to communicate flawlessly as one)

System power usage

~190 kW rack TDP (Max Q configuration) up to ~230 kW peak (Max P configuration)

CPU

36x NVIDIA Vera CPUs (Each featuring 88 custom Armv9.2 "Olympus" cores, 3,168 cores total per rack)

System memory

Up to 54 TB total LPDDR5X (up to 1.5 TB per Vera CPU)

Networking

NVIDIA ConnectX-9 SuperNICs (Up to 1.6 Tb/s with integrated Silicon Photonics) NVIDIA BlueField-4 DPUs (with integrated SSDs for key-value cache storage) Scale-out capabilities via NVIDIA Spectrum-6 Ethernet or Quantum-CX9 InfiniBand

Management network

Host baseboard management controller (BMC) with RJ45 per tray, Top-of-Rack (TOR) out-of-band management switches

Storage

NVIDIA BlueField-4 integrated NVMe for cache, plus tray-level E1.S PCIe Gen6 SSDs and M.2 boot drives

Software

NVIDIA AI Enterprise (including NIM microservices), Red Hat AI / Red Hat Enterprise Linux / OpenShift, CoreWeave Mission Control, NVIDIA DGX OS

Rack units (RU)

Operating temperature

Requires advanced Direct-to-Chip (D2C) liquid cooling (The VR200 NVL72 compute and NVSwitch trays utilize a fanless design, with liquid cooling flow requirements nearly double that of the Blackwell generation).

Need GPU clusters today?

CUDO’s Blackwell clusters are available now. HGX B200, HGX B300, GB200 NVL72, and GB300 NVL72 configurations are deployed across ISO-certified data centres in North America, Europe, the UK, and MENA. CUDO is planning upgrade paths from Blackwell to Vera Rubin for when hardware becomes available.

Register your interest in Vera Rubin

CUDO is preparing managed Vera Rubin infrastructure for H2 2026. Register now to discuss early access, capacity planning, and deployment timelines with our engineering team.

First name*

Last name*

Company name*

Phone*

Business email address*

What do you use CUDO Compute for?*

How can we help?

Products

NVIDIA Vera Rubin

NVIDIA Vera Rubin

Infrastructure and technology partners

Inside the Rubin architecture

Rubin GPU

Vera CPU

NVLink 6

Performance vs Blackwell

Per GPU (versus Blackwell B200)

Per rack (Vera Rubin NVL72)

What Vera Rubin means for CUDO customers

Seamless upgrade from Blackwell

Two form factors

Power and cooling readiness

Vera Rubin vs Blackwell at a glance

Need GPU clusters today?

Register your interest in Vera Rubin