NVIDIA HGX B200

Eight Blackwell GPUs deliver 15x faster inference and 3x faster training than H100, with 192 GB HBM3e per GPU and 1.8 TB/s NVLink interconnect. Purpose-built clusters designed, deployed, and managed by CUDO, from a single node to hundreds.

NVIDIA HGX B200

Infrastructure and technology partners

Perfect for a range of workloads

3X faster training than H100 across multi-node B200 clusters. Pre-train and fine-tune foundation models at 70B to over 1 trillion parameters on validated infrastructure with InfiniBand networking and optimised communication libraries.

Serve large language models and MoE architectures on dedicated B200 nodes with predictable latency, high throughput, and 15X the inference performance of H100.

Accelerate scientific simulations, CFD, climate modelling, and financial risk analysis with Blackwell’s FP64 Tensor Core performance across multi-node clusters. Double the FP64 throughput of H100.

Purpose-built B200 clusters, designed and managed by CUDO

Deployed across 16 ISO-certified data centres

From 8 to 1,000+ GPUs in a single deployment

NVIDIA Quantum-X800 InfiniBand or Spectrum-X Ethernet networking

Expert rack-level design, installation, and benchmarking before handoff

24/7 monitoring, management, and engineering support

Compatible with Slurm, Kubernetes, and NVIDIA Base Command

Available at the most cost-effective pricing

Launch your AI products faster with on-demand GPUs and a global network of data center partners

Bare metal

Complete control over a physical machine for more control.

Powered by renewable energy

No noisy neighbors

SpectrumX local networking

300Gbps external connectivity

NVMe SSD storage

Enterprise

We offer a range of solutions for enterprise customers.

Powerful GPU clusters

Scalable data center colocation

Large quantities of GPUs and hardware

Optimize to your requirements

Expert installation

Scale as your demand grows

Specifications

Browse specifications for the NVIDIA B200 GPU

Starting from

Contact us for pricing

Architecture

NVIDIA Blackwell

GPU

8x NVIDIA Blackwell GPUs

GPU memory

1,440 GB total, 64 TB/s HBM3e bandwidth

FP4 tensor core performance

144 PFLOPS (sparse) / 72 PFLOPS (dense)

FP8 tensor core performance

72 PFLOPS (sparse) / 36 PFLOPS (dense)

NVIDIA NVSwitch

2x

NVIDIA NVLink bandwidth

14.4 TB/s aggregate bandwidth

System power usage

~14.3 kW max

CPU

2x Intel Xeon Platinum 8570 processors, 112 cores total, 2.1 GHz (base), 4 GHz (max boost)

System memory

2 TB, configurable to 4 TB

Networking

Networking4x OSFP ports serving 8x single-port NVIDIA ConnectX-7 VPI (up to 400 Gb/s NVIDIA InfiniBand/Ethernet), 2x dual-port QSFP112 NVIDIA BlueField-3 DPU (up to 400 Gb/s NVIDIA InfiniBand/Ethernet)

Management network

10 Gb/s onboard NIC with RJ45, 100 Gb/s dual-port ethernet NIC, host baseboard management controller (BMC) with RJ45

Storage

OS: 2x 1.9 TB NVMe M.2, internal storage: 8x 3.84 TB NVMe U.2

Software

NVIDIA AI Enterprise (optimized AI software), NVIDIA Mission Control (AI data center operations and orchestration with NVIDIA Run:ai technology), NVIDIA DGX OS (operating system), supports Red Hat Enterprise Linux / Rocky / Ubuntu

Rack units (RU)

10

Operating temperature

10-35°C / 50-90°F

Ideal uses cases for the NVIDIA B200 GPU

Explore uses cases for the NVIDIA B200 including Frontier model training, Production inference at scale, Scientific computing and HPC, Sovereign and regulated AI.

Frontier model training

Multi-node B200 clusters with fifth-generation NVLink and InfiniBand networking deliver 3x the training performance of H100. Reduce time-to-train on 70B+ parameter models with infrastructure that's been tested and benchmarked before handoff, not self-serve hardware you have to validate yourself.

Production inference at scale

15x faster real-time inference and 12x lower cost per token compared to H100, powered by Blackwell's second-generation Transformer Engine with native FP4 precision. Serve trillion-parameter LLMs and MoE models on dedicated nodes with predictable latency and guaranteed throughput.

Scientific computing and HPC

Blackwell's FP64 tensor core performance accelerates computational fluid dynamics, climate modelling, drug discovery, and financial risk analysis. 192 GB HBM3e per GPU eliminates memory bottlenecks for the largest simulation workloads.

Sovereign and regulated AI

Deploy B200 clusters in ISO-certified data centres globally. Meet data residency and regulatory requirements with full infrastructure control, your hardware, your jurisdiction, managed by CUDO.

Blog

Browse alternative GPU solutions for your workloads

Access a wide range of performant NVIDIA and AMD GPUs to accelerate your AI, ML & HPC workloads

Discuss your infrastructure requirements

Scroll to Top