NVIDIA HGX B200
Eight Blackwell GPUs deliver 15x faster inference and 3x faster training than H100, with 192 GB HBM3e per GPU and 1.8 TB/s NVLink interconnect. Purpose-built clusters designed, deployed, and managed by CUDO, from a single node to hundreds.
NVIDIA HGX B200
Infrastructure and technology partners
Perfect for a range of workloads
3X faster training than H100 across multi-node B200 clusters. Pre-train and fine-tune foundation models at 70B to over 1 trillion parameters on validated infrastructure with InfiniBand networking and optimised communication libraries.
Serve large language models and MoE architectures on dedicated B200 nodes with predictable latency, high throughput, and 15X the inference performance of H100.
Accelerate scientific simulations, CFD, climate modelling, and financial risk analysis with Blackwell’s FP64 Tensor Core performance across multi-node clusters. Double the FP64 throughput of H100.
Purpose-built B200 clusters, designed and managed by CUDO
Deployed across 16 ISO-certified data centres
From 8 to 1,000+ GPUs in a single deployment
NVIDIA Quantum-X800 InfiniBand or Spectrum-X Ethernet networking
Expert rack-level design, installation, and benchmarking before handoff
24/7 monitoring, management, and engineering support
Compatible with Slurm, Kubernetes, and NVIDIA Base Command
Available at the most cost-effective pricing
Launch your AI products faster with on-demand GPUs and a global network of data center partners
Bare metal
Powered by renewable energy
No noisy neighbors
SpectrumX local networking
300Gbps external connectivity
NVMe SSD storage
Enterprise
Powerful GPU clusters
Scalable data center colocation
Large quantities of GPUs and hardware
Optimize to your requirements
Expert installation
Scale as your demand grows
Specifications
Browse specifications for the NVIDIA B200 GPU
Starting from
Contact us for pricing
Architecture
NVIDIA Blackwell
GPU
8x NVIDIA Blackwell GPUs
GPU memory
1,440 GB total, 64 TB/s HBM3e bandwidth
FP4 tensor core performance
144 PFLOPS (sparse) / 72 PFLOPS (dense)
FP8 tensor core performance
72 PFLOPS (sparse) / 36 PFLOPS (dense)
NVIDIA NVSwitch
2x
NVIDIA NVLink bandwidth
14.4 TB/s aggregate bandwidth
System power usage
~14.3 kW max
CPU
2x Intel Xeon Platinum 8570 processors, 112 cores total, 2.1 GHz (base), 4 GHz (max boost)
System memory
2 TB, configurable to 4 TB
Networking
Networking4x OSFP ports serving 8x single-port NVIDIA ConnectX-7 VPI (up to 400 Gb/s NVIDIA InfiniBand/Ethernet), 2x dual-port QSFP112 NVIDIA BlueField-3 DPU (up to 400 Gb/s NVIDIA InfiniBand/Ethernet)
Management network
10 Gb/s onboard NIC with RJ45, 100 Gb/s dual-port ethernet NIC, host baseboard management controller (BMC) with RJ45
Storage
OS: 2x 1.9 TB NVMe M.2, internal storage: 8x 3.84 TB NVMe U.2
Software
NVIDIA AI Enterprise (optimized AI software), NVIDIA Mission Control (AI data center operations and orchestration with NVIDIA Run:ai technology), NVIDIA DGX OS (operating system), supports Red Hat Enterprise Linux / Rocky / Ubuntu
Rack units (RU)
10
Operating temperature
10-35°C / 50-90°F
Ideal uses cases for the NVIDIA B200 GPU
Explore uses cases for the NVIDIA B200 including Frontier model training, Production inference at scale, Scientific computing and HPC, Sovereign and regulated AI.
Frontier model training
Multi-node B200 clusters with fifth-generation NVLink and InfiniBand networking deliver 3x the training performance of H100. Reduce time-to-train on 70B+ parameter models with infrastructure that's been tested and benchmarked before handoff, not self-serve hardware you have to validate yourself.
Production inference at scale
15x faster real-time inference and 12x lower cost per token compared to H100, powered by Blackwell's second-generation Transformer Engine with native FP4 precision. Serve trillion-parameter LLMs and MoE models on dedicated nodes with predictable latency and guaranteed throughput.
Scientific computing and HPC
Blackwell's FP64 tensor core performance accelerates computational fluid dynamics, climate modelling, drug discovery, and financial risk analysis. 192 GB HBM3e per GPU eliminates memory bottlenecks for the largest simulation workloads.
Sovereign and regulated AI
Deploy B200 clusters in ISO-certified data centres globally. Meet data residency and regulatory requirements with full infrastructure control, your hardware, your jurisdiction, managed by CUDO.
Blog
Resources
- Emmanuel Ohiri
Resources
- Emmanuel Ohiri
Resources
- Emmanuel Ohiri
Resources
- Emmanuel Ohiri
Resources
- Emmanuel Ohiri
Resources
- Emmanuel Ohiri
Browse alternative GPU solutions for your workloads
Access a wide range of performant NVIDIA and AMD GPUs to accelerate your AI, ML & HPC workloads
NVIDIA H100 PCIe
Price on request
Scale with high performance H100 GPUs on our reserved cloud.