Reserved cloud

NVIDIA H200: The world’s most powerful GPU cloud for large-scale AI workloads

The NVIDIA H200 is based on NVIDIA’s latest AI chips and the ideal choice for large-scale AI applications. Based on the NVIDIA Hopper architecture, the H200 comes with 141 gigabytes (GB) of HBM3e memory at 4.8 terabytes per second (TB/s) - that’s over 50% more capacity than NVIDIA H100 Tensor Core GPU and up to 2x the performance of the NVIDIA H100.

With network speeds of up to 3.2 Tbps you can accelerate generative AI training and inference and HPC workloads with unparalleled speed. Limited GPU resources are available to Reserve; quickly reserve the NVIDIA H200 GPU now!

Submit the form to reserve your NVIDIA H200 cloud, and our experts will be in touch with you.

Get access

Be the first to get access to the NVIDIA H200 GPU

Reserved cloud (trial)
Starting from

Get access to the latest NVIDIA H200 GPU’s today, the fastest GPU Cloud for AI and HPC

Use cases

With generative AI and LLMs requiring greater memory and speed, the H200 GPU provides the fastest and most powerful card available today.

With 1.9X the performance of the H100 for Llama 70B Inference, it is the ideal solution for training or inference.

Request access today to test the H200 GPU Cloud, or reserve your H200 Cloud on CUDO Compute for as long as you want it, with unique contracts tailored to suit your needs.

Almost twice as powerful as the H100 for specific tasks, the H200 on CUDO Compute allows you to build and scale your LLMs more efficiently and affordably than ever before!


Architecture NVIDIA Hopper
Form FactorSXM
FP64 Tensor Core67 TFLOPS
FT32 Tensor Core989 TFLOPS²
BFLOAT16 Tensor Core1,979 TFLOPS²
FP16 Tensor Core1,979 TFLOPS²
FP8 Tensor Core3,958 TFLOPS²
INT8 Tensor Core3,958 TOPS²
GPU Memory141GB
GPU Memory Bandwidth4.8 TB/s
Decoders7 NVDEC, 7 JPEG
Max Thermal Design PowerUp to 700W
Multi Instance GPUsUp to 7 MIGs @16.5GB each
InterconnectNVLink: 900GB/s, PCIe Gen5: 128GB/s
Starting from $2.49/hr
  • AI Inference

    AI developers can utilize the NVIDIA H200 to accelerate AI inference workloads, such as image and speech recognition, at lightning speed. The H200 GPU’s powerful Tensor Cores enable it to quickly process large amounts of data, making it perfect for real-time inference applications.

  • Deep Learning

    The NVIDIA H200 empowers data scientists and researchers to achieve groundbreaking milestones in deep learning. Its massive memory and processing power guarantee significantly reduced training and deployment times for complex, large-scale models and enables model training on significantly larger datasets.

  • High-Performance Computing

    From complex scientific simulations to weather forecasting and intricate financial modelling, the H200 empowers diverse organizations to accelerate high-performance computing tasks. Its unmatched memory bandwidth and processing capabilities ensure smooth operation for workloads of any scale, allowing you to achieve unmatched results faster than ever.

CUDO Compute: Accessible GPU rental

Industry demand for HPC resources has grown exponentially, driven by the explosion in ML training, deep learning, and AI inference applications. This growth has made it challenging for organizations to rent GPU resources or even buy some powerful data center and workstation GPUs.

Whether your field is data science, machine learning, or any high-performance computing on GPU, getting started is simple. Start using many of our HPC resources today, or reserve powerful data center GPUs to ensure you have the capacity to empower your developers and delight your customers.

Request access today to our NVIDIA H200 GPU cloud or contact us to discuss your requirements and our expert team can help advise you.

Deploy high-performance cloud GPUs