NVIDIA L40S

Power your inference workloads with our NVIDIA L40S infrastructure. Deploy scalable clusters instantly, now available in our Sweden data center. Powered by the Ada Lovelace architecture and cutting-edge features, the L40S brings next-level performance and exceptional processing power to handle intensive tasks, such as AI inference and training.

NVIDIA L40S

Infrastructure and technology partners

Perfect for a range of workloads

Deploying AI based workloads on CUDO Compute is easy and cost-effective. Follow our AI related tutorials.

Deploying rendering based workloads on CUDO Compute is easy and cost-effective.

From video editing to image generation, virtualization is ideal for your content creation needs.

Purpose-built L40S clusters, designed and managed by CUDO

Deployed across 16 ISO-certified data centres

From 8 to 1,000+ GPUs in a single deployment

NVIDIA Quantum-X800 InfiniBand or Spectrum-X Ethernet networking

Expert rack-level design, installation, and benchmarking before handoff

24/7 monitoring, management, and engineering support

Compatible with Slurm, Kubernetes, and NVIDIA Base Command

Available at the most cost-effective pricing

Launch your AI products faster with on-demand GPUs and a global network of data center partners

Bare metal

Complete control over a physical machine for more control.

No noisy neighbors

SpectrumX local networking

300Gbps external connectivity

NVMe SSD storage

Clusters

Deploy and manage powerful, scalable GPU clusters.

Scale up & down as needed

Production ready in minutes

Infiniband networking support

Manage with our CLI, dashboard & API

On-platform storage tooling

Specifications

Browse specifications for the NVIDIA L40S GPU

Starting from

Architecture

NVIDIA Ada Lovelace

GPU

8x NVIDIA L40S GPUs

GPU memory

48GB GDDR6 with ECC

NVIDIA Ada Lovelace Architecture-Based CUDA® Cores

18,176

NVIDIA Fourth-Generation Tensor Cores

568

PCIe

PCIe Gen4

PCIe Gen4 x16 interconnect

64 GB/s bidirectional bandwidth

System power usage

~4 to 5 kW max

CPU

Dual 4th or 5th Gen Intel Xeon Scalable Processors (e.g., Platinum 8480+) or AMD EPYC 9004 Series equivalents

System memory

1 TB to 2 TB DDR5 ECC memory

Networking

Up to 4x single-port NVIDIA ConnectX-7 NICs (Up to 400 Gb/s) or NVIDIA BlueField-3 DPUs for scale-out Ethernet clustering (Spectrum-X)

Management network

1GbE onboard network interface card (NIC) with RJ45. Host baseboard management controller (BMC) with RJ45

Storage

OS: 2x 1.92 TB NVMe M.2 (Boot drives), internal storage: 2x to 8x 3.84 TB NVMe U.2/U.3 (Partner configurable)

Software

NVIDIA AI Enterprise (optimized AI software suite included), NVIDIA Omniverse Enterprise, Ubuntu / Red Hat Enterprise Linux / Rocky Linux

Rack units (RU)

Operating temperature

5-30°C (41-86°F)

Ideal uses cases for the NVIDIA L40S GPU

Explore uses cases for the NVIDIA L40S including Data science and AI, Rendering and 3D graphics, High-performance virtual workstations.

Data science and AI

The L40S GPU offers powerful training and inference performance, allowing professionals to reduce the time to completion for model training and development as well as data preparation workflows.

Rendering and 3D graphics

Powered by the latest fourth-generation Tensor Core and featuring enhanced AI capabilities, the L40S is the top choice for artists and content creators to handle complex rendering and graphics tasks.

High-performance virtual workstations

When combined with NVIDIA RTX Virtual Workstation (vWS) software, the L40S allows professionals to access the most demanding applications from anywhere with awe-inspiring performance that rivals physical workstations.

Blog

Resources

AI hardware installation & maintenance: from GPU racks to memory and storage

Hardware bottlenecks strand expensive compute. We detail the precise site readiness, cooling, and storage configurations needed to scale AI racks

Emmanuel Ohiri

March 18, 2026

Resources

Key considerations for optimizing power efficiency with sustainable energy sources

Power is no longer a background variable in AI infrastructure. It is a first-order constraint that sets the ceiling on

Emmanuel Ohiri

February 10, 2026

Resources

Building for 70% AI-driven demand: Planning for the coming capacity surge

Global data center capacity will nearly triple by 2030, with AI driving most demand. Traditional infrastructure planning no longer works

Emmanuel Ohiri

January 20, 2026

Resources

NVIDIA H100 versus H200: how do they compare?

Read the comprehensive comparison between NVIDIA's H100 and H200 GPUs. Discover the expected improvements and performance gains for AI and

Emmanuel Ohiri

January 16, 2026

Resources

NVIDIA’s Blackwell architecture: breaking down the B100, B200, and GB200

NVIDIA introduced a pivotal breakthrough in AI technology by unveiling its next-gen Blackwell-based GPUs at the NVIDIA GTC 2024.

Emmanuel Ohiri

January 15, 2026

Resources

What is ensemble learning?

Ensemble learning combines the strengths of different algorithms to achieve greater accuracy and solve complex problems.

Emmanuel Ohiri

January 15, 2026

Browse alternative GPU solutions for your workloads

Access a wide range of performant NVIDIA and AMD GPUs to accelerate your AI, ML & HPC workloads

NVIDIA H100 SXM

Price on request

Deploy performant H100s on-demand with CUDO Compute.

NVIDIA H100 PCIe

Price on request

Scale with high performance H100 GPUs on our reserved cloud.

NVIDIA H100 SXM

Pricing on request

Deploy performant H100s on-demand with CUDO Compute.

NVIDIA L40S

Pricing on request

Deploying AI based workloads on CUDO Compute.

NVIDIA H200

Pricing on request

Deploying AI based workloads on CUDO Compute

NVIDIA H100

Pricing on request

Deploying AI based workloads on CUDO Compute

Discuss your infrastructure requirements

First name*

Last name*

Company name*

Phone*

Business email address*

What do you use CUDO Compute for?*

How can we help?

Products

NVIDIA L40S

NVIDIA L40S

Infrastructure and technology partners

Perfect for a range of workloads

Purpose-built L40S clusters, designed and managed by CUDO

Deployed across 16 ISO-certified data centres

From 8 to 1,000+ GPUs in a single deployment

NVIDIA Quantum-X800 InfiniBand or Spectrum-X Ethernet networking

Expert rack-level design, installation, and benchmarking before handoff

24/7 monitoring, management, and engineering support

Compatible with Slurm, Kubernetes, and NVIDIA Base Command

Available at the most cost-effective pricing

Bare metal

Clusters

Specifications

Ideal uses cases for the NVIDIA L40S GPU

Data science and AI

Rendering and 3D graphics

High-performance virtual workstations

Blog

Resources

AI hardware installation & maintenance: from GPU racks to memory and storage

Resources

Key considerations for optimizing power efficiency with sustainable energy sources

Resources

Building for 70% AI-driven demand: Planning for the coming capacity surge

Resources

NVIDIA H100 versus H200: how do they compare?

Resources

NVIDIA’s Blackwell architecture: breaking down the B100, B200, and GB200

Resources

What is ensemble learning?

Browse alternative GPU solutions for your workloads

NVIDIA H100 SXM

Price on request

NVIDIA H100 PCIe

Price on request

NVIDIA H100 SXM

Pricing on request

NVIDIA L40S

Pricing on request

NVIDIA H200

Pricing on request

NVIDIA H100

Pricing on request

Discuss your infrastructure requirements