Computer Hardware

What Makes the NVIDIA L40S Special? First Impressions on L40S

September 7, 2023 • 4 min read

DGX and HGX are Costly and Hard to Come By… Alternative?

NVIDIA is the choice hardware for anything AI-related. Its compute leadership in AI with the NVIDIA A100 and NVIDIA H100 drives high demand for NVIDIA’s high-performance GPUs for developing the next wave of AI models. However, NVIDIA A100 and NVIDIA H100 have a very high startup cost for smaller scale operations, especially for the DGX and HGX variants.

Training and Inferencing complex AI models for text-to-image and LLM generative AIs is highly compute-intensive. The NVIDIA L40S GPU was announced and released to fill a gap combining powerful AI computing with best-in-class graphics and media acceleration built to power the next generation of data center workloads. The NVIDIA L40S is capable of powering generative AI and large language model (LLM) inference and training to 3D graphics, rendering, and video.

But how does NVIDIA make a GPU that can tackle all these workloads? What makes the NVIDIA L40S special?

NVIDIA L40S Advantages

The naming convention leads us to believe the L40S is an upgraded L40 designed for data center graphics and large-scale NVIDIA Omniverse simulation and workloads. But it is more. NVIDIA makes it clear that this GPU is the most universal high-performance accelerator for any workload you throw at it, supporting complex AI training and inferencing at a high level, comparing it to NVIDIA’s flagships: A100 and H100 SXM.

	A100 80GB SXM	NVIDIA L40S	H100 80 GB SXM
GPU Architecture	Ampere	Ada Lovelace	Hopper
FP64	9.7 TFLOPS	N/A	33.5 TFLOPS
FP32	19.5 TFLOPS	91.6 TFLOPS	66.9 TFLOPS
RT Cores	N/A	212 TFLOPS	N/A
TF32 Tensor Core	312 TFLOPS	366 TFLOPS	989 TFLOPS
FP16/BF16 Tensor Core	624 TFLOPS	733 TFLOPS	1979 TFLOPS
FP8 Tensor Core	N/A	1466 TFLOPS	3958 TFLOPS
INT8 Tensor Core	1248 TOPS	1466 TOPS	3958 TOPS
GPU Memory	80GB HBM2e	48GB GDDR6	80GB HBM3
GPU Memory Bandwidth	2039 GB/s	864 GB/s	3352 GB/s
L2 Cache	40MB	96MB	50MB
Media Engine	0 NVENC 5 NVDEC 5 NVJPEG	0 NVENC 5 NVDEC 5 NVJPEG	0 NVENC 7 NVDEC 7 NVJPEG
Power	Up to 400 W	Up to 350W	Up to 700W
Form Factor	SXM4 - 8 GPU HGX	Dual Slot Width	SXM5 - 8 GPU HGX
Interconnect	PCIe 4.0 x16	PCIe 4.0 x16	PCIe 5.0 x16

Better General Purpose Computing: Comparing the L40S specifications with the NVIDIA A100 SXM, there is a substantial gap in performance for FP32, the standard metric for general compute performance, even outperforming the NVIDIA H100 SXM. The L40S delivers exceptional performance in HPC workloads such as simulations, rendering, graphics, and more.

Better AI Performance: While general computing isn’t the A100’s strong suit, the L40S also outperforms it in its specialty. Tensor Core performance in the same FP32 format is higher by a decent amount. Also, with the new Transformer Architecture, ability to compute on FP8 and hybrid floating point precision, the L40S is ahead of the game compared to the A100 in training and inferencing AI.

Better Accessibility: The NVIDIA L40S is a mainstream accelerator slotting into servers via PCIe 4.0. Its user-friendly installation process, low entry barriers, and impressive performance make it a standout choice for upgrade versus other AI accelerators. Additionally, NVIDIA has extensive experience in GPU market dominance for productivity, further enhancing the appeal of the L40S.

Better General Use: NVIDIA is pushing this GPU as an alternative to the NVIDIA A100, but it is more than that, capable of executing any HPC workload. This GPU is highly versatile for users with workloads spanning from complex simulation to dense AI training or even sometimes both!

Final Thoughts

Built on the NVIDIA Ada Lovelace architecture, the L40S delivers groundbreaking multi-workload acceleration for large language model (LLM) inference and training, generative AI performance, as well as graphics and video applications. The versatility, performance, and availability make the NVIDIA L40S an attractive GPU for accelerating the most demanding workloads. Talk to our team at SabrePC and configure your next deep learning and AI server with NVIDIA L40S!

Blog

Computer Hardware

What Makes the NVIDIA L40S Special? First Impressions on L40S

DGX and HGX are Costly and Hard to Come By… Alternative?

NVIDIA L40S Advantages

Final Thoughts

Tags

Related Content