NVIDIA H200, the highest-performing GPU for AI
The past three years have highlighted the extreme integrations of artificial intelligence in everyday workloads. And Generative AI has emerged, delivering these capabilities to the mainstream. At SabrePC, we strive to offer the best high-performance GPU-accelerated solutions and keep everyone up to date on the newest advancements in the hardware industry. At the Supercomputing Conference 2023, NVIDIA unveiled the new NVIDIA H200 Tensor Core GPU, adding to their Hopper Architecture of unparalleled AI accelerators.
About the NVIDIA H200 Tensor Core GPU
The H200 adds to NVIDIA’s portfolio as the leading AI and high-performance data center GPU, bringing massive compute to data centers. H200 is the world’s first GPU to feature HBM3e memory with 4.8TB/s of memory bandwidth, a 1.4X faster than H100. The H200 also has 141GB, compared to H100’s 80GB. The faster and larger bandwidth and memory accelerate the performance of dense generative AI models, power full-scale AI factories, and handle HPC applications with ease, while meeting the evolving demands of the ever-growing model sizes that have been released including GPT-4, Llama2, and more.
For the H100 and H200, the raw GPU performance remains the same due to the same GPU chip used. Same 80 billion transistors and structure. All improvements and performance increases are credited to the H200 to the faster and higher capacity of 141GB of HBM3e.
Specification | H100 SXM | H200 SXM |
---|---|---|
FP64 teraFLOPS | 34 | 34 |
FP64 Tensor core teraFLOPS | 67 | 67 |
FP32 teraFLOPS | 67 | 67 |
TF32 Tensor core teraFLOPS | 989 | 989 |
BFLOAT16 Tensor core teraFLOPS | 1,979 | 1,979 |
FP16 Tensor core teraFLOPS | 1,979 | 1,979 |
FP8 Tensor core teraFLOPS | 3,958 | 3,958 |
INT8 Tensor Core TOPS | 3,958 | 3,958 |
GPU Memory | 80GB HBM3 | 141GB HBM3e |
GPU Memory Bandwidth | 3.35TB/s | 4.8TB/s |
Decoders | 7 NVDEC | 7 NVDEC |
Confidential Computing | Supported | Supported |
Max Thermal Design power (TDP) | UP to 700W (configurable) | up to 700W (configurable) |
Multi-instance GPUs | Up to 7 MIGS at 10GB each | up to 7 MIGS at 16.5GB each |
Form Factor | SXM | SXM |
Interconnect | NVLink: 900GB/s PCIe Gen5: 128GB/s | NVLink: 900GB/s PCIe Gen 5 128GB/s |
Server Options | NVIDIA HGX H100 | NVIDIA HGX H200 |
NVIDIA Al Enterprise | Add-on | Add-on |
NVIDIA H200 Form Factor
The NVIDIA H200 GPUs is the perfect GPU for powering the largest large language models and addressing a diverse range of inference needs with its larger and faster HBM3e memory. The NVIDIA H200 can be had via the NVIDIA HGX H200 system board available as a server building block in the form of an integrated baseboard in four or eight H200 SXM5 GPUs for up to a total of 1.1TB of GPU memory. 1.1 Terabytes?! This machine has more GPU memory than some consumer desktops have as SSD storage.
NVIDIA Hopper Architecture Advancements
NVIDIA H200 Tensor Core GPU built on the Hopper Architecture utilizes a Transformer Engine and fourth-generation Tensor Cores which holds out to 5.5x faster than the last generation NVIDIA A100 Tensor Core GPU. With the most performance ever seen in a single server node, enterprise AI developers can power the future’s generative AI.
The NVIDIA H200’s cutting-edge HBM3e memory offers unbeatable performance, all within spec of the H100. With the increased power and scalability, seamless communication between GPUs is essential. NVIDIA fourth generation NVLink accelerates multi-GPU input and output (IO) across eight GPU servers at 900GB/s bidirectional per GPU, over 7X the bandwidth of PCIe Gen5.
When to expect NVIDIA H200
Expect the availability of the NVIDIA H200 in 2024. Talk to a SabrePC representative today for updates on how you can leverage an NVIDIA HGX H200 machine for your workload. Accelerate your computing infrastructure, increase your business’s productivity, and develop AI models with the highest-performing AI accelerator.
Have any questions? Looking for an HGX H200 or an alternative to power the most demanding workloads? Contact us today or explore our various customizable Deep Learning Training server platforms.