At the SC23 conference, NVIDIA revealed a significant advancement in the realm of AI computing by launching the NVIDIA HGX H200, a powerhouse built on the innovative NVIDIA Hopper architecture. This groundbreaking platform incorporates the NVIDIA H200 Tensor Core GPU, featuring cutting-edge HBM3e memory designed to manage colossal data volumes, particularly catering to generative AI and high-performance computing (HPC) workloads.
The hallmark of the NVIDIA H200 lies in its pioneering integration of HBM3e memory, offering unprecedented speed and capacity enhancements. Boasting 141GB of memory at a staggering 4.8 terabytes per second, this GPU offers nearly double the capacity and an impressive 2.4x bandwidth compared to its predecessor, the NVIDIA A100.
Expectations are high as systems powered by the H200, crafted by top-tier server manufacturers and cloud service providers, are anticipated to roll out in the second quarter of 2024.
Ian Buck, NVIDIA's vice president of hyperscale and HPC, emphasized the pivotal role of large, fast GPU memory in fueling generative AI and HPC applications. He stated, "With NVIDIA H200, the industry's leading end-to-end AI supercomputing platform just got faster to solve some of the world's most important challenges."
The NVIDIA Hopper architecture marks a substantial performance leap over its precursor, consistently raising the performance bar. Recent software enhancements like NVIDIA TensorRT-LLM on the H100 have set the stage for perpetual innovation. The introduction of H200 promises further performance strides, with expected enhancements enabling almost double the inference speed on a 70 billion-parameter LLM, termed Llama 2, compared to the H100. Future software updates are anticipated to deliver additional performance enhancements with the H200.
Diverse deployment options await the H200, available in NVIDIA HGX H200 server boards in four- and eight-way configurations, compatible with HGX H100 systems' hardware and software. The GPU is also an integral part of the NVIDIA GH200 Grace Hopper Superchip, launched in August.
The global ecosystem of partner server manufacturers, including ASRock Rack, ASUS, Dell Technologies, GIGABYTE, Hewlett Packard Enterprise, Lenovo, and more, will adopt H200 into their existing systems.
Cloud service providers like Amazon Web Services, Google Cloud, Microsoft Azure, Oracle Cloud Infrastructure, CoreWeave, Lambda, and Vultr will swiftly deploy H200-based instances starting next year, offering heightened performance across various application workloads.
The impending arrival of NVIDIA HGX H200 signifies a major stride in AI computing, promising enhanced performance and scalability across a multitude of industries, setting new standards for AI and HPC applications.