— Chalmers University of Technology, Sweden
MLPerf Benchmarks
The NVIDIA AI platform achieves world-class performance and versatility in MLPerf Training, Inference, and HPC benchmarks for the most demanding, real-world AI workloads.
MLPerf™ benchmarks—developed by MLCommons, a consortium of AI leaders from academia, research labs, and industry—are designed to provide unbiased evaluations of training and inference performance for hardware, software, and services. They’re all conducted under prescribed conditions. To stay on the cutting edge of industry trends, MLPerf continues to evolve, holding new tests at regular intervals and adding new workloads that represent the state of the art in AI.
MLPerf Inference v4.0 measures inference performance on nine different benchmarks, including large language models (LLMs), text-to-image, natural language processing, speech, recommenders, computer vision, and medical image segmentation.
MLPerf Training v4.0 measures training performance on nine different benchmarks, including LLM pre-training, LLM fine-tuning, text-to-image, graph neural network (GNN), computer vision, medical image segmentation, and recommendation.
MLPerf HPC v3.0 measures training performance across four different scientific computing use cases, including climate atmospheric river identification, cosmology parameter prediction, quantum molecular modeling, and protein structure prediction.
Deep learning algorithms trained on large-scale datasets that can recognize, summarize, translate, predict, and generate content for a breadth of use cases. details.
Generates images from text prompts. details.
Delivers personalized results in user-facing services such as social media or ecommerce websites by understanding interactions between users and service items, like products or ads. details.
Finds instances of real-world objects such as faces, bicycles, and buildings in images or videos and specifies a bounding box around each. details.
Uses neural networks designed to work with data structured as graphs. details.
Assigns a label from a fixed set of categories to an input image, i.e., applies to computer vision problems. details.
Understands text by using the relationship between different words in a block of text. Allows for question answering, sentence paraphrasing, and many other language-related use cases. details.
Performs volumetric segmentation of dense 3D images for medical use cases. details.
Identify hurricanes and atmospheric rivers in climate simulation data. details.
Solve a 3D image regression problem on cosmological data. details.
Predict energies or molecular configurations. details.
Predicts three-dimensional protein structure based on one-dimensional amino acid connectivity. details.
The NVIDIA accelerated computing platform, powered by NVIDIA HopperTM GPUs and NVIDIA Quantum-2 InfiniBand networking, delivered the highest performance on every benchmark in MLPerf Training v4.0. On the LLM benchmark, NVIDIA more than tripled performance in just one year, through a record submission scale of 11,616 H100 GPUs and software optimizations. NVIDIA also delivered 1.8X more performance on the text-to-image benchmark in just seven months. And, on the newly-added LLM fine-tuning and graph neural network benchmarks, NVIDIA set the bar. NVIDIA achieved these exceptional results through relentless full-stack engineering at data center scale.
The NVIDIA platform continues to demonstrate unmatched performance and versatility in MLPerf Training v4.0. NVIDIA delivered the highest performance on all nine benchmarks, and set new records on the following benchmarks: LLM, LLM fine-tuning, text-to-image, graph neural network, and object detection (light weight).
The NVIDIA accelerated computing platform, fueled by the NVIDIA Hopper architecture, delivered exceptional performance across every workload in the MLPerf Inference v4.0 data center category. NVIDIA TensorRT™-LLM software nearly tripled GPT-J LLM performance on Hopper GPUs in just six months. The NVIDIA HGX™ H200, powered by NVIDIA H200 Tensor Core GPUs with 141GB HBM3e memory, also made its debut, setting new records on the new Llama 2 70B and Stable Diffusion XL generative AI tests. The NVIDIA GH200 Grace Hopper™ Superchip also demonstrated outstanding performance, while NVIDIA Jetson Orin remained at the forefront in the edge category, running the most diverse set of models including generative AI models like GPT-J and Stable Diffusion XL.
The NVIDIA H100 Tensor Core supercharged the NVIDIA platform for HPC and AI in its MLPerf HPC v3.0 debut, enabling up to 16X faster time to train in just three years and delivering the highest performance on all workloads across both time-to-train and throughput metrics. The NVIDIA platform was also the only one to submit results for every MLPerf HPC workload, which span climate segmentation, cosmology parameter prediction, quantum molecular modeling, and the latest addition, protein structure prediction. The unmatched performance and versatility of the NVIDIA platform makes it the instrument of choice to power the next wave of AI-powered scientific discovery.
NVIDIA Full-Stack Innovation Fuels Performance Gains
The complexity of AI demands a tight integration between all aspects of the platform. As demonstrated in MLPerf’s benchmarks, the NVIDIA AI platform delivers leadership performance with the world’s most advanced GPU, powerful and scalable interconnect technologies, and cutting-edge software—an end-to-end solution that can be deployed in the data center, in the cloud, or at the edge with amazing results.
An essential component of NVIDIA’s platform and MLPerf training and inference results, the NGC™ catalog is a hub for GPU-optimized AI, HPC, and data analytics software that simplifies and accelerates end-to-end workflows. With over 150 enterprise-grade containers—including workloads for generative AI, conversational AI, and recommender systems; hundreds of AI models; and industry-specific SDKs that can be deployed on premises, in the cloud, or at the edge—NGC enables data scientists, researchers, and developers to build best-in-class solutions, gather insights, and deliver business value faster than ever.
Achieving world-leading results across training and inference requires infrastructure that’s purpose-built for the world’s most complex AI challenges. The NVIDIA AI platform delivered leading performance powered by the NVIDIA HGX™ platform, including the NVIDIA HGX H100, NVIDIA HGX H200, as well as the NVIDIA GH200 Grace Hopper Superchip, and the scalability and flexibility of NVIDIA interconnect technologies—NVIDIA NVLink, NVSwitch™, and Quantum-2 InfiniBand. These are at the heart of the NVIDIA data center platform, the engine behind our benchmark performance.
In addition, NVIDIA DGX™ systems offer the scalability, rapid deployment, and incredible compute power that enable every enterprise to build leadership-class AI infrastructure.
Learn More About Our Data Center Training and Inference Performance.