MLPerf v5.1 AI Inference Benchmark Comparison: Spotlight on NVIDIA Blackwell Ultra GB300 and AMD Instinct MI355X

MLPerf v5.1 AI Inference Benchmark Comparison: Spotlight on NVIDIA Blackwell Ultra GB300 and AMD Instinct MI355X

The latest MLPerf v5.1 AI inference benchmarks have witnessed the debut of groundbreaking chips from NVIDIA and AMD: the Blackwell Ultra GB300 and the Instinct MI355X. These powerful processors are generating considerable buzz in the tech community for their outstanding performance metrics.

NVIDIA Blackwell Ultra GB300 & AMD Instinct MI355X: A New Benchmark in AI Performance

MLCommons recently released its latest evaluation of AI performance through the MLPerf v5.1 benchmarks, revealing remarkable submissions, notably from NVIDIA and AMD. The Blackwell Ultra GB300 and Instinct MI355X stand out as the premier offerings in AI technology from their respective manufacturers. In this analysis, we will closely examine their capabilities as demonstrated through the benchmarks.

Blackwell Ultra GB300 Performance Highlights

In the DeepSeek R1 (Offline) category, NVIDIA’s GB300 outpaces its predecessor, the GB200, dramatically achieving a 45% performance increase in 72 GPU setups and a 44% boost in an 8 GPU configuration. These improvements align closely with NVIDIA’s projected performance gains.

In the DeepSeek R1 (Server) category, the Blackwell has made notable strides with a 25% increase in performance for 72 GPUs and a 21% boost in configurations with 8 GPUs.

AMD’s Instinct MI355X Enters the Arena

The AMD Instinct MI355X has also made substantial contributions, particularly in the Llama 3.1 405B (Offline) benchmarks. A comparative evaluation against the GB200 revealed a remarkable 27% performance increase, demonstrating AMD’s advancements in the AI sector.

Moreover, in a benchmark involving Llama 2 70B (Offline), the MI355X showcased impressive throughput, generating up to 648, 248 tokens per second with a 64-chip configuration and a striking 2.09x performance increase over the NVIDIA GB200 in an 8-chip setup.

NVIDIA has shared a detailed analysis of their benchmarks, including the various records achieved through the Blackwell Ultra GB300 platform. These results showcase a significant advancement in AI inference capabilities.

Blackwell Sets Llama 3.1 405B Interactive Record headline with NVIDIA logo in performance graph context.
Image Source: NVIDIA

Comprehensive Record Table

MLPerf Inference Per-Accelerator Records
Benchmark Offline Server Interactive
DeepSeek-R1 5, 842 tokens/second/GPU 2, 907 tokens/second/GPU **
Flame 3.1 405B 224 tokens/second/GPU 170 tokens/second/GPU 138 tokens/second/GPU
Call 2 70B 99.9% 12, 934 tokens/second/GPU 12, 701 tokens/second/GPU 7, 856 tokens/second/GPU
Call 2 70B 99% 13, 015 tokens/second/GPU 12, 701 tokens/second/GPU 7, 856 tokens/second/GPU
Llama 3.1 8B 18, 370 tokens/second/GPU 16, 099 tokens/second/GPU 15, 284 tokens/second/GPU
Stable Diffusion XL 4.07 samples/second/GPU 3.59 queries/second/GPU **
Mixtral 8x7B 16, 099 tokens/second/GPU 16, 131 tokens/second/GPU **
DLRMv2 99% 87, 228 samples/second/GPU 80, 515 samples/second/GPU **
DLRMv2 99.9% 48, 666 samples/second/GPU 46, 259 queries/second/GPU **
Whisper 5, 667 tokens/second/GPU ** **
R-GAT 81, 404 samples/second/GPU ** **
Retinanet 1, 875 samples/second/GPU 1, 801 queries/second/GPU **

Furthermore, NVIDIA’s Blackwell Ultra has established new reasoning benchmarks at MLPerf, outperforming their previous Hopper architecture by a multiplier of 4.7x in offline mode and 5.2x in server configurations, indicating a substantial leap in efficacy.

DeepSeek-R1 Performance Comparison
Architecture Offline Server
Hopper 1, 253 tokens/second/GPU 556 tokens/second/GPU
Blackwell Ultra 5, 842 tokens/second/GPU 2, 907 tokens/second/GPU
Blackwell Ultra Advantage 4.7x 5.2x

As we look forward to future MLPerf submissions, it’s anticipated that NVIDIA, AMD, and Intel will continue to enhance their platforms, striving for even greater performance levels in this competitive landscape.

Source & Images

Leave a Reply

Your email address will not be published. Required fields are marked *