MLPerf v5.1 AI Inference Benchmark Comparison: Spotlight on NVIDIA Blackwell Ultra GB300 and AMD Instinct MI355X

The latest MLPerf v5.1 AI inference benchmarks have witnessed the debut of groundbreaking chips from NVIDIA and AMD: the Blackwell Ultra GB300 and the Instinct MI355X. These powerful processors are generating considerable buzz in the tech community for their outstanding performance metrics.

NVIDIA Blackwell Ultra GB300 & AMD Instinct MI355X: A New Benchmark in AI Performance

MLCommons recently released its latest evaluation of AI performance through the MLPerf v5.1 benchmarks, revealing remarkable submissions, notably from NVIDIA and AMD. The Blackwell Ultra GB300 and Instinct MI355X stand out as the premier offerings in AI technology from their respective manufacturers. In this analysis, we will closely examine their capabilities as demonstrated through the benchmarks.

Blackwell Ultra GB300 Performance Highlights

In the DeepSeek R1 (Offline) category, NVIDIA’s GB300 outpaces its predecessor, the GB200, dramatically achieving a 45% performance increase in 72 GPU setups and a 44% boost in an 8 GPU configuration. These improvements align closely with NVIDIA’s projected performance gains.

In the DeepSeek R1 (Server) category, the Blackwell has made notable strides with a 25% increase in performance for 72 GPUs and a 21% boost in configurations with 8 GPUs.

AMD’s Instinct MI355X Enters the Arena

The AMD Instinct MI355X has also made substantial contributions, particularly in the Llama 3.1 405B (Offline) benchmarks. A comparative evaluation against the GB200 revealed a remarkable 27% performance increase, demonstrating AMD’s advancements in the AI sector.

Moreover, in a benchmark involving Llama 2 70B (Offline), the MI355X showcased impressive throughput, generating up to 648, 248 tokens per second with a 64-chip configuration and a striking 2.09x performance increase over the NVIDIA GB200 in an 8-chip setup.

NVIDIA has shared a detailed analysis of their benchmarks, including the various records achieved through the Blackwell Ultra GB300 platform. These results showcase a significant advancement in AI inference capabilities.

Image Source: NVIDIA

Comprehensive Record Table

MLPerf Inference Per-Accelerator Records
Benchmark	Offline	Server	Interactive
DeepSeek-R1	5, 842 tokens/second/GPU	2, 907 tokens/second/GPU	**
Flame 3.1 405B	224 tokens/second/GPU	170 tokens/second/GPU	138 tokens/second/GPU
Call 2 70B 99.9%	12, 934 tokens/second/GPU	12, 701 tokens/second/GPU	7, 856 tokens/second/GPU
Call 2 70B 99%	13, 015 tokens/second/GPU	12, 701 tokens/second/GPU	7, 856 tokens/second/GPU
Llama 3.1 8B	18, 370 tokens/second/GPU	16, 099 tokens/second/GPU	15, 284 tokens/second/GPU
Stable Diffusion XL	4.07 samples/second/GPU	3.59 queries/second/GPU	**
Mixtral 8x7B	16, 099 tokens/second/GPU	16, 131 tokens/second/GPU	**
DLRMv2 99%	87, 228 samples/second/GPU	80, 515 samples/second/GPU	**
DLRMv2 99.9%	48, 666 samples/second/GPU	46, 259 queries/second/GPU	**
Whisper	5, 667 tokens/second/GPU	**	**
R-GAT	81, 404 samples/second/GPU	**	**
Retinanet	1, 875 samples/second/GPU	1, 801 queries/second/GPU	**

Furthermore, NVIDIA’s Blackwell Ultra has established new reasoning benchmarks at MLPerf, outperforming their previous Hopper architecture by a multiplier of 4.7x in offline mode and 5.2x in server configurations, indicating a substantial leap in efficacy.

DeepSeek-R1 Performance Comparison
Architecture	Offline	Server
Hopper	1, 253 tokens/second/GPU	556 tokens/second/GPU
Blackwell Ultra	5, 842 tokens/second/GPU	2, 907 tokens/second/GPU
Blackwell Ultra Advantage	4.7x	5.2x

As we look forward to future MLPerf submissions, it’s anticipated that NVIDIA, AMD, and Intel will continue to enhance their platforms, striving for even greater performance levels in this competitive landscape.

Source & Images

MLPerf v5.1 AI Inference Benchmark Comparison: Spotlight on NVIDIA Blackwell Ultra GB300 and AMD Instinct MI355X

NVIDIA Blackwell Ultra GB300 & AMD Instinct MI355X: A New Benchmark in AI Performance

Blackwell Ultra GB300 Performance Highlights

AMD’s Instinct MI355X Enters the Arena

Comprehensive Record Table

Related Articles:

iPhone 17 Launch: Enhanced Display with ProMotion, 3,000 Nits Brightness, and 8 Hours Extra Video Playback Compared to iPhone 16

Nearly 50% of Top-Selling PlayStation Games in the US Are from the Call of Duty Franchise

Leave a Reply Cancel reply