AMD Instinct MI350 GPU: Unleashing AI Power with 3nm 3D Chiplet, CDNA 4 Architecture, 185 Billion Transistors, 1400W TBP, and 288GB Memory for Over 4000B LLM Support

At Hot Chips 2025, AMD unveiled comprehensive details about its latest Instinct MI350 AI accelerator, powered by the innovative CDNA 4 architecture. This announcement comes a mere two months after the initial launch of the MI350 series, designed specifically for demanding AI workloads.

AMD Unveils Architectural Insights of Instinct MI350 at Hot Chips 2025, Positioned for Expansive LLMs

AMD Instinct MI350 GPUs showcased at Hot Chips 2025.

The MI350 series responded to the exponential growth of large language models (LLMs), driving the necessity for advancements in both data formats and chip memory capacities. By pushing the boundaries in these areas, AMD significantly enhanced the performance and efficiency of AI processing.

Trends in Large AI Models: Growth in Parameter Count, Context Length, Agentic AI Processing

The enhancements in the CDNA-4 architecture provide substantial boosts in both capacity and bandwidth for High Bandwidth Memory (HBM), facilitating quicker AI training and inference across more expansive models. The chips have remarkably increased link speeds, achieving better power efficiency and overall performance.

Generative AI needs: GPU memory, bandwidth, ALUs, power efficiency, large-scale model training.

This new architecture achieves faster processing by optimizing power delivery and enhancing connectivity through the Infinity Fabric for better bandwidth efficiency during operations. It also supports various lower precision data formats, such as FP8 and industry-standard micro-scaled MXFP6 and MXFP4 types.

MI350 Series Variants and Specifications

The AMD MI350 series primarily includes the MI350X, an air-cooled design with a total board power (TBP) of 1000W and a peak clock speed of 2.2 GHz. On the higher end, the MI355X model is tailored for liquid-cooled data centers, boasting a TBP of 1400W and maximum clock speed of 2.4 GHz.

AMD Instinct MI350 GPU specs: 185B transistors and advanced 3D chiplet design.

These impressive specifications stem from AMD’s extensive engineering expertise, featuring a sophisticated design of 185 billion transistors within a 3D Multi-Chiplet configuration. This includes advanced HBM3e memory and utilizes both 3nm and 6nm process technologies to optimize cost-effectiveness and performance.

AMD Instinct MI350 chiplet architecture diagram.

Architectural Breakdown and Capabilities

The architectural details reveal a total of eight Accelerator Complex Dies (XCDs) utilized per MI350 package, crafted using TSMC’s leading 3nm technology. Each chip is connected through a robust infrastructure designed for maximum throughput.

Each I/O Base Die operates on a more mature 6nm process, ensuring enhanced yield rates and cost-efficiency. The die configuration facilitates effective memory handling through eight HBM3e sites, providing a sizable 288 GB of memory across the accelerator.

Additionally, the memory subsystem supports a variety of configurations to enhance compute capabilities efficiently. This includes a comprehensive internal memory architecture and cache tiering designed to maximize performance during data-intensive operations.

Performance Metrics and Competitive Edge

In terms of raw computation power, the MI350 series manages to deliver considerable improvements when pitted against its predecessors, showcasing up to 20 PFLOPs of FP4/FP6 compute capability—an impressive fourfold performance uplift thanks to the advancements in HBM3e technology and associated cache improvements.

AMD Instinct MI350 GPU performance uplift versus competitors.

AMD has indicated that the Instinct MI350 series will be available through multiple distribution partners beginning in Q3 2025. Future developments are also on the horizon, with the MI400 series anticipated to roll out in 2026.

AMD Instinct AI Accelerators Comparison:

Accelerator Name	AMD Instinct MI500	AMD Instinct MI400	AMD Instinct MI350X	AMD Instinct MI325X	AMD Instinct MI300X	AMD Instinct MI250X
GPU Architecture	CDNA Next / UDNA	CDNA Next / UDNA	CDNA 4	Aqua Vanjaram (CDNA 3)	Aqua Vanjaram (CDNA 3)	Aldebaran (CDNA 2)
GPU Process Node	TBD	TBD	3nm	5nm + 6nm	5nm + 6nm	6 nm
XCDs (Chiplets)	TBD	8 (MCM)	8 (MCM)	8 (MCM)	8 (MCM)	2 (MCM), 1 (Per Die)
GPU Cores	TBD	TBD	16, 384	19, 456	19, 456	14, 080
Max Clock Speed	TBD	TBD	2400 MHz	2100 MHz	2100 MHz	1700 MHz
INT8 Compute	TBD	TBD	5200 TOPS	2614 TOPS	2614 TOPS	383 TOPs
FP6/FP4 Matrix	TBD	40 PFLOPs	20 PFLOPs	N/A	N/A	N/A
FP8 Matrix	TBD	20 PFLOPs	5 PFLOPs	2.6 PFLOPs	2.6 PFLOPs	N/A
FP16 Matrix	TBD	10 PFLOPs	2.5 PFLOPs	1.3 PFLOPs	1.3 PFLOPs	383 TFLOPs
FP32 Vector	TBD	TBD	157.3 TFLOPs	163.4 TFLOPs	163.4 TFLOPs	95.7 TFLOPs
FP64 Vector	TBD	TBD	78.6 TFLOPs	81.7 TFLOPs	81.7 TFLOPs	47.9 TFLOPs
VRAM	TBD	432GB HBM4	288 GB HBM3e	256 GB HBM3e	192GB HBM3	128 GB HBM2e
Infinity Cache	TBD	TBD	256 MB	256 MB	256 MB	N/A
Memory Clock	TBD	19.6 TB/s	8.0 Gbps	5.9 Gbps	5.2 Gbps	3.2 Gbps
Memory Bus	TBD	TBD	8192-bit	8192-bit	8192-bit	8192-bit
Memory Bandwidth	TBD	TBD	8TB/s	6.0 TB/s	5.3 TB/s	3.2 TB/s
Form Factor	TBD	TBD	OAM	OAM	OAM	OAM
Cooling	TBD	TBD	Passive / Liquid	Passive Cooling	Passive Cooling	Passive Cooling
TDP (Max)	TBD	TBD	1400W (355X)	1000W	750W	560W

For further details, visit the source.

AMD Instinct MI350 GPU: Unleashing AI Power with 3nm 3D Chiplet, CDNA 4 Architecture, 185 Billion Transistors, 1400W TBP, and 288GB Memory for Over 4000B LLM Support

AMD Unveils Architectural Insights of Instinct MI350 at Hot Chips 2025, Positioned for Expansive LLMs

MI350 Series Variants and Specifications

Architectural Breakdown and Capabilities

Performance Metrics and Competitive Edge

AMD Instinct AI Accelerators Comparison:

Related Articles:

Activision Responds to Call of Duty Identity Crisis: Black Ops 6 Skins and Weapons Won’t Transfer to Black Ops 7

AMD Launches First “UEC-Ready” Pensando Pollara 400 AI NIC, Achieving 400GbE Speeds

Leave a Reply Cancel reply