AMD Partners with GlobalFoundries for MI500 Co-Packaged Optics Amidst Intensifying Silicon Photonics Competition with NVIDIA

AMD Partners with GlobalFoundries for MI500 Co-Packaged Optics Amidst Intensifying Silicon Photonics Competition with NVIDIA

Advanced Micro Devices (AMD) is set to collaborate with GlobalFoundries for the advancement of its Co-Packaged Optics (CPO) solution, a crucial component for the upcoming Instinct MI500 AI accelerators.

Collaboration Between GlobalFoundries and AMD on Next-Gen Co-Packaged Optics

The Co-Packaged Optics (CPO) technology, also referred to as Silicon Photonics, represents a leap forward in reducing dependency on copper wiring by utilizing light for signal transmission. This innovation allows CPOs to be directly integrated with hardware accelerators such as GPUs, significantly enhancing interconnect latency and enabling high-bandwidth communication between CPUs and GPUs, a requirement for future AI data centers.

Both AMD and NVIDIA are gearing up to capitalize on CPO technologies for their next-generation AI graphics processing units (GPUs).AMD’s initiative includes a unique MRM-based CPO solution specifically designed for the Instinct MI500 accelerators. Manufacturing of the Photonic Integrated Circuits (PICs) for this project will be handled by GlobalFoundries, with packaging provided by ASE. Notably, last year, AMD bolstered its capabilities by acquiring photonics specialists, Enosemi, to expedite advancements in CPO technologies.

In a similar vein, NVIDIA is reportedly developing its own CPO PICs for the upcoming Vera Rubin accelerators. The fabrication of these circuits will be managed by TSMC, and SPIL will take charge of packaging, while assembly will occur at Foxconn Industrial Internet, a branch of Foxconn. For the Rubin Ultra model, CPO implementation is being prioritized over Near-Package Optics (NPO).

As NVIDIA progresses, it plans to fully integrate Co-Packaged Optics into its Feynman generation of AI accelerators, thus eliminating reliance on NPOs.

AMD’s advancements for the MI500 series are noteworthy, as they will utilize an ultra-modern 2nm manufacturing process, outclassing the upcoming MI400 series, which will also operate using 2nm technology but won’t be as advanced as the MI500. The MI500 accelerators will benefit from the cutting-edge CDNA 6 architecture, while the MI400 will employ the CDNA 5 architecture. Furthermore, HBM4E memory will be utilized for the MI500, promising exceptionally high memory bandwidth exceeding 19.6 TB/s compared to its predecessor, the MI400 accelerators, which utilize HBM4 memory.

Despite earlier speculations, AMD has confirmed that it will retain its architecture naming convention for Instinct GPUs, refraining from switching to UDNA branding.

A roadmap image titled 'Extending the Leadership Roadmap' showcasing AMD Instinct GPUs MI300A/X for 2023, MI325X for 2024, MI350 Series for 2025, MI400 Series for 2026, and MI500 Series for 2027.

AMD is making significant promises regarding AI performance advancements with the launch of the Instinct MI500 series, aiming for more than a 1000x increase in AI capabilities within a four-year timeline. This ambitious goal is crucial to meet surging AI demands and maintain competitiveness, especially as rivals intensify their own technological pursuits. The MI500 is anticipated to hit the market in 2027.

Overview of AMD Instinct AI Accelerators

Accelerator Name AMD Instinct MI500 AMD Instinct MI400 AMD Instinct MI350X AMD Instinct MI325X AMD Instinct MI300X AMD Instinct MI250X
GPU Architecture CDNA 6 CDNA 5 CDNA 4 Aqua Vanjaram (CDNA 3) Aqua Vanjaram (CDNA 3) Aldebaran (CDNA 2)
GPU Process Node 2 nm 2nm+3nm 3nm 5nm+6nm 5nm+6nm 6 nm
XCDs (Chiplets) TBD 8 (MCM) 8 (MCM) 8 (MCM) 8 (MCM) 2 (MCM) 1 (Per Die)
GPU Cores TBD TBD 16, 384 19, 456 19, 456 14, 080
GPU Clock Speed (Max) TBD TBD 2400 MHz 2100 MHz 2100 MHz 1700 MHz
INT8 Compute TBD TBD 5200 TOPS 2614 TOPS 2614 TOPS 383 TOPS
FP6/FP4 Matrix TBD 40 PFLOPs 20 PFLOPs N/A N/A N/A
FP8 Matrix TBD 20 PFLOPs 5 PFLOPs 2.6 PFLOPs 2.6 PFLOPs N/A
FP16 Matrix TBD 10 PFLOPs 2.5 PFLOPs 1.3 PFLOPs 1.3 PFLOPs 383 TFLOPs
FP32 Vector TBD TBD 157.3 TFLOPs 163.4 TFLOPs 163.4 TFLOPs 95.7 TFLOPs
FP64 Vector TBD TBD 78.6 TFLOPs 81.7 TFLOPs 81.7 TFLOPs 47.9 TFLOPs
VRAM HBM4E 432GB HBM4 288 GB HBM3e 256 GB HBM3e 192GB HBM3 128 GB HBM2e
Infinity Cache TBD TBD 256 MB 256 MB 256 MB N/A
Memory Clock TBD 19.6 TB/s 8.0 Gbps 5.9 Gbps 5.2 Gbps 3.2 Gbps
Memory Bus TBD TBD 8192-bit 8192-bit 8192-bit 8192-bit
Memory Bandwidth TBD TBD 8TB/s 6.0 TB/s 5.3 TB/s 3.2 TB/s
Form Factor TBD TBD OAM OAM OAM OAM
Cooling TBD Passive / Liquid Passive / Liquid Passive Cooling Passive Cooling Passive Cooling
TDP (Max) TBD TBD 1400W (355X) 1000W 750W 560W

For additional insights, you can refer to the latest updates from @jukan05.

For images and more details, visit Wccftech.

Leave a Reply

Your email address will not be published. Required fields are marked *