Intel Partners with NVIDIA’s Blackwell Ecosystem for a Hybrid Rack-Scale AI Platform to Revitalize its AI Chips

Intel Partners with NVIDIA’s Blackwell Ecosystem for a Hybrid Rack-Scale AI Platform to Revitalize its AI Chips

Intel has made headlines by integrating its Gaudi 3 rack-scale solution with NVIDIA’s advanced technology stack. This innovative combination leverages Intel’s own AI chips alongside NVIDIA’s Blackwell GPUs, promising substantial performance enhancements for users.

Intel Unveils a Hybrid AI Server with NVIDIA’s Blackwell Technology

The Gaudi line of AI chips from Intel has gained significant traction in the industry. However, the company has faced stiff competition from giants like NVIDIA and AMD in capturing revenue from the burgeoning AI sector. To address this challenge, Intel is reimagining its strategy for the Gaudi platform. As reported by SemiAnalysis, the tech giant is set to introduce the Gaudi 3 rack-scale system, which incorporates NVIDIA’s Blackwell B200 GPU as part of a hybrid architecture, complemented by Connect-X networking technology.

This announcement stands out as one of the key highlights from the recent OCP Global Summit, where Intel aims to carve out a distinctive niche within the rack-scale AI market. The proposed system uniquely employs Intel’s Gaudi 3 chips to handle the ‘decode’ aspects of inferencing workloads, while the B200 GPUs focus on the more demanding ‘prefill’ phases. Blackwell GPUs are characterized by their exceptional performance in large matrix-multiplications, making them an optimal choice for handling prefill operations.

Intel Gaudi 3 Rack Scale64 server with Compute Tray and Switch Tray, shown at a tech event, features 2 Xeon CPU and Gaudi3 AI chips.
Image Credits: SemiAnalysis

In this innovative configuration, the Gaudi 3 architecture will prioritize memory bandwidth and Ethernet-centric scalability. On the connectivity front, the setup utilizes NVIDIA’s ConnectX-7 400 GbE NICs mounted on compute trays, alongside Broadcom’s Tomahawk 5 switches, boasting a remarkable 51.2 Tb/s throughput to facilitate full rack connectivity. According to SemiAnalysis, each compute tray is equipped with two Xeon CPUs, four Gaudi 3 AI chips, and four NICs, in addition to a NVIDIA BlueField-3 DPU, with a total of sixteen trays allocated per rack.

Multiple processors labeled Eight Bay Cuda Cores on a computer server motherboard.
Intel’s Gaudi 2 rack

The Gaudi platform is positioned as a cost-effective decoding engine amidst a landscape heavily influenced by NVIDIA’s dominance. This strategy suggests a pragmatic approach where Intel seeks to enhance its market standing not by direct competition, but by leveraging a collaborative arrangement. Claims suggest that this rack-scale architecture can achieve 1.7 times faster performance in prefill tasks compared to a baseline model using only the B200 GPU; however, these results await independent validation.

While this hybrid setup presents an optimistic future, challenges remain. The Gaudi platform is still hindered by an undeveloped software ecosystem, which may impede its broader adoption. Furthermore, with the Gaudi architecture scheduled for phased retirement in the coming months, it’s uncertain whether this configuration will achieve the same mainstream acceptance as competing solutions.

Source & Images

Leave a Reply

Your email address will not be published. Required fields are marked *