AMD Launches First “UEC-Ready” Pensando Pollara 400 AI NIC, Achieving 400GbE Speeds

AMD has recently unveiled its innovative Pensando Pollara 400 AI NIC at Hot Chips 2025, marking the introduction of the first Ultra Ethernet Consortium (UEC)-ready AI Network Interface Card (NIC) in the industry.

AMD Enhances Performance by 25% with the 400GbE Pensando Pollara 400 AI NIC

Building upon its previous development, AMD showcased the Pensando Pollara 400 last year. This state-of-the-art NIC, designed specifically for AI systems, boasts a remarkable bandwidth of 400 Gbps, putting it in direct competition with NVIDIA’s ConnectX-7. However, NVIDIA has also launched the more advanced ConnectX-8, offering impressive 800GbE speeds with their latest Blackwell Ultra systems.

AMD Pensando Pollara 400 AI NIC; Industry’s first Ultra Ethernet AI NIC, 400 Gbps speed.

The Pensando Pollara 400 comes packed with several cutting-edge features:

Programmable Hardware Pipeline
Up To 1.25x Performance Enhancement
400 Gbps Throughput
Open Ecosystem Compatibility
UEC Ready RDMA Capabilities
Reduced Job Completion Time
Exceptional Availability

Diagram of AMD Instinct system architecture with Infinity Fabric and PCIe switch connections.

The architecture of the Pensando Networking solutions is closely aligned with AMD’s existing Data Center architectures, particularly the EPYC and Instinct families, which utilize PCIe switches to efficiently connect NICs and CPUs.

AMD advancing data center solutions with CPUs, GPUs, and networking hardware.

Importantly, the Pensando NIC operates without a PCIe switch and interfaces directly with a Gen5 x16 connection. The underlying architecture is outlined in the following diagram:

Block diagram showing AMD Pensando NIC architecture with NOC interconnect and P4DMA components.

Through the utilization of a P4 architecture, the Pensando Pollara 400 AI NIC achieves remarkable efficiency.

AMD Pensando P4 architecture diagram showing packet processing and memory flow paths.

The architecture’s significant components encompass the Table Engine (TE), responsible for generating table keys from the package header vector, along with executing specific memory reads based on data type.

Diagram of P4 Pipeline Components highlighting Table Engine key generation and memory access.

The design also features a Match Processing Unit (MPU), a specialized processor utilizing optimized opcodes for field manipulation, facilitating distinct memory, table, and PHV interfaces.

Diagram of P4 Pipeline Components showing Match Processing Unit and interfaces.

Additionally, innovations such as Virtual Address to Physical Address (va2pa) translation capabilities enhance system performance further.

Flowchart of virtual to physical address translation process in computing architecture.

In terms of atomic memory operations, AMD has implemented them adjacent to SRAM systems for greater efficiency.

AMD enhancements in atomic operations overview with benefits for SRAM memory.

The Pipeline Cache Coherency employs invalidate/update logic, ensuring P4 coherency operates effectively on an address range basis.

Diagram of pipeline cache coherency enhancements with invalidate/update logic explanation.

AMD identifies several challenges impacting AI system performance across Scale-out networks. Issues such as inefficient link utilization tied to ECMP load balancing, network congestion, and packet loss hinder overall effectiveness.

AI Scale-Out Network: Challenges in system performance and network issues like congestion and packet loss.

The company also highlights that AI networks experience significantly higher utilization rates compared to general-purpose networks, often pushing the limits of network bandwidth availability.

High Network Utilization: AI Backend Networks Drive Data Transfers with 95% Utilization.

AMD presents the Ultra Ethernet Consortium (UEC) as a vital solution to overcoming these obstacles. The UEC fosters an open, interoperable, high-performance framework designed to address the networking requirements essential for AI and high-performance computing (HPC) applications at scale.

Ultra Ethernet Consortium: Open, scalable, cost-effective ethernet for AI and HPC demands.

Designed for efficiency and affordability, the UEC aims to fulfill the significant demands increasingly placed on modern data networks.

AMD Pensando Pollara 400 AI NIC with RDMA, UEC AI transport, congestion control, fast recovery.

Additional advantages of the UEC include enhanced routing techniques and network management solutions designed to address issues related to congestion and packet loss.

Pollara RDMA vs RoCEv2 RPC performance chart, highlighting network efficiency gains.

In summary, AMD’s Pensando Pollara 400 AI UEC-ready RDMA NIC demonstrates a 25% performance improvement compared to RoCEv2 with 4 Qpairs, and a notable 40% increase over RoCEv2 with 1 Qpair, solidifying its role as a leader in networking technology.

Source & Images