In-Depth Analysis of Intel Clearwater Forest “Xeon 6+” Processors: Featuring Up to 288 Darkmont E-Cores, 576 MB Cache, and 18A with Foveros D3D + EMIB 2.5D Technology

In-Depth Analysis of Intel Clearwater Forest “Xeon 6+” Processors: Featuring Up to 288 Darkmont E-Cores, 576 MB Cache, and 18A with Foveros D3D + EMIB 2.5D Technology

Intel has provided additional insights into its upcoming Xeon 6+ E-core CPU family, known as Clearwater Forest, which boasts an impressive core count of up to 288 next-generation cores.

Introducing Intel Clearwater Forest: 288 Next-Gen Darkmont E-Cores for High-Density Compute Servers

Building on the strides made by its predecessor, Sierra Forest—the first E-Core dedicated Xeon CPU that offered enhanced compute density alongside performance efficiency—Intel is making significant advancements with Clearwater Forest. This marks a notable evolution in Intel’s Xeon lineup, which is now organized into separate families for Performance Cores (P-Core) and Efficient Cores (E-Core).

Intel logo next to 'Data Center Strategic Overview' on a microchip background.

The Clearwater Forest marks the beginning of the second generation of E-Core only CPUs under the Xeon 6+ branding.

Introducing Intel Xeon 6+ and Intel Xeon 6 CPUs, previously codenamed Granite Rapids and Sierra Forest.

Advanced Technology: Intel 18A, RibbonFET & Power Via with Foveros Direct3D

With Clearwater Forest, Intel is elevating its disaggregated architecture and advanced packaging solutions. This new chip structure employs a multilayer design featuring various chiplets and components, showcasing Intel’s engineering prowess.

Intel Clearwater Forest server CPU details screen with features like 288 E-cores and Intel 18A.

The Clearwater Forest architecture integrates twelve EMIB tiles using 2.5D packaging technology. This configuration connects three active base tiles, two I/O tiles, and a total of twelve compute tiles. The I/O tiles are built on the Intel 7 node, active base tiles utilize the Intel 3 process node, and compute chiplets are produced with the cutting-edge Intel 18A technology.

Clearwater Forest Architecture diagram showing 12x Compute tile with Intel 18A and other components.

Each compute chiplet, featuring the Darkmont E-Core design, is crafted using the 18A process node which employs RibbonFET technology, optimizing power efficiency through reduced gate capacitance. Furthermore, the 18A process boasts an impressive cell density of over 90% and facilitates improved signal routing through the backside power rails, significantly minimizing energy losses by 4-5%.

Intel 18A Process infographic highlighting benefits like higher cell density.

RibbonFET technology enhances electrical current management and lowers power leakage, yielding notable performance benefits. This innovation enables a tighter grip on electrical currents while maintaining lower operational voltages, with the resulting shorter gate lengths contributing to a 20% decrease in power consumption per transistor.

RibbonFET Technology diagram with highlighted features such as electrical current control.

Key features of the RibbonFET technology include:

  • Enhanced miniaturization of chip components for high-density CPUs
  • Precise control over electrical currents in the transistor channel
  • Improved performance per watt and operational efficiency
  • Tunable parameters enabled through ribbon widths and various threshold voltage types

PowerVia technology complements RibbonFET by elevating standard cell utilization by up to 10% and ISO-power performance by 4%.This approach channels power from underneath the silicon, enhancing overall chip performance.

Intel PowerVia diagram outlining key features.

Highlights of PowerVia technology include:

  • Reduced power distribution congestion, boosting overall chip performance
  • Redistribution of course pitch metals to optimize layout
  • Backside die integration for efficient power management
  • Nano-scale TSVs for enhanced power distribution
  • Superior signal routing capabilities
  • Over 90% cell density for optimized space utilization

Additionally, Clearwater Forest is set to be the first in high-volume production to use Foveros Direct3D, an innovative packaging solution that effectively links compute and I/O tiles on the base active tiles. This technology minimizes power consumption with a 9um bump pitch, allowing for efficient data transfer between tiles.

The following 3D construction overview illustrates the Clearwater Forest Xeon 6+ CPU architecture:

Intel Clearwater Forest 3D construction diagram with labeled chiplets.

Exploring the Three Primary Tiles of Clearwater Forest

The Clearwater Forest architecture comprises three main tiles: the Compute Tile, I/O Tile, and Base Tile.

Clearwater Forest I/O Tile

This tile utilizes Intel 7 process technology and integrates eight accelerators across two packages, including Intel Quick Assist Technology, Intel Dynamic Load Balancer, Intel Data Streaming Accelerator, and Intel In-Memory Analytics Accelerator, totaling 16 accelerators.

Intel I/O Tile Architecture diagram featuring technological details.

Each I/O tile comes equipped with 48 PCIe Gen 5.0 lanes (totaling 96), 32 CXL 2.0 lanes (totaling 64), and 96 UPI 2.0 lanes (totaling 192).While unchanged from Granite Rapids, this design presents a significant upgrade over Sierra Forest.

Clearwater Forest Base Tile

The Base Tile, connected via EMIB to the compute tiles above, uses Intel 3 process technology to house three Base Tiles. Each of these Base Tiles contains four DDR5 memory controllers, resulting in a total of 12 memory channels. Furthermore, they provide a shared LLC with 48 MB for each compute tile, aggregating to 576 MB of on-package LLC.

Clearwater Forest Compute Tile

The compute tiles represent the most advanced aspect of Clearwater Forest, featuring the new 18A process technology. Each tile is structured with six modules, each hosting four Darkmont E-Cores, culminating in 24 E-Cores per compute tile and 288 E-Cores across all twelve tiles.

Intel Tech Tour slide displaying Compute Tile Architecture with details on modules and E-cores.
Intel Compute Tile Architecture infographic showcasing specifications.
Intel Compute Tile Architecture highlighting core specifications.
Intel Compute Tile Architecture diagram detailing Darkmont E-cores.

Moreover, each module includes 4 MB of L2 cache, translating to 24 MB per compute tile and a total of 288 MB of L2 cache throughout the twelve tiles. When combined with the LLC, the entire chip reaches 864 MB of cache.

  • 12x Compute Tiles (Intel 18A)
  • 3x Active Base Tiles (Intel 3)
  • 2x Intel I/O Tiles (Intel 7)
  • 12x EMIB Tiles (EMIB 2.5D)

In-Depth Look at Darkmont E-Core

Now let’s delve deeper into the Darkmont E-Core, which is also employed in the Panther Lake client CPUs.

Intel slide titled 'Darkmont E-core Deep Dive' by Intel Fellow Stephen Robinson.

While the Darkmont architecture bears similarities to the Skymont design featured in the Lunar Lake and Arrow Lake CPUs, it represents a substantial upgrade over Crestmont.

Intel diagram titled 'Darkmont E-core' with labeled sections.

Notable improvements in the Darkmont core include an updated prediction block with 128 bytes, enhanced instruction fetching, and a 9-wide microarchitecture boasting a wider decode unit that incorporates 50% more decode clusters compared to Crestmont. Other enhancements include increased Uop queue capacity and a more refined instruction cache.

Intel Darkmont E-core out-of-order engine diagram with allocation and retire features highlighted.

Intel has also enhanced the Out-of-Order Engine (OOE).It features an 8-wide allocation and a 16-wide retire mechanism for quicker resource management, alongside a more substantial out-of-order window capacity of 416 entries.

Expansion across dispatch ports has reached 26, with the Scalar Engine featuring 8 Integer ALUs, while the Vector Engine includes 4 float ALUs, optimizing performance across multiple execution tasks.

Intel execution engine display illustrating Darkmont E-core features.

Memory subsystem enhancements reflect a comprehensive upgrade: doubling of L2 cache bandwidth and expedited L1 to L1 transfers are now possible, enhancing data communication efficiency.

Through elimination of external fabric data transfers, the L2 cache can now directly access data via the L1 cache. The conviction clock rate has similarly improved from 16 bytes to 32 bytes each clock cycle.

Comparison chart of Crestmont and Darkmont E-Core architectures.

In conclusion, the Darkmont E-Cores found in Clearwater Forest offer a performance increase of up to 90% compared to the 144-core Xeon 6780E ‘Sierra Forest, ’ also achieving a 23% rise in efficiency across varied loads and supporting up to 8:1 server consolidation with lower total cost of ownership (TCO).

Initial Performance Metrics

Intel has released preliminary performance statistics for the Clearwater Forest ‘Xeon 6+’ CPUs, presenting comparisons with both the 144-core Xeon 6700E ‘Sierra Forest’ and the unreleased 288-core Xeon 6900E chips.

Graph comparing performance metrics of Clearwater Forest versus Sierra Forest.

In contrast to the 144-core Sierra Forest (Xeon 6780E) operating at 330W, the Clearwater Forest variant with 288 cores and a TDP of 450W demonstrates a remarkable 36.3% lower TDP, with doubled core counts, achieving 112.7% higher performance and 54.7% enhanced efficiency per watt.

Compared to the 288-core Sierra Forest chip, which manages a TDP of 500W, Clearwater Forest maintains a TDP that is 11% lower while offering 17% better performance and 30% increased performance per watt.

Performance and efficiency metrics of Darkmont vs Crestmont CPUs.

This elevated performance stems from the advanced Darkmont E-Cores, which provide a 17% lift in IPC. The Clearwater Forest platform thereby introduces 1.9x improved performance, a 23% enhancement in efficiency, and supports substantial server consolidation ratios compared to outdated Xeon systems.

Specifications for Intel Xeon 6+ CPUs and Platform

The Clearwater Forest “Xeon 6+ CPUs”will utilize the LGA 7529 socket, applicable in both 1S and 2S configurations. This is the same socket as that used by the Xeon 6900P ‘Granite Rapids-AP’ CPUs. These chips will operate within a TDP range of 300-500W, mirroring the operating parameters of the Xeon 6700E and 6900P, which hold 144 cores.

Screen highlighting Intel Clearwater Forest tech specs and features.

These CPUs will facilitate up to 12-channel DDR5 memory with support for speeds up to 8000 MT/s, alongside accommodating up to 6 UPI 2.0 links (up to 24 GT/s), up to 96 PCIe Gen 5.0 lanes, and up to 64 CXL 2.0 lanes.

In terms of security features, the architecture includes Intel Software Guard Extensions (SGX) and Intel Trust Domain Extensions (TDX).Furthermore, power management is enhanced by Intel’s Application Energy Telemetry (AET) and Turbo Rate Limiter technologies. The Clearwater Forest CPUs will support Advanced Vector Extensions 2 (AVX2) with VNNI and INT8 capabilities.

Intel Xeon 6+ features showing 288 E-cores and DDR5 memory capabilities.

In summary, here is how Clearwater Forest “Xeon 6+”stacks up against Sierra Forest “Xeon 6”:

  • Up to 2x the core count
  • 17% IPC improvement per core
  • More than 5x last level cache
  • 4 additional memory channels
  • 2 more UPI links
  • 20% increased memory speed
Comparison between Intel® Xeon 6700E and Clearwater Forest specifications.

The anticipated launch for Intel’s Clearwater Forest “Xeon 6+”CPUs is scheduled for the second half of 2026, with additional performance data and insights expected to be unveiled leading up to the release.

Overview of Intel Xeon CPU Families (Preliminary):

Family Branding Diamond Rapids Clearwater Forest Granite Rapids Sierra Forest Emerald Rapids Sapphire Rapids Ice Lake-SP Cooper Lake-SP Cascade Lake-SP/AP Skylake-SP
Process Node TBD Intel 18A Intel 3 Intel 3 Intel 7 Intel 7 10nm+ 14nm++ 14nm++ 14nm+
Platform Name Intel Oak Stream Intel Birch Stream Intel Birch Stream Intel Mountain Stream/Intel Birch Stream Intel Eagle Stream Intel Eagle Stream Intel Whitley Intel Cedar Island Intel Purley Intel Purley
Core Architecture Panther Cove-X Darkmont Redwood Cove Sierra Glen Raptor Cove Golden Cove Sunny Cove Cascade Lake Cascade Lake Skylake
MCP (Multi-Chip Package) WeUs Yes Yes Yes Yes Yes Yes No No Yes No
Socket LGA XXXX / 9324 LGA 4710 / 7529 LGA 4710 / 7529 LGA 4710 / 7529 LGA 4677 LGA 4677 LGA 4189 LGA 4189 LGA 3647 LGA 3647
Max Core Count TBD Up to 288 Up to 128 Up to 288 Up to 64? Up to 56 Up to 40 Up to 28 Up to 28 Up to 28
Max Thread Count TBD Up to 288 Up to 256 Up to 288 Up to 128 Up to 112 Up to 80 Up to 56 Up to 56 Up to 56
Max L3 Cache TBD TBD 480 MB L3 108 MB L3 320 MB L3 105 MB L3 60 MB L3 38.5 MB L3 38.5 MB L3
Memory Support Up to 16-Channel DDR5? Up to 12-Channel DDR5-8000 Up to 12-Channel DDR5-6400/MCR-8800 Up to 12-Channel DDR5-6400 Up to 8-Channel DDR5-5600 Up to 8-Channel DDR5-4800 Up to 8-Channel DDR4-3200 Up to 6-Channel DDR4-3200 DDR4-2933 6-Channel DDR4-2666 6-Channel
PCIe Gen Support PCIe 6.0? PCIe 5.0 (96 lanes) PCIe 5.0 (136 lanes) PCIe 5.0 (88 lanes) PCIe 5.0 (80 lanes) PCIe 5.0 (80 lanes) PCIe 4.0 (64 lanes) PCIe 3.0 (48 lanes) PCIe 3.0 (48 lanes) PCIe 3.0 (48 lanes)
TDP Range (PL1) TBD Up to 500W Up to 500W Up to 350W Up to 350W Up to 350W 105-270W 150W-250W 165W-205W 140W-205W
3D Xpoint Optane DIMM TBD N/A Donahue Pass N/A Crow Pass Crow Pass Barlow Pass Barlow Pass Apache Pass N/A
Competition AMD EPYC Venice AMD EPYC Turin AMD EPYC Turin AMD EPYC Bergamo AMD EPYC Genoa ~5nm AMD EPYC Genoa ~5nm AMD EPYC Milan 7nm+ AMD EPYC Rome 7nm AMD EPYC Rome 7nm AMD EPYC Naples 14nm
Launch 2025-2026 2026 2024 2024 2023 2022 2021 2020 2018 2017

Source & Images

Leave a Reply

Your email address will not be published. Required fields are marked *