
Intel has officially announced its Xe3 graphics architecture, set to debut in the integrated GPU of the upcoming Panther Lake processors, with plans for a Xe3P variant in the near future.
Intel Unveils Xe3 Architecture for Panther Lake’s iGPU: Promising Up to 50% Performance Uplift
Intel’s Xe3 follows last year’s Xe2 architecture, which significantly enhanced its product offerings by integrating into two key releases: the Lunar Lake “Core Ultra 200″CPUs and the Arc B-Series “Battlemage”discrete graphics cards. The Xe2 architecture capitalized on lessons learned from its predecessor, Xe1, and the initial Arc Alchemist A-series family, resulting in a successful launch across both platforms.



Recent enhancements in software have also bolstered Intel’s offerings in driver support, benefiting not only gaming but also content creation, rendering, and AI processes. The newly released Arc Pro series has integrated seamlessly with the existing driver ecosystem alongside the Battlemage GPUs.

Intel’s recent months have showcased substantial advancements in graphics technology, highlighted by the upcoming Panther Lake “Core Ultra 300″series, which introduces the cutting-edge Xe3 architecture.
Xe3 iGPUs: The Next Generation of Arc B-Series and Insights on Xe3P
The Xe3 architecture builds upon Xe2 by expanding graphics capabilities for larger configurations and optimizing throughput. Notably, the iGPUs powered by Xe3 will be branded under the Arc B-Series.
Interestingly, while the Battlemage discrete GPUs are based on Xe2, the Panther Lake iGPUs transition to Xe3 architecture. This alignment reflects Intel’s strategic decision to unify its product stack across both integrated and discrete options.

Future developments indicate that an Arc family using an upgraded Xe3 architecture named Xe3P is in the pipeline, poised to deliver further optimizations rather than jumping directly to Xe4. This strategic move suggests Xe3P could be employed in both discrete GPU solutions and enhanced iGPU setups for upcoming Nova Lake CPUs.
While Xe3P will not be part of the current Arc B-Series along with the Battlemage dGPUs or the Panther Lake iGPUs, anticipation builds for what will likely be the next in the Arc family—perhaps the Arc C-Series. With these elements clarified, let’s dive into the specifics of the Xe3 architecture.
Xe3 – Elevating iGPU Performance and Power Efficiency
The Xe3 architecture marks a significant escalation in rendering capabilities. The previous Xe2 featured 4 Xe cores and 4 ray tracing units dedicated per render slice.

In contrast, Xe3 introduces a robust figure of 6 Xe cores and 6 ray tracing units per render slice, translating to a 50% increase. This enhancement enables Intel to efficiently deploy diverse configurations of GPU tiles within its Panther Lake SoCs.

The available configurations include a 4 Xe core die for 8C and 16C WeUs and a more advanced 12 Xe core setup earmarked for the top 16C die, promising an evolution in performance dynamics compared to competitors like Arrow Lake and Lunar Lake.

The specifications for the two configurations are as follows:
- 4 Xe Core Configuration:
- 4 Xe Cores (Xe3 Architecture)
- 1 Render Slice
- 32 XMX Engines
- 4 MB L2 Cache
- 1 Geo Pipeline
- 4 Samplers
- 4 Ray Tracing Units
- 2 Pixel Backends
- 12 Xe Core Configuration:
- 12 Xe Cores (Xe3 Architecture)
- 2 Render Slices
- 96 XMX Engines
- 16 MB L2 Cache
- 2 Geo Pipelines
- 12 Samplers
- 12 Ray Tracing Units
- 4 Pixel Backends

Despite representing a decrease in L2 cache for the 4Xe configuration, the 12Xe model excels with its 16MB L2 cache, effectively reducing traffic on the SoC fabric—leading to up to 36% reduced traffic during gaming scenarios.

Architectural upgrades within the Xe3 framework include enhanced core features such as eight 512-bit Vector Engines and eight 2048-bit XMX Engines, alongside a +33% increase in shared L1/SLM cache.

This innovative architecture ensures that the Xe Vector Engine can now leverage up to 25% more threads while providing support for variable register allocation, which enhances performance—especially in AI-focused tasks.

Moreover, the XMX engines are engineered for AI acceleration, with a 12Xe iGPU capable of delivering up to 120 TOPs, while a 4Xe iGPU can achieve about 40 TOPs. For context, the previous Xe2 architecture produced a maximum of 67 TOPs, making the transition to Xe3 a notable leap in performance.

The Xe3 architecture’s per Xe-core operations per clock are detailed as:
- XMX TF32: 1024 ops/clk
- XMX FP16: 2048 ops/clk
- XMX BF16: 2048 ops/clk
- XMX INT8: 4096 ops/clk
- XMX INT4: 8192 ops/clk
- XMX INT2: 8192 ops/clk

Additionally, Intel has introduced a cutting-edge ray tracing unit featuring dynamic ray management, designed for asynchronous ray tracing. This unit is equipped with multiple traversal pipelines, triangle intersection units, and a BVH cache, enhancing overall performance.

The new URB manager facilitates partial updates, greatly improving efficiency in data management on the GPU. Moreover, enhancements include up to 2x anisotropic filtering and stencil test rates, further setting the Xe3 apart.
On the media front, the architecture includes advanced features such as AV1 Encode/Decode, VVC Decode, and eDP 1.5 support. Additional functions include AVC 10-bit support and compatibility with various Sony XAVC formats, enriching the multimedia handling capabilities of Xe3 in Panther Lake.
Intel Continues to Scale and Enhance GPU Performance with Xe3
Intel has revealed preliminary performance evaluations for its Xe3 GPUs, focusing on microbenchmarks that assess individual segments of the GPU microarchitecture compared to prior iterations.

Initial results for blend and backend performance indicate minimal fluctuations, given that the resource allocations remain constant in Xe3. However, a striking 50% increase in FP16 metrics for GEMM reflects the scaling advantage of the GPU. With Xe3 exceeding Xe2 in size, these benchmarks fully utilize its capabilities, showcasing impressive architectural enhancements such as improvements in anisotropic rate, mesh render rate, scattered reads, and ray tracing intersection, which range from 2x to 2.7x increases.

Significant gains in areas like depth testing and register-heavy applications have noted improvements exceeding 7x compared to the previous generation, illustrating the leap in performance standards.

For a visual representation, a frame rendered using Xe3 versus Xe2 unveils the strides made regarding performance enhancements.

Moreover, Intel is enhancing its Windows Graphics Software Stack, introducing useful updates including compiler improvements via the Intel Graphics Compiler (IGC) and variable register allocation to optimize performance further.

Intel is introducing faster scheduling capabilities through direct preemption, which allows for swift context-switching without flushing. Additionally, the latest updates include support for DirectX Cooperative Vectors, showcased through Intel’s “Neural Radiance Field”demo utilizing these vectors.

In summary, the Intel Xe3 architecture represents a noteworthy improvement over the Xe2, which currently competes with leading RDNA 3.5 iGPUs like the Radeon 880M and 890M in mainstream laptops. While Xe2 may not completely match the higher tiers such as the RDNA 3.5 implementations like the Strix Halo, collaborations between Intel and NVIDIA’s custom SoC partnerships may bridge this gap.
Leave a Reply