NVIDIA Rubin CPX GPU: Optimized for Super AI Applications with Million-Token Coding, GenAI, 128 GB GDDR7 Memory, and 30 PFLOPs FP4 Performance

NVIDIA is making headlines with the anticipated release of its latest innovation, the Rubin AI platform. This advanced system is set to feature Vera CPUs in conjunction with the cutting-edge Rubin CPX chip, boasting a remarkable 128 GB of GDDR7 memory.

Unveiling the NVIDIA Rubin AI Platform: A New Era of Speed and Efficiency

NVIDIA continues to build anticipation around its next-generation Rubin AI platform, while also hinting at the future potential of its Feynman platform. Recent updates highlight the innovative capabilities of the Rubin GPUs, emphasizing the integration of advanced technologies like Vera CPUs and the new ConnectX-9 SuperNICs.

NVIDIA Rubin CPX GPU for massive context showcased with features like 128GB GDDR7 memory and available end 2026.

NVIDIA today announced NVIDIA Rubin CPX, a new class of GPU purpose-built for massive-context processing. This enables AI systems to handle million-token software coding and generative video with groundbreaking speed and efficiency.

Rubin CPX works hand in hand with NVIDIA Vera CPUs and Rubin GPUs inside the new NVIDIA Vera Rubin NVL144 CPX platform. This integrated NVIDIA MGX system packs 8 exaflops of AI compute to provide 7.5x more AI performance than NVIDIA GB300 NVL72 systems, as well as 100TB of fast memory and 1.7 petabytes per second of memory bandwidth in a single rack. A dedicated Rubin CPX compute tray will also be offered for customers looking to reuse existing Vera Rubin 144 systems.

NVIDIA Rubin CPX enables the highest performance and token revenue for long-context processing — far beyond what today’s systems were designed to handle. This transforms AI coding assistants from simple code-generation tools into sophisticated systems that can comprehend and optimize large-scale software projects.

To process video, AI models can take up to 1 million tokens for an hour of content, pushing the limits of traditional GPU compute. Rubin CPX integrates video decoder and encoders, as well as long-context inference processing, in a single chip for unprecedented capabilities in long-format applications such as video search and high-quality generative video.

Built on the NVIDIA Rubin architecture, the Rubin CPX GPU uses a cost-efficient, monolithic die design packed with powerful NVFP4 computing resources and is optimized to deliver extremely high performance and energy efficiency for AI inference tasks.

via NVIDIA

The Rubin family brings forth a new class of GPUs engineered for demanding AI applications, including advanced million-token software coding and Generation AI (GenAI).These revolutionary GPUs promise unmatched speed and efficiency.

Announcing Vera Rubin CPX Dual Rack Solution with NVIDIA branding, highlighted features include 1.7 PB/s memory and availability in 2026.

Within the Vera Rubin NVL 144 CPX platform, NVIDIA’s Rubin CPX chips will work in tandem with the next-generation Vera CPUs, succeeding the Grace CPU. This modern MGX system is designed to deliver an impressive 8 Exaflops of AI compute, marking a 7.5x improvement over the existing Grace Blackwell GB300 NVL72 platform. Additionally, it will feature 100 TB of rapid memory and a substantial memory bandwidth of 1.7 Petabytes, effectively tripling the attention performance compared to previous systems.

Key Advantages of the NVIDIA Vera Rubin CPX Platform

7.5x increase in AI compute (8 Exaflops NVFP4)
3.0x faster bandwidth (1.7 PB/s bandwidth)
4.0x greater memory capacity (150 TB in GDDR7)

Each NVIDIA Rubin CPX GPU is set to deliver 30 PFLOPs of NVFP4 AI compute and can accommodate up to 128 GB of GDDR7 memory. The choice of GDDR7 over HBM for the Rubin CPX platform is noteworthy, reflecting NVIDIA’s commitment to cost-effective solutions without compromising performance. Moreover, these GPUs are expected to feature expanded NVENC and NVDNC capabilities, significantly enhancing video processing for GenAI tasks.

NVIDIA roadmap highlights Blackwell, Rubin, and Feynman architectures from 2025 to 2028 with Grace CPU and NVLink switch details.

NVIDIA anticipates the first Rubin CPX systems will be available by the end of 2026, with the Vera Rubin production phase expected to commence shortly, aiming for a launch event at the upcoming GTC 2026.

Source & Images