
NVIDIA continues to make waves in the tech industry with its latest advancements in neural rendering and gaming, particularly through its Blackwell RTX GPU series, which includes the RTX 5090 and RTX PRO 6000.
NVIDIA Showcases Blackwell RTX Innovations at Hot Chips 2025
Launched in January 2023, the Blackwell RTX architecture has unveiled groundbreaking features that primarily focus on artificial intelligence (AI).This innovation is not a new venture for NVIDIA, as its journey began with the launch of CUDA in 2006, marking a significant advancement in accelerated computing and AI.

NVIDIA asserts that advancements in compute density have surpassed Moore’s Law scaling, achieved through techniques such as Sparsity, a new Instruction Set Architecture (ISA), and optimized lower precision formats. The 2018 introduction of real-time ray tracing and the subsequent launch of DLSS the following year represent critical milestones in this evolution.

These innovations have been achieved using advanced technologies, including RT cores and Tensor cores, and Blackwell now amplifies these capabilities to a new level.

In the data center space, NVIDIA introduced FP4 precision, providing a 4x boost for workloads that require dense scaling. Jensen Huang, NVIDIA’s CEO, aims to reaffirm AI’s pivotal role in graphics, facilitating the emergence of the neural rendering era with Blackwell RTX. The RTX brand signifies innovation in simulation, content creation, and gaming, paving the way to extend data center technologies to consumer RTX GPUs.

So, what does the Blackwell architecture contribute? It boasts advancements like DLSS 4, MFG, ACE, and enhanced Path Tracing, all designed to accelerate performance and improve visual fidelity. NVIDIA claims Blackwell RTX can achieve a “10x amplification in performance, footprint, and design cycle.”DLSS 4 strategically uses AI to render 100% of the pixels following the initial frame, ultimately leading to faster rendering times and extended battery life for mobile devices.

The key design principles of the RTX Blackwell GPU include:
- Optimization for new neural workloads
- Minimizing memory footprint
- Ensuring quality service for neural and graphics tasks
- Scalable energy efficiency

On a technical scale, RTX Blackwell is an engineering powerhouse, offering 4000 AI Tera Operations Per Second (TOPS) and High-Speed FP4 support based on 5th Generation Tensor Cores. It provides up to 360 RT TFLOPs targeted at Mega Geometry with the 4th Gen RT Cores, while the AI Management Process (AMP) effectively manages AI models alongside graphics processing.

The architecture of the RTX Blackwell Streaming Multiprocessor (SM) diverges significantly from its Data Center counterpart. One notable improvement is the integration of FP32 and INT32 units that were previously separate, enhancing processing efficiency.

Furthermore, RTX Blackwell enhances Shader Execution Reordering (SER), doubling the efficiency of shader execution.

The 5th Generation Tensor Core introduces FP4 support and includes MFG mode in DLSS 4, allowing the GPU to use AI to render four frames simultaneously.

As a result, utilizing DLSS 4 along with Frame Generation significantly reduces individual frame rendering time by providing a tenfold increase in core rail gating speed, while achieving 100 times faster DRAM self-refresh rates. Mobile platforms can experience a reduction of up to 2x in GPU power consumption, greatly enhancing battery life.

The introduction of GDDR7 enables the RTX Blackwell to achieve speeds of up to 30 Gbps, effectively doubling the data rate of its predecessor GDDR6. This new memory standard further enhances efficiency on mobile platforms.
Thanks to NVIDIA’s AMP unit, simultaneous execution of AI and graphics workloads is a reality, leading to smoother frame delivery and quicker model responses.

Transitioning from gaming applications to professional uses, NVIDIA is featuring novel capabilities in the RTX PRO 6000, such as Universal MIG. This allows for up to four instances of the RTX PRO GPUs, each equipped with 24 GB VRAM, to operate simultaneously with consistent latency and throughput.
An impressive demonstration showcased the RTX PRO 6000’s capability by running four instances of Cyberpunk 2077 at 1080p on maximum settings, a task manageable for this powerful GPU.

With the use of a standard time-sliced instance as a baseline, comparisons against MIG 2x and 4x modes revealed a remarkable 60% increase in scalability. The RTX PRO 6000 Blackwell GPU is indeed well-suited for managing multiple instances of demanding applications like Cyberpunk 2077.

Overall, NVIDIA’s Blackwell GPU architecture has been making strides since its release, continuously evolving for both consumer and professional applications. As more games and content creation tools begin to incorporate the extensive AI and neural enhancements offered by Blackwell, the anticipation surrounding future developments in this space is palpable.
Leave a Reply