NVIDIA TensorRT Enhances RTX Performance by 2x on All RTX GPUs for Desktop PCs

NVIDIA has officially launched TensorRT for its GeForce RTX GPUs, promising a remarkable performance improvement—up to 2x faster than DirectML—specifically for AI applications.

NVIDIA’s TensorRT Transforms AI Performance on RTX GPUs

In an exciting announcement, NVIDIA has made TensorRT available across its RTX platform. This powerful AI inference engine allows consumers using RTX GPUs to experience significant speed enhancements, optimizing their applications for more efficient performance.

With TensorRT’s integration, users can expect computational uplifts of up to 2x in various AI applications compared to DirectML. Notably, TensorRT is also natively supported by Windows ML, enhancing compatibility and efficiency. Moreover, TensorRT-LLM is already available on the Windows platform.

Today’s AI PC software stack requires developers to choose between frameworks that have broad hardware support but lower performance, or optimized paths that only cover certain hardware or model types and require the developer to maintain multiple paths. The new Windows ML inference framework was built to solve these challenges.

Windows ML is built on top of ONNX Runtime and seamlessly connects to an optimized AI execution layer provided and maintained by each hardware manufacturer. For GeForce RTX GPUs, Windows ML automatically uses TensorRT for RTX — an inference library optimized for high performance and rapid deployment. Compared to DirectML, TensorRT delivers over 50% faster performance for AI workloads on PCs.

Windows ML also delivers quality of life benefits for the developer. It can automatically select the right hardware to run each AI feature, and download the execution provider for that hardware, removing the need to package those files into their app. This allows NVIDIA to provide the latest TensorRT performance optimizations to users as soon as they are ready. And because it’s built on ONNX Runtime, Windows ML works with any ONNX model.

Beyond merely enhancing performance, TensorRT for RTX also introduces optimizations that significantly reduce library file sizes by 8x and includes Just-in-time optimizations tailored for individual GPUs. This cutting-edge technology is set to roll out in June for all NVIDIA GeForce RTX GPUs, with more details available at developer.nvidia.com.

Performance evaluations reveal that with TensorRT, applications like ComfyUI achieve a 2x speed increase, while video editing tools such as DaVinci Resolve and Vegas Pro can see up to a 60% improvement in speed. This promises to accelerate AI-driven workflows, enabling RTX GPUs to maximize their capabilities fully.

NVIDIA’s innovations are far-reaching, powering over 150 AI SDKs, with five new ISV integrations arriving this month, which include:

LM Studio (+30% performance with the latest CUDA)
Topaz Video AI (GenAI Video accelerated CUDA)
Bilibili (NVIDIA Broadcast Effects)
AutoDesk VRED (DLSS 4)
Chaos Enscape (DLSS 4)

Additionally, NVIDIA is announcing new NIMs and AI Blueprints, which feature plugins for Project G-Assist, integrating platforms such as Discord, Gemini, IFTTT, Twitch, Spotify, and SignalRGB. Users are also encouraged to develop custom plugins for Project G-Assist by visiting github.com/NVIDIA/G-Assist.

Source & Images