
NVIDIA has recently unveiled its latest GeForce RTX 5090 GPU, which significantly outperforms AMD’s RX 7900 XTX in inference tasks on the DeepSeek R1 model. This impressive leap in performance is largely attributed to the new fifth-generation Tensor Cores integrated into NVIDIA’s architecture.
Streamlined Access to DeepSeek’s Reasoning Models with RTX GPUs
As consumer GPUs evolve, they have become powerful tools for running advanced large language models (LLMs) on local systems. NVIDIA and AMD are both evolving their hardware to enhance the usability of these models. Recently, AMD highlighted the capabilities of its RDNA 3 flagship GPU using the DeepSeek R1 LLM. In response, NVIDIA showcased benchmarking results from their latest RTX Blackwell series, confirming that the GeForce RTX 5090 has claimed a decisive edge over its competitors.

Performance metrics reveal that the GeForce RTX 5090 can process up to 200 tokens per second with models like Distill Qwen 7b and Distill Llama 8b. This output nearly doubles the performance of AMD’s RX 7900 XTX, underscoring NVIDIA’s dominance in AI performance. With the introduction of comprehensive “RTX on AI”support, we can expect edge AI capabilities to become commonplace in consumer-grade PCs.
Accessing DeepSeek R1 on NVIDIA GPUs
NVIDIA has facilitated access for enthusiasts seeking to leverage DeepSeek R1 on their RTX GPUs. The company has released a detailed blog that walks users through the setup, making it as straightforward as operating any online chatbot. Here’s a key takeaway from their recent announcement:
To help developers securely experiment with these capabilities and build their own specialized agents, the 671-billion-parameter DeepSeek-R1 model is now available as an NVIDIA NIM microservice preview on build.nvidia.com. The DeepSeek-R1 NIM microservice can deliver up to 3, 872 tokens per second on a single NVIDIA HGX H200 system.
Developers can test and experiment with the application programming interface (API), which is expected to be available soon as a downloadable NIM microservice, part of the NVIDIA AI Enterprise software platform.
The DeepSeek-R1 NIM microservice simplifies deployments with support for industry-standard APIs. Enterprises can maximize security and data privacy by running the NIM microservice on their preferred accelerated computing infrastructure.
– NVIDIA
This innovative approach enables developers and enthusiasts to experiment with AI models using local builds. Running these models locally not only enhances performance — contingent on the system’s hardware capabilities — but also ensures greater data security, safeguarding sensitive information throughout the process.
For those interested in exploring more about NVIDIA’s offerings, check out more information through this link:
or visit the source for details and images.
Leave a Reply ▼