NVIDIA RTX GPUs Provide Unmatched AI Performance for OpenAI’s New “gpt-oss” Models

NVIDIA RTX GPUs Provide Unmatched AI Performance for OpenAI’s New “gpt-oss” Models

NVIDIA, in collaboration with OpenAI, has unveiled the latest addition to its suite of AI models, the gpt-oss family. This release provides consumers with access to groundbreaking AI technology, leveraging the exceptional processing capabilities of RTX GPUs.

NVIDIA detailed its partnership with OpenAI today, marking a significant advancement that enables cutting-edge artificial intelligence to operate swiftly on RTX-powered PCs and workstations—resources that were previously limited to cloud data centers.

NVIDIA’s founder and CEO, Jensen Huang, highlighted the significance of this move for the tech industry:

“OpenAI showed the world what could be built on NVIDIA AI — and now they’re advancing innovation in open-source software, ” said Jensen Huang.“The gpt-oss models let developers everywhere build on that state-of-the-art open-source foundation, strengthening U. S.technology leadership in AI — all on the world’s largest AI compute infrastructure.”

This launch represents a pivotal moment, ushering in an era of faster and more intelligent on-device AI, powered by the formidable capabilities of GeForce RTX and PRO GPUs. Two versions of the model are being introduced, catering to a broad spectrum of users:

  • gpt-oss-20b model: Tailored for optimal performance on NVIDIA RTX AI PCs equipped with a minimum of 16GB VRAM, this model can process up to 250 tokens per second when running on an RTX 5090 GPU.
  • gpt-oss-120b model: Designed for professional environments, this model is supported by NVIDIA RTX PRO GPUs, maximizing processing capabilities.
OpenAI raises $8.3 billion funds
OpenAI secures $8.3 billion as part of $40 billion funding push

The gpt-oss models are the first to leverage MXFP4 precision on NVIDIA RTX, a cutting-edge training method that enhances model quality without sacrificing performance relative to older techniques. Both models boast an impressive context length capacity of up to 131, 072 tokens, one of the most extensive available for local inference. They feature a flexible mixture-of-experts (MoE) architecture, enabling chain-of-thought capabilities and supporting instruction-following and tool usage.

This week’s RTX AI Garage is focusing on how AI developers and enthusiasts can effectively utilize the new OpenAI models on NVIDIA RTX GPUs:

    Source & Images

    Leave a Reply

    Your email address will not be published. Required fields are marked *