NVIDIA has unveiled the Nemotron 3 family of open models, a new portfolio designed to support transparent, efficient and specialised development of agentic artificial intelligence across a wide range of industries. Available in Nano, Super and Ultra sizes, the Nemotron 3 family introduces a hybrid latent mixture-of-experts architecture aimed at enabling reliable multi-agent AI systems at scale while controlling costs and improving accuracy.
The launch reflects a broader shift in enterprise AI adoption, as organisations move beyond single-model chatbots towards collaborative systems made up of multiple AI agents working together. These systems promise more advanced reasoning and automation but also introduce new challenges, including communication overhead, context drift, rising inference costs and the need for greater transparency. NVIDIA positions Nemotron 3 as a direct response to these pressures, combining open access with performance and efficiency.
“Open innovation is the foundation of AI progress,” said Jensen Huang, founder and CEO of NVIDIA. “With Nemotron, we’re transforming advanced AI into an open platform that gives developers the transparency and efficiency they need to build agentic systems at scale.”
NVIDIA said Nemotron 3 also aligns with its sovereign AI strategy, supporting governments and organisations in regions such as Europe and South Korea that want to deploy AI systems tailored to their own data, regulatory frameworks and societal values using open and auditable models.
Architecture designed for multi-agent efficiency
At the core of Nemotron 3 is a hybrid mixture-of-experts design that selectively activates subsets of model parameters depending on the task. This approach allows the models to scale to large parameter counts while keeping inference efficient, making them suitable for workflows that involve many agents operating in parallel.
Nemotron 3 Nano is a 30-billion-parameter model that activates up to 3 billion parameters at a time, targeting highly efficient tasks such as software debugging, content summarisation, AI assistant workflows and information retrieval. NVIDIA said Nano delivers up to four times higher token throughput than Nemotron 2 Nano and reduces reasoning-token generation by as much as 60 percent, significantly lowering inference costs. The model also supports a context window of up to one million tokens, enabling it to retain and connect information across long, multi-step tasks.
Independent benchmarking organisation Artificial Analysis ranked Nemotron 3 Nano as the most open and efficient model in its size category, citing leading accuracy among comparable open models.
Nemotron 3 Super and Nemotron 3 Ultra are aimed at more demanding multi-agent scenarios. Super is a high-accuracy reasoning model with approximately 100 billion parameters and up to 10 billion active per token, optimised for low-latency collaboration between agents. Ultra scales to around 500 billion parameters with up to 50 billion active per token, positioning it as a deep reasoning engine for complex workflows such as research, planning and advanced decision-making.
Both Super and Ultra are trained using NVIDIA’s 4-bit NVFP4 format on the Blackwell architecture, reducing memory requirements and accelerating training without sacrificing accuracy compared with higher-precision methods. NVIDIA said this enables larger models to be trained on existing infrastructure more cost-effectively.
Adoption across enterprises and startups
A growing list of early adopters is already integrating Nemotron models into production workflows. These include Accenture, Cadence, CrowdStrike, Cursor, Deloitte, EY, Oracle Cloud Infrastructure, Palantir, Perplexity, ServiceNow, Siemens, Synopsys and Zoom, spanning sectors such as manufacturing, cybersecurity, software development, media and communications.
“NVIDIA and ServiceNow have been shaping the future of AI for years, and the best is yet to come,” said Bill McDermott, chairman and CEO of ServiceNow. “Today, we’re taking a major step forward in empowering leaders across all industries to fast-track their agentic AI strategy. ServiceNow’s intelligent workflow automation combined with NVIDIA Nemotron 3 will continue to define the standard with unmatched efficiency, speed and accuracy.”
Nemotron 3 is also positioned to support hybrid AI strategies, where tasks are routed between frontier proprietary models and efficient open models within a single workflow. According to NVIDIA, this approach allows developers to balance reasoning quality and cost by matching each task to the most appropriate model.
“Perplexity is built on the idea that human curiosity will be amplified by accurate AI built into exceptional tools, like AI assistants,” said Aravind Srinivas, CEO of Perplexity. “With our agent router, we can direct workloads to the best fine-tuned open models, like Nemotron 3 Ultra, or leverage leading proprietary models when tasks benefit from their unique capabilities — ensuring our AI assistants operate with exceptional speed, efficiency and scale.”
Startups are also exploring Nemotron 3 as a foundation for building AI teammates and collaborative systems. Portfolio companies backed by General Catalyst, Mayfield and Sierra Ventures are evaluating the models to accelerate development from prototype to enterprise deployment.
“NVIDIA’s open model stack and the NVIDIA Inception program give early-stage companies the models, tools and a cost-effective infrastructure to experiment, differentiate and scale fast,” said Navin Chaddha, managing partner at Mayfield. “Nemotron 3 gives founders a running start on building agentic AI applications and AI teammates, and helps them tap into NVIDIA’s massive installed base.”
Open datasets, tools and availability
Alongside the models, NVIDIA released three trillion tokens of new Nemotron pretraining, post-training and reinforcement learning datasets. These datasets include reasoning, coding and multi-step workflow examples, as well as the Nemotron Agentic Safety Dataset, which provides real-world telemetry to help teams evaluate and improve the safety of complex agent systems.
The company also introduced NeMo Gym and NeMo RL, open-source libraries that provide training environments and post-training capabilities for Nemotron models, along with NeMo Evaluator for validating safety and performance. The models are supported by tools such as LM Studio, llama.cpp, SGLang and vLLM, with additional integrations planned through partners including Prime Intellect and Unsloth.
Nemotron 3 Nano is available today through model repositories and inference service providers, as well as via NVIDIA’s NIM microservice for secure deployment on NVIDIA-accelerated infrastructure. Nemotron 3 Super and Ultra are expected to be available in the first half of 2026.



