Tuesday, 16 December 2025
26.5 C
Singapore
28.3 C
Thailand
23.2 C
Indonesia
27.5 C
Philippines

NVIDIA debuts Nemotron 3 family of open models for agentic AI

NVIDIA launches the open Nemotron 3 AI model family, targeting efficient, transparent multi-agent systems across enterprise and startup use cases.

NVIDIA has unveiled the Nemotron 3 family of open models, a new portfolio designed to support transparent, efficient and specialised development of agentic artificial intelligence across a wide range of industries. Available in Nano, Super and Ultra sizes, the Nemotron 3 family introduces a hybrid latent mixture-of-experts architecture aimed at enabling reliable multi-agent AI systems at scale while controlling costs and improving accuracy.

The launch reflects a broader shift in enterprise AI adoption, as organisations move beyond single-model chatbots towards collaborative systems made up of multiple AI agents working together. These systems promise more advanced reasoning and automation but also introduce new challenges, including communication overhead, context drift, rising inference costs and the need for greater transparency. NVIDIA positions Nemotron 3 as a direct response to these pressures, combining open access with performance and efficiency.

“Open innovation is the foundation of AI progress,” said Jensen Huang, founder and CEO of NVIDIA. “With Nemotron, we’re transforming advanced AI into an open platform that gives developers the transparency and efficiency they need to build agentic systems at scale.”

NVIDIA said Nemotron 3 also aligns with its sovereign AI strategy, supporting governments and organisations in regions such as Europe and South Korea that want to deploy AI systems tailored to their own data, regulatory frameworks and societal values using open and auditable models.

Architecture designed for multi-agent efficiency

At the core of Nemotron 3 is a hybrid mixture-of-experts design that selectively activates subsets of model parameters depending on the task. This approach allows the models to scale to large parameter counts while keeping inference efficient, making them suitable for workflows that involve many agents operating in parallel.

Nemotron 3 Nano is a 30-billion-parameter model that activates up to 3 billion parameters at a time, targeting highly efficient tasks such as software debugging, content summarisation, AI assistant workflows and information retrieval. NVIDIA said Nano delivers up to four times higher token throughput than Nemotron 2 Nano and reduces reasoning-token generation by as much as 60 percent, significantly lowering inference costs. The model also supports a context window of up to one million tokens, enabling it to retain and connect information across long, multi-step tasks.

Independent benchmarking organisation Artificial Analysis ranked Nemotron 3 Nano as the most open and efficient model in its size category, citing leading accuracy among comparable open models.

Nemotron 3 Super and Nemotron 3 Ultra are aimed at more demanding multi-agent scenarios. Super is a high-accuracy reasoning model with approximately 100 billion parameters and up to 10 billion active per token, optimised for low-latency collaboration between agents. Ultra scales to around 500 billion parameters with up to 50 billion active per token, positioning it as a deep reasoning engine for complex workflows such as research, planning and advanced decision-making.

Both Super and Ultra are trained using NVIDIA’s 4-bit NVFP4 format on the Blackwell architecture, reducing memory requirements and accelerating training without sacrificing accuracy compared with higher-precision methods. NVIDIA said this enables larger models to be trained on existing infrastructure more cost-effectively.

Adoption across enterprises and startups

A growing list of early adopters is already integrating Nemotron models into production workflows. These include Accenture, Cadence, CrowdStrike, Cursor, Deloitte, EY, Oracle Cloud Infrastructure, Palantir, Perplexity, ServiceNow, Siemens, Synopsys and Zoom, spanning sectors such as manufacturing, cybersecurity, software development, media and communications.

“NVIDIA and ServiceNow have been shaping the future of AI for years, and the best is yet to come,” said Bill McDermott, chairman and CEO of ServiceNow. “Today, we’re taking a major step forward in empowering leaders across all industries to fast-track their agentic AI strategy. ServiceNow’s intelligent workflow automation combined with NVIDIA Nemotron 3 will continue to define the standard with unmatched efficiency, speed and accuracy.”

Nemotron 3 is also positioned to support hybrid AI strategies, where tasks are routed between frontier proprietary models and efficient open models within a single workflow. According to NVIDIA, this approach allows developers to balance reasoning quality and cost by matching each task to the most appropriate model.

“Perplexity is built on the idea that human curiosity will be amplified by accurate AI built into exceptional tools, like AI assistants,” said Aravind Srinivas, CEO of Perplexity. “With our agent router, we can direct workloads to the best fine-tuned open models, like Nemotron 3 Ultra, or leverage leading proprietary models when tasks benefit from their unique capabilities — ensuring our AI assistants operate with exceptional speed, efficiency and scale.”

Startups are also exploring Nemotron 3 as a foundation for building AI teammates and collaborative systems. Portfolio companies backed by General Catalyst, Mayfield and Sierra Ventures are evaluating the models to accelerate development from prototype to enterprise deployment.

“NVIDIA’s open model stack and the NVIDIA Inception program give early-stage companies the models, tools and a cost-effective infrastructure to experiment, differentiate and scale fast,” said Navin Chaddha, managing partner at Mayfield. “Nemotron 3 gives founders a running start on building agentic AI applications and AI teammates, and helps them tap into NVIDIA’s massive installed base.”

Open datasets, tools and availability

Alongside the models, NVIDIA released three trillion tokens of new Nemotron pretraining, post-training and reinforcement learning datasets. These datasets include reasoning, coding and multi-step workflow examples, as well as the Nemotron Agentic Safety Dataset, which provides real-world telemetry to help teams evaluate and improve the safety of complex agent systems.

The company also introduced NeMo Gym and NeMo RL, open-source libraries that provide training environments and post-training capabilities for Nemotron models, along with NeMo Evaluator for validating safety and performance. The models are supported by tools such as LM Studio, llama.cpp, SGLang and vLLM, with additional integrations planned through partners including Prime Intellect and Unsloth.

Nemotron 3 Nano is available today through model repositories and inference service providers, as well as via NVIDIA’s NIM microservice for secure deployment on NVIDIA-accelerated infrastructure. Nemotron 3 Super and Ultra are expected to be available in the first half of 2026.

Hot this week

Denodo: Rethinking data architecture for AI agility and measurable ROI in Asia-Pacific

Denodo highlights how modern, composable data architectures powered by logical data management are helping Asia-Pacific enterprises accelerate AI adoption, ensure governance, and achieve measurable ROI.

Crunchyroll Arc returns to celebrate fandom, connection, and anime’s global rise

Crunchyroll brings back its Arc year-in-review experience, highlighting anime fandom, personalised personas, and the medium’s growing global impact.

China Changan Automobile Group reaches 30 million vehicle milestone

China Changan Automobile Group marks its 30 millionth vehicle milestone, highlighting its EV strategy, safety focus, and global growth plans.

Plaud Note Pro launches in Singapore as AI-powered note-taking device

Plaud launches the Note Pro in Singapore, introducing a slim AI note-taker with real-time human-AI alignment and up to 50 hours of recording.

Developers in Australia and India build new network API solutions at Nokia and Telstra hackathon

Developers create new prototypes using network APIs at Nokia and Telstra’s Connected Future Hackathon 2025.

Meta outlines evolving scam and influence threats in latest adversarial report

Meta’s latest Adversarial Threat Report highlights evolving scam networks, AI-driven abuse and efforts to protect users across APAC.

Jobstreet by SEEK outlines key job market shifts and skills needed to thrive in Singapore in 2026

Jobstreet by SEEK highlights rising retrenchments, strong tech demand, and the growing importance of AI and skills-based hiring in Singapore.

Crunchyroll Arc returns to celebrate fandom, connection, and anime’s global rise

Crunchyroll brings back its Arc year-in-review experience, highlighting anime fandom, personalised personas, and the medium’s growing global impact.

Plaud Note Pro launches in Singapore as AI-powered note-taking device

Plaud launches the Note Pro in Singapore, introducing a slim AI note-taker with real-time human-AI alignment and up to 50 hours of recording.

Related Articles

Popular Categories