Nvidia partners with Mistral AI to accelerate new open model family

Nvidia and Mistral AI launch the Mistral 3 model family to boost enterprise AI performance across cloud and edge platforms.

Nurin Sofia

3 December 2025 2 Min Read

Mistral AI has introduced the Mistral 3 family of open-source multilingual and multimodal models, developed to run efficiently across Nvidia’s supercomputing and edge platforms. The launch marks a closer partnership between the two companies as they work to advance large-scale AI for enterprise use.

Table Of Content

Expanding model capabilities across cloud and edge
Performance gains through Nvidia-optimised architecture
Bringing AI to the edge with compact models

Expanding model capabilities across cloud and edge

The new Mistral 3 range includes models designed for both frontier-level performance and compact edge deployment. The flagship model, Mistral Large 3, uses a mixture-of-experts architecture that activates only the most relevant parts of the network for each token. This approach is intended to improve efficiency while maintaining accuracy, allowing enterprises to scale AI systems without excessive compute demands.

Mistral Large 3 features 41 billion active parameters, 675 billion total parameters and a 256,000-token context window. According to the company, these specifications enable high scalability and adaptability across demanding enterprise workloads. The models are available across cloud environments, data centres and edge devices from 2 December.

Mistral AI describes this release as part of an emerging phase of distributed intelligence, where models can operate flexibly across a wide range of hardware while bridging the gap between research innovation and practical deployment.

Performance gains through Nvidia-optimised architecture

The partnership leverages Nvidia’s GB200 NVL72 systems alongside the Mistral 3 model architecture to improve performance across large AI workloads. By tapping into Nvidia NVLink’s coherent memory domain and expert parallelism optimisations, the mixture-of-experts design is able to use hardware resources more efficiently at scale.

These gains are further supported by low-precision NVFP4 and Nvidia Dynamo disaggregated inference optimisations. Together, these improvements aim to raise training and inference throughput without affecting model accuracy.

On the GB200 NVL72 platform, Mistral Large 3 delivered a tenfold performance increase compared with the earlier Nvidia H200 generation. The improvement is expected to help businesses reduce per-token costs, enhance energy efficiency and improve overall user experience.

Bringing AI to the edge with compact models

Alongside the flagship model, Mistral AI has released nine small language models under the Ministral 3 suite. These compact models are designed to run on Nvidia’s edge platforms, including RTX PCs and laptops, the Nvidia Spark platform and Jetson devices.

Nvidia is also working with popular open-source frameworks such as Llama.cpp and Ollama to optimise performance across its GPUs. Developers can already test the Ministral 3 models through these tools, enabling fast and efficient execution on local hardware.

The entire Mistral 3 family is openly available, giving researchers and developers freedom to customise and build on the models. Nvidia’s NeMo tools, including Data Designer, Customizer, Guardrails and the NeMo Agent Toolkit, offer additional pathways for enterprises to refine models for specific applications and accelerate deployment from early prototypes to production systems.

To support consistent performance from cloud to edge, Nvidia has also optimised several inference frameworks — TensorRT-LLM, SGLang and vLLM — for the new model family. The models are accessible on major open-source platforms and cloud providers, with deployment as Nvidia NIM microservices expected soon.

Tags:

AI In The News NVIDIA

Search the Site

Recent Posts

Geek

Follow Us

Nvidia partners with Mistral AI to accelerate new open model family

Table Of Content

Expanding model capabilities across cloud and edge

Performance gains through Nvidia-optimised architecture

Bringing AI to the edge with compact models

Tags:

Share Article

123RF introduces Gen AI-powered video comprehension capability on AWS

Red Hat expands AWS collaboration to enhance AI inference performance

Related Posts

About Us

Join the tech community:

Follow Us

Add as a preferred source on Google