Wednesday, 10 September 2025
30.4 C
Singapore
31.9 C
Thailand
22.2 C
Indonesia
27.7 C
Philippines

NVIDIA Blackwell Ultra sets new benchmark in MLPerf inference tests

NVIDIA’s Blackwell Ultra architecture sets new records in MLPerf Inference v5.1, boosting AI performance and reducing costs for enterprises.

NVIDIA’s Blackwell Ultra architecture has set new standards in AI performance, achieving record results in the latest MLPerf Inference v5.1 benchmarks. The NVIDIA GB300 NVL72 rack-scale system, powered by Blackwell Ultra, delivered the highest throughput on the new reasoning inference benchmark, outperforming previous-generation systems by up to 45% in DeepSeek-R1 inference throughput compared with GB200 NVL72 platforms.

Inference performance is vital for the economics of AI infrastructure. Higher throughput enables more tokens to be processed at speed, boosting revenue, lowering total cost of ownership (TCO) and increasing overall productivity. NVIDIA’s latest achievement underscores its ongoing efforts to push the limits of AI factory performance.

Enhanced architecture and full-stack optimisation

The Blackwell Ultra architecture builds on the foundation of its predecessor with significant improvements. Each GPU now offers 1.5 times more NVFP4 AI compute and double the attention-layer acceleration compared with Blackwell. It also features up to 288GB of HBM3e memory per GPU, providing greater capacity for large-scale AI workloads.

A key factor in these results is NVIDIA’s full-stack co-design approach, integrating hardware and software innovations. Blackwell and Blackwell Ultra incorporate hardware acceleration for NVFP4, NVIDIA’s custom 4-bit floating point format. This delivers better accuracy than other FP4 formats and comparable results to higher-precision options. NVIDIA TensorRT Model Optimizer quantised models such as DeepSeek-R1, Llama 3.1 405B, and Llama 2 70B to NVFP4, and together with the open-source TensorRT-LLM library, enabled higher performance without sacrificing accuracy.

The company also highlighted disaggregated serving, a technique that separates context processing and token generation for large language models. This method was crucial to record-breaking results on the Llama 3.1 405B Interactive benchmark, nearly doubling performance per GPU compared with traditional serving approaches.

Broad industry adoption and availability

NVIDIA’s ecosystem of partners also contributed to the strong benchmark results. Submissions came from leading cloud providers and server makers including Azure, Broadcom, Cisco, CoreWeave, Dell Technologies, HPE, Lenovo, Oracle, Supermicro, and the University of Florida. These results demonstrate that the market-leading performance of NVIDIA’s AI platform is accessible across a wide range of systems and services.

The company made its first benchmark submissions using the NVIDIA Dynamo inference framework, further strengthening its position in AI optimisation. Organisations deploying AI applications can now leverage these advances through major cloud platforms and server vendors, benefiting from reduced TCO and higher returns on investment.

NVIDIA’s record-breaking results in MLPerf Inference v5.1 reaffirm its leadership in AI computing. By delivering stronger performance and efficiency, Blackwell Ultra provides a compelling platform for enterprises building and scaling next-generation AI applications.

Hot this week

Sony showcases connected media ecosystem at IBC 2025

Sony highlights advanced media production tools and sustainable innovation at IBC 2025, showcasing AI, cloud, and virtual production.

OpenAI to launch job platform and AI certification scheme

OpenAI will launch an AI job platform and certification scheme to help employers find talent and upskill job seekers.

Employment Hero report shows workers under pressure and rethinking careers

Employment Hero’s 2025 Jobs Report reveals rising costs, job mobility, and Gen Z’s shift towards security and purpose across four countries.

GM slows EV production as US tax credit nears expiration

GM is slowing EV production as the US$7,500 tax credit ends, raising concerns about the future of the American electric car market.

Xero launches new AI features in JAX to support small business accounting

Xero unveils new AI features in its JAX platform, offering automation, insights, and secure support for small businesses worldwide.

Young Singapore inventor wins James Dyson Award for diabetes innovation

NUS graduate Zoey Chan wins James Dyson Award 2025 in Singapore for nido, a tool designed to simplify daily insulin injections.

Maxicare adopts Agentforce to streamline dental authorisations

Maxicare adopts Salesforce’s Agentforce to automate dental authorisations, improving clinic efficiency and member healthcare services.

Canon unveils next-generation video production equipment to elevate cinematic storytelling

Canon launches EOS C50, RF85mm f/1.4L VCM, and CN5x11 IAS T R1/P1 to support next-generation video production and storytelling.

Coursera launches Skill Tracks to address workplace skill gaps

Coursera launches Skill Tracks to help organisations close skill gaps with role-based, data-driven learning across IT, data, software, and GenAI.

Related Articles

Popular Categories