Wednesday, 23 July 2025
30.6 C
Singapore
32.7 C
Thailand
25.6 C
Indonesia
27.3 C
Philippines

Alibaba unveils upgraded Qwen3 model, surpasses OpenAI and DeepSeek in maths and coding

Alibaba’s upgraded Qwen3 model beats OpenAI and DeepSeek in maths and coding, cementing China’s role in global AI development.

Alibaba Group Holding has released a significantly upgraded version of its third-generation Qwen3 large language model (LLM), positioning the Chinese tech giant ahead of competitors OpenAI and DeepSeek in several key benchmarks. The new model, named Qwen3-235B-A22B-Instruct-2507-FP8, was announced on 16 July through AI community platforms Hugging Face and ModelScope, Alibaba’s open-source initiative.

According to Alibaba, the updated model has shown “significant improvements in general capabilities”, particularly in areas such as instruction following, logical reasoning, text comprehension, mathematics, science, coding, and tool usage.

One of the most notable achievements of the new Qwen3 model was its score of 70.3 points on the 2025 American Invitational Mathematics Examination. This result outpaced DeepSeek-V3-0324, released in March, which scored 46.6, and OpenAI’s GPT-4o-0327, which managed just 26.7 points.

In the field of programming, the Qwen3 model earned a score of 87.9 on the MultiPL-E benchmark. This places it slightly ahead of DeepSeek’s 82.2 and OpenAI’s 82.7, though it still trails Anthropic’s Claude Opus 4 Non-thinking model, which achieved a score of 88.5.

Model upgrades and expanded capabilities

The upgraded Qwen3-235B-A22B-Instruct-2507-FP8 builds on a previous version known as Qwen3-235B-A22B-FP8, enhancing its capabilities across a range of applications. However, it functions only in what is referred to as a “non-thinking” mode. In this setting, the AI generates outputs directly without showing its reasoning process, unlike models designed for step-by-step logical thought.

Despite this, the model’s capacity to process information has increased substantially. Its token limit has expanded eightfold to 256,000, allowing it to handle significantly longer texts in a single interaction, which could be helpful in scenarios requiring extended analysis or multi-turn conversations.

Additionally, Alibaba announced that a separate Qwen model with three billion parameters will be embedded into HP’s AI assistant, “Xiaowei Hui”, across its personal computers sold in China. This integration is expected to boost the assistant’s performance in tasks such as document drafting and meeting summarisation.

Global recognition and ongoing competition

The Qwen3 family, introduced in late April, comprises models with parameter sizes ranging from 600 million to 235 billion. Its largest model, the Qwen3-235B-A22B-No-Thinking, is currently recognised as the third-best open-source AI model globally. It is ranked behind Kimi K2, developed by Chinese start-up Moonshot AI, and DeepSeek’s DeepSeek R1-0528, which is a fine-tuned reasoning-focused model.

Recent rankings from Hugging Face also reflect the growing prominence of Qwen models in China’s AI landscape. According to its June assessment, three out of the top ten Chinese LLMs were part of the Qwen series, underlining Alibaba’s competitive edge in the country’s burgeoning AI sector.

Jensen Huang, CEO of Nvidia, highlighted China’s strong performance in open-source AI during a visit last week. Speaking amid renewed business activity between the US and China following a June breakthrough in trade discussions, Huang said that Alibaba’s Qwen, along with models from DeepSeek and Moonshot, represented “the best open reasoning models in the world today”, describing them as “very advanced”.

Hot this week

Human programmer triumphs over AI in Tokyo coding contest

Polish coder narrowly beats OpenAI’s AI model in a 10-hour coding contest, marking a major moment in human vs machine programming.

Samsung introduces new Smart Monitor range featuring first OLED M9 model

Samsung launches a new Smart Monitor range, featuring the first OLED M9 model and refreshed M8 and M7 models with AI and productivity upgrades.

Kyndryl launches AI framework to drive dynamic enterprise performance

Kyndryl introduces its Agentic AI Framework, enabling businesses to scale adaptive AI agents for performance, security and operational impact.

Alibaba Cloud named a GenAI leader in Omdia’s latest Asia and Oceania report

Alibaba Cloud named GenAI leader in Omdia’s Asia & Oceania 2025 report, topping seven of nine categories for innovation and adoption.

Honor opens largest store in Singapore with hands-on access to Magic V5 foldable

Honor opens its biggest store in Singapore at Plaza Singapura, featuring the Magic V5 and exclusive launch-day promotions.

AMD and Stability AI launch BF16 NPU model for Stable Diffusion 3.0 Medium

AMD and Stability AI launch the world’s first BF16 SD 3.0 Medium model for Ryzen AI laptops, now available in Amuse 3.1.

Borderlands 4 set for Nintendo Switch 2 release on 3 October

Borderlands 4 launches on Nintendo Switch 2 on 3 October, following its main release on PlayStation, Xbox and PC in September 2025.

Temus supports Singapore businesses in adopting AI with AWS AI Springboard programme

Temus helps Singapore enterprises adopt practical AI solutions through AWS AI Springboard, with support for 300 businesses.

Amazon acquires AI wearable startup Bee to boost personal assistant technology

Amazon acquires AI wearable startup Bee to enhance its personal assistant technology and strengthen its position in the AI wearables market.

Related Articles

Popular Categories