Sunday, 14 December 2025
28.2 C
Singapore
28.3 C
Thailand
23.6 C
Indonesia
28.1 C
Philippines

Alibaba unveils upgraded Qwen3 model, surpasses OpenAI and DeepSeek in maths and coding

Alibaba’s upgraded Qwen3 model beats OpenAI and DeepSeek in maths and coding, cementing China’s role in global AI development.

Alibaba Group Holding has released a significantly upgraded version of its third-generation Qwen3 large language model (LLM), positioning the Chinese tech giant ahead of competitors OpenAI and DeepSeek in several key benchmarks. The new model, named Qwen3-235B-A22B-Instruct-2507-FP8, was announced on 16 July through AI community platforms Hugging Face and ModelScope, Alibaba’s open-source initiative.

According to Alibaba, the updated model has shown “significant improvements in general capabilities”, particularly in areas such as instruction following, logical reasoning, text comprehension, mathematics, science, coding, and tool usage.

One of the most notable achievements of the new Qwen3 model was its score of 70.3 points on the 2025 American Invitational Mathematics Examination. This result outpaced DeepSeek-V3-0324, released in March, which scored 46.6, and OpenAI’s GPT-4o-0327, which managed just 26.7 points.

In the field of programming, the Qwen3 model earned a score of 87.9 on the MultiPL-E benchmark. This places it slightly ahead of DeepSeek’s 82.2 and OpenAI’s 82.7, though it still trails Anthropic’s Claude Opus 4 Non-thinking model, which achieved a score of 88.5.

Model upgrades and expanded capabilities

The upgraded Qwen3-235B-A22B-Instruct-2507-FP8 builds on a previous version known as Qwen3-235B-A22B-FP8, enhancing its capabilities across a range of applications. However, it functions only in what is referred to as a “non-thinking” mode. In this setting, the AI generates outputs directly without showing its reasoning process, unlike models designed for step-by-step logical thought.

Despite this, the model’s capacity to process information has increased substantially. Its token limit has expanded eightfold to 256,000, allowing it to handle significantly longer texts in a single interaction, which could be helpful in scenarios requiring extended analysis or multi-turn conversations.

Additionally, Alibaba announced that a separate Qwen model with three billion parameters will be embedded into HP’s AI assistant, “Xiaowei Hui”, across its personal computers sold in China. This integration is expected to boost the assistant’s performance in tasks such as document drafting and meeting summarisation.

Global recognition and ongoing competition

The Qwen3 family, introduced in late April, comprises models with parameter sizes ranging from 600 million to 235 billion. Its largest model, the Qwen3-235B-A22B-No-Thinking, is currently recognised as the third-best open-source AI model globally. It is ranked behind Kimi K2, developed by Chinese start-up Moonshot AI, and DeepSeek’s DeepSeek R1-0528, which is a fine-tuned reasoning-focused model.

Recent rankings from Hugging Face also reflect the growing prominence of Qwen models in China’s AI landscape. According to its June assessment, three out of the top ten Chinese LLMs were part of the Qwen series, underlining Alibaba’s competitive edge in the country’s burgeoning AI sector.

Jensen Huang, CEO of Nvidia, highlighted China’s strong performance in open-source AI during a visit last week. Speaking amid renewed business activity between the US and China following a June breakthrough in trade discussions, Huang said that Alibaba’s Qwen, along with models from DeepSeek and Moonshot, represented “the best open reasoning models in the world today”, describing them as “very advanced”.

Hot this week

Pudu Robotics unveils new robot dog as it expands global presence

Pudu Robotics unveils its new D5 robot dog in Tokyo as part of its global push into service and industrial robotics.

Kaspersky uncovers macOS malware campaign abusing ChatGPT chat-sharing feature

Kaspersky reports a macOS malware campaign using ChatGPT’s chat-sharing feature to spread the AMOS infostealer.

New research finds growing public demand for modern emergency call systems in Australia and New Zealand

New study shows strong public support for modern, data-driven and AI-enabled emergency call systems in Australia and New Zealand.

Deepal marks Christmas in Singapore with Pantler Café collaboration and S07 test drive giveaway

Deepal partners with Pantler Café in Singapore for festive treats, an S07 showcase and a 3D2N electric SUV test drive giveaway.

AMD introduces EPYC Embedded 2005 series for compact, power-efficient AI systems

AMD launches the EPYC Embedded 2005 Series, offering compact, power-efficient processors for constrained networking, storage and industrial systems.

Tiiny AI unveils pocket-sized AI supercomputer verified by Guinness World Records

Tiiny AI reveals a Guinness-verified pocket-sized AI supercomputer designed to run massive models locally without relying on the cloud.

Samsung Galaxy Z TriFold sells out first batch, second waitlist opens in Singapore

Samsung’s Galaxy Z TriFold sells out its first batch in Singapore, with a second waitlist now open for the premium tri-fold phone.

PlayStation introduces limited edition Genshin Impact DualSense controller

PlayStation announces a limited edition Genshin Impact DualSense controller for PS5, launching in Singapore on 21 January 2026.

PGL brings Counter-Strike 2 Major to Singapore in November 2026

PGL confirms the Counter-Strike 2 Major is coming to Singapore in November 2026, marking the first CS2 Major in Southeast Asia.

Related Articles

Popular Categories