Thursday, 31 July 2025
28.6 C
Singapore
30.1 C
Thailand
21.6 C
Indonesia
28.5 C
Philippines

Alibaba unveils Wan2.2 open-source video generation models for cinematic content creation

Alibaba launches Wan2.2 MoE-based open-source models to help creators generate cinematic video content with better control and efficiency.

Alibaba has launched Wan2.2, a suite of large video generation models designed to help creators and developers produce cinematic-quality video content with greater ease and control. The new models are the first open-source video generation tools built on the Mixture-of-Experts (MoE) architecture, offering improved efficiency and creative flexibility.

Enhancing video generation with MoE architecture

The Wan2.2 release includes three distinct models: Wan2.2-T2V-A14B (text-to-video), Wan2.2-I2V-A14B (image-to-video), and Wan2.2-TI2V-5B (a hybrid text/image-to-video model). All three models are built on the MoE architecture and trained using carefully curated aesthetic datasets, allowing them to deliver cinematic-style video outputs with fine-tuned artistic control.

The models enable users to customise visual elements such as lighting, colour tone, camera angles, composition, focal length, and time of day. They also deliver realistic representations of complex motion, including facial expressions, hand gestures, and dynamic movement, while better adhering to physical laws and user instructions.

To address the typically high computational costs associated with video generation, Wan2.2-T2V-A14B and Wan2.2-I2V-A14B feature a two-expert denoising design in their diffusion process. One expert handles high-noise data for overall scene structure, while the other refines detail and texture. Despite containing 27 billion parameters, only 14 billion are activated per generation step, which reduces the overall computational load by up to 50%.

Improved performance and creative capabilities

Wan2.2 builds on the foundation of its predecessor, Wan2.1, with a significantly expanded dataset and improved generation capabilities. The training data includes 65.6% more image data and 83.2% more video data compared to the previous version. This allows Wan2.2 to deliver better performance in generating complex scenes, capturing nuanced motion, and producing varied creative styles.

The models support a cinematic prompt system, which categorises and refines user input across key aesthetic dimensions such as lighting, colour, and composition. This system ensures a higher level of interpretability and alignment with users’ visual intentions during video creation.

Compact hybrid model enables scalability

In addition to the primary MoE models, Alibaba has introduced Wan2.2-TI2V-5B, a compact hybrid model that supports both text and image input. It uses a dense model architecture with a high-compression 3D VAE system that achieves a temporal and spatial compression ratio of 4x16x16. This boosts the overall compression rate to 64, enabling it to generate five-second 720p videos in just a few minutes using a standard consumer-grade GPU.

The Wan2.2 models are available for download through Hugging Face, GitHub, and Alibaba Cloud’s open-source platform ModelScope. Alibaba has been a regular contributor to the global open-source AI community, previously releasing four Wan2.1 models in February 2025 and the Wan2.1-VACE (Video All-in-one Creation and Editing) model in May 2025. Collectively, these models have been downloaded over 5.4 million times from Hugging Face and ModelScope.

Hot this week

ASUS launches TUF Gaming Series Five monitors with flagship 300Hz model

ASUS debuts TUF Gaming Series Five monitors in Singapore, with flagship models offering 300Hz refresh and advanced gaming features.

Singapore to grow AI user base beyond scientists to boost innovation

Singapore plans to expand its AI user base beyond engineers to drive innovation across professional sectors and boost global competitiveness.

US fines Cadence US$140 million over illegal tech sales to Chinese military-linked university

Cadence to pay US$140 million and plead guilty to violating US export controls over sales to a Chinese military university.

Alibaba launches Qwen3-Coder, its most advanced open-source AI coding model

Alibaba releases Qwen3-Coder, a powerful open-source AI coding model designed for agentic programming and real-world software development.

Comcast introduces StreamStore to simplify streaming subscriptions

Comcast launches StreamStore, a hub for managing over 450 streaming apps and subscriptions via Xfinity set-top boxes and online.

Yelp launches AI-generated videos for restaurants and nightlife venues

Yelp introduces AI-generated videos to showcase restaurants and nightlife spots using user content, OpenAI scripts, and voiceovers from ElevenLabs.

Google adds AI-powered narrated slideshows to NotebookLM

Google updates NotebookLM with Video Overviews, enabling AI-generated narrated slideshows using user documents and visual elements.

YouTube to use AI to identify and restrict underage users’ accounts

YouTube will use AI to identify underage users in the US and apply child safety restrictions, including limits on ads and video content.

Opera files competition complaint in Brazil over Microsoft’s Edge tactics

Opera files a competition complaint in Brazil, accusing Microsoft of steering users toward Edge through anti-competitive tactics in Windows.

Related Articles

Popular Categories