Monday, 13 October 2025
31.6 C
Singapore
34.2 C
Thailand
30.9 C
Indonesia
29.2 C
Philippines

Alibaba unveils Wan2.2 open-source video generation models for cinematic content creation

Alibaba launches Wan2.2 MoE-based open-source models to help creators generate cinematic video content with better control and efficiency.

Alibaba has launched Wan2.2, a suite of large video generation models designed to help creators and developers produce cinematic-quality video content with greater ease and control. The new models are the first open-source video generation tools built on the Mixture-of-Experts (MoE) architecture, offering improved efficiency and creative flexibility.

Enhancing video generation with MoE architecture

The Wan2.2 release includes three distinct models: Wan2.2-T2V-A14B (text-to-video), Wan2.2-I2V-A14B (image-to-video), and Wan2.2-TI2V-5B (a hybrid text/image-to-video model). All three models are built on the MoE architecture and trained using carefully curated aesthetic datasets, allowing them to deliver cinematic-style video outputs with fine-tuned artistic control.

The models enable users to customise visual elements such as lighting, colour tone, camera angles, composition, focal length, and time of day. They also deliver realistic representations of complex motion, including facial expressions, hand gestures, and dynamic movement, while better adhering to physical laws and user instructions.

To address the typically high computational costs associated with video generation, Wan2.2-T2V-A14B and Wan2.2-I2V-A14B feature a two-expert denoising design in their diffusion process. One expert handles high-noise data for overall scene structure, while the other refines detail and texture. Despite containing 27 billion parameters, only 14 billion are activated per generation step, which reduces the overall computational load by up to 50%.

Improved performance and creative capabilities

Wan2.2 builds on the foundation of its predecessor, Wan2.1, with a significantly expanded dataset and improved generation capabilities. The training data includes 65.6% more image data and 83.2% more video data compared to the previous version. This allows Wan2.2 to deliver better performance in generating complex scenes, capturing nuanced motion, and producing varied creative styles.

The models support a cinematic prompt system, which categorises and refines user input across key aesthetic dimensions such as lighting, colour, and composition. This system ensures a higher level of interpretability and alignment with users’ visual intentions during video creation.

Compact hybrid model enables scalability

In addition to the primary MoE models, Alibaba has introduced Wan2.2-TI2V-5B, a compact hybrid model that supports both text and image input. It uses a dense model architecture with a high-compression 3D VAE system that achieves a temporal and spatial compression ratio of 4x16x16. This boosts the overall compression rate to 64, enabling it to generate five-second 720p videos in just a few minutes using a standard consumer-grade GPU.

The Wan2.2 models are available for download through Hugging Face, GitHub, and Alibaba Cloud’s open-source platform ModelScope. Alibaba has been a regular contributor to the global open-source AI community, previously releasing four Wan2.1 models in February 2025 and the Wan2.1-VACE (Video All-in-one Creation and Editing) model in May 2025. Collectively, these models have been downloaded over 5.4 million times from Hugging Face and ModelScope.

Hot this week

AMD launches Ryzen Embedded 9000 Series for next-generation industrial computing

AMD launches the Ryzen Embedded 9000 Series, delivering high performance, energy efficiency, and long-term reliability for industrial systems.

Wi-Fi 7 as the nervous system of the intelligent enterprise

Wi-Fi 7 is set to become the backbone of intelligent enterprises in Southeast Asia, enabling faster, more reliable networks and powering advanced use cases.

Asia-Pacific lags in fraud protection as risks rise

Asia-Pacific has fallen behind in fraud protection, with key economies facing rising risks despite strong regulatory efforts in some markets.

Armis and Fortinet expand partnership to boost cyber resilience for global businesses

Armis and Fortinet have expanded their partnership to enhance cyber resilience with deeper integration, unified visibility, and automated security enforcement.

Samsung develops wearable technologies for the early detection of heart and brain conditions

Samsung is developing wearable devices that detect early signs of heart failure and monitor brain activity using new EEG technology.

Belkin unveils Stage PowerGrip: a magnetic iPhone accessory with built-in power bank

Belkin unveils the Stage PowerGrip, a magnetic iPhone grip that doubles as a multi-device charger with a 9,300mAh battery.

Wi-Fi 7 as the nervous system of the intelligent enterprise

Wi-Fi 7 is set to become the backbone of intelligent enterprises in Southeast Asia, enabling faster, more reliable networks and powering advanced use cases.

Anthropic study reveals malicious data can easily sabotage AI models

Anthropic warns that small amounts of malicious training data can easily sabotage large AI models like Claude.

Apple discontinues the Clips app after eight years of creative video editing

Apple ends support for its Clips video-editing app, removing it from the App Store after eight years of creative use.

Related Articles