Monday, 13 October 2025
30 C
Singapore
29.9 C
Thailand
21.6 C
Indonesia
28.8 C
Philippines

Alibaba introduces open-source model for digital human video generation

Alibaba launches open-source Wan2.2-S2V model, enabling lifelike digital human video generation from portraits and audio.

Alibaba has unveiled Wan2.2-S2V, an open-source speech-to-video model designed to generate digital human videos. The technology enables users to convert portrait photos into film-quality avatars capable of speaking, singing, and performing, broadening the possibilities for professional content creation.

Expanding video creation capabilities

Part of the Wan2.2 video generation series, Wan2.2-S2V allows creators to animate videos using a single image and an audio clip. It supports multiple framing options including portrait, bust, and full-body perspectives, and can dynamically generate character actions and environmental details based on prompts.

The model is powered by advanced audio-driven animation technology that delivers natural and expressive performances, from dialogue to musical pieces. It also supports scenes featuring multiple characters and a wide range of avatars, including cartoon, animal, and stylised designs.

To meet varied production needs, the tool provides flexible output resolutions of 480P and 720P. This makes it suitable for both professional presentations and social media content while ensuring quality visuals for different creative contexts.

Combining innovation and efficiency

Wan2.2-S2V improves upon traditional talking-head animation by merging text-guided global motion control with audio-driven fine-grained local movements. This combination allows for expressive and lifelike performances across complex scenarios.

A notable advancement lies in its frame processing approach. By compressing historical frames of any length into a single latent representation, the model reduces computational demands and ensures stability in long-video generation, addressing a common challenge for extended animated productions.

Alibaba’s research team also built a large-scale audio-visual dataset tailored to film and television scenarios to train the model. Using multi-resolution training, it supports video creation in diverse formats, from short-form vertical content to conventional horizontal film and television outputs.

Commitment to open-source community

The Wan2.2-S2V model is available for download on Hugging Face, GitHub, and Alibaba Cloud’s ModelScope. Alibaba has been steadily contributing to the open-source ecosystem, previously releasing Wan2.1 models in February 2025 and Wan2.2 models in July. Together, the Wan series has recorded over 6.9 million downloads across Hugging Face and ModelScope.

Alibaba said the release reflects its ongoing efforts to support professional creators with advanced AI tools while contributing to the wider developer community.

Hot this week

Lakeba Group and partners unveil UAE’s first AI Centre of Excellence

Lakeba Group, UOWD, DoxAI, and AqlanX launch the UAE’s first AI Centre of Excellence to drive innovation, local talent, and digital sovereignty.

Singapore businesses ramp up international hiring as local talent gap widens

Singapore companies are accelerating international hiring as local talent shortages grow, with nearly half of new roles now based overseas.

Armis and Fortinet expand partnership to boost cyber resilience for global businesses

Armis and Fortinet have expanded their partnership to enhance cyber resilience with deeper integration, unified visibility, and automated security enforcement.

Delta Electronics showcases energy-efficient data centre solutions at Data Centre World Asia 2025

Delta Electronics unveiled cutting-edge power and cooling solutions at Data Centre World Asia 2025, supporting sustainable, AI-ready data centres.

Anthropic study reveals malicious data can easily sabotage AI models

Anthropic warns that small amounts of malicious training data can easily sabotage large AI models like Claude.

Square Enix unveils new Dissidia Final Fantasy after eight years, but fans are disappointed by mobile exclusivity

Square Enix announces a new Dissidia Final Fantasy for mobile, but fans express disappointment after eight years without a mainline release.

Samsung Galaxy XR headset details revealed ahead of expected launch

Samsung’s Galaxy XR headset leak reveals dual 4K displays, Snapdragon XR2+ Gen 2 chip, and a rumoured 22 October launch.

Belkin unveils Stage PowerGrip: a magnetic iPhone accessory with built-in power bank

Belkin unveils the Stage PowerGrip, a magnetic iPhone grip that doubles as a multi-device charger with a 9,300mAh battery.

Wi-Fi 7 as the nervous system of the intelligent enterprise

Wi-Fi 7 is set to become the backbone of intelligent enterprises in Southeast Asia, enabling faster, more reliable networks and powering advanced use cases.

Related Articles