Wednesday, 27 August 2025
29 C
Singapore
28.4 C
Thailand
20.3 C
Indonesia
26.5 C
Philippines

Alibaba introduces open-source model for digital human video generation

Alibaba launches open-source Wan2.2-S2V model, enabling lifelike digital human video generation from portraits and audio.

Alibaba has unveiled Wan2.2-S2V, an open-source speech-to-video model designed to generate digital human videos. The technology enables users to convert portrait photos into film-quality avatars capable of speaking, singing, and performing, broadening the possibilities for professional content creation.

Expanding video creation capabilities

Part of the Wan2.2 video generation series, Wan2.2-S2V allows creators to animate videos using a single image and an audio clip. It supports multiple framing options including portrait, bust, and full-body perspectives, and can dynamically generate character actions and environmental details based on prompts.

The model is powered by advanced audio-driven animation technology that delivers natural and expressive performances, from dialogue to musical pieces. It also supports scenes featuring multiple characters and a wide range of avatars, including cartoon, animal, and stylised designs.

To meet varied production needs, the tool provides flexible output resolutions of 480P and 720P. This makes it suitable for both professional presentations and social media content while ensuring quality visuals for different creative contexts.

Combining innovation and efficiency

Wan2.2-S2V improves upon traditional talking-head animation by merging text-guided global motion control with audio-driven fine-grained local movements. This combination allows for expressive and lifelike performances across complex scenarios.

A notable advancement lies in its frame processing approach. By compressing historical frames of any length into a single latent representation, the model reduces computational demands and ensures stability in long-video generation, addressing a common challenge for extended animated productions.

Alibaba’s research team also built a large-scale audio-visual dataset tailored to film and television scenarios to train the model. Using multi-resolution training, it supports video creation in diverse formats, from short-form vertical content to conventional horizontal film and television outputs.

Commitment to open-source community

The Wan2.2-S2V model is available for download on Hugging Face, GitHub, and Alibaba Cloud’s ModelScope. Alibaba has been steadily contributing to the open-source ecosystem, previously releasing Wan2.1 models in February 2025 and Wan2.2 models in July. Together, the Wan series has recorded over 6.9 million downloads across Hugging Face and ModelScope.

Alibaba said the release reflects its ongoing efforts to support professional creators with advanced AI tools while contributing to the wider developer community.

Hot this week

Belkin introduces first Qi2.2 chargers with 25W wireless charging speeds

Belkin launches its first Qi2.2-certified chargers, offering 25W wireless charging speeds with three models designed for both home and travel use.

New feature begins rolling out to Windows Insiders

Microsoft is testing a Windows 11 feature to resume Android apps on PCs, starting with Spotify, for seamless cross-device use.

TechInnovation 2025 returns with focus on real-world solutions

TechInnovation 2025 returns to Singapore from 29 to 31 October, showcasing over 100 technologies and fostering cross-border collaboration.

ATPI expands in Asia to support growing business travel demand

ATPI expands in Asia with new offices in India and planned growth in China and South Korea to meet rising regional business travel demand.

Best monitors to buy in 2025 for gaming, work, and creativity

Discover the best monitors of 2025 with top picks for gaming, work, and creativity, offering speed, accuracy, and stunning visuals.

ASEAN battery conference highlights regional leadership and collaboration

The 3rd ASEAN Battery Technology Conference in Phuket showcased new partnerships, safety standards, and innovation for clean energy.

TechLaw.Fest marks 10th edition with focus on digital innovation in law

TechLaw.Fest 2025 marks its 10th edition in Singapore with keynotes, global legal tech discussions, and the launch of the e-Apostille.

Oyster malware campaign targets IT professionals with fake software tools

Oyster malware campaign targets IT professionals with fake tools like WinSCP and PuTTY, raising ransomware concerns.

IBM and AMD partner to advance quantum-centric supercomputing

IBM and AMD are partnering to develop quantum-centric supercomputing, combining quantum and high-performance computing to solve complex problems.

Related Articles

Popular Categories