Wednesday, 30 April 2025
32.3 C
Singapore
36.1 C
Thailand
27.2 C
Indonesia
28.6 C
Philippines

Microsoft’s AI could soon make your photos talk and sing

Explore how Microsoft's new AI tool VASA-1 can bring your photos to life by creating realistic videos of them talking and singing.

Microsoft Research Asia has just unveiled VASA-1, an experimental AI tool that could transform still images or drawings of people into realistic videos where they appear to talk or sing. Using an existing audio file, this tool can animate your photos with facial expressions, head movements, and perfectly synced lip movements that match the audio’s speech or song.

On the project’s webpage, you can find numerous examples that showcase how lifelike these animations can be. Although some lip and head movements might still look a bit mechanical and not perfectly in sync, the overall effect is convincing enough that it could easily be mistaken for real footage.

There’s a significant potential for misuse, particularly in the creation of deepfake videos, which is something Microsoft’s researchers are quite aware of. Consequently, they have decided against releasing any public demos, APIs, or additional details about the implementation until they can ensure the tool will be used responsibly and in accordance with stringent regulations. They haven’t mentioned specific safeguards to prevent misuse by malicious actors for harmful purposes like creating deepfake pornography or misinformation campaigns.

Despite these concerns, the technology promises several beneficial applications. It could enhance educational equity and improve accessibility for individuals with communication challenges by giving them access to an avatar that can communicate on their behalf. Additionally, this tool could provide companionship and therapeutic support, especially in programmes that offer interactions with AI-powered characters.

VASA-1 was trained using the VoxCeleb2 dataset, which includes over 1 million spoken expressions from 6,112 celebrities extracted from YouTube videos. Interestingly, it works not just on real faces but also on artistic ones. An amusing example is the animation of the Mona Lisa synced with an audio clip of Anne Hathaway’s viral rendition of Lil Wayne’s “Paparazzi,” which is quite delightful and worth a watch.

Hot this week

ASUS and JustCo introduce experience zones for business travellers and professionals in Singapore

ASUS and JustCo open new tech-enabled workspace zones in Singapore, featuring premium monitors and chairs for modern professionals.

Bluesky outage raises questions about decentralisation in practice

Bluesky, a decentralised social platform, went offline briefly, raising fresh questions about how decentralisation works.

Razer launches Pro Click V2 and V2 Vertical Mice: Blending gaming and productivity

Razer's new Pro Click V2 and V2 Vertical mice offer gaming precision and ergonomic comfort, with AI prompt access and long battery life, available now!

Spotify sees record operating income and adds 5 million new premium users

Spotify adds 5M new premium users and hits record income as AI playlists and podcasts drive continued growth.

M1 launches anniversary sale with zero upfront cost on new phones

M1 celebrates 28 years with a major sale offering $0 phones, low monthly plans, loyalty rewards and roaming perks until 15 June 2025.

Google Play loses nearly half its apps since early 2024

Due to stricter rules and quality control changes, Google Play lost nearly half its apps in 2024, dropping from 3.4M to 1.8M.

Snapchat drops plans for simplified app, tests new five-tab layout instead

Snapchat has dropped its simplified app redesign and is testing a new five-tab layout to improve user experience and content discovery.

Startups fight back against Cluely’s AI cheating tool with detection software

Startups fight back against AI cheating tool Cluely with new detection software, while Cluely hints at future smart glasses and AI hardware.

Meta introduces new AI tools at LlamaCon

Meta's first LlamaCon event launches open AI tools to challenge OpenAI and promote accessible, developer-friendly AI solutions.

Related Articles

Popular Categories