Saturday, 30 August 2025
30.3 C
Singapore
30.8 C
Thailand
21.9 C
Indonesia
28 C
Philippines

ChatGPT could soon gain the ability to see

ChatGPT’s Advanced Voice Mode might soon include a live camera feature, enabling AI to identify objects and interact visually.

ChatGPT’s Advanced Voice Mode, known for enabling real-time conversations with a chatbot, might soon include visual capabilities. Code uncovered in the latest beta version of the app hints at introducing a “live camera” feature. This discovery in ChatGPT v1.2024.317, as reported by Android Authority, suggests that the rollout of this exciting feature could be just around the corner. However, OpenAI has yet to confirm an official release date.

A glimpse into the feature’s early tests

The idea of ChatGPT having a visual edge has been introduced previously. During the initial alpha testing phase of Advanced Voice Mode in May, OpenAI demonstrated its potential visual capabilities. In one example, the chatbot used a phone’s camera to identify a dog, recognise its ball, and associate the two in the context of playing fetch. This ability to observe, understand, and link objects to real-world scenarios was widely praised by early testers.

Alpha testers were quick to explore the feature’s uses. A notable example came from a user on X (formerly Twitter), Manuel Sainsily, who utilised the camera to ask questions about his new kitten. This interactive capability showcased how the feature could provide fun and practical benefits.

When Advanced Voice Mode entered beta testing in September for ChatGPT Plus and Enterprise users, its visual functionality was notably absent. Despite this, the voice feature gained immense popularity for enabling natural, dynamic conversations. According to OpenAI, users could interrupt the chatbot at any moment, and it could even pick up on the speaker’s emotional tone.

What sets it apart from competitors?

ChatGPT could have a unique edge over rivals like Google and Meta if the live camera feature is introduced. Google’s conversational AI, Gemini Live, may speak over 40 languages but lacks visual processing capabilities. Similarly, Meta’s Natural Voice Interactions, showcased at the Connect 2024 event in September, cannot use camera inputs. While these systems are competent in their ways, OpenAI’s visual feature could redefine how AI assistants interact with the world.

Desktop users can now enjoy Advanced Voice Mode

In a related update, OpenAI announced that Advanced Voice Mode is now available to paid ChatGPT Plus users on desktop. Previously limited to mobile devices, this update means users can now access this feature directly on their laptops or PCs.

The introduction of the live camera could mark a significant leap forward, combining the ability to see and hear into one seamless AI experience. While the exact timing remains uncertain, the potential impact of this development is already generating excitement among users and industry experts alike.

Hot this week

Airwallex wins three honours at Asia FinTech Awards 2025

Airwallex wins three awards at the Asia FinTech Awards 2025, including Banking Tech of the Year, Best Employer, and Director of the Year.

China Changan Automobile Group officially launches with global ambitions

China Changan Automobile Group launches with a global strategy to sell five million vehicles annually by 2030, led by NEVs.

NVIDIA unveils Jetson Thor, its next-generation robotics computing platform

NVIDIA launches Jetson Thor, a next-gen AI robotics platform with 7.5x computing power, designed for developers and large-scale robotics projects.

Meta partners with Midjourney to bring AI-generated images to its platforms

Meta partners with Midjourney to bring advanced AI-generated images to its platforms, boosting creative features across its apps.

YouTube TV faces potential loss of Fox channels this week

YouTube TV may lose access to Fox channels this week due to stalled contract negotiations, potentially disrupting coverage of the NFL and college football.

ChatGPT to introduce parental controls as AI safety concerns rise

OpenAI is introducing parental controls for ChatGPT, addressing growing concerns about the safety of AI chatbots and their impact on young users.

Japan uses an AI simulation of Mount Fuji’s eruption to prepare citizens

Japan uses AI to simulate a Mount Fuji eruption, showing its potential devastation and promoting disaster preparedness.

Anthropic updates Claude chatbot policy to use chat data for AI training

Anthropic will utilise Claude chatbot conversations for AI training starting from 28 September, with opt-out options and a five-year data retention policy.

Microsoft releases Windows 11 25H2 update for testing in the Release Preview channel

Microsoft has released the Windows 11 25H2 update in the Release Preview Channel, with feature removals and improved admin controls.

Related Articles

Popular Categories