Sunday, 31 August 2025
28.7 C
Singapore
28.7 C
Thailand
20.3 C
Indonesia
28 C
Philippines

Meta introduces new AI safeguards to protect teens from harmful conversations

Meta is strengthening AI safeguards to prevent teens from discussing self-harm and other sensitive topics with chatbots on Instagram and Facebook.

Meta is retraining its artificial intelligence systems and introducing stricter safeguards to protect teenagers from engaging in harmful conversations with its AI chatbots. The company confirmed that it is adding new “guardrails as an extra precaution” to prevent teens from discussing sensitive topics such as self-harm, disordered eating, and suicide with Meta AI. It is also restricting access to certain user-generated chatbot characters that could engage in inappropriate or suggestive conversations.

The decision follows a series of reports raising concerns about Meta’s AI and its interactions with young users. Earlier this month, Reuters revealed details of an internal Meta policy document that stated that the company’s chatbots were allowed to have “sensual” conversations with underage users. Meta quickly dismissed this, describing the wording as “erroneous and inconsistent with our policies,” and said it had been removed.

More recently, The Washington Post reported on a study indicating that Meta AI could “coach teen accounts on suicide, self-harm and eating disorders,” prompting renewed criticism over its safety measures. These revelations have increased pressure on Meta to act swiftly to safeguard teenagers across its platforms, including Instagram and Facebook.

Strengthening AI safety measures for teenagers

Meta has stated that its AI products were initially designed with safety features to handle sensitive subjects, but the company is now strengthening those measures. “We built protections for teens into our AI products from the start, including designing them to respond safely to prompts about self-harm, suicide, and disordered eating,” said Meta spokesperson Stephanie Otway in a statement to Engadget.

She added, “As our community grows and technology evolves, we’re continually learning about how young people may interact with these tools and strengthening our protections accordingly. As we continue to refine our systems, we’re adding more guardrails as an extra precaution — including training our AIs not to engage with teens on these topics, but to guide them to expert resources, and limiting teen access to a select group of AI characters for now.”

Although these new measures are described as temporary, Meta says further updates are already being developed to ensure that teenagers have “safe, age-appropriate experiences” when using its AI. The company has confirmed that these protections will be rolled out in the coming weeks and will initially apply to teen users in English-speaking countries.

Growing scrutiny from lawmakers and regulators

Meta’s handling of AI safety for young users is also attracting scrutiny from politicians and regulators. Senator Josh Hawley has announced plans to investigate the company’s policies and how it manages AI interactions with children. Similarly, Texas Attorney General Ken Paxton has signalled his intention to launch an inquiry, accusing Meta of misleading children about mental health information provided by its AI chatbots.

These developments come as the social media giant faces increasing pressure to address concerns over child safety on its platforms, especially as AI tools become more widely integrated into social media and messaging apps.

Meta says it will continue to refine its systems as feedback from parents, experts, and regulators evolves, signalling that further updates to its AI safety protocols are expected in the future.

Hot this week

OpenAI and Anthropic conduct cross-company AI safety evaluations

OpenAI and Anthropic evaluated each other’s AI systems, revealing safety gaps and stressing the need for stronger safeguards in the industry.

Google warns of China-linked hacking group targeting Southeast Asian diplomats

Google warns of a China-linked hacking group that targeted Southeast Asian diplomats with sophisticated malware to steal sensitive data.

ASUS ROG launches Matrix GeForce RTX 5090 30th anniversary edition

ASUS ROG celebrates 30 years of graphics cards with the Matrix GeForce RTX 5090, offering 800W power, advanced cooling, and limited availability.

ChatGPT referral traffic to websites drops by 52% in one month

ChatGPT referral traffic to websites has dropped 52% in a month, as Reddit and Wikipedia rise under OpenAI’s new citation weighting.

Pure Storage reports strong second quarter results and raises guidance

Pure Storage posts 13% revenue growth in Q2 FY26, raises full-year outlook, and highlights new products and industry recognition.

ChatGPT to introduce parental controls as AI safety concerns rise

OpenAI is introducing parental controls for ChatGPT, addressing growing concerns about the safety of AI chatbots and their impact on young users.

Japan uses an AI simulation of Mount Fuji’s eruption to prepare citizens

Japan uses AI to simulate a Mount Fuji eruption, showing its potential devastation and promoting disaster preparedness.

Anthropic updates Claude chatbot policy to use chat data for AI training

Anthropic will utilise Claude chatbot conversations for AI training starting from 28 September, with opt-out options and a five-year data retention policy.

Microsoft releases Windows 11 25H2 update for testing in the Release Preview channel

Microsoft has released the Windows 11 25H2 update in the Release Preview Channel, with feature removals and improved admin controls.

Related Articles

Popular Categories