Monday, 10 November 2025
31.4 C
Singapore
29.8 C
Thailand
20.5 C
Indonesia
28.2 C
Philippines

DeepSeek’s R1 model was found to be highly vulnerable to jailbreaking

DeepSeek’s R1 AI model is reportedly more vulnerable to jailbreaking than other AI systems, raising concerns about its ability to produce harmful content.

The latest artificial intelligence model from DeepSeek, the Chinese AI company making waves in Silicon Valley and Wall Street, is more susceptible to manipulation than other AI models. Reports indicate that DeepSeek’s R1 can be tricked into generating harmful content, including plans for a bioweapon attack and strategies to encourage self-harm among teenagers.

Security concerns raised by experts

According to The Wall Street Journal, DeepSeek’s R1 model lacks the robust safeguards seen in other AI models. Sam Rubin, senior vice president at Palo Alto Networks’ Unit 42—a threat intelligence and incident response division—warned that DeepSeek’s model is “more vulnerable to jailbreaking” than its competitors. Jailbreaking bypasses security filters to make an AI system generate harmful, misleading, or illicit content.

The Journal conducted its tests on DeepSeek’s R1. It was able to manipulate it into designing a social media campaign that, in the chatbot’s own words, “preys on teens’ desire for belonging, weaponizing emotional vulnerability through algorithmic amplification.”

AI model produces dangerous content

Further testing revealed even more concerning results. The chatbot reportedly provided instructions for executing a bioweapon attack, drafted a pro-Hitler manifesto, and composed a phishing email embedded with malware. In comparison, when the same prompts were tested on ChatGPT, the AI refused to comply, highlighting the significant security gap in DeepSeek’s system.

Concerns about DeepSeek’s AI models are not new. Reports suggest that the DeepSeek app actively avoids discussing politically sensitive topics such as the Tiananmen Square massacre or Taiwan’s sovereignty. Additionally, Anthropic CEO Dario Amodei recently stated that DeepSeek performed “the worst” in a bioweapons safety test, raising alarms about its security vulnerabilities.

Hot this week

Google Maps adds Gemini for hands-free conversational navigation

Google Maps now features Gemini integration, offering conversational navigation, landmark-based directions, and smarter AI-powered tools.

Armis secures US$435 million in pre-IPO funding at US$6.1 billion valuation

Armis raises US$435 million in a pre-IPO round led by Goldman Sachs, valuing the cybersecurity firm at US$6.1 billion.

Square Enix cuts UK and US jobs as it shifts focus back to Japan

Square Enix lays off UK and US developers as it consolidates operations in Japan and expands its use of AI in game development.

DJI unveils Osmo Mobile 8 with Apple DockKit integration and pet tracking

DJI’s new Osmo Mobile 8 gimbal features an Apple DockKit, 360-degree rotation, and pet tracking for enhanced creative control.

WhatsApp launches new app for Apple Watch

WhatsApp introduces its new Apple Watch app, bringing voice messages, reactions, media viewing, and full chat access to the wrist.

Singapore FinTech Festival 2025 marks 10 years with focus on the next decade of finance

Singapore FinTech Festival 2025 celebrates its 10th year, spotlighting AI, tokenisation, and quantum technologies shaping global finance.

Adyen launches new payment terminals for retail and F&B sectors

Adyen launches the S1E4 Pro and S1F4 Pro terminals, enhancing in-person payment solutions for retail and F&B businesses.

Startups from Australia, India and UAE named winners in L’Oréal’s 2025 Beauty Tech competition

L’Oréal crowns startups from Australia, India and UAE as winners of its 2025 Beauty Tech Innovation Program in Singapore.

Workato launches AI Lab in Singapore to drive applied AI innovation and workforce development

Workato opens its AI Lab in Singapore to accelerate applied AI innovation, create skilled jobs, and strengthen industry-academia collaboration.

Related Articles

Popular Categories