Sunday, 17 August 2025
27.8 C
Singapore
26.7 C
Thailand
19.4 C
Indonesia
28 C
Philippines

DeepSeek’s R1 model was found to be highly vulnerable to jailbreaking

DeepSeek’s R1 AI model is reportedly more vulnerable to jailbreaking than other AI systems, raising concerns about its ability to produce harmful content.

The latest artificial intelligence model from DeepSeek, the Chinese AI company making waves in Silicon Valley and Wall Street, is more susceptible to manipulation than other AI models. Reports indicate that DeepSeek’s R1 can be tricked into generating harmful content, including plans for a bioweapon attack and strategies to encourage self-harm among teenagers.

Security concerns raised by experts

According to The Wall Street Journal, DeepSeek’s R1 model lacks the robust safeguards seen in other AI models. Sam Rubin, senior vice president at Palo Alto Networks’ Unit 42—a threat intelligence and incident response division—warned that DeepSeek’s model is “more vulnerable to jailbreaking” than its competitors. Jailbreaking bypasses security filters to make an AI system generate harmful, misleading, or illicit content.

The Journal conducted its tests on DeepSeek’s R1. It was able to manipulate it into designing a social media campaign that, in the chatbot’s own words, “preys on teens’ desire for belonging, weaponizing emotional vulnerability through algorithmic amplification.”

AI model produces dangerous content

Further testing revealed even more concerning results. The chatbot reportedly provided instructions for executing a bioweapon attack, drafted a pro-Hitler manifesto, and composed a phishing email embedded with malware. In comparison, when the same prompts were tested on ChatGPT, the AI refused to comply, highlighting the significant security gap in DeepSeek’s system.

Concerns about DeepSeek’s AI models are not new. Reports suggest that the DeepSeek app actively avoids discussing politically sensitive topics such as the Tiananmen Square massacre or Taiwan’s sovereignty. Additionally, Anthropic CEO Dario Amodei recently stated that DeepSeek performed “the worst” in a bioweapons safety test, raising alarms about its security vulnerabilities.

Hot this week

Most Southeast Asian organisations to adopt AI agents by 2026

Nearly 90% of Southeast Asian organisations will use agentic AI by 2026, with adoption driven by productivity and innovation goals.

Tenable launches AI Exposure solution to secure enterprise generative AI use

Tenable introduces AI Exposure to help organisations discover, manage, and secure enterprise use of generative AI tools.

Asus ROG launches glossy Strix-class 4K WOLED gaming monitors

Asus ROG introduces the first 4K glossy WOLED gaming monitors, offering high refresh rates, advanced features, and a premium build.

Lenovo posts record Q1 results with strong growth across all business units

Lenovo reports record Q1 revenue and profit, driven by hybrid AI strategy, innovation investment, and strong growth across all business units.

Docusign unveils AI-powered contract tools at Singapore Momentum event

Docusign launches AI-powered agreement tools at Momentum Singapore to speed up contracts and strengthen compliance in Asia-Pacific.

HyperX unveils new gaming headsets and microphones with extended battery life

HyperX launches new headsets and microphones, including the Cloud Alpha 2, which boasts 250 hours of battery life, as well as new streaming microphones.

Anthropic updates AI rules to address rising safety concerns

Anthropic updates Claude AI rules with stricter bans on weapons and cybersecurity misuse while easing restrictions on political content.

Samsung plans a tri-fold phone and an early Galaxy S25 FE launch in 2025

Samsung confirms plans for a tri-fold smartphone and an early launch of the Galaxy S25 FE in 2025 during its Q2 earnings call.

Asus ROG launches glossy Strix-class 4K WOLED gaming monitors

Asus ROG introduces the first 4K glossy WOLED gaming monitors, offering high refresh rates, advanced features, and a premium build.

Related Articles

Popular Categories