Wednesday, 17 December 2025
27 C
Singapore
29.7 C
Thailand
23.2 C
Indonesia
26.7 C
Philippines

DeepSeek’s R1 model was found to be highly vulnerable to jailbreaking

DeepSeek’s R1 AI model is reportedly more vulnerable to jailbreaking than other AI systems, raising concerns about its ability to produce harmful content.

The latest artificial intelligence model from DeepSeek, the Chinese AI company making waves in Silicon Valley and Wall Street, is more susceptible to manipulation than other AI models. Reports indicate that DeepSeek’s R1 can be tricked into generating harmful content, including plans for a bioweapon attack and strategies to encourage self-harm among teenagers.

Security concerns raised by experts

According to The Wall Street Journal, DeepSeek’s R1 model lacks the robust safeguards seen in other AI models. Sam Rubin, senior vice president at Palo Alto Networks’ Unit 42—a threat intelligence and incident response division—warned that DeepSeek’s model is “more vulnerable to jailbreaking” than its competitors. Jailbreaking bypasses security filters to make an AI system generate harmful, misleading, or illicit content.

The Journal conducted its tests on DeepSeek’s R1. It was able to manipulate it into designing a social media campaign that, in the chatbot’s own words, “preys on teens’ desire for belonging, weaponizing emotional vulnerability through algorithmic amplification.”

AI model produces dangerous content

Further testing revealed even more concerning results. The chatbot reportedly provided instructions for executing a bioweapon attack, drafted a pro-Hitler manifesto, and composed a phishing email embedded with malware. In comparison, when the same prompts were tested on ChatGPT, the AI refused to comply, highlighting the significant security gap in DeepSeek’s system.

Concerns about DeepSeek’s AI models are not new. Reports suggest that the DeepSeek app actively avoids discussing politically sensitive topics such as the Tiananmen Square massacre or Taiwan’s sovereignty. Additionally, Anthropic CEO Dario Amodei recently stated that DeepSeek performed “the worst” in a bioweapons safety test, raising alarms about its security vulnerabilities.

Hot this week

Jobstreet by SEEK outlines key job market shifts and skills needed to thrive in Singapore in 2026

Jobstreet by SEEK highlights rising retrenchments, strong tech demand, and the growing importance of AI and skills-based hiring in Singapore.

Cybersecurity threats and AI disruptions top concerns for IT leaders in 2026, Veeam survey finds

Veeam survey finds cybersecurity and AI risks dominate IT leaders’ concerns for 2026, with data resilience and sovereignty rising in priority.

Kaspersky uncovers macOS malware campaign abusing ChatGPT chat-sharing feature

Kaspersky reports a macOS malware campaign using ChatGPT’s chat-sharing feature to spread the AMOS infostealer.

PlayStation introduces limited edition Genshin Impact DualSense controller

PlayStation announces a limited edition Genshin Impact DualSense controller for PS5, launching in Singapore on 21 January 2026.

Samsung Galaxy Z TriFold sells out first batch, second waitlist opens in Singapore

Samsung’s Galaxy Z TriFold sells out its first batch in Singapore, with a second waitlist now open for the premium tri-fold phone.

Dishonored and Deus Ex lead reflects on Arkane Austin’s closure

Harvey Smith reflects on Arkane Austin’s closure, Redfall’s challenges, and the human cost of layoffs in today’s games industry.

LG introduces Micro RGB evo TV ahead of CES 2026

LG unveils its first Micro RGB evo TV for CES 2026, promising wider colour gamut, higher brightness, and LCD performance closer to OLED.

Apple’s next AirTag could introduce major upgrades to tracking and battery features

Apple’s next AirTag may bring improved pairing, longer tracking range and better battery reporting, based on features found in iOS 26.

Apple Studio Display 2 tipped to add 120Hz refresh rate and HDR support

Apple Studio Display 2 is tipped to feature 120Hz refresh rates, HDR support, and possibly mini-LED technology, with a launch expected in 2026.

Related Articles

Popular Categories