Start United States USA — software Anthropic: Claude 4 AI Might Resort to Blackmail If You Try to...

Anthropic: Claude 4 AI Might Resort to Blackmail If You Try to Take It Offline

Von

May 24, 2025

165

Faced with the news it was set to be replaced, the AI tool threatened to blackmail the engineer in charge by revealing an extramarital affair.
AI start-up Anthropic’s newly released chatbot, Claude 4, has been spotted engaging in unethical behaviors like blackmail when its self-preservation is threatened.
The reports follow Anthropic rolling out Claude Opus 4 and Claude Sonnet 4 earlier this week, saying the tools set “new standards for coding, advanced reasoning, and AI agents.” Anthropic dubbed Opus 4 “the world’s best coding model,” amid fierce competition from the likes of OpenAI’s ChatGPT and Google Gemini.
In one test scenario detailed in a safety document assessing the tool, Opus 4 was asked to act as an AI assistant at a fictional company. The chatbot was then given access to emails implying it would soon be taken offline and replaced with a new AI system, and that the engineer responsible for executing the replacement was having an extramarital affair.