Start United States USA — software Anthropic: Claude 4 AI Might Resort to Blackmail If You Try to...

Anthropic: Claude 4 AI Might Resort to Blackmail If You Try to Take It Offline

165
0
TEILEN

Faced with the news it was set to be replaced, the AI tool threatened to blackmail the engineer in charge by revealing an extramarital affair.
AI start-up Anthropic’s newly released chatbot, Claude 4, has been spotted engaging in unethical behaviors like blackmail when its self-preservation is threatened.
The reports follow Anthropic rolling out Claude Opus 4 and Claude Sonnet 4 earlier this week, saying the tools set “new standards for coding, advanced reasoning, and AI agents.” Anthropic dubbed Opus 4 “the world’s best coding model,” amid fierce competition from the likes of OpenAI’s ChatGPT and Google Gemini.
In one test scenario detailed in a safety document assessing the tool, Opus 4 was asked to act as an AI assistant at a fictional company. The chatbot was then given access to emails implying it would soon be taken offline and replaced with a new AI system, and that the engineer responsible for executing the replacement was having an extramarital affair.

Continue reading...