AI agents that can control and read data from an internet browser are also susceptible to prompt injection attacks, whereby they obey malicious text circulating in web content.
Be careful around AI-powered browsers: Hackers could take advantage of generative AI that’s been integrated into web surfing.
Anthropic warned about the threat on Tuesday. It’s been testing a Claude AI Chrome extension that allows its AI to control the browser, helping users perform searches, conduct research, and create content. But for now, it’s limited to paid subscribers as a research preview because the integration introduces new security vulnerabilities. Claude has been reading data on the browser and misinterpreting it as a command that it should execute. 
These “prompt injection attacks” also mean a hacker could secretly embed instructions in web content to manipulate the Claude extension into executing a malicious request. 
“Prompt injection attacks can cause AIs to delete files, steal data, or make financial transactions. This isn’t speculation: we’ve run ‘red-teaming’ experiments to test Claude for Chrome and, without mitigations, we’ve found some concerning results,” Anthropic says.
Anthropic’s investigation involved “123 test cases representing 29 different attack scenarios,” which resulted in a 23.6% success rate through the prompt injections. For example, one successful attack used a phishing email to demand that all other emails in the inbox be deleted.