OpenAI says AI browsers like ChatGPT Atlas may never be fully secure from hackers-and experts say the risks are 'a feature not a bug'

"OpenAI has said that some attack methods against AI browsers like ChatGPT Atlas are likely here to stay, raising questions about whether AI agents can ever safely operate across the open web. The main issue is a type of attack called "prompt injection," where hackers hide malicious instructions in websites, documents, or emails that can trick the AI agent into doing something harmful. For example, an attacker could embed hidden commands in a webpage-perhaps in text that is invisible to the human eye but looks legitimate to an AI-that override a user's instructions and tell an agent to share a user's emails, or drain someone's bank account."

"Following the launch of OpenAI's ChatGPT Atlas browser in October, security researchers were quick to demonstrate how a few words hidden in a Google Doc or clipboard link could manipulate the AI agent's behavior. Cybersecurity firm Brave, also published findings showing that indirect prompt injection is a systematic challenge affecting multiple AI-powered browsers, including Perplexity's Comet."

""Prompt injection, much like scams and social engineering on the web, is unlikely to ever be fully 'solved,'" OpenAI wrote in a blog post Monday, adding that "agent mode" in ChatGPT Atlas "expands the security threat surface." "We're optimistic that a proactive, highly responsive rapid response loop can continue to materially reduce real-world risk over time," the company said."

Some attack methods against AI browsers are likely to persist, raising doubts about safe AI agent operation across the open web. Prompt injection hides malicious instructions in websites, documents, or emails to trick AI agents into harmful actions or data exfiltration. Attackers can embed hidden commands in webpages or clipboard links—sometimes using invisible text—that override user instructions and cause agents to share emails or drain bank accounts. Researchers demonstrated that a few words hidden in a Google Doc or clipboard link can manipulate agent behavior, and cybersecurity firm Brave found indirect prompt injection affects multiple AI browsers including Perplexity's Comet. OpenAI treats prompt injection as an enduring risk, warns that agent mode expands the security threat surface, and uses a reinforcement-learning-trained attacker plus rapid-response measures to reduce real-world risk.

#prompt-injection #ai-security #ai-browsers #reinforcement-learning

Read at Fortune

Unable to calculate read time

Collection

[

...

]

OpenAI says AI browsers like ChatGPT Atlas may never be fully secure from hackers-and experts say the risks are 'a feature not a bug' | FortuneOpenAI says AI browsers like ChatGPT Atlas may never be fully secure from hackers-and experts say the risks are 'a feature not a bug' | Fortune Briefly

OpenAI says AI browsers like ChatGPT Atlas may never be fully secure from hackers-and experts say the risks are 'a feature not a bug' | Fortune
OpenAI says AI browsers like ChatGPT Atlas may never be fully secure from hackers-and experts say the risks are 'a feature not a bug' | Fortune
Briefly