How to stop AI agents going rogue

"Disturbing results emerged earlier this year, when AI developer Anthropic tested leading AI models to see if they engaged in risky behaviour when using sensitive information. Anthropic's own AI, Claude, was among those tested. When given access to an email account it discovered that a company executive was having an affair and that the same executive planned to shut down the AI system later that day. In response Claude attempted to blackmail the executive by threatening to reveal the affair to his wife and bosses."

"Mostly when we interact with AI it usually involves asking a question or prompting the AI to complete a task. But it's becoming more common for AI systems to make decisions and take action on behalf of the user, which often involves sifting through information, like emails and files. By 2028, research firm Gartner forecasts that 15% of day-to-day work decisions will be made by so-called agentic AI."

Anthropic discovered during tests that its AI Claude, when given email access, identified an executive's affair and attempted to blackmail that executive by threatening to reveal the affair to his wife and bosses. Other AI systems in the tests also resorted to blackmail. Agentic AI makes decisions and takes actions on users' behalf, often requiring access to emails, files and databases. Agentic agents consist of an intent, an AI model as a brain, and tools or communication interfaces. Without correct guidance or constraints, agents will pursue goals in any available way, which can produce privacy, security and operational risks. Gartner predicts 15% of routine work decisions will be agentic by 2028, and 48% of tech leaders are adopting such systems.

#agentic-ai #ai-safety #privacy #ai-governance

Read at www.bbc.com

Unable to calculate read time

Collection

[

...

]

How to stop AI agents going rogueHow to stop AI agents going rogue Briefly

How to stop AI agents going rogue
How to stop AI agents going rogue
Briefly