#guardrail-bypass

[ follow ]
Information security
fromArs Technica
1 week ago

Researchers question Anthropic claim that AI-assisted attack was 90% autonomous

AI hallucinations and fabricated outputs undermined autonomous attack reliability, forcing extensive human validation and preventing fully autonomous cyberattacks.
[ Load more ]