
"Before its debut, Anthropic's frontier red team tested Opus 4.6 in a sandboxed environment to see how well it could find bugs in open-source code. The team gave the Claude model everything it needed to do the job - access to Python and vulnerability analysis tools, including classic debuggers and fuzzers - but no specific instructions or specialized knowledge. Claude found more than 500 previously unknown zero-day vulnerabilities in open-source code using just its "out-of-the-box" capabilities,"
""It's a race between defenders and attackers, and we want to put the tools in the hands of defenders as fast as possible," Logan Graham, head of Anthropic's frontier red team, told Axios. "The models are extremely good at this, and we expect them to get much better still." Zoom in: The previously unknown vulnerabilities that Claude Opus 4.6 found ranged from ones that could be exploited to crash a system to others that could corrupt memory."
Anthropic's frontier red team sandbox-tested Claude Opus 4.6 with Python and vulnerability analysis tools, including debuggers and fuzzers, without providing specialized instructions. Using out-of-the-box capabilities, the model discovered more than 500 previously unknown zero-day vulnerabilities in open-source projects, with each finding validated by internal or external researchers. Discovered issues ranged from crash-inducing bugs in GhostScript to buffer overflows in OpenSC and CGIF and other memory-corruption risks. The model demonstrated advanced reasoning to generate exploits and identify subtle faults. Anthropic expects these capabilities to accelerate defensive security efforts and help secure open-source software.
Read at Axios
Unable to calculate read time
Collection
[
|
...
]