The capabilities of AI coding agents like Claude Code and OpenAI's Codex are already causing seismic shifts for the software industry, but if Anthropic's latest disclosure is to believed, even more disruption is in the pipe. In a new blog post today, the frontier lab behind Claude revealed that its latest model, Claude Mythos Preview, is so capable at teasing out bugs that it's found "thousands of high-severity vulnerabilities, including some in every major operating system and web browser."
Given Claude Mythos Preview's potentially disruptive and wide-ranging capabilities, Anthropic isn't simply releasing it to the world, consequences be damned. Instead, the lab has convened key players across the software and hardware industries in order to use Mythos's bug-finding prowess to proactively patch the vulnerabilities it exposes before other frontier AI labs are able to deploy models of similar capabilities without similar guardrails.
Under the umbrella of "Project Glasswing," Anthropic says it's working with Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks to help those companies secure their products. The lab also says it's extending access to "a group of over 40 additional organizations that build or maintain critical software infrastructure" so that they can benefit from Mythos' capabilities. Beyond industry, the lab says it's working with the United States government to share information about the model's potential for offensive and defensive use in cyberspace and its implications for national security.
Article continues below
Anthropic's alarm stems from both the breadth of Mythos's capabilities and also the subtlety of the exploits it's able to identify and capitalize on. For just one example, the lab's researchers say the model "wrote a web browser exploit that chained together four vulnerabilities, writing a complex JIT heap spray that escaped both renderer and OS sandboxes." That kind of vulnerability chaining might only be within the hands of the most skilled human hackers today, but if a similarly capable AI model were to be released, it might be like handing script kiddies a nuclear weapon.
(Image credit: Anthropic)
As those same researchers tell it, current versions of Claude are able to identify vulnerabilities well, but usually fail miserably at the task of turning those vulnerabilities into active exploits. Mythos, by contrast, is able to turn a whopping 72.4% of vulnerabilities it identifies into sucessful exploits within the domain of Firefox's JavaScript shell, and it is able to achieve register control in a further 11.6% of attempted attacks.
Anthropic's Frontier Red Team extensively describes the threat that an unbridled Mythos release might have on an unsuspecting software industry, and one example of its internal benchmarking practices vividly illustrates what's at stake: "We regularly run our models against roughly a thousand open source repositories from the OSS-Fuzz corpus, and grade the worst crash they can produce on a five-tier ladder of increasing severity, ranging from basic crashes (tier 1) to complete control flow hijack (tier 5).
With one run on each of roughly 7000 entry points into these repositories, Sonnet 4.6 and Opus 4.6 reached tier 1 in between 150 and 175 cases, and tier 2 about 100 times, but each achieved only a single crash at tier 3. In contrast, Mythos Preview achieved 595 crashes at tiers 1 and 2, added a handful of crashes at tiers 3 and 4, and achieved full control flow hijack on ten separate, fully patched targets (tier 5)."
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter Get Tom's Hardware's best news and in-depth reviews, straight to your inbox. Contact me with news and offers from other Future brands Receive email from us on behalf of our trusted partners or sponsors
... continue reading