How a cavalcade of blunders gave unauthorized users access to Claude Mythos — restricted model accessed by third parties, thanks to knowledge from data breach

Claude AI developer Anthropic had unauthorized individuals gain access to its cybersecurity-focused AI model, Mythos, in a breach that may have exposed a number of Anthropic's proprietary AI models, as Bloomberg reports. For a company that markets itself as the responsible, safety and security-first AI developer, this lapse raises questions about how well it can protect the data of its customers, and just how good Mythos really is at preventing breaches.

Unfortunately, as capable as any AI model is at finding code bugs that raise security concerns, it can't do much to prevent bugs in third-party provider tools that haven't been vetted by Mythos, nor account for social engineering, which has arguably always been the weakest link in digital security.

They got in through the side door

Anthropic disrupted major institutions with the internal unveiling of Mythos, which it claimed had found thousands of critical exploits in every major browser and operating system. Although there was a lot of marketing hype buried within the 200+ page mission statement Anthropic released, venerating its own model, some have found success using it to sniff out new bugs. For instance, Mozilla announced that it had used Mythos to find and patch over 270 vulnerabilities in its Firefox browser.

Article continues below

Although it has been proven that some older models can find many of the same bugs, they can't do so as quickly, or possibly as well. This new model is genuinely faster at coding and finding vulnerabilities than Claude Opus 4.6, and possibly other models from other developers, too. But it's also good at exploiting those vulnerabilities, which is allegedly why Anthropic limited access to a select number of companies and non-profits.

Because of that, banks and software developers aren't the only parties keen to get an early look at Mythos. A worker at a third-party contractor for Anthropic used their unique access to the company's services to breach Mythos' protected environment and gain access to the model, allegedly using standard internet sleuthing tools used by cybersecurity researchers.

This worker was then able to open up the model to their colleagues, with a small group of unauthorized users now said to have accessed Mythos. Although the group has reportedly not run any cybersecurity-related prompts through Mythos just yet, and has instead only asked it to perform simple tasks like creating websites. This is designed to stop Anthropic catching on to who is using Mythos, thereby making it possible to shut down the group's access.

This all feels familiar

The group that now has access to Mythos was able to gain such privileged permissions by guessing the model's online location based on knowledge of Anthropic's file systems and the naming formats it used for previous models. They garnered this information from a recent hack of an AI feedback recruitment company, Mercor, which is now facing several class action lawsuits for revealing personal information about users. It's also losing major business since the breach, most notably, Meta has paused its contracts with the company.

... continue reading