Anthropic’s Mythos breach was humiliating

Anthropic’s tightly controlled rollout of Claude Mythos has taken an awkward turn. After spending weeks insisting the AI model is so capable at cybersecurity that it is too dangerous to release publicly, it appears the model fell into the wrong hands anyway.

According to Bloomberg, a “small group of unauthorized users” has had access to Mythos — whose existence was first revealed in a leak — since the day Anthropic announced plans to offer it to a select group of companies for testing. Anthropic says it is investigating. That’s a rough look for a company that has built its brand on taking AI safety seriously while touting the cybersecurity prowess of its latest model.

From a technological standpoint, the Mythos breach is embarrassingly unsophisticated. Bloomberg reports the group accessed Mythos by making “an educated guess about the model’s online location,” using information about Anthropic’s other models exposed in the breach of Mercor — a company that makes AI training data — along with access one member had through contract work evaluating Anthropic models. The group got unauthorized access to Mythos through a combination of insider knowledge and a lucky guess, not some sophisticated technological exploit or wholesale theft of the model.

Security vulnerabilities are inevitable, and it was Mercor, not Anthropic, that revealed the information the hackers used to guess Mythos’ location. Pia Hüsch, a research fellow at the British think tank Royal United Services Institute (RUSI), told me that no company is ever completely secure and humans are often the weakest link, though it “does initially seem a bit lucky” that there were no serious consequences.

Anthropic failed to anticipate an ‘entirely imaginable’ kind of failure

But it’s not entirely bad luck. These kinds of educated guesses are a very standard hacking technique, and the Mercor breach was already known before Mythos’ release. Security researcher Lukasz Olejnik described it to me as an “entirely imaginable” kind of failure that the cybersecurity industry has been routinely dealing with for the last 20 years. So Anthropic should have anticipated it and should have prepared accordingly, particularly knowing that its information had been compromised.

Anthropic also appears to have had the means to spot the breach. The company is able to “log and track model use,” Olejnik said, which should make it possible to stop unauthorized or malicious access, especially since the Mythos rollout was supposed to be highly limited. Evidently, Anthropic wasn’t monitoring closely enough — and given how dangerous it says the model is, it’s reasonable to ask why.

By Bloomberg’s account, the group was not using Mythos for cybersecurity tasks, partly because they just wanted to mess around with the new model and partly because doing so could have tipped Anthropic off. If Anthropic’s messaging surrounding Mythos is to be taken seriously, that is a lucky break. The company has framed Mythos as a “watershed moment for security,” claiming it found vulnerabilities in “every major operating system and web browser,” and said its release must be coordinated to allow time to “reinforce the world’s cyber defenses.”

Anthropic has a habit of using dramatic, alarming-sounding language that can be tough to interrogate cleanly, including flirting with the idea that its Claude model might be conscious. Even so, early reports from parties with access suggest Mythos is particularly adept in cybersecurity. Mozilla CTO Bobby Holley said it found hundreds of bugs in Firefox 150 and may finally give defenders a chance at complete victory over attackers. Unsurprisingly, governments and financial institutions around the world have been eager to get their hands on it. The NSA and other US agencies reportedly have access despite Anthropic’s designation as a supply chain risk, though the rollout appears to have bypassed the US cybersecurity agency, CISA, so far.

“Anthropic claims to be at the absolute forefront of all these technologies, but also positions itself as the responsible actor in all of this.”

... continue reading

Anthropic&#8217;s Mythos breach was humiliating

Anthropic’s Mythos breach was humiliating