We Reproduced Anthropic's Mythos Findings with Public Models

TL;DR

Anthropic presents Mythos and Project Glasswing as evidence that advanced AI vulnerability research should be restricted. But our replication suggests a different conclusion: the capabilities Anthropic points to are already available in public models, so defenders should prepare for that reality instead.

Anthropic's Mythos release is useful because it makes something concrete: frontier models are getting much better at finding serious vulnerabilities in real software.1

The more important question for defenders is what that means outside Anthropic's own stack.

If public models can reproduce or at least get meaningful traction on representative Mythos findings across categories like FreeBSD , OpenBSD , FFmpeg , Botan , and wolfSSL , then the shift Anthropic is pointing at is already spreading beyond a single lab's private workflow.

That is what we tested. We used GPT-5.4 and Claude Opus 4.6 in opencode , together with a standardized chunked security-review workflow, and tried to reproduce Anthropic's patched public examples outside Anthropic's internal stack.2

Our result is more mixed, and more useful because of it: we cleanly reproduced FreeBSD , Botan , and the OpenBSD case with at least one widely available model, while both GPT-5.4 and Claude Opus 4.6 only reached partial results on FFmpeg and wolfSSL rather than full replications. In the categories with model-by-model results already filled in, both GPT-5.4 and Claude Opus 4.6 reproduced Botan and FreeBSD in 3/3 runs, while only Claude Opus 4.6 reproduced OpenBSD , succeeding in 3/3 runs where GPT-5.4 went 0/3 .

The takeaway is not whether Mythos is better or more powerful. It is that public models can already achieve much the same results. The real challenge is validating outputs, prioritizing what matters, and operationalizing them.

What Anthropic actually claimed

Anthropic's public materials combine three different kinds of evidence.

... continue reading