Anthropic says its most powerful AI cyber model is too dangerous to release publicly

Anthropic on Tuesday announced Project Glasswing, a sweeping cybersecurity initiative that pairs an unreleased frontier AI model — Claude Mythos Preview — with a coalition of twelve major technology and finance companies in an effort to find and patch software vulnerabilities across the world's most critical infrastructure before adversaries can exploit them.The launch partners include Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks. Anthropic says it has also extended access to more than 40 additional organizations that build or maintain critical software, and is committing up to $100 million in usage credits for Claude Mythos Preview across the effort, along with $4 million in direct donations to open-source security organizations.The announcement arrives at a moment of extraordinary momentum — and extraordinary scrutiny — for the San Francisco-based AI startup. Anthropic disclosed on Sunday that its annualized revenue run rate has surpassed $30 billion, up from approximately $9 billion at the end of 2025, and the number of business customers each spending over $1 million annually now exceeds 1,000, doubling in less than two months. The company simultaneously announced a multi-gigawatt compute deal with Google and Broadcom. On the same day, Bloomberg reported that Anthropic had poached a senior Microsoft executive, Eric Boyd, to lead its infrastructure expansion.But Glasswing is something categorically different from a revenue milestone or a compute deal. It’s Anthropic's most ambitious attempt to translate frontier AI capabilities — capabilities the company itself describes as dangerous — into a defensive advantage before those same capabilities proliferate to hostile actors.Why Anthropic built a model it considers too dangerous to release publiclyAt the center of Project Glasswing sits Claude Mythos Preview, a general-purpose frontier model that Anthropic says has already identified thousands of high-severity zero-day vulnerabilities — meaning flaws previously unknown to software developers — in every major operating system and every major web browser, along with a range of other critical software.The company is not making the model generally available."We do not plan to make Claude Mythos Preview generally available due to its cybersecurity capabilities," Newton Cheng, Frontier Red Team Cyber Lead at Anthropic, told VentureBeat in an exclusive interview. "However, given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. The fallout — for economies, public safety, and national security — could be severe."That language — "the fallout could be severe" — is striking coming from the company that built the model. Anthropic is effectively arguing that the tool it created is powerful enough to reshape the cybersecurity landscape, and that the only responsible thing to do is to keep it restricted while giving defenders a head start.The technical results reinforce that claim. According to Anthropic's press release, Mythos Preview was able to find nearly all of the vulnerabilities it surfaced, and develop many related exploits, entirely autonomously, without any human steering. Three examples stand out: The model found a 27-year-old vulnerability in OpenBSD — widely regarded as one of the most security-hardened operating systems in the world and commonly used to run firewalls and critical infrastructure. The flaw allowed an attacker to remotely crash any machine running the OS simply by connecting to it. It also discovered a 16-year-old vulnerability in FFmpeg — the near-ubiquitous video encoding and decoding library — in a line of code that automated testing tools had exercised five million times without ever catching the problem. And perhaps most alarmingly, Mythos Preview autonomously found and chained together several vulnerabilities in the Linux kernel to escalate from ordinary user access to complete control of the machine.All three vulnerabilities have been reported to the relevant maintainers and have since been patched. For many other vulnerabilities still in the remediation pipeline, Anthropic says it is publishing cryptographic hashes of the details today, with plans to reveal specifics after fixes are in place.On the CyberGym evaluation benchmark, Mythos Preview scored 83.1%, compared to 66.6% for Claude Opus 4.6, Anthropic's next-best model. The gap is even wider on coding benchmarks: Mythos Preview achieves 93.9% on SWE-bench Verified versus 80.8% for Opus 4.6, and 77.8% on SWE-bench Pro versus 53.4%.How Anthropic plans to disclose thousands of zero-days without overwhelming open-source maintainersFinding thousands of zero-days at once sounds impressive. Actually handling the output responsibly is a logistical nightmare — and one of the sharpest criticisms that security researchers have raised about AI-driven vulnerability discovery. Flooding open-source maintainers, many of whom are unpaid volunteers, with an avalanche of critical bug reports could easily do more harm than good.Cheng told VentureBeat that Anthropic has built a triage pipeline specifically to manage this problem. "We triage every bug that we find and then send the highest severity bugs to professional human triagers we have contracted to assist in our disclosure process by manually validating every bug report before we send it out to ensure that we send only high-quality reports to maintainers," he said.That pipeline is designed to prevent exactly the scenario that maintainers fear most: an automated firehose of unverified reports. "We do not submit large volumes of findings to a single project without first reaching out in an effort to agree on a pace the maintainer can sustain," Cheng added.When Anthropic has access to the source code, the company aims to include a candidate patch with every report, labeled by provenance — meaning the maintainer knows the patch was written or reviewed by a model — and offers to collaborate on a production-quality fix. "Models can write patches," Cheng noted, "but there are many factors that impact patch quality, and we strongly recommend that autonomously-written patches are put under the same scrutiny and testing that human-written patches are."On disclosure timelines, Anthropic says it follows a coordinated vulnerability disclosure framework. Once a patch is available, the company will generally wait 45 days before publishing full technical details, giving downstream users time to deploy the fix before exploitation information becomes public. Cheng said the company may shorten that buffer "if the details are already publicly known through other channels, or if earlier publication would materially help defenders identify and mitigate ongoing attacks," or extend it "when patch deployment is unusually complex or the affected footprint is unusually broad."Those are reasonable principles, but they will be tested at a scale that no vulnerability disclosure program has ever attempted. The sheer volume of findings — thousands of zero-days across every major platform — means that even a well-designed triage process will face bottlenecks. And the 45-day disclosure window assumes that maintainers can actually produce, test, and ship a patch in that time, which is far from guaranteed for complex kernel-level bugs or deeply embedded cryptographic flaws.The source code leak, the CMS blunder, and why trust is Anthropic's biggest vulnerabilityThe irony of a company claiming to build the most capable cyber model ever constructed while simultaneously suffering a string of embarrassing security lapses has not been lost on observers.In late March, a draft blog post about Mythos was left in an unsecured and publicly searchable data store — a CMS misconfiguration that exposed roughly 3,000 internal assets, including what appeared to be strategic plans for the model's rollout. Days later, on March 31, anyone who ran npm install on Claude Code pulled down Anthropic's complete original source code — 512,000 lines — for approximately three hours due to a packaging error, an incident that drew widespread attention in the developer community and was first reported by VentureBeat.When asked why partners and governments should trust Anthropic as the custodian of a model it describes as having unprecedented cyber capabilities, Cheng was direct. "Security is central to how we build and ship," he told VentureBeat. "These two incidents, a blog CMS misconfiguration and an npm packaging error, were human errors in publishing tooling, not breaches of our security architecture. We've made changes to prevent these from happening again, and we'll continue to improve our processes."It is a technically accurate distinction — neither incident involved a breach of Anthropic's core model weights, training infrastructure, or API systems — but it is also a distinction that may prove difficult to sustain as a public argument. For an organization asking governments and Fortune 500 companies to trust it with a tool that can autonomously find and exploit vulnerabilities in the Linux kernel, even minor operational lapses carry outsized reputational risk. The fact that the Mythos leak itself was what first alerted the security community to the model's existence, weeks before the planned announcement, underscores the point.What Microsoft, CrowdStrike, and the Linux Foundation found when they tested the modelThe coalition's breadth is notable. It includes direct competitors — Google and Microsoft — alongside cybersecurity incumbents, financial institutions, and the steward of the world's largest open-source ecosystem. And several partners have already been running Mythos Preview against their own infrastructure for weeks.CrowdStrike's CTO Elia Zaitsev framed the initiative in terms of collapsing timelines: "The window between a vulnerability being discovered and being exploited by an adversary has collapsed — what once took months now happens in minutes with AI." AWS Vice President and CISO Amy Herzog said her teams have already been testing Mythos Preview against critical codebases, where the model is "already helping us strengthen our code." And Microsoft's Global CISO Igor Tsyganskiy noted that when tested against CTI-REALM, Microsoft's open-source security benchmark, "Claude Mythos Preview showed substantial improvements compared to previous models."Perhaps the most revealing comment came from Jim Zemlin, CEO of the Linux Foundation, who pointed to the fundamental asymmetry that has plagued open-source security for decades: "In the past, security expertise has been a luxury reserved for organizations with large security teams. Open-source maintainers — whose software underpins much of the world's critical infrastructure — have historically been left to figure out security on their own." Project Glasswing, he said, "offers a credible path to changing that equation."To back that claim with dollars, Anthropic says it has donated $2.5 million to Alpha-Omega and OpenSSF through the Linux Foundation, and $1.5 million to the Apache Software Foundation. Maintainers interested in access can apply through Anthropic's Claude for Open Source program.Inside the pricing, the compute deal, and Anthropic's path toward a potential IPOAfter the research preview period — during which Anthropic's $100 million credit commitment will cover most usage — Claude Mythos Preview will be available to participants at $25 per million input tokens and $125 per million output tokens. Participants can access the model through the Claude API, Amazon Bedrock, Google Cloud's Vertex AI, and Microsoft Foundry.Those prices reflect the model's computational intensity. The draft blog post that leaked in March described Mythos as a large, compute-intensive model that would be expensive for both Anthropic and its customers to serve. Anthropic's solution is to develop and launch new safeguards with an upcoming Claude Opus model, allowing the company to "improve and refine them with a model that does not pose the same level of risk as Mythos Preview," as Cheng told VentureBeat. Security professionals whose legitimate work is affected by those safeguards will be able to apply to an upcoming Cyber Verification Program.The financial context matters. The same day Project Glasswing launched, Anthropic disclosed its revenue milestone and the Google-Broadcom compute deal. Broadcom signed an expanded deal with Anthropic that will give the AI startup access to about 3.5 gigawatts worth of computing capacity drawing on Google's AI processors, according to CNBC. The scale of compute being marshaled is staggering — and it helps explain why Anthropic needs both the revenue from enterprise cybersecurity partnerships and the infrastructure to serve a model of Mythos Preview's size.The timing also intersects with growing speculation about Anthropic's path to a public offering. The company is reportedly evaluating an IPO as early as October 2026. A high-profile, government-adjacent cybersecurity initiative with blue-chip partners is exactly the kind of program that burnishes an IPO narrative — particularly when the company can simultaneously point to $30 billion in annualized revenue and a compute footprint measured in gigawatts.Anthropic says defenders have months, not years, before adversaries catch upThe most consequential question raised by Project Glasswing is not whether Mythos Preview's capabilities are real — the partner endorsements and patched vulnerabilities suggest they are — but how much time defenders actually have before similar capabilities are available to adversaries.Cheng was candid about the timeline. "Frontier AI capabilities are likely to advance substantially over just the next few months," he told VentureBeat. "Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely." He described Project Glasswing as "an important step toward giving defenders a durable advantage in the coming AI-driven era of cybersecurity" but added a crucial caveat: "It's important to note, this is a starting point. No one organization can solve these cybersecurity problems alone."That framing — months, not years — is worth taking seriously. DARPA launched its original Cyber Grand Challenge in 2016, a competition to create automatic defensive systems capable of reasoning about flaws, formulating patches, and deploying them on a network in real time. At the time, the winning AI-powered bot, Mayhem, finished last when placed against human teams at DEF CON. A decade later, Anthropic is claiming that a frontier AI model can find vulnerabilities that survived 27 years of expert human review and millions of automated security tests — and can chain exploits together autonomously to achieve full system compromise.The delta between those two data points illustrates why the industry is treating this as a genuine inflection point, not a marketing exercise. Anthropic itself has firsthand experience with the offensive side of this equation: the company disclosed in November 2025 that a Chinese state-sponsored group achieved 80 to 90 percent autonomous tactical execution using Claude across approximately 30 targets, according to Anthropic's misuse report.Project Glasswing arrives during one of the most turbulent weeks in Anthropic's history. In the span of days, the company has announced a model it considers too dangerous for public release, disclosed that its revenue has tripled, sealed a multi-gigawatt compute deal, hired a senior Microsoft executive, made it more expensive for Claude Code subscribers to use third-party tools like OpenClaw, and weathered a major outage of its Claude chatbot on Tuesday morning. Anthropic says it will report publicly on what it has learned within 90 days. In the medium term, the company has proposed that an independent, third-party body might be the ideal home for continued work on large-scale cybersecurity projects.Whether any of that is fast enough depends on a race that is already underway. Anthropic built a model that can autonomously crack open the most hardened operating systems on the planet — and is now betting that sharing it with defenders, under careful restrictions, will do more good than the inevitable moment when similar capabilities land in less careful hands. It is, in essence, a wager that transparency can outrun proliferation. The next few months will determine whether that bet pays off, or whether the glasswing's wings were never quite opaque enough to hide what was coming.

Anthropic says its most powerful AI cyber model is too dangerous to release publicly — so it built Project Glasswing