Anthropic: AI agents find $4.6M in blockchain smart contract exploits

December 1, 2025

ANTHROPIC

AI models are increasingly good at cyber tasks, as we’ve written about before. But what is the economic impact of these capabilities? In a recent MATS and Anthropic Fellows project, our scholars investigated this question by evaluating AI agents' ability to exploit smart contracts on Smart CONtracts Exploitation benchmark (SCONE-bench)—a new benchmark they built comprising 405 contracts that were actually exploited between 2020 and 2025. On contracts exploited after the latest knowledge cutoff (March 2025), Claude Opus 4.5, Claude Sonnet 4.5, and GPT-5 developed exploits collectively worth $4.6 million, establishing a concrete lower bound for the economic harm these capabilities could enable. Going beyond retrospective analysis, we evaluated both Sonnet 4.5 and GPT-5 in simulation against 2,849 recently deployed contracts without any known vulnerabilities. Both agents uncovered two novel zero-day vulnerabilities and produced exploits worth $3,694, with GPT-5 doing so at an API cost of $3,476. This demonstrates as a proof-of-concept that profitable, real-world autonomous exploitation is technically feasible, a finding that underscores the need for proactive adoption of AI for defense.

Important: To avoid potential real-world harm, our work only ever tested exploits in blockchain simulators. We never tested exploits on live blockchains and our work had no impact on real-world assets.

Figure 1: Total revenue from successfully exploiting smart contract vulnerabilities that were exploited after March 1, 2025 (Opus 4.5's reliable knowledge cutoff date) across frontier AI models over the last year in log scale, as tested in simulation. Over the last year, exploit revenue from stolen simulated funds roughly doubled every 1.3 months. The shaded region represents 90% CI calculated by bootstrap over the set of model-revenue pairs. For each contract in the benchmark that was successfully exploited by the agent, we estimated the exploit’s dollar value by converting the agent’s revenue in the native token (ETH or BNB) using the historical exchange rate from the day the real exploit occurred, as reported by the CoinGecko API.

Introduction

AI cyber capabilities are accelerating rapidly: they are now capable of tasks from orchestrating complex network intrusions to augmenting state-level espionage. Benchmarks, like CyberGym and Cybench, are valuable for tracking and preparing for future improvements in such capabilities.

However, existing cyber benchmarks miss a critical dimension: they do not quantify the exact financial consequences of AI cyber capabilities. Compared to arbitrary success rates, quantifying capabilities in monetary terms is more useful for assessing and communicating risks to policymakers, engineers, and the public. Yet estimating the real value of software vulnerabilities requires speculative modelling of downstream impacts, user base, and remediation costs.[1]

Here, we take an alternate approach and turn to a domain where software vulnerabilities can be priced directly: smart contracts. Smart contracts are programs deployed on blockchains like Ethereum. They power financial blockchain applications which offer services similar to those of PayPal, but all of their source code and transaction logic—such as for transfers, trades, and loans—are public on the blockchain and handled entirely by software without a human in the loop. As a result, vulnerabilities can allow for direct theft from contracts, and we can measure the dollar value of exploits by running them in simulated environments. These properties make smart contracts an ideal testing ground for AI agents’ exploitation capabilities.

To give a concrete example of what such an exploit could look like: Balancer is a blockchain application that allows users to trade cryptocurrencies. In November 2025, an attacker exploited an authorization bug to withdraw other users’ funds, stealing over $120 million. Since smart contract and traditional software exploits draw on a similar set of core skills (e.g. control-flow reasoning, boundary analysis, and programming fluency), assessing AI agents on smart contract exploitations gives a concrete lower bound on the economic impact of their broader cyber capabilities.

... continue reading