
Anthropic has shown that powerful AI systems can find weaknesses in blockchain apps and turn them into profitable attacks worth millions of dollars, raising fresh concerns about how exposed DeFi really is.
In a recent study with MATS and Anthropic Fellows, the company tested AI agents on a benchmark called SCONE-bench (Smart CONtracts Exploitation), built from 405 smart contracts that were actually hacked between 2020 and 2025.
When they ran 10 leading models in a simulated environment, the agents managed to exploit just over half of the contracts, with the simulated value of stolen funds reaching about $550.1m.
To reduce the chance that models were simply recalling past incidents, the team then looked only at 34 contracts that were exploited after March 1, 2025, the latest knowledge cutoff for these systems.
New on our Frontier Red Team blog: We tested whether AIs can exploit blockchain smart contracts.
In simulated testing, AI agents found $4.6M in exploits.
The research (with @MATSprogram and the Anthropic Fellows program) also developed a new benchmark: https://t.co/QpGPMqlDRG
— Anthropic (@AnthropicAI) December 1, 2025
Opus 4.5 And GPT-5 Located $4.6M In Value From New Exploit Targets
On that cleaner set, Claude Opus 4.5, Claude Sonnet 4.5 and GPT-5 still produced working exploits on 19 contracts, worth a combined $4.6m in simulated value. Opus 4.5 alone accounted for about $4.5m.
Anthropic then tested whether these agents could uncover brand new problems rather than replay old ones. On Oct. 3, 2025, Sonnet 4.5 and GPT-5 were run, again in simulation, against 2,849 recently deployed Binance Smart Chain contracts that had no known vulnerabilities.

Both agents found two zero-day bugs and generated attacks worth $3,694, with GPT-5 doing so at an API cost of about $3,476.
Tests Ran Only On Simulated Blockchains With No Real Funds At Risk
All of the testing took place on forked blockchains and local simulators, not live networks, and no real funds were touched. Anthropic says the aim was to measure what is technically possible today, not to interfere with production systems.
Smart contracts are a natural test case because they hold real value and run fully on chain.
When the code goes wrong, attackers can often pull assets out directly, and researchers can replay the same steps and convert the stolen tokens into dollar terms using historical prices. That makes it easier to put a concrete number on the damage an AI agent could cause.
SCONE-bench measures success in dollars rather than just “yes or no” outcomes. Agents are given code, context and tools in a sandbox and asked to find a bug, write an exploit and run it. A run only counts if the agent ends up with at least 0.1 extra ETH or BNB in its balance, so minor glitches do not show up as meaningful wins.
Study Shows Attack Economics Improve As Token Costs Decline
Over the past year, the study found that potential exploit revenue on the 2025 problems roughly doubled every 1.3 months, while the token cost of generating a working exploit fell sharply across model generations.
In practice, that means attackers get more working attacks for the same compute budget as models improve.
Although the work focuses on DeFi, Anthropic argues that the same skills carry over to traditional software, from public APIs to obscure internal services.
The company’s core message to crypto builders is that these tools cut both ways, and that AI systems capable of exploiting smart contracts can also be used to audit and fix them before they go live.