Tech News
← Back to articles

We hid backdoors in ~40MB binaries and asked AI + Ghidra to find them

read original related products more articles

Claude can code, but can it check binary executables?

Now on the front page of Hacker News — see the discussion.

We already did our experiments with using NSA software to hack a classic Atari game. This time we want to focus on a much more practical task — using AI agents for malware detection. We partnered with Michał “Redford” Kowalczyk, reverse engineering expert from Dragon Sector, known for finding malicious code in Polish trains, to create a benchmark of finding backdoors in binary executables, without access to source code.

See BinaryAudit for the full benchmark results — including false positive rates, tool proficiency, and the Pareto frontier of cost-effectiveness. All tasks are open source and available at quesmaOrg/BinaryAudit.

We were surprised that today’s AI agents can detect some hidden backdoors in binaries. We hadn’t expected them to possess such specialized reverse engineering capabilities.

However, this approach is not ready for production. Even the best model, Claude Opus 4.6, found relatively obvious backdoors in small/mid-size binaries only 49% of the time. Worse yet, most models had a high false positive rate — flagging clean binaries.

In this blog post we discuss a few recent security stories, explain what binary analysis is, and how we construct a benchmark for AI agents. We will see when they accomplish tasks and when they fail — by missing malicious code or by reporting false findings.

Background

Just a few months ago Shai Hulud 2.0 compromised thousands of organizations, including Fortune 500 companies, banks, governments, and cool startups — see postmortem by PostHog. It was a supply chain attack for the Node Package Manager ecosystem, injecting malicious code stealing credentials.

Just a few days ago, Notepad++ shared updates on a hijack by state-sponsored actors, who replaced legitimate binaries with infected ones.

... continue reading