Skip to content
Tech News
← Back to articles

Lean proved this program correct; then I found a bug

read original get Debugging Software Toolkit → more articles
Why This Matters

This article highlights the growing importance of formal verification in software development, especially as AI tools become more adept at uncovering vulnerabilities. The successful verification of a zlib implementation in Lean demonstrates a promising approach to building more secure and reliable software systems amidst increasing security threats.

Key Takeaways

I fuzzed a verified implementation of zlib and found a buffer overflow in the Lean runtime.

AI agents are getting very good at finding vulnerabilities in large-scale software systems.

Anthropic, was apparently so spooked by the vulnerability-discovery capabilities of Mythos, they decided not to release it as it was "too dangerous" (lol). Whether you believe the hype about these latest models or not, it seems undeniable that the writing is on the wall:

The cost of discovering security bugs is collapsing, and the vast majority of software running today was never built to withstand that kind of scrutiny. We are facing a looming software crisis.

In the face of this oncoming tsunami, recently there has been increasing interest in formal verification as a solution. If we state and prove properties about our code using a mechanical tool, can we build robust, secure and stable software that can overcome this oncoming barrage of attacks?

One recent development in the Lean ecosystem has taken steps towards this question. 10 agents autonomously built and proved an implementation of zlib, lean-zip, an impressive landmark result. Quoting from Leo De Moura, the chief architect of the Lean FRO (here):

With apologies for the AI-slop (Leo has a penchant for it, it seems), the key result is that lean-zip is not just another implementation of zlib. It is an implementation that has been verified as correct end to end, guaranteed by Lean to be entirely free of implementation bugs.

What does "verified as correct" actually look like? Here is one of the main theorems (github):

theorem zlib_decompressSingle_compress ( data : ByteArray ) ( level : UInt8 ) ( hsize : data.size < 1024 * 1024 * 1024 ) : ZlibDecode.decompressSingle ( ZlibEncode.compress data level ) = .ok data

For any byte array less than 1 gigabyte, calling ZlibDecode.decompressSingle on the output of ZlibEncode.compress produces the original data. The decompress function is exactly the inverse of compression. This pair of functions is entirely correct.

... continue reading