I fuzzed a verified implementation of zlib and found a buffer overflow in the Lean runtime.
AI agents are getting very good at finding vulnerabilities in large-scale software systems.
Anthropic, was apparently so spooked by the vulnerability-discovery capabilities of Mythos, they decided not to release it as it was "too dangerous" (lol). Whether you believe the hype about these latest models or not, it seems undeniable that the writing is on the wall:
The cost of discovering security bugs is collapsing, and the vast majority of software running today was never built to withstand that kind of scrutiny. We are facing a looming software crisis.
In the face of this oncoming tsunami, recently there has been increasing interest in formal verification as a solution. If we state and prove properties about our code using a mechanical tool, can we build robust, secure and stable software that can overcome this oncoming barrage of attacks?
One recent development in the Lean ecosystem has taken steps towards this question. 10 agents autonomously built and proved an implementation of zlib, lean-zip, an impressive landmark result. Quoting from Leo De Moura, the chief architect of the Lean FRO (here):
With apologies for the AI-slop (Leo has a penchant for it, it seems), the key result is that lean-zip is not just another implementation of zlib. It is an implementation that has been verified as correct end to end, guaranteed by Lean to be entirely free of implementation bugs.
What does "verified as correct" actually look like? Here is one of the main theorems (github):
theorem zlib_decompressSingle_compress ( data : ByteArray ) ( level : UInt8 ) ( hsize : data.size < 1024 * 1024 * 1024 ) : ZlibDecode.decompressSingle ( ZlibEncode.compress data level ) = .ok data
For any byte array less than 1 gigabyte, calling ZlibDecode.decompressSingle on the output of ZlibEncode.compress produces the original data. The decompress function is exactly the inverse of compression. This pair of functions is entirely correct.
... continue reading