Broken Proofs and Broken Provers

Broken proofs and broken provers

People expect perfection. Consider the reaction when someone who has been vaccinated against a particular disease nevertheless dies of it. Mathematical proof carries the aura of perfection, but again people’s expectations will sometimes be dashed. As outlined in an earlier post, the verification of a real world system is never finished. We can seldom capture 100% of reality, so failure remains possible. Even in a purely mathematical proof, there are plenty of ways to screw up. Plus, proof assistants have bugs.

Some badly broken proofs

Many years ago, I refereed a paper about the verification of some obscure application. I recall that the theoretical framework depended on several parameters, among them $z$, a nonzero complex number. This context was repeated for all the theorems in the paper: each included the assumption $\forall z.\, z

ot=0$. Needless to say, that assumption is flatly false. The author of the paper had presumably intended to write something like $\forall z.\, z

ot=0 \to \cdots$. The error is easy to overlook, and I only became suspicious because the proofs looked too simple. The entire development was invalid.

Isabelle would have helped here. For one thing, Sledgehammer warns you if it detects a contradiction in your background theory. And also, when you have a series of theorems that all depend on the same context, you can prove them within a locale, with assumptions such as $z

ot=0$ laid out clearly. Even without locales, the Isabelle practice of listing the assumptions separately in the theorem statement would have avoided the problem. Isabelle newbies frequently ask why we prefer inference rules with explicit premises and conclusions over formulas like $\forall x.\,P(x) \to Q(x)$. It is to avoid the confusing clutter of quantifiers and implications and the further clutter of the proof steps needed to get rid of them ( intros in Rocq).

But in some cases, Isabelle itself can be the problem. Once, a student had proved some false statement and then used it to prove all the other claims (easily!). But how did he prove the false statement in the first place? Actually he didn’t: his proof was stuck in a loop. By a quirk of multithreading: when processing a theory file, If one Isabelle thread gets bogged down, other threads will still race ahead under the assumption that the bogged-down proof will eventually succeed. Such issues are easily identified if you run your theory in batch mode: it would simply time out. I have given an example of this error in another post. Experienced users know to be wary when proofs go through too easily.

Another way to prove nonsense is getting your definitions wrong. With luck, the lemmas that you have to prove about your definitions will reveal any errors. Apart from that, you will be okay provided the definitions do not appear in the main theorem. Recently I completed a large proof where I was deeply unsure that I understood the subject matter. Fortunately, the headline result included none of the intricate definitions involved in the work. The key point is that making a definition in a proof assistant does not compromise its consistency. If the definition is wrong, theorems mentioning it will not mean what you think they mean, but you will not be able to prove $1=0$.

... continue reading