Abstractions from proofs

Thomas A. Henzinger, Ranjit Jhala, Rupak Majumdar, Kenneth L. McMillan (POPL 2004)

[link]

This is a seminal paper in formal verification. It describes how to use Craig Interpolants to speed-up the verification of imperative program properties. The high-level idea is nice. I went up to somewhere in page 5 in detail with a friend. You have your program, of which you’re trying to prove some property. Traditionally, static analysis over-approximate the space state of the programs to be practical, but that means they have false positives: the analysis might say “you program has a bug, because if $P$ happens, then you will access this invalid pointer”, for example. But it might be the case that $P$ is actually impossible. On the other hand, it is difficult to get precise analyses (i.e. those that do not give false positives) to scale to large programs. One specific challenge is that, in order to prove that a trace of execution is impossible, your verifier might need to remember an exponential number of facts about the program when it’s following possible traces of execution. In order to prove that a trace is infeasible, your SMT solver might use facts points of the program that are very early on. In order to only need smaller, more local predicates, they use Craig Interpolants.

The idea goes more or less like this. Suppose that, during your analysis, you found a program trace that is impossible (as verified by a SAT/SMT solver). The hypothesis is that a proof of unsatisfiability also encodes the “reason” why the trace is impossible, and you should be able to extract that reason from it. This “reason” is usually captured by the interpolant, which is a predicate that is more general (weaker) than what you can derive from a prefix of the trace, but still enough to prove the unsatisfiability of the trace together with the corresponding suffix of the trace. The Craig Interpolant, which always exists, also only uses variables that are common between the prefix and suffix of the trace, i.e. it does not use dead variables. The paper then describes a sound and complete system of rules that, given the proof of unsatisfiability, transforms that into the interpolant sequent that the verifier will keep track of.

I might revisit this paper at some point, and try to digest all of the remaining details. I think I’d still need to look at Craig’s Interpolation Theorem in First-Order Logic from the source to appreciate the idea more deeply, as that’s a key piece that makes this paper possible.

Gabriel Poesia
Computer Science PhD student