Quantified Invariant Generation using an Interpolating Saturation Prover - PowerPoint PPT Presentation

About This Presentation
Title:

Quantified Invariant Generation using an Interpolating Saturation Prover

Description:

Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: AAAAA – PowerPoint PPT presentation

Number of Views:97
Avg rating:3.0/5.0
Slides: 44
Provided by: KenM97
Category:

less

Transcript and Presenter's Notes

Title: Quantified Invariant Generation using an Interpolating Saturation Prover


1
Quantified Invariant Generationusing
anInterpolating Saturation Prover
  • Ken McMillan
  • Cadence Research Labs

TexPoint fonts used in EMF AAAAA
2
Quantified invariants
  • Many systems that we would like to verify
    formally are effectively infinite state
  • Parameterized protocols
  • Programs manipulating unbounded data structures
    (arrays, heaps, stacks)
  • Programs with unbounded thread creation
  • To verify such systems, we must construct a
    quantified invariant
  • For all processes, array elements, threads, etc.
  • Existing fully automated techniques for
    generating invariants are not strongly relevance
    driven
  • Invisible invariants
  • Indexed predicate abstraction
  • Shape analysis

3
Interpolants and abstraction
  • Interpolants derived from proofs can provide an
    effective relevance heuristic for constructing
    inductive invariants
  • Provides a way of generalizing proofs about
    bounded behaviors to the unbounded case
  • Exploits a provers ability to focus on relevant
    facts
  • Used in various applications, including
  • Hardware verification (propositional case)
  • Predicate abstraction (quantifier-free)
  • Program verification (quantifier-free)
  • This talk
  • Moving to the first-order case, including FO(TC)
  • Modifying SPASS to create an interpolating FO
    prover
  • Apply to program verification with arrays, linked
    lists

4
Invariants from unwindings
  • Consider this very simple approach
  • Partially unwind a program into a loop-free,
    in-line program
  • Construct a Floyd/Hoare proof for the in-line
    program
  • See if this proof contains an inductive invariant
    proving the property
  • Example program

x y 0 while() x y while(x ! 0)
x-- y-- assert (y 0)
5
Unwind the loops
  • Assertions may diverge as we unwind
  • A practical method must somehow prevent this kind
    of divergence!

6
Interpolation Lemma
Craig,57
  • If A Ù B false, there exists an interpolant A'
    for (A,B) such that
  • A implies A
  • A is inconsistent with B
  • A is expressed over the common vocabulary of A
    and B

A variety of techniques exist for deriving an
interpolant from a refutation of A Ù B, generated
by a theorem prover.
7
Interpolants for sequences
  • Let A1...An be a sequence of formulas
  • A sequence A0...An is an interpolant for
    A1...An when
  • A0 True
  • Ai-1 Ai ) Ai, for i 1..n
  • An False
  • and finally, Ai 2 L (A1...Ai) \ L(Ai1...An)

In other words, the interpolant is a
structured refutation of A1...An
8
Interpolants as Floyd-Hoare proofs
2. Each is over common symbols of prefix and
suffix
3. Begins with true, ends with false
9
FOCI An Interpolating Prover
  • Proof-generating decision procedure for
    quantifier-free FOL
  • Equality with uninterpreted function symbols
  • Theory of arrays
  • Linear rational arithmetic, integer difference
    bounds
  • SAT Modulo Theories approach
  • Boolean reasoning performed by SAT solver
  • Exploits SAT relevance heuristics
  • Quantifier-free interpolants from proofs
  • Linear-time construction TACAS 04
  • From Q-F interpolants, we can derive atomic
    predicates for Predicate Abstraction Henzinger,
    et al, POPL 04
  • Allows counterexample-based refinement
  • Integrated with software verification tools
  • Berkeley BLAST, Cadence IMPACT

10
Avoiding divergence
  • Programs are infinite state, so convergence to a
    fixed point is not guaranteed.
  • What would prevent us from computing an infinite
    sequence of interpolants, say, x0, x1, x2,...
    as we unwind the loops further?
  • Limited completeness result TACAS06
  • Stratify the logical language L into a hierarchy
    of finite languages
  • Compute minimal interpolants in this hierarchy
  • If an inductive invariant proving the property
    exists in L, you must eventually converge to one

Interpolation provides a means of static analysis
in abstract domains of infinite height. Though we
cannot compute a least fixed point, we can
compute a fixed point implying a given property
if one exists.
11
Expressiveness hierarchy
Canonical Heap Abstractions
8FO(TC)
Indexed Predicate Abstraction
8FO
Expressiveness
Predicate Abstraction
QF
Interpolant Language
Parameterized Abstract Domain
12
Need for quantified interpolants
for(i 0 i lt N i) ai i for(j 0
j lt N j) assert aj j
  • Existing interpolating provers cannot produce
    quantified interpolants
  • Problem how to prevent the number of quantifiers
    from diverging in the same way that constants
    diverge when we unwind the loops?

13
Need for Reachability
... node a create_list() while(a)
assert(alloc(a)) a a-gtnext
...
invariant
8 x (rea(next,a,x) x ? nil ! alloc(x))
  • This condition needed to prove memory safety (no
    use after free).
  • Cannot be expressed in FO
  • We need some predicate identifying a closed set
    of nodes that is allocated
  • We require a theory of reachability (in effect,
    transitive closure)

Can we build an interpolating prover for full
FOL than that handles reachability, and avoids
divergence?
14
Clausal provers
  • A clausal refutation prover takes a set of
    clauses and returns a proof of unsatisfiability
    (i.e., a refutation) if possible.
  • A prover is based on inference rules of this form

P1 ... Pn
C
  • where P1 ... Pn are the premises and C the
    conclusion.
  • A typical inference rule is resolution, of which
    this is an instance

p(a) p(U) ! q(U)
q(a)
  • This was accomplished by unifying p(a) and P(U),
    then dropping the complementary literals.

15
Superposition calculus
  • Modern FOL provers based on the superposition
    calculus
  • example superposition inference

Q(a) P ! (a c)
P ! Q(c)
  • this is just substitution of equals for equals
  • in practice this approach generates a lot of
    substitutions!
  • use reduction order to reduce number of
    inferences

16
Reduction orders
  • A reduction order  is
  • a total, well founded order on ground terms
  • subterm property f(a) Â a
  • monotonicity a  b implies f(a)  f(b)
  • Example Recursive Path Ordering (with Status)
    (RPOS)
  • start with a precedence on symbols a  b  c Â
    f
  • induces a reduction ordering on ground terms
  • f(f(a)  f(a)  a  f(b)  b  c  f

17
Ordering Constraint
  • Constrains rewrites to be downward in the
    reduction order

Q(a) P ! (a c)
P ! Q(c)
example this inference only possible if a  c
18
Local Proofs
  • A proof is local for a pair of clause sets (A,B)
    when every inference step uses only symbols from
    A or only symbols from B.
  • From a local refutation of (A,B), we can derive
    an interpolant for (A,B) in linear time.
  • This interpolant is a Boolean combination of
    formulas in the proof

19
Reduction orders and locality
  • A reduction order is oriented for (A,B) when
  • s  t for every s ? L (B) and t 2L(B)
  • Intuition rewriting eliminates first A
    variables, then B variables.

oriented x y c d f
x y f(x) c f(y) c
Local!!
f(y) c f(y) d c d
c d c ? d ?
20
Orientation is not enough
A
B
Q(a)
a c
Q  a  b  c
b c
Q(b)
  • Local superposition gives only cc.
  • Solution replace non-local superposition with
    two inferences

Second inference can be postponed until after
resolving with Q(b)
This procrastination step is an example of a
reduction rule, and preserves completeness.
21
Completeness of local inference
  • Thm Local superposition with procrastination is
    complete for refutation of pairs (A,B) such that
  • (A,B) has a universally quantified interpolant
  • The reduction order is oriented for (A,B)
  • This gives us a complete method for generation of
    universally quantified interpolants for arbitrary
    first-order formulas!
  • This is easily extensible to interpolants for
    sequences of formulas, hence we can use the
    method to generate Floyd/Hoare proofs for inline
    programs.

22
Avoiding Divergence
  • As argued earlier, we still need to prevent
    interpolants from diverging as we unwind the
    program further.
  • Idea stratify the clause language

Example Let Lk be the set of clauses with at
most k variables and nesting depth at most k.
Note that each Lk is a finite language.
  • Stratified saturation prover
  • Initially let k 1
  • Restrict prover to generate only clauses in Lk
  • When prover saturates, increase k by one and
    continue

The stratified prover is complete, since every
proof is contained in some Lk.
23
Completeness for universal invariants
  • Lemma For every safety program M with a 8
    safety invariant, and every stratified saturation
    prover P, there exists an integer k such that P
    refutes every unwinding of M in Lk, provided
  • The reduction ordering is oriented properly
  • This means that as we unwind further, eventually
    all the interpolants are contained in Lk, for
    some k.
  • Theorem Under the above conditions, there is
    some unwinding of M for which the interpolants
    generated by P contain a safety invariant for M.

This means we have a complete procedure for
finding universally quantified safety invariants
whenever these exist!
24
In practice
  • We have proved theoretical convergence. But does
    the procedure converge in practice in a
    reasonable time?
  • Modify SPASS, an efficient superposition-based
    saturation prover
  • Generate oriented precedence orders
  • Add procrastination rule to SPASSs reduction
    rules
  • Drop all non-local inferences
  • Add stratification (SPASS already has something
    similar)
  • Add axiomatizations of the necessary theories
  • An advantage of a full FOL prover is we can add
    axioms!
  • As argued earlier, we need a theory of arrays and
    reachability (TC)
  • Since this theory is not finitely axiomatizable,
    we use an incomplete axiomatization that is
    intended to handle typical operations in
    list-manipulating programs

25
Partially Axiomatizing FO(TC)
  • Axioms of the theory of arrays (with select and
    store)

8 (A, I, V) (select(update(A,I,V), I) V
8 (A,I,J,V) (I ? J ! select(update(A,I,V), J)
select(A,J))
  • Axioms for reachability (rea)

8 (L,E) rea(L,E,E)
8 (L,E,X) (rea(L,select(L,E),X) ! rea(L,E,X))
if e-gtlink reaches x then e reaches x
8 (L,E,X) (rea(L,E,X) ! E X _
rea(L,select(L,E),X))
if e reaches x then e x or e-gtlink reaches x
etc...
Since FO(TC) is incomplete, these axioms must be
incomplete
26
Simple example
for(i 0 i lt N i) ai i for(j 0
j lt N j) assert aj j
27
Unwinding simple example
  • Unwind the loops twice

note stratification prevents constants
diverging as 0, succ(0), succ(succ(0)), ...
28
List deletion example
a create_list() while(a) tmp a-gtnext
free(a) a tmp
  • Invariant synthesized with 3 unwindings (after
    some simplification)

rea(next,a,nil)
8 x (rea(next,a,x)! x nil _ alloc(x))
  • That is, a is acyclic, and every cell is
    allocated
  • Note that interpolation can synthesize Boolean
    structure.

29
More small examples
This shows that divergence can be controlled.
But can we scale to large programs?...
30
Canonical abstraction
  • Abstraction replaces concrete heaps with abstract
    symbolic heaps
  • Abstraction parameterize by instrumentation
    predicates
  • Abstract heap represents infinite class of
    concrete heaps
  • Summary node represents equivalence class of
    concrete nodes
  • Dotted arcs mean may point to

31
Example program
node create_list() node l NULL
while() node n malloc(...)
n-gtnext l l n return l main()
node a create_list() while(a)
assert(alloced(a)) a a-gtnext
  • Want to prove this program does not access a
    freed cell.

32
Canonical Abstraction
  • Predicates Pta, Reaa, is_null, alloc
  • Relations next

is_null
Pta
Rean
a
Pta
is_null
Rean
Rean
a
alloc
Pta
Rean
is_null(n)
Rean
Rean
a
alloc
alloc
All three abstract heaps verify property!
33
A slightly larger program
main() node a create_list() node b
create_list() node c create_list()
node p ? a ? b c while(p)
assert(alloced(p)) p p-gtnext
  • We have to track a, b and c to prove this
    property
  • Lets look at what happens with canonical heap
    abstractions...

34
After creating a
  • Predicates Pta, Reaa, is_null, alloced
  • Relations next

35
After creating b
36
After creating c
Picture 27 abstract heaps here
Problem abstraction scales exponentially with
number of independent data structures.
37
Independent analyses
  • Suppose we do a Cartesian product of 3
    independent analyses for a,b,c.


  • How do we know we can decompose the analysis in
    this way and prove the property?
  • What if some correlations are needed between the
    analyses?
  • For non-heap properties, one good answer is to
    compute interpolants.

38
Abstraction from interpolants
main() node a create_list() node b
create_list() node c create_list()
node p ? x ? b c while(p)
assert(alloced(p)) p p-gtnext
  • Interpolants contain inductive invariants after
    unrolling loops 3 times.
  • Interpolant after creating c

39
Shape of the interpolant
( a ? 0 ) alloced(a) ) ( b ? 0 ) alloced(b) )
( c ? 0 ) alloced(c) )
8 x. (x ? 0 alloced(x) )
alloced(next(x))

next
a b c
alloced
next
null
  • Invariant says that allocated cells closed under
    next relation
  • Notice also the size of this formula is linear in
    the number of lists, not exponential as is the
    set of shape graphs.

40
Suggests decomposition
( a ? 0 ) alloced(a) ) ( b ? 0 ) alloced(a) )
( c ? 0 ) alloced(a) )
8 x. (x ? 0 alloced(x) )
alloced(next(x))

Canonical abstract domains
Predicates
Relations
a 0, alloced(n)
b 0, alloced(n)
c 0, alloced(n)
next
n 0, alloced(n)
  • Each of these analyses proves one conjunct of the
    invariant.

41
Conclusion
  • Interpolants and invariant generation
  • Computing interpolants from proofs allows us to
    generalize from special cases such as loop-free
    unwindings
  • Interpolation can extract relevant facts from
    proofs of these special cases
  • Must avoid divergence
  • Quantified invariants
  • Needed for programs that manipulating arrays or
    heaps
  • FO equality prover modified to produce local
    proofs (hence interpolants)
  • Complete for universal invariants
  • Can be used to construct invariants of simple
    array- and list-manipulating programs, using
    partial axiomatization of FO(TC)
  • Language stratification prevents divergence
  • Might be used as a relevance heuristic for shape
    analysis, IPA

For this approach to work in practice, we need FO
provers with strong relevance heuristics as in
DPLL...
42
Expressiveness hierarchy
Canonical Heap Abstractions
8FO(TC)
Indexed Predicate Abstraction
8FO
Expressiveness
Predicate Abstraction
QF
Interpolant Language
Parameterized Abstract Domain
43
Need for Reachability
... node a create_list() while(a)
assert(alloc(a)) a a-gtnext
...
invariant
8 x (rea(next,a,x) x ? nil ! alloc(x))
  • This condition needed to prove memory safety (no
    use after free).
  • Cannot be expressed in FO
  • We need some predicate identifying a closed set
    of nodes that is allocated
  • We require a theory of reachability (in effect,
    transitive closure)

Can we build an interpolating prover for full
FOL than that handles reachability, and avoids
divergence?
Write a Comment
User Comments (0)
About PowerShow.com