Title: P1252428717mRAyk
1Efficient SAT/BDD-based Techniques for Predicate
Abstraction
Shuvendu K. Lahiri
Microsoft Research, Redmond
Joint work with Thomas Ball, Randy Bryant, Byron
Cook, Robert Nieuwenhuis, Albert Oliveras
2Program analysis and abstraction
- Unbounded state space
- Unbounded integers, arrays, heap
- State exploration may not terminate
- Abstraction
- Construct an overapproximation of program
behavior - Abstract domain/operators ensures that the
analysis terminates
3Automatic predicate abstraction
- Graf Saïdi, CAV 97
- Underlying framework
- Abstract Interpretation, Cousot Cousot 77
- Idea
- Given set of predicates P P1, , Pk
- Formulas describing properties of system state
- Finite Abstraction
- Abstraction (s) subset of P1, , Pk holds
on s - At most 2k abstract states
4Predicate abstraction in practice
- Boolean Program from C programs
- SLAM
- Software Model Checking
- BLAST, MAGIC,
- Loop invariant synthesis for arrays and lists
- ESC-JAVA,..
- Distributed Protocol Verification
- UCLID, Murphi,
5Definitions
- Predicates
- Literals in some theory T
- P x 1, x y, x lt y 2, f(x) f(y) 2,..
- Formula
- Boolean combination of predicates
- ?(x 1 ? x lt y 2)
6Fundamental Operation Predicate Cover
- FP (?)
- Predicate cover of ?
- Weakest expression over P that implies ?
P Set of predicates ? Formula
7Example
P x lt y, x 2
- Minterms over P
- x ? y ? x ? 2
- x lt y ? x ? 2
- x ? y ? x 2
- x lt y ? x 2
? y gt 1
8Traditional approaches
- FP (?)
- Predicate cover of ?
- Weakest expression over P that implies ?
?
FP (?)
- Check which minterms imply ?
- Use a decision procedure to check the implication
Exponential number of Decision Procedure Calls
Partitioning defined by the predicates
P Set of predicates ? Formula
9Traditional approaches
- Large number of decision procedure calls
- Worst case exponential in P
- Exponential behavior often seen in practice
- Each decision procedure call can be expensive
- Limits scalability
- FP (?) invoked a few thousand times during a
single software verification run - Tools have to sacrifice precision for efficiency
10Overview of the talk
- Two approaches to predicate abstraction
- Symbolic Decision Procedures
- Satisfiability Modulo Theory (SMT) based
- Symbolic decision procedures (SDP)
- Lahiri, Ball, Cook CAV05
- SMT-based predicate abstraction
- Eager Lahiri, Bryant, Cook CAV03
- DPLL(T) based Lahiri, Oliveras, Nieuwenhuis
CAV06 - Challenges ahead
11Predicate Abstraction using Symbolic Decision
Procedures
12Overview of SDP
- Symbolic Decision Procedures
- Predicate abstraction
- SDP for Equality Logic
- Combining SDP for two theories
13Computing FP (?)
- FP distributes over conjunction
- FP (?1 ? ?2) FP (?1) ? FP (?2)
- Suffices to compute FP (e1 ? e2 . ? en)
- Each ei is a literal
- First convert ? to an equivalent conjunctive
normal form (CNF) - Rest of the talk, assume n 1 (simplicity)
- Concentrate on computing FP (e)
14Decision Procedure (DP)
- Input
- A set G g1,, gm of literals
- A literal e
- Output
- Is G ? e valid?
- Equivalently
- Is g1 ? .. gm ? ?e UNSAT?
- Is G ? ?e UNSAT?
15Symbolic Decision Procedure (SDP)
- Input
- A set G g1,.,gm of atomic expressions
- An atomic expression e
- Output
- Representation for
- G G ? G, and G ? ?e is UNSAT
- Symbolic Decision Procedure
- One run of SDP(G,e) represents an exponential
(2G) number of runs of DP(G,e)
16Predicate Abstraction and SDP
- PBar ?p p ? P
- SDP(P ? PBar, e) represents FP (e)
- FP (e)
- all minterms over P ? PBar that imply e
- SDP(P ? PBar, e)
- G G ? P ? PBar , and G ? ?e is UNSAT
17Overview of SDP
- Symbolic Decision Procedures
- Predicate abstraction
- SDP for Equality Logic
- Combining SDP for two theories
18A Decision Procedure for Equality Logic
- Atomic expressions
- x y, x ? y
- Inference Rules (R)
- Reflexivity, Symmetry, Transitivity
- Contradiction
- x y, x ? y ? ?
- Inference rule
- generates a new expression from existing
expressions
19A Decision Procedure for Equality Logic
G ab,bc e (a c)
- Atomic expressions
- x y, x ? y
- Inference Rules (R)
- Reflexivity, Symmetry, Transitivity
- Contradiction
- x y, x ? y ? ?
- Inference rule
- generates a new expression from existing
expressions
20A Decision Procedure for Equality Logic
G ? ?e
- Atomic expressions
- x y, x ? y
- Inference Rules (R)
- Reflexivity, Symmetry, Transitivity
- Contradiction
- x y, x ? y ? ?
- Inference rule
- generates a new expression from existing
expressions
R
R
lg(G)
R
Yes
Contains ?
UNSAT
SAT
21Symbolic DP for Equality Logic
G ab,bc,ad,dc e (a c)
- Modifications
- Introduce a Boolean variable g for each
expression g in G - Add true for ?e
- Construct a shared expression for the
derivations
a b
b c
true
a d
d c
a b
b c
a ? c
a d
d c
22Symbolic DP for Equality Logic
G ab,bc,ad,dc e (a c)
- Modifications
- Introduce a Boolean variable g for each
expression g in G - Add true for ?e
- Construct a shared expression for the
derivations
a b
b c
true
a d
d c
a ? c
a b
b c
a d
d c
a ? c
a c
23Symbolic DP for Equality Logic
G ab,bc,ad,dc e (a c)
- Modifications
- Introduce a Boolean variable g for each
expression g in G - Add true for ?e
- Construct a shared expression for the
derivations - SDP(G,e)
- The expression representing ? after lg(G)
steps
a b
b c
true
a d
d c
a ? c
a b
b c
a d
d c
a ? c
a c
?
24Symbolic DP for Equality Logic
G ab,bc,ad,dc e (a c)
- Output
- A shared Boolean expression with . variables in
the leaves
a b
b c
true
a d
d c
a ? c
a b
b c
a d
d c
a ? c
a c
?
25SDP for Equality Logic
- Expression representing ? after lg(G) steps
- Shared expression for G G ? G, and DP(G,e)
is UNSAT - Shared expression can be computed in polynomial
time - Derivations repeated for lg(G) steps
- Each step has at most V2 atomic expressions
- V number of vars in G
26SDP for other theories
G ? ?e
- Bounded-depth Saturating Theory T
- Decision procedure for T can be implemented by
saturation - Provide a function Depth G ?Nat, to denote the
max. depth to iterate
R
R
Depth(G)
R
Yes
No
Contains ?
UNSAT
SAT
27SDP for other theories
- Equality with Uninterpreted Functions (EUF)
- Expressions f(x) f(g(y)), x f(z)
- Depth(G) lt 3m
- m is the number of terms in G
- Polynomial Complexity of SDP
- Difference Logic (DIFF)
- Expressions x ? y c
- Depth(G) lt lg(G)
- Pseudo Polynomial Complexity of SDP
- Depends on the size of constants in G
28Overview of SDP
- Predicate Abstraction
- Symbolic Decision Procedures
- Predicate abstraction
- SDP for Equality Logic
- Combining SDP for two theories
29Combining SDPs for two theories
- Extend Nelson-Oppen method for combining decision
procedures for two theories T1, T2 - Nelson, Oppen TOPLAS 79
- The decision procedures communicate via
equalities over shared variables - Given SDP1 and SDP2 for theories T1, T2
- Disjoint signatures, convex theories
- Each theory generates derivations of all
equalities between variables - Complexity of the resultant SDP (for T1?T2) only
increases linearly in the number of variables
30Combining SDP for two theories
G1
SDP1
G2
N number of shared variables
xy
SDP2
G1
xy
SDP1
31Combining SDP for theories
- Combined SDP for EUF DIFF
- Pseudo Polynomial complexity
- Important fragment of most program verification
queries (especially in SLAM)
32SDP to Predicate Abstraction
- Output of SDP is an Expression DAG
- Represents FP (e)
- Can be used directly to construct Boolean
programs (with intermediate variables) - To compute explicit expression for FP (e)
- Construct a Binary Decision Diagram (BDD) from
SDP, and enumerate prime-implicants - BDDs crucial for exploiting the shared
representation
33Evaluation
- SLAM benchmarks
- Generated 665 predicate abstraction queries from
device driver verification - Decision Procedure (Zapato) based approach
27904sec - SDP based approach 273s
- 100X speedup
34Challenges
- SDP for other interesting theories and
combinations - Linear arithmetic, non-convex theories
- Incremental SDPs
- Useful for combining SDPs
- Output sensitive predicate abstraction?
- Complexity is polynomial in the number of
minterms in the output
35Conclusion
- Predicate abstraction via symbolic decision
procedures - Polynomial algorithms for useful theories
- Modular combination of Symbolic Decision
Procedures for theories - Can design SDP for each theory in isolation
- Simple prototype implementation
- Promising results on SLAM queries
36Overview of the talk
- Two approaches to predicate abstraction
- Symbolic Decision Procedures
- Satisfiability Modulo Theory (SMT) based
- Symbolic decision procedures (SDP)
- Lahiri, Ball, Cook CAV05
- SMT-based predicate abstraction
- Eager Lahiri, Bryant, Cook CAV03
- DPLL(T) based Lahiri, Oliveras, Nieuwenhuis
CAV06 - Challenges ahead
37SMT-based predicate abstraction
38Satifiability Modulo Theories (SMT)
- SMT
- Decide satisfiability of a (ground) first-order
formula with respect to a background theory T - Example (EUF)
- g(a) c ? (f(g(a)) ? f(c) ? g(a) d) ? c ? d
- SMT-solvers
- Leverages efficient Boolean search of Boolean
satifiability (SAT) solvers
39SMT for predicate abstraction
- Input
- A formula ?, a set of predicates P over a theory
T - Output
- GP (?) External predicate cover of ?
- Same as ?FP (??)
- Main Idea Lahiri et al. CAV03, Clarke et al.
FMSD 04 - Introduce fresh Boolean variables B b1,.., bn
- Construct the formula ? ? (?i (bi ? Pi))
- Enumerate all the models over B
40Eager SMT techniques
? (X, B)
- Methodology
- Translates a (ground) formula into
equisatisfiable Boolean formula - Use off-the-shelf SAT solvers to check the
satisfiability - Tools UCLID
Equisatisfiable Translation
?bool (A, B)
Variables introduced during translation
41Predicate abstraction using eager SMT techniques
? ? (?i (bi ? Pi))
- Methodology
- Lahiri, Bryant, Cook CAV03
- Translates a (ground) formula into Boolean
formula - Use off-the-shelf BDD or SAT solvers to perform
AllSAT over B - Implemented in UCLID
- Uses SATQE (Kroening)
Equisatisfiable Translation Preserves
solutions over Boolean variables
Equisatisfiable Translation
?bool (A, B)
Variables introduced during translation
42Advantage over explicit approach
- Single Call to SAT-based Quantification Engine
- Removes exponential number of calls to theorem
prover - Learning in Incremental SAT
- Retains conflict clauses across different
solutions - Leverage future advances in SAT
- Without any change to the framework
43Evaluation
- Compared with a black-box decision procedure
based approach - Das, Dill and Park, CAV99
- SLAM benchmarks
- Device driver verification
- Eager SMT technique improves 50-100X on many
benchmarks - Distributed protocol verification (UCLID)
- Lahiri, Bryant VMCAI04
- Decision procedure (SVC/CVC) based approach
unable to finish on most examples - gt 10,000 theorem prover calls
44Lazy SMT techniques
- Integrate a theory T-solver with SAT solver
- Lazily rule out T-inconsistent Boolean models
using theory solver - CVC-Lite, Verifun, MathSAT, Barcelogic,
- Barcelogic Tool
- R. Nieuwenhuis and A. Oliveras CAV05
- Optimizations (based on DPLL(T))
- Check partial Boolean models for T-inconsistency
- Upon T-inconsistency, use the explanation as a
conflicting clause and perform backjump - Theory (unit) propagation to generate implied
facts
45Predicate abstraction using lazy methods
- Lahiri, Nieuwenhuis, Oliveras CAV06, using
Barcelogic - Enumerate all the models over B for
- ? ? ? ? (?i (bi ? Pi))
- while ? is T-satisfiable do
- M T-model for ? using SMT-solver
- M project M onto B
- Consider ?M as a conflicting clause
- Perform conflict analysis to generate backjump
clause - Optionally add backjump clause
- Backjump and continue
- return all models over B
46Experimental results
- SLAM benchmarks
- 5seconds on 665 benchmarks
- gt 100X improvement on SDP based approach
- Hardware and protocol benchmarks UCLID
- 7 set of benchmarks
- 22X 143X improvement over Eager-SMT based
approach - Linked list verification Lahiri, Qadeer POPL06
- 4 set of benchmarks
- 31X 40X improvement over Eager-SMT based
approach - SDP-based technique not applied on the latter two
classes - Need support for (sound) quantifier-reasoning
47Hardware and protocol benchmarks
Benchmarks Preds Eager (secs) Lazy (secs) minterms cubes
Aodv 21 657 4.6 2916 458
Bakery 32 245 11 426 294
BRP 22 3.5 0.1 30 24
Cache_ibm 16 34 1.3 326 123
Cache_ibm2 26 1119 23 2238 1022
Dlx 23 335 13 30808 2704
OOO 25 921 36 10728 242
cubes Number of prime-implicants in the BDD
for the minterms
- Theory propagation crucial for benchmarks with
arithmetic - E.g. 17X slowdown in OOO without it
- Reusing lemmas and clauses improves 1.5X 3X on
most examples
48Conclusions
- Relatively easy to turn SMT solver to perform
predicate abstraction - Clear benefit from leveraging learned clause and
not restarting the search after each model - Improvements in SMT translate to predicate
abstraction case
49Overview of the talk
- Two approaches to predicate abstraction
- Symbolic Decision Procedures
- Satisfiability Modulo Theory (SMT) based
- Symbolic decision procedures (SDP)
- Lahiri, Ball, Cook CAV05
- SMT-based predicate abstraction
- Eager Lahiri, Bryant, Cook CAV03
- DPLL(T) based Lahiri, Oliveras, Nieuwenhuis
CAV06 - Challenges ahead
50Summary
- Symbolic decision procedures
- Can construct DAG representation of output in
polynomial time for useful theories - Modular combination of SDPs
- Require more optimizations to make it practical
- SMT-based procedures
- Can leverage SMT solvers without much effort
- ALLSAT using SAT-solvers (Eager) or SMT solvers
(Lazy) - Lazy approaches benefit from tighter SATtheory
reasoning
51Challenges for predicate abstraction tools
- Predicate abstraction with non-ground formulas
- Quantifiers were removed with simple
instantiation techniques for UCLID/List
verification benchmarks - Generate partial models during ALLSAT
- Should improve the performace when ratio of
minterms cubes is large - Incremental refinement of approximations
- Construct refined approximation of FP (?) from
coarser approximations, without repeating work - Some initial directions in CAV06 paper
- Refining the abstraction (incrementally) with
monotonically increasing set of predicates
52 Questions?
53(No Transcript)
54Overview
- Predicate Abstraction
- Symbolic Decision Procedures (SDP)
- Predicate abstraction
- SDP for Equality Logic
- Combining SDP for two theories
- Implementation and Results
- Related Work
55Zap Overview
- Ball, Lahiri, Musuvathi
- Many automated program analysis tools require
symbolic reasoning - e.g. Unit-testing, model checking, static
analysis, - Support symbolic operations for such tools
- Support richer operations, apart from validity
checking - Support useful theories for program analysis
- Leverage advances in SAT solving and theorem
proving
MUTT unit-testing
Zing model checking
Boogie static analysis
SLAM/SDV
Zap theorem prover
56Symbolic Reasoning for Automated Software Analysis
- Validity / Satisfiability
- Model generation
- Useful in test case generation
- Quantifier elimination
- Image operation in model checking
- Abstract interpretation operations
- abstract transformers, join, widen
- Interpolants
- For abstraction-refinement
57Interesting Theories
- Theories
- Equality with uninterpreted functions (EUF)
- Linear Arithmetic
- Arrays
- Bounded Integers
- Lists
- Sets
- Combine the symbolic operations for different
theories
58Symbolic Reasoning for Automated Software Analysis
- Validity / Satisfiability
- Model generation
- Useful in test case generation
- Quantifier elimination
- Image operation in model checking
- Abstract interpretation operations
- abstract transformers, join, widen
- Interpolants
- For abstraction-refinement
59?FP (??)
60Evaluation
- SLAM benchmarks
- Generated 665 predicate abstraction queries from
device driver verification - Decision Procedure based approach 27904sec
- SDP based approach 273s
- 100X speedup
- Synthetic benchmark
- Comparison with UCLID
- More than 100X speedup
61Related Work
- Decision Procedure Based
- Calls a decision procedure to check implication
with each minterm - Das Dill, Saidi Shankar,
- Boolean Quantifier Elimination Based
- Lahiri, Bryant, Cook, CAV 03, Clarke et al.,
FMSD 04 - Performs predicate abstraction by quantifier
elimination - Reduces restricted first-order quantifier
elimination to Boolean quantifier elimination
62Experimental Setup
- Symbolic Method
- Incremental SAT-based method
- SATQE Simple extension to Zchaff
- Built by Daniel Kroening at CMU
- Explicit Method
- Algorithm of Das, Dill Park, CAV99
- Avoids exponential worst case in many cases in
practice - Uses SVC as a decision procedure
- Device Driver Benchmarks from SLAM Toolkit
- Ball and Rajamani, MSR
- Queries during C ? Boolean Program construction
63Evaluation on SLAM-benchmarks
Example Preds Explicit Explicit Symbolic Symbolic
Example Preds Calls Time (sec) Prop-vars SAT-based time (sec)
Dr.10 19 gt7576 gt1000 115 9.9
Dr.13 20 gt7351 gt1000 234 44.7
Dr.15 23 gt7237 gt1000 336 68.2
Dr.17 15 3041 507 105 6.1
Dr.3 13 2023 355 125 7.0
- BDD based approach worse than SAT on larger
benchmarks
64Symbols
- ?????????????????????????????????????????????
65Challenges ahead