Title: Introduction to Satisfiability Modulo Theories (SMT)
1Introduction to
Satisfiability Modulo Theories(SMT)
- Clark Barrett, NYU
- Sanjit A. Seshia, UC Berkeley
ICCAD Tutorial November 2, 2009
2Boolean Satisfiability (SAT)
p1
Ç
Æ
p2
?
. . .
Æ
Ç
Ç
pn
Is there an assignment to the p1, p2, , pn
variables such that ? evaluates to 1?
3Satisfiability Modulo Theories
p1
x y
Ç
Æ
p2
x 2 z 1
?
. . .
Æ
Ç
w 0xFFFF x
Ç
x 26 v
pn
Is there an assignment to the x,y,z,w variables
s.t. ? evaluates to 1?
4Satisfiability Modulo Theories
- Given a formula in first-order logic, with
associated background theories, is the formula
satisfiable? - Yes return a satisfying solution
- No generate a proof of unsatisfiability
5Applications of SMT
- Hardware verification at higher levels of
abstraction (RTL and above) - Verification of analog/mixed-signal circuits
- Verification of hybrid systems
- Software model checking
- Software testing
- Security Finding vulnerabilities, verifying
electronic voting machines, - Program synthesis
6References
- Satisfiability Modulo Theories
- Clark Barrett, Roberto Sebastiani, Sanjit A.
Seshia, and Cesare Tinelli. - Chapter 8 in the Handbook of Satisfiability,
Armin Biere, Hans van Maaren, and Toby Walsh,
editors, IOS Press, 2009. - (available from our webpages)
- SMTLIB A repository for SMT formulas (common
format) and tools - SMTCOMP An annual competition of SMT solvers
7Roadmap for this Tutorial
- Background and Notation
- Survey of Theories
- Theory Solvers
- Approaches to SMT Solving
- Lazy Encoding to SAT
- Eager Encoding to SAT
- Conclusion
8Roadmap for this Tutorial
- Background and Notation
- Survey of Theories
- Theory Solvers
- Approaches to SMT Solving
- Lazy Encoding to SAT
- Eager Encoding to SAT
- Conclusion
9First-Order Logic
- A formal notation for mathematics, with
expressions involving - Propositional symbols
- Predicates
- Functions and constant symbols
- Quantifiers
- In contrast, propositional (Boolean) logic only
involves propositional symbols and operators
10First-Order Logic Syntax
- As with propositional logic, expressions in
first-order logic are made up of sequences of
symbols. - Symbols are divided into logical symbols and
non-logical symbols or parameters. - Example
- (x y) Æ (y z) Æ (f(z) f(x)1)
11First-Order Logic Syntax
- Logical Symbols
- Propositional connectives Ç, Æ, , !,
- Variables v1, v2, . . .
- Quantifiers 8, 9
- Non-logical symbols/Parameters
- Equality
- Functions , -, , bit-wise , f(), concat,
- Predicates , is_substring,
- Constant symbols 0, 1.0, null,
12Quantifier-free Subset
- We will largely restrict ourselves to formulas
without quantifiers (8, 9) - This is called the quantifier-free
subset/fragment of first-order logic with the
relevant theory
13Logical Theory
- Defines a set of parameters (non-logical symbols)
and their meanings - This definition is called a signature.
- Example of a signature
- Theory of linear arithmetic over integers
- Signature is (0,1,,-,) interpreted over Z
14Roadmap for this Tutorial
- Background and Notation
- Survey of Theories
- Theory Solvers
- Two Approaches to SMT Solving
- Lazy Encoding to SAT
- Eager Encoding to SAT
- Conclusion
15Some Useful Theories
- Equality (with uninterpreted functions)
- Linear arithmetic (over Q or Z)
- Difference logic (over Q or Z)
- Finite-precision bit-vectors
- integer or floating-point
- Arrays / memories
- Misc. Non-linear arithmetic, strings, inductive
datatypes (e.g. lists), sets,
16Theory of Equality and Uninterpreted Functions
(EUF)
- Also called the free theory
- Because function symbols can take any meaning
- Only property required is congruence that these
symbols map identical arguments to identical
values i.e., x y ) f(x) f(y) - SMTLIB name QF_UF
17Data and Function Abstraction
with EUF
Common Operations
p
x
1 0
ITE(p, x, y)
y
If-then-else
x
x y
y
Test for equality
18Hardware Abstraction with EUF
F1
F2
F3
- For any Block that Transforms or Evaluates Data
- Replace with generic, unspecified function
- Also view instruction memory as function
19Example QF_UF (EUF) Formula
- (x y) Æ (y z) Æ (f(x) ? f(z))
- Transitivity
- (x y) Æ (y z) ) (x z)
- Congruence
- (x z) ) (f(x) f(z))
20Equivalence Checking of
Program Fragments
int fun1(int y) int x, z z y y
x x z return xx
SMT formula ? Satisfiable iff programs
non-equivalent ( z y Æ y1 x Æ x1 z Æ
ret1 x1x1) Æ ( ret2 yy ) Æ (
ret1 ? ret2 )
int fun2(int y) return yy
What if we use SAT to check equivalence?
21Equivalence Checking of
Program Fragments
SMT formula ? Satisfiable iff programs
non-equivalent ( z y Æ y1 x Æ x1 z Æ
ret1 x1x1) Æ ( ret2 yy ) Æ (
ret1 ? ret2 )
int fun1(int y) int x, z z y y
x x z return xx
Using SAT to check equivalence (w/ Minisat)
32 bits for y Did not finish in over 5 hours
16 bits for y 37 sec. 8 bits for y 0.5
sec.
int fun2(int y) return yy
22Equivalence Checking of
Program Fragments
int fun1(int y) int x, z z y y
x x z return xx
SMT formula ? ( z y Æ y1 x Æ x1 z Æ
ret1 sq(x1) ) Æ ( ret2 sq(y) )
Æ ( ret1 ? ret2 )
int fun2(int y) return yy
Using EUF solver 0.01 sec
23Equivalence Checking of
Program Fragments
int fun1(int y) int x x x y y
x y x x y return xx
Does EUF still work?
No! Must reason about bit-wise XOR. Need a
solver for bit-vector arithmetic. Solvable in
less than a sec. with a current bit-vector solver.
int fun2(int y) return yy
24Finite-Precision Bit-Vector Arithmetic (QF_BV)
- Fixed width data words
- Can model int, short, long, etc.
- Arithmetic operations
- E.g., add/subtract/multiply/divide comparisons
- Twos complement and unsigned operations
- Bit-wise logical operations
- E.g., and/or/xor, shift/extract and equality
- Boolean connectives
25Linear Arithmetic
(QF_LRA, QF_LIA)
- Boolean combination of linear constraints of the
form - (a1 x1 a2 x2 an xn b)
- xis could be in Q or Z , 2 ,gt,,lt,
- Many applications, including
- Verification of analog circuits
- Software verification, e.g., of array bounds
26Difference Logic (QF_IDL,
QF_RDL)
- Boolean combination of linear constraints of the
form - xi - xj cij or xi ci
- 2 ,gt,,lt,, xis in Q or Z
- Applications
- Software verification (most linear constraints
are of this form) - Processor datapath verification
- Job shop scheduling / real-time systems
- Timing verification for circuits
27Arrays/Memories
- SMT solvers can also be very effective in
modeling data structures in software and hardware - Arrays in programs
- Memories in hardware designs e.g. instruction
and data memories, CAMs, etc.
28Theory of Arrays (QF_AX)Select and Store
- Two interpreted functions select and store
- select(A,i) Read from A at index i
- store(A,i,d) Write d to A at index i
- Two main axioms
- select(store(A,i,d), i) d
- select(store(A,i,d), j) select(A,j) for i ? j
- One other axiom
- (8 i. select(A,i) select(B,i)) ) A B
29Equivalence Checking of
Program Fragments
int fun1(int y) int x2 x0 y
y x1 x1 x0 return x1x1
SMT formula ? x1 store(x,0,y) Æ y1
select(x1,1) Æ x2 store(x1,1,select(x1,0))
Æ ret1 sq(select(x2,1))
Æ ( ret2 sq(y) ) Æ ( ret1 ? ret2 )
int fun2(int y) return yy
30Roadmap for this Tutorial
- Background and Notation
- Survey of Theories
- Theory Solvers
- Two Approaches to SMT Solving
- Lazy Encoding to SAT
- Eager Encoding to SAT
- Conclusion
31 32Roadmap for this Tutorial
- Background and Notation
- Survey of Theories
- Theory Solvers
- Approaches to SMT Solving
- Lazy Encoding to SAT
- Eager Encoding to SAT
- Conclusion
33Eager Approach to SMT
SAT Solver involved in Theory Reasoning
- Key Ideas
- Small-domain encoding
- Constrain model search
- Rewrite rules
- Abstraction-based methods (eager lazy)
- Example Solvers
- UCLID, STP, Spear, Boolector, Beaver,
34Theories
- Eager Encoding Methods have been demonstrated for
the following Theories - Equality Uninterpreted Functions
- Integer Linear Arithmetic
- Restricted Lambda expressions
- Arrays, memories, etc.
- Finite-precision Bit-Vector Arithmetic
- Strings
35UCLID Operation
Input Formula
Lambda Expansion for Arrays
?-free Formula
- Operation
- Series of transformations leading to Boolean
formula - Each step is validity (satisfiability) preserving
- Each step performs optimizations
Function Predicate Elimination
Linear/ Bitvector ArithmeticFormula
Encoding Arithmetic
Boolean Formula
Boolean Satisfiability
http//uclid.eecs.berkeley.edu
36Rewrites Eliminating Function Applications
- Two applications of an uninterpreted function f
in a formula - f(x1) and f(x2)
37Small-Domain Encoding
- Consider an SMT formula ?(x1, x2, , xn) where xi
2 Di - Small-domain encoding/Finite instantiation
Derive finite set Si ½
Di s.t. Si Di - In some cases, Si is finite where Di is infinite
- Encode each xi to take values only in Si
- Could be done by encoding to SAT
- Example Integer Linear Arithmetic (QF_LIA)
38Solving QF_LIA is NP-complete
- In NP
- If a satisfying solution exists, then one exists
within a bound d - log d is polynomial in input size
- Expression for d Papadimitriou, 82
- (nm) (bmax 1) ( m amax ) 2m3
- Input size
- m constraints
- n variables
- bmax largest constant (absolute value)
- amax largest coefficient (absolute value)
39Small-domain encoding / Finite Instantiation
Naïve approach
- Steps
- Calculate the solution bound d
- Encode each integer variable with d log d e bits
translate to Boolean formula - Run SAT solver
- Problem For QF_LIA, d is W( m m )
- W( m log m ) bits per variable
- Solution Exploit special-cases and
domain-specific structure
40Special Case 1 Equality Logic
- Linear constraints are equalities xi xj
- Result d n
-
x1 ? x2 Æ x2 ? x3 Æ x1 ? x3 3-valued domain
is needed 1, 2, 3
41Special Case 2 Difference Logic
- Boolean combination of difference-bound
constraints - xi xj b, xi b
- Result d n (bmax 1)
Bryant, Lahiri, Seshia, CAV02 - Proof sketch satisfying solution corresponds to
shortest path in constraint graph - Longest such path has length n (bmax 1)
- Tighter formula-specific bounds possible
42Special Case 3 Generalized 2SAT
- Generalized 2SAT constraints
- xi xj b, - xi - xj b, xi - xj b,
xi b -
- d 2 n (bmax 1) Seshia, Subramani,
Bryant,04
43Full Integer Linear Arithmetic
- Can we avoid the mm blow-up?
- In fact, yes. The idea is to derive a new
parameterized solution bound d - Formalize parameters that the bound really
depends on - Parameters characterize sparse structure
- Occurs especially in software verification also
in many high-level hardware models - Seshia Bryant, LICS04, LMCS05
44Structure of Linear Constraints in Software
Verification
- Characteristics of studied benchmarks
- Mostly difference constraints
- Only 3 of constraints were NOT difference
constraints - Non-difference constraints are sparse
- At most 6 variables per constraint (total number
of variables in 1000s) - Some similar observations Pratt77,
ESC/Java-Simplify-TR03
45Parameterized Solution Bound
- New parameters
- k non-difference constraints,
- w variables per constraint (width)
m constraints
n variables
bmax max constant
amax max coefficient
46Example
m constraints 3
k non-difference 1
n variables 4
w width 3
bmax max constant 3
amax max coefficient 2
47Summary of d Values
Logic Solution Bound d
Equality logic n
Difference logic n ( bmax 1 )
Generalized 2SAT logic 2 n ( bmax 1 )
Full Integer Linear Arithmetic n (bmax 1) (amaxk w k)
48Abstraction-Based Methods
- For some logics, one cannot easily compute a
closed-form expression for the small domain - Example Bit-Vector Arithmetic
- In such cases, an abstraction-refinement approach
can be used to compute formula-specific small
domains
49 Bit-Vector Arithmetic Some History
- B.C. (Before Chaff)
- String operations (concatenate, field extraction)
- Linear arithmetic with bounds checking
- Modular arithmetic
- SAT-Based Bit Blasting
- Generate Boolean circuit based on bit-level
behavior of operations - Handles arbitrary operations
- Check with best available SAT solver
- Effective in many applications
- CBMC Clarke, Kroening, Lerda, TACAS 04
- Microsoft Cogent SLAM Cook, Kroening,
Sharygina, CAV 05
50Research Challenge
- Is there a better way than bit blasting?
- Requirements
- Provide same functionality as with bit blasting
- Must support all bit-vector operators
- Exploit word-level structure
- Improve on performance of bit blasting
- Current Approaches based on two core ideas
- Simplification Simplify input formula using
word-level rewrite rules and solvers - Abstraction Can use automatic abstraction-refinem
ent to solve simplified formula
51Bit-Vector SMT Solvers, circa Spr.2009
- Current Techniques with Sample Tools
- Proof-based abstraction-refinement UCLID
Bryant et al., TACAS 07 - Solver for linear modular arithmetic to simplify
the formula STP Ganesh Dill, CAV07 - Automatic parameter tuning for SAT Spear Hutter
et al., FMCAD 07 - Rewrites, underapproximation, efficient SAT
engine Boolector Brummayer Biere, TACAS09 - Equality/constant propagation, logic
optimization, special rules for non-linear ops -
Beaver Jha et al., CAV09 - DPLL(T) framework Layered approach, rewriting
CVC3 Barrett et al., MathSAT Bruttomesso et
al, Yices Dutertre et al., Z3 de Moura et al
52Abstraction-Refinement
- Deciding Bit-Vector Arithmetic with Abstraction
Bryant et al., TACAS 07, STTT 09 - Use bit blasting as core technique
- Apply to simplified versions of formula under
and over approximations - Generate successive approximations until a
solution is found or formula shown unsatisfiable - Inspired by McMillan Amlas proof-based
abstraction for finite-state model checking - Small Motivating Example
- (x y ? y x) Æ (x y ? y x)
- Sufficient to prove the left-hand conjunct unsat
53Approximations to Formula
?
Original Formula
- Example Approximation Techniques
- Underapproximating
- Restrict word-level variables to smaller ranges
of values - Overapproximating
- Replace subformula with Boolean variable
54Starting Iterations
?
?1-
- Initial Underapproximation
- (Greatly) restrict ranges of word-level variables
- Intuition Satisfiable formula often has
small-domain solution
55First Half of Iteration
?
?1-
- SAT Result for ?1-
- Satisfiable
- Then have found solution for ?
- Unsatisfiable
- Use UNSAT proof to generate overapproximation ?1
56Second Half of Iteration
?1
?
?1-
- SAT Result for ?1
- Unsatisfiable then have shown ? unsatisfiable
- Satisfiable solution indicates variable ranges
that must be expanded - Generate refined underapproximation
57Example
?1 (x y2)
? (x y2) Æ (x2 gt y2)
?2- (x2 y22) Æ (x22 gt y22)
?1- (x1 y12) Æ (x12 gt y12)
58Iterative Behavior
- Underapproximations
- Successively more precise abstractions of ?
- Allow wider variable ranges
- Overapproximations
- No predictable relation
- UNSAT proof not unique
?2
?1
? ? ?
?k
?
?k-
? ? ?
?2-
?1-
59Overall Effect
- Soundness
- Only terminate with solution on
underapproximation - Only terminate as UNSAT on overapproximation
- Completeness
- Successive underapproximations approach ?
- Finite variable ranges guarantee termination
- In worst case, get ?k- ? ?
?2
?1
? ? ?
?k
?
?k-
? ? ?
?2-
?1-
60Roadmap for this Tutorial
- Background and Notation
- Survey of Theories
- Theory Solvers
- Approaches to SMT Solving
- Lazy Encoding to SAT
- Eager Encoding to SAT
- Conclusion
61Summary of Ideas Modeling
- Philosophy Model systems in first-order logic
suitable theories - Widely-used theories
- Equality and uninterpreted functions
- Linear arithmetic
- Bit-vector arithmetic
- Arrays
62Summary of Ideas Lazy Methods
- Philosophy Extend DPLL framework from SAT to SMT
- Literals assigned by SAT are sent to Theory
Solver - Theory Solver determines if literals are
satisfiable in the theory - Key optimizations small explanations, early
conflict detection, theory propagation
C. Barrett S. A. Seshia
62
ICCAD 2009 Tutorial
63Summary of Ideas Eager Methods
- Philosophy Constrain solution space with
logic-specific methods - Small-domain encoding
- Compute bounds that work for any formula in the
logic - Abstraction-refinement of domains
- Compute formula-specific small domains
- Rewrite rules high level and bit level
- Simplify formula before and after bit-blasting
64Challenges and Opportunities
- Solvers for new theories
- Strings
- Non-linear arithmetic
- Can we exploit domain-specific structure?
- Parallel SMT
- Better support for quantifiers
- Better proof/interpolant generation
65Join the SMT Community
- We need your new, exciting applications!
- Contribute to SMT-LIB
- Create new solvers, compete in SMTCOMP
Slides and book chapter available on our
websites Clark http//cs.nyu.edu/barrett San
jit http//www.eecs.berkeley.edu/sseshia