Title: Automatic Verification of Software Systems
1Automatic Verification of Software Systems
- Rupak Majumdar
-
- UC Berkeley
2Our Goal
Specifying and Checking Properties of Programs
- Goals
- defect detection
- partial validation
- Properties
- memory safety
- temporal safety
- security
-
- Many techniques
- Symbolic model checking
- Program analysis/ verification
- Theorem proving
- Many projects
- Bandera, Blast, ESC-Java, FeaVer, JPF, LClint,
PolyScope, PREfix, SLAM, TVLA, Verisoft, xgcc,
3 This has been the Holy Grail of computer
science for many decades, but now in some very
key areas, for example, driver verification,
were building tools that can do actual proof
about the software and how it works in order to
guarantee the reliability. Bill Gates, April 18,
2002, keynote address at WinHEC
4Symbolic Model Checking
5Our Methodology
- Model Building
- capture the relevant aspects of the system and
the property formally (in mathematics / logic) - Model Checking
- algorithms (i.e., software tools) for model
analysis, rather than for model execution
(simulation)
6Our Approach
Trajectory dynamic evolution of
state Model generates a set of
trajectories Property assigns (boolean)
values to trajectories
Algorithm computes values of the
trajectories generated by a model
7Example Finite State
Trajectory dynamic evolution of state sequence
of states Model generates a set of
trajectories state-transition
graph Property assigns values to trajectories
logical formula Algorithm compute
values of the trajectories generated by a
model graph algorithms
yellow and green alternate
8Model Checking 101 Safety
- Graph Search Keep searching successors until
- Hit error states report bug !
- Add no new successors report safe
Init
ERROR STATES
SYSTEMS STATE SPACE
9Model Checking From Finite to Infinite
Graph-theoretic Algorithms -primitive
operation access to a state or transition
-for finite-state systems
-complexity analysis Symbolic Algorithms -primi
tive operation pre or post on a state set
("region") -also for infinite-state
systems -termination analysis
10Two (Parallel) Stories
- Symbolic Model Checking
- in the broad sense
- abstract data type regions
- symbolic algorithms
- termination analysis
- Software verification for
- C programs
- predicate abstraction
- lazy refinement
Applications
Theory
11Symbolic Transition Systems
- S, Pre, Post, ?
- Set of regions RR1,R2,, Ri?S
- ? ?R
- Pre, Post R ?R
- ?,?,\ RXR?R
- ? RXR ? T,F
Computable
Symbolic semi-algorithm Start with regions in ?
and compute new regions using the operations
above
12Example Software Implementations
S Set of (n, v) where n is a control flow node,
and v is a valuation to all the program
variables Regions sets of states written as
first order formulas over the program
variables and control flow nodes The predicate
language usually depends on the properties of
interest Pre Weakest precondition
operation Post Strongest postcondition
operation Other examples Timed and hybrid
automata (Th(Reals,)), Fifo automata (Regular
expressions), . . .
13Why Symbolic Algorithms?
- - Efficient implementations in practice
- - Symbolic algorithms on BDDs outperform
enumerative implementations - Works for infinite-state systems as well as
finite systems - timed and hybrid automata (HyTech)
- software programs
- Provides very general termination results for
model checking algorithms - Generalizes to richer models of systems
- Games, Markov decision processes
14General Termination Results
Theorem On the infinite state system M The
symbolic closure algorithm over Ops terminates ,
There is a finite E-equivalence , Symbolic model
checking for logic L terminates
- Examples
- Ops pre, Å, n , E bisimilarity,
L CTL, - M timed automata
- Ops pre, Å , E similarity,
L ACTL, ECTL - M 2D rectangular hybrid automata
- Ops pre, Å a, E trace equivalence, L
LTL - M rectangular hybrid automata
15Richer Models
- Multi-player Games
- Compositional analysis
- Symbolic games
- cpre1, cpre2, boolean
- Same symbolic algorithms
- General termination results hold
- Stochastic Systems
- Model uncertainty in system behavior by
probabilistic transitions - Quantitative properties
- Symbolic probabilistic systems
- Regions S ! 0,1
- Ppre, lattice operations
- Same symbolic algorithms
16Software Verification
17The Needle Problem
18Model Checking Software
- What is the model?
- Given system, must construct model automatically
- General termination results do not hold
- Model is not well-behaved
- Symbolic iterations of wpre and spost do not, in
general terminate - If they do terminate, it takes a long time
- Insight Approximate analysis of systems
(abstract interpretation) is usually good
enough for many properties of interest
19Model Checking Abstraction
- Partition the state space
- Existentially lift transition relation
20Abstractions
- If there is an error path in the concrete system,
there is an error path in the abstract system - But there may be error paths in the abstraction
that are not in the concrete system - Problem Who gives you a good abstraction?
21Model Checking Abstraction
- Problem Abstraction too coarse
- Solution Refine abstraction
Init
ERROR STATES
22Model Checking Abstraction
- Problem Abstraction too coarse
- Solution Refine abstraction
Init
ERROR STATES
23Abstract-Check-Refine Loop
Abstract
Is model unsafe ?
Check
Refine
Why infeasible ?
Infeasible
24Abstract Only Where Required
- Abstraction is very expensive
- Why abstract regions that are never visited ?
- On-the-fly abstraction driven by the search
Init
ERROR STATES
25Refine Only Where Required
- Why be precise everywhere ?
- Dont refine error-free regions
Init
ERROR STATES
ERROR FREE
26Refine Only Where Required
- Why be precise everywhere ?
- Dont refine error-free regions
- Different precision for different regions
- Local Refinement driven by the search
Init
ERROR STATES
ERROR FREE
27Lazy Abstraction
Abstract locally
Is model unsafe ?
Check refined part
Refine
Why infeasible ?
Infeasible
28Benefits of Lazy Abstraction
- Abstract only where required
- Reachable state space maybe very sparse
- Construct the abstraction on-the-fly
- Use greater precision only where required
- Different precisions/abstractions for different
regions - Refine locally
- Reuse work from earlier phases
- Batch-oriented ) lose work from previous runs
- Integrate the three phases
29BLAST
- Berkeley Lazy Abstraction Software Verification
Tool - Input
- API usage rules
- client C source code as is
- Analysis
- create, explore and refine program abstractions
- Output
- Error traces
- Proofs of Correctness
- Scale to 10s of 1000s of LOC
30BLAST Internals
- Abstractions
- Sets of predicates on program variables
- Model Checking
- Explore the predicate abstraction
- Counterexample Analysis
- Symbolic execution of program traces
- Refinement
- Generate explanations of infeasibility
- Set operations, emptiness
- Decision procedures
31Example
Example ( ) 1 do lock()
old new 2 if () 3
unlock() new
4 while ( new ! old) 5
unlock () return
Q Is Error Reachable ?
32ExampleCFA
lock() old new
Example ( ) 1 do lock()
old new 2 if () 3
unlock() new
4 while ( new ! old) 5
unlock () return
33ExampleCFA
Example ( ) 1 do lock()
old new 2 if () 3
unlock() new
4 while ( new ! old) 5
unlock () return
Q Is Error Reachable ?
34Step 1 Search
1
LOCK0
lock() old new
gt
unlock()
Set of predicates LOCK0, LOCK1
35Step 2 Analyze Counterexample
lock() old new
LOCK0 Æ new old
newold
LOCK0
unlock()
36Step 2 Analyze Counterexample
LOCK0 Æ new1 new
LOCK1 Æ new1 old
LOCK1 Æ new 1 old
LOCK0 Æ new old
LOCK0
Track the predicate new old
LOCK0
37Step 3 Resume Search
Set of predicates LOCK0, LOCK1, new old
LOCK0 Æ new old
38Local Refinement
Example ( ) 0 if () 6 do
got_lock 0 7 if
() 8 lock()
got_lock 9
if (got_lock) 10
unlock() 11 while
() 1 do lock()
old new 2 if () 3
unlock() new
4 while ( new ! old) 5
unlock () return
6 do got_lock 0 7
if () 8
lock() got_lock
9 if (got_lock) 10
unlock() 11
while ()
1 do lock()
old new 2 if () 3
unlock() new
4 while ( new ! old) 5 unlock ()
return
39Local Refinement
Search on left subtree not repeated
Different abstractions for subtrees
Refine right subtree only
40Leaves Covered (Reuse work)
0
Leaves covered Avoid repeating search
when paths merge
LOCK0 Æ
COVERED !
41Invariants grow on Trees
1
LOCK0
Ç LOCK0 Æ new old
LOCK0 Æ new old
4
Ç LOCK1 Æ newold
5
LOCK1 Æ newold
Invariants
42Proving the VC
- Each condition dischargeable automatically
(Vampyre, CVC ) - Tree yields a small decomposition
- Entire proof can be extracted from model
checkers data structures
43Implementation
44BLAST
LAZY ABSTRACTION
45Experiments
Prf Size (bytes)
Total Time(sec)
Active Preds
Total Preds
Lines
Program
Pred. Disc. Time(sec)
253
0.01
4.5
5
5
18131
ide.c
Linux Lock 3 state
179
0.01
0.5
2
2
23539
qpmouse.c
0.03
20.93
2
2
17736
aha152x.c
403.33
428.63
4
5
16506
tlan.c
156787
540
1398
45
85
17798
cdaudio.c
1565
2086
37
62
17386
floppy.c
60129
17
395
44
93
fixed
5
64
40
54
12131
kbflter.c
Windows DDK IRP 22 state
165
256
35
48
7619
0.38
10
34
37
fixed
3.34
54
46
57
17372
mouclass.c
102967
519
1980
50
193
61781
parport.c
46start NP
CallDriver
SKIP2
SKIP1
return child status
Skip
IPC
CallDriver
synch
MPR3
NP
CallDriver
prop completion
PPC
not pending returned
MPR completion
Complete request
CallDriver
MPR1
MPR2
DC
return not Pend
no prop completion
synch
CallDriver
N/A
N/A
IRP accessible
CallDriver
start P
SKIP2
Mark Pending
SKIP1
Skip
IPC
CallDriver
synch
MPR3
NP
CallDriver
return Pending
prop completion
PPC
not pending returned
MPR completion
Complete request
CallDriver
MPR1
MPR2
DC
no prop completion
CallDriver
N/A
47Thread Modular Assume Guaranteefor Concurrent
Software
48Multithreaded Programs
F
²
Thread 1
Thread 2
- Adapt the same algorithm but with additional
heuristics. - Need to consider all interleavings. State space
explosion - Partial order reduction,
- Decompose the problem.
- Divide-and-conquer!!!
49Example
- Locking
- 1
- lock()
- 2
- x
- 3
- unlock()
lock() assume(m0) m tid unlock()
m 0
Property There is no race on x. pc1 ¹ 2 Ç pc2 ¹ 2
A program consists of an arbitrary number of
threads executing the above code
50Unsuccessful Proof Decomposition
F1
unconstrained
²
F2
²
unconstrained
F
F1Æ F2
)
²
51Unsuccessful Proof Decomposition
unconstrained
²
pc1 ¹ 2 Ç pc2 ¹ 2
pc1 ¹ 2 Ç pc2 ¹ 2
²
unconstrained
lock() x unlock()
lock() x unlock()
²
No race pc1 ¹ 2 Ç pc2 ¹ 2
52Assume Guarantee Proof Decomposition
F1
µ
²
- Given T1, T2, find G1, G2 such that
- T1 G2 ² f
- G1 T2 ² f
- T1 G2 µ G1
- G1 T2 µ G2
- Then T1 T2 ² f
F2
µ
²
F
F1Æ F2
)
²
53Propositional Validity?
P1 Æ P2 ) P2
P1 Æ P2 ) P1
P1 Æ P2 ) P1 Æ P2
Take P1, P2 true, and P1, P2 false
54Assume guarantee decomposition
No race
mm Ç m0 Æ m1 Ç m1 Æ m0
x is modified only when m 2
No race
55Thread Modular Abstraction Refinement
- Thread modular reasoning
- Explore the state space of threads separately,
without exploring the product - Refine transition relations and environment
actions - Refine transitions from above
- Refine environments from below
- Most locking protocols can be verified using
thread modular reasoning! - Extended Blast to check for concurrent safety
conditions (and race conditions)
56Conclusions
57Techniques
- Model checking
- symbolic model checking
- predicate abstraction
- counterexample-driven refinement
- assume guarantee reasoning
- Program analysis
- abstract interpretation
- points-to analysis
- data flow analysis
- slicing optimizations
- Automated deduction
- weakest preconditions
- theorem proving
- Proof generation
- PCC
- Software
- CIL Infrastructure
- Pointer Analysis
- CU BDD
- Simplify, CVC, Vampyre
- OCAML
58Conclusion
- Separate local (symbolic operators) from global
algorithms - Adjust precision of analysis locally, and on the
fly - Use compositional reasoning to alleviate state
space exploration
59BLAST
www.eecs.berkeley.edu/tah/blast/ www.eecs.berkel
ey.edu/rupak
This is joint work with Tom Henzinger, Ranjit
Jhala, Greg Sutre, and Shaz Qadeer