Title: Software Model Checking
1Software Model Checking
Rajeev Alur University of Pennsylvania
University of Edinburgh, July 2008
2Systems Software
do KeAcquireSpinLock() nPacketsOld
nPackets if(request) request
request-gtNext KeReleaseSpinLock() nPackets
while(nPackets! nPacketsOld) KeReleaseSpinLo
ck()
- Can Microsoft Windows version X be bug-free?
- Millions of lines of code
- Types of bugs that cause crashes well-known
- Enormous effort spent on debugging/testing code
- Certifying third-party code (e.g. device drivers)
Do lock operations, acquire and release
strictly alternate on every program execution?
3Concurrency LibrariesExploiting concurrency
efficiently and correctly
- dequeue(queue_t queue, value_t pvalue)
-
- node_t head
- node_t tail
- node_t next
- while (true)
- head queue-gthead
- tail queue-gttail
- next head-gtnext
- if (head queue-gthead)
- if (head tail)
- if (next 0)
- return false
- cas(queue-gttail, tail, next)
- else
- pvalue next-gtvalue
- if (cas(queue-gthead, head, next))
- break
Shared Memory
Can the code deadlock? Is sequential semantics
of a queue preserved? (Sequential consistency)
4Security Checks for Java Applets
https//java.sun.com/javame/
public VectorltStringgt phoneBook public String
number public int Selected public void
sendEvent() phoneBook getPhoneBook()
selected chhoseReceiver() numberphoneBook.el
ementAt(selected) if ((numbernull)(number
)) //output error else
String message inputMessage()
sendMessage(number, message)
EventSharingMidlet from J2ME
How to certify applications for data integrity /
confidentiality ? By listening to messages, can
one infer whether a particular entry is in the
addressbook?
5Verifier
In Search of the Holy Grail
yes/proof
software/model
correctness specification
no/bug
- Correctness is formalized as a mathematical claim
to be proved or falsified rigorously - always with respect to the given specification
- Challenge Impossibility results for automated
verifier - Verification problem is undecidable (Turing 1936)
- Even approximate versions are computationally
intractable (model checking is Pspace-hard)
61970s Proof calculi for program correctness
Key to proof Finding suitable loop invariants
BubbleSort (A array1..n of int) B A
array1..n of int for (i0 iltn i)
Permute(A,B) Sorted(Bn-i,n) for
0ltkltn-i-1 and n-iltkltn BkltBk for
(j0 jltn-i j) Permute(A,B),
Sorted(Bn-i,n, for 0ltkltn-i-1 and n-iltkltn
BkltBk for 0ltkltj Bk lt Bj if
(BjgtBj1) swap(B,j,j1) return B
BubbleSort (A array1..n of int) B A
array1..n of int for (i0 iltn i)
for (j0 jltn-i j) if
(BjgtBj1) swap(B,j,j1) return B
7Deductive Program Verification
- Powerful mathematical logic (e.g. first-order
logic, Higher-order logics) needed for
formalization - Great progress in decision procedures
- Finding proof decomposition requires expertise,
but modern tools support many built-in proof
tactics - Contemporary theorem provers HOL, PVS, ACL2,
ESC-Java, Boogie - In practice
- User partially annotates the program with
invariants, and the tool infers remaining
invariants needed to complete the proof - Checks are modular (per function)
- Success story Windows developers must add enough
annotations to be able to prove absence of buffer
overflow errors
81980s Finite-state Protocol Analysis
- Automated analysis of finite-state protocols with
respect to temporal logic specifications - Network protocols, Distributed algorithms
Specs Is there a deadlock? Does every req get
ack? Does a buffer overflow? Tools SPIN,
Murphi, CADP
9Battling State-space Explosion
- Analysis is basically a reachability problem in a
HUGE graph - Size of graph grows exponentially as the number
of bits required for state encoding - Graph is constructed only incrementally,
on-the-fly - Many techniques for exploiting structure
symmetry, data independence, hashing, partial
order reduction - Great flexibility in modeling Scale down
parameters (buffer size, number of network nodes)
State
Transition
Bad states
101990s Symbolic Model Checking
- Constraint-based analysis of Boolean systems
- Symbolic Boolean representations (propositional
formulas, OBDDs) used to encode system dynamics - Success in finding high-quality bugs in hardware
applications (VHDL/Verilog code)
Global bus
Deadlock found in cache coherency protocol
Gigamax by model checker SMV
UIC
UIC
UIC
Cluster bus
P
M
P
M
Read-shared/read-owned/write-invalid/write-shared/
11Symbolic Reachability Problem
- Model variables X x1, xn
- Each var is of finite type, say, boolean
- Initialization I(X) a formula over X e.g. (x1
x2) - Update T(X,X)
- How new vars X are related to old vars X as a
result of executing one step of the program
Disjunction of clauses obtained by compiling
individual instructions e.g. (x1 x1 x1
x2 x2 x3 x3) - Target set F(X) e.g. (x2 x3)
- Computational problem
- Can F be satisfied starting with I by repeatedly
applying T ? - K-step reachability reduces to propositional
satisfiability (SAT) Bounded Model Checking - I(X0) T(X0,X1) T(X1,X2) ---
T(Xk-1,Xk) F(Xk)
12The Story of SAT
- Propositional Satisfiability Given a formula
over Boolean variables, is there an assignment of
0/1s to vars which makes the formula true - Canonical NP-hard problem (Cook 1971)
- Enormous progress in tools that can solve
instances with 1000s of variables and millions of
clauses
1960 DP ?10 var
1996 GRASP ?1k var
1994 Hannibal ? 3k var
2002 Berkmin ?10k var
1988 SOCRATES ? 3k var
1996 Stålmarck ? 1000 var
1986 BDDs ? 100 var
2001 Chaff ?10k var
1992 GSAT ? 300 var
1962 DLL ? 10 var
1952 Quine ? 10 var
1996 SATO ?1k var
Source Malik 2004
132000s Model Checking of C code
- Phase 1 Given a program P, build an abstract
finite-state (Boolean) model A such that set of
behaviors of P is a subset of those of A
(conservative abstraction) - Phase 2 Model check A wrt specification this
can prove P to be correct, or reveal a bug in P,
or suggest inadequacy of A - Shown to be effective on Windows device drivers
in Microsoft Research project SLAM
do KeAcquireSpinLock() nPacketsOld
nPackets if(request) request
request-gtNext KeReleaseSpinLock() nPackets
while(nPackets! nPacketsOld) KeReleaseSpinLo
ck()
Do lock operations, acquire and release,
strictly alternate on every program execution?
14Program Abstraction
int x, y if xgt0 yx1 . else
yx1 .
bool bx, by if bx bytrue
. else bytrue,false .
15Software Model Checking
- Tools for verifying source code combine many
techniques - Program analysis techniques such as slicing,
range analysis - Abstraction
- Model checking
- Refinement from counter-examples
- New challenges for model checking (beyond
finite-state reachability analysis) - Recursion gives pushdown control
- Pointers, dynamic creation of objects,
inheritence. - A very active and emerging research area
- Abstraction-based tools SLAM, BLAST,
- Direct state encoding F-SOFT, CBMC, CheckFence
16Coming Up
- CheckFence Project at Penn
-
- Concurrent Executions on Relaxed Memory Models
- Analysis tool for Concurrent Data Types
-
- Joint work with Sebastian Burckhardt and Milo
Martin - Not covered How to check that a Java midlet does
not leak user-specified secrets (Ongoing work
with Pavol Cerny)
17Challenge Exploiting Concurrency, Correctly
18Concurrency on Multiprocessors
Initially x y 0
thread 1 x 1 y 1
thread 2 r1 y r2 x
Standard Interleavings
x 1 y 1 r1 y r2 x r1r21
x 1 r1 y y 1 r2 x r10,r21
x 1 r1 y r2 x y 1 r10,r21
r1 y x 1 y 1 r2 x r10,r21
r1 y x 1 r2 x y 1 r10,r21
r1 y r2 x x 1 y 1 r1r20
Can we conclude that if r1 1 then r2 must be 1 ?
No! On real multiprocessors, possible to have
r11 and r20
19Architectures with Weak Memory Models
- A modern multiprocessor does not enforce global
ordering of all instructions for performance
reasons - Lamport (1979) Sequential consistency semantics
for correctness of multiprocessor shared memory
(like interleaving) - Considered too limiting, and many relaxations
proposed - In theory TSO, RMO, Relaxed
- In practice Alpha, Intel IA-32, IBM 370, Sun
SPARC, PowerPC
cache
Main Memory
20Concurrency in TheoryCCS (1978)
Concurrency in PracticeIntel (2007)
CCS Syntax P e a.P PP PP P\a CCS
Operational Semantics (sample rules) a.P -a-gt
P P a-gt P P -a-gt P PQ a-gt PQ
QP -a-gt QP P a-gt P Q a-gt Q PQ
t-gt PQ
Intel 64 memory ordering obeys following
principles 1. Loads are not reordered with other
loads 2. Stores are not reordered with other
stores 3. Stores are not reordered with older
loads 4. Loads may be reordered with older stores
to different locations but not with older stores
to same locations 4 more rules Illustrative
examples
21Programming with Weak Memory Models
- Concurrent programming is already hard, shouldnt
the effects of weaker models be hidden from the
programmer? - Mostly yes
- Safe programming using extensive use of
synchronization primitives - Use locks for every access to shared data
- Compilers use memory fences to enforce ordering
- Not always
- Non-blocking data structures
- Highly optimized library code for concurrency
- Code for lock/unlock instructions
- OS code managing process queues etc.
22Non-blocking Queue (MS96)
- boolean_t dequeue(queue_t queue, value_t
pvalue) -
- node_t head
- node_t tail
- node_t next
- while (true)
- head queue-gthead
- tail queue-gttail
- next head-gtnext
- if (head queue-gthead)
- if (head tail)
- if (next 0)
- return false
- cas(queue-gttail, (uint32) tail, (uint32)
next) - else
- pvalue next-gtvalue
- if (cas(queue-gthead, (uint32) head,
(uint32) next)) - break
Queue is being possibly updated concurrently
head
tail
Atomic compare-and-swap for synchronization
23Simple Usable by programmers
- Programs (multi-threaded)
Application level concurrency model
System-level code Concurrency libraries
Architecture-aware Concurrency Analysis
Architecture level concurrency model
Highly parallel hardware -- multicores, SoCs
Complex Efficient use of parallelism
24Software Model Checking for Concurrent Code on
Multiprocessors
- Why? Real bugs in real code
- Opportunities
- 10s100s lines of low-level library C code
- Hard to design and verify -gt buggy
- Effects of weak memory models, fences
- Challenges
- Lots of behaviors possible high level of
concurrency - How to formalize and reason about weak memory
models?
25Shared Memory Consistency Models
- Specifies restrictions on what values a read from
shared memory can return - Program Order x ltp y if x and y are instructions
belonging to the same thread and x appears before
y - Sequential Consistency (Lamport 79) Concurrent
execution is correct if there exists a global
order lt of all accesses such that - If x ltp y then x lt y
- Each load returns value of most recent, according
to lt, store to the same location (or initial
value, if no such store exists) - Clean abstraction for programmers, but high
implementation cost
26Effect of Memory Model
Initially flag1 flag2 0
-
-
- Ensures mutual exclusion if architecture supports
SC memory - Most architectures do not enforce ordering of
accesses to different memory locations - Does not ensure mutual exclusion under weaker
models - Ordering can be enforced using fence
instructions - Insert MEMBAR between lines 1 and 2 to ensure
mutual exclusion
thread 2
thread 1
1. flag1 1 2. if (flag2 0) crit. sect.
1. flag2 1 2. if (flag1 0) crit. sect.
27Weak Memory Models
- A large variety of models exist a good starting
point - Shared Memory Consistency Models A tutorial
- IEEE Computer 96, Adve Gharachorloo
- How to relax memory order requirement?
- Operations of same thread to different locations
need not be globally ordered - How to relax write atomicity requirement?
- Read may return value of a write not yet globally
visible - Uniprocessor semantics preserved
- Typically defined in architecture manuals (e.g.
SPARC manual)
28Which Memory Model should a Verifier use?
RMO
PSO
TSO
390
SC
IA-32
Alpha
Relaxed
29Formalization of Relaxed
- Program Order x ltp y if x and y are instructions
belonging to the same thread and x appears before
y - Concurrent execution over a set X of accesses is
correct wrt Relaxed if there exists a total order
lt over X such that - If x ltp y, and both x and y are accesses to the
same address, and y is a store, then x lt y must
hold - For a load l and a store s visible to l, either s
and l have same value, or there exists another
store s visible to l with s lt s - A store s is visible to load l if they are to the
same address and either s lt l or s ltp l - Constraint-based specification that can be easily
encoded in logical formulas
30Pass all executions of the test are
observationally equivalent to a serial
execution Fail
Inconclusive runs out of time or memory
31How To Bound Executions
- Verify individual symbolic tests
- finite number of concurrent threads
- finite number of operations/thread
- nondeterministic input values
- Example
- User creates suite of tests of increasing size
32Why Symbolic Test Programs?
- 1) Make everything finite
- State is unbounded (dynamic memory
allocation)... is bounded for individual test - Checking sequential consistency is undecidable
(AMP 96)... is decidable for individual test - 2) Gives us finite instruction sequence to work
with - State space too large for interleaved system
model .... can directly encode value flow
between instructions - Memory model specified by axioms .... can
directly encode ordering axioms on instructions
33Tool Architecture
C code
Memory model
Symbolic Test
Symbolic test gives exponentially many
executions (symbolic inputs, dynamic memory
allocation, ordering of instructions). CheckFence
solves for incorrect executions.
34construct CNF formula whose solutions correspond
precisely to the concurrent executions
C code
Memory model
Symbolic Test
Symbolic Test
automatic, lazyloop unrolling
automatic specification mining (enumerate correct
observations)
35Specification Mining
Possible Operation-level Interleavings
enqueue(X) enqueue(Y) dequeue() -gt Z
enqueue(X) dequeue() -gt Z enqueue(Y)
dequeue() -gt Z enqueue(X) enqueue(Y)
For each interleaving, obtain symbolic constraint
by encoding corresponding executions in SAT
solver Spec is disjunction of all
possibilities Spec (ZX) (Znull)
To find bugs, check satisfiability of Phi
Spec where Phi encodes all possible concurrent
executions
36Encoding Memory Order
thread 1
thread 2
s1 store s2 store
l1 load l2 load
- Variables for encoding
- Use boolean vars for relative order (xlty) of
memory accesses - Use bitvector variables Ax and Dx for address and
data values associated with memory access x - Encode constraints
- encode transitivity of memory order
- encode ordering axioms of the memory
modelExample (for SC) (s1lts2) (l1ltl2) - encode value flow Loaded value must match last
value stored to same addressExample value must
flow from s1 to l1 under following
conditions((s1ltl1)(As1 Al1)((s2lts1)(l1lts2)
(As2 ! Al1))) -gt (Ds1 Dl1)
37Example Memory Model Bug
head
- Processor 1
- links new node into list
Processor 1 reorders the stores! memory accesses
happen in order 1 2 3
adding a fence between lines on left side
prevents reordering
38Algorithms Analyzed
Type Description LOC Source
Queue Two-lock queue 80 M. Michael and L. Scott (PODC 1996)
Queue Non-blocking queue 98 M. Michael and L. Scott (PODC 1996)
Set Lazy list-based set 141 Heller et al. (OPODIS 2005)
Set Nonblocking list 174 T. Harris (DISC 2001)
Deque snark algorithm 159 D. Detlefs et al. (DISC 2000)
LL/VL/SC CAS-based 74 M. Moir (PODC 1997)
LL/VL/SC Bounded Tags 198 M. Moir (PODC 1997)
39Results
- snark algorithm has 2 known bugs
- lazy list-based set had a unknown bug(missing
initialization missed by formal correctness
proof CAV 2006 because of hand-translation of
pseudocode)
40Results
41Typical Tool Performance
- Very efficient on small testcases (lt 100 memory
accesses)Example (nonblocking queue) T0 i (e
d) T1 i (e e d d )- find
counterexamples within a few seconds- verify
within a few minutes- enough to cover all 9
fences in nonblocking queue - Slows down with increasing number of memory
accesses in testExample (snark deque)Dq
pop_l pop_l pop_r pop_r push_l push_l
push_r push_r - - has 134 memory accesses (77 loads, 57
stores)- Dq finds second snark bug within 1
hour - Does not scale past 300 memory accesses
42CheckFence Summary
- Software model checking of low-level concurrent
software requires encoding of memory models - Challenge for model checking due to high level of
concurrency and axiomatic specifications - Opportunity to find bugs in library code thats
hard to design and verify - CheckFence project at Penn
- SAT-based bounded model checking for concurrent
data types - Bugs in real code with fences
43Ongoing Research
- Whats the best way to verify C code (on relaxed
memory models)? - SAT-based encoding seems suitable to capture
specifications of memory models, but many
opportunities for improvement - Can one develop abstract operational abstract
models for multiprocessor architectures? - Proof methods for relaxed memory models
- Hardware support for transactional memory
- Current interest in industry and architecture
research - Can formal verification influence
designs/standards?
44SoftwareModel Checker
yes/proof
software/model
correctness specification
no/bug
- Impressive progress on an intractable problem
- Device drivers
- Concurrency libraries
- Buffer overflows in OS
- Network protocols
- Academic research with industrial impact
- Ingredients for success
- SAT almost feasible
- Logic Algorithms Tools
- Focus on specific problems
- Scalability not necessary
- Flexibility in setting up the problem
- Unmet challenge Lack of robustness of tools -gt
lot of user expertise needed