Title: Applying Model Checking To Large Programs
1Applying Model Checking To Large Programs
- Madan Musuvathi
- Microsoft Research
2The Model Checking Problem
- A system model S
- A property P
- Check if S satisfies P
3The Model Checking Problem
- A system model S
- An environment E
- A property P
- Check if S in E satisfies P
4In Previous Lectures
- A system model S
- An environment E
- A property P
- Check if S in E satisfies P
5When Applied to Large Systems
- A system model S
- An environment E
- A property P
- Check if S in E satisfies P
6Model Checking An Engineer's View
- Given a system and its environment
- Expose nondeterminism
- Environment nondeterminism inputs, timers,
events - Internal nondeterminism arising from
abstractions - Systematically explore all states of the system
- Do this exploration intelligently
- If lucky, you find a bug
- If luckier, you verify the system
7Explicit State Model Checking
- Explicitly generate the individual states
- Systematically explore the state space
- State space Graph that captures all behaviors
- Model checking Graph search
- Generate the state space graph "on-the-fly"
- State space is typically much larger than the
reachable set of states
8Guarded Transition System
- System State Transitions
- Readily models event-driven systems
9The Algorithm
Hashtable states_seen Queue pending insert
init_state into pending while(pending is not
empty) current pending.remove() for each
enabled transition T restore_state(current)
execute transition T successor
save_state() if(successor in
states_seen) continue check successor for
correctness insert successor into pending
queue
10How to write a model checker in an hour
- Specify the system and the environment as a class
- State member fields
- Transitions member functions
- Each member function has a Boolean guard function
- Capturing state provide serialization functions
- GetState() returns the state in a buffer
- SetState() copies the state from a buffer
- Implement the search algorithm
11State Explosion Problem
- Simple descriptions result in (very) large state
spaces - State space reduction techniques
- Identify behaviorally equivalent states
- Process symmetry reduction
- Heap symmetry reduction
- Identify behaviorally equivalent transition
orderings - Partial-order reduction
12How to write a model checker in a week
- Specify the system and the environment as a class
- State member fields
- Transitions member functions
- Each member function has a Boolean guard function
- Capturing state provide serialization functions
- GetState() returns the state in a buffer
- SetState() copies the state from a buffer
- Implement the search algorithm
- Implement some state space reduction techniques
13Practical Challenges
- Reduce manual intervention
- How to specify the system?
- What is the environment?
- Guarantees
- Soundness
- If the tool terminates without finding a bug (of
a certain type), then the program has no bugs - Preciseness
- If the tool reports an error, then it is indeed a
real error - Orthogonal to the difficulty of model checking
algorithms
14Specifying the Model
- Conventional model checkers require an
intermediate description (or "model") - Describes the system at a high level
- Throws away implementation details
- Good for checking designs, rather than
implementations - Success stories hardware, cache-coherence
protocols - Problems
- Specifying a model is HARD for large systems
- As the system evolves model has to be updated
- What you check is not what you run!
- Manual errors can miss or introduce errors
15Automatically Extract the Model
- Statically analyze the code to generate a model
- Models usually mimic the implementation
Murphi model
FLASH
Rule "PI Local Get (Put)" 1Cache.State
Invalid ! Cache.Wait 2 ! DH.Pending
3 ! DH.Dirty gt Begin 4 Assert
!DH.Local 5 DH.Local true 6 CC_Put(Home,
Memory) EndRule
void PILocalGet(void) // ... Boilerplate
setup 2 if (!hl.Pending) 3 if
(!hl.Dirty) 4! // ASSERT(hl.Local)
... 6 PI_SEND(F_DATA, F_FREE, F_SWAP,
F_NOWAIT, F_DEC, 1) 5 hl.Local 1
16Automatic Extraction
- FeaVer C program -gt Promela (SPIN) model
- User provided patterns to extract features
- Bandera Java -gt Bandera model
- Sophisticated property-driven slicing techniques
- Can throw away unrelated parts, if applicable
- Problems
- Not all primitives are available in the modeling
language - Pointers, dynamic object creation, dynamic
threads, exceptions - A precise-enough slice could be as large as the
program iteself
17Code as the model
- Directly execute the code
- Pioneered by Verisoft
- State-less model checking
- Explicit model checkers
- Java Path Finder (Java)
- CMC (C/C)
- State space can be infinite (or very large)
- Try exploring as much behaviors as possible
- Focus on precision
18Model Checking Testing ?
- Almost!
- Systematic exploration of nondeterminism
- Testing random walks in the state space
- Model checking systematic graph search
- Forces the user to expose more nondeterminism
- A call to malloc() can fail, a packet can get
lost - State space reduction techniques identify
redundant tests
19Specifying the System
- Similar to building a unit-test framework
- Extract the code to be checked
- Provide an environment model
- Includes entities that the implementation
interacts with - Calls to libraries, network, timers manual input
- Code environment is a closed system
- An executable that you can run
- Provide correctness properties
20Identify the Transitions
- Transition is a code execution between two
non-deterministic choices - Atomic execution of a thread between two schedule
points - Execution of an event handler
- Model checker should get control at these choice
points
21Capturing the State
- State of the program is captured by global
variables, stack, heap, and registers - Need a way to capture the state of the
environment model
22Backtracking
- Physically reset the state to an older version
- Java Pathfinder, CMC
- Go to the initial state and reexecute
- Fork a separate process at initial state
(Verisoft) - Some systems have a natural 'reset'
- Unload and reload a driver
- Reformat the disk
23Experience with CMC
- Three AODV implementations
- 35 implementation bugs, 1 specification bug
- Linux TCP
- 4 bugs, 90 protocol coverage
- Three Linux filesystems
- 32 bugs in total
- 10 serious ones (such as deleting "/")
24Environment Problem
- Where to separate the system and the environment
- Need a faithful abstraction of the environment
- Enough nondeterminism to trigger interesting
behaviors in the system - Not too much nondeterminism to trigger false
behaviors - An Example
- System Linux TCP implementation
- Environment Kernel, network (driver hardware),
25Extracting Linux TCP from the Kernel
- Conventional wisdom
- Extract TCP along a minimal, narrow interface
- Minimizes the model state
- Provide a kernel library
- Implements stubs for all kernel functions TCP
requires - Never worked!
- The narrowest interfaces still had 150 interface
fns - These interfaces are not documented
- Errors in stubs can cause subtle but false errors
- Model checkers are good in finding subtle errors!
- Errors in stubs can miss errors
26Extracting Linux TCP from the Kernel
- Solution (hard learned)
- Extract along well-defined interfaces
- Minimize errors in stub implementations
- These interfaces change infrequently
- Do so even if it stresses model checking
- Well defined interfaces around TCP
- The system call interface (kernel user
processes) - The hardware abstraction layer (kernel
hardware) - Extracting at these two interfaces
- Forces CMC to run the entire Linux kernel
27Running the Entire Kernel in CMC
- Linux kernel has to run in user space
- Has been done before (UML User Mode Linux)
- CMC needs to handle much larger states
- Approximately 300 kilobytes
- Incremental states in effect extract TCP relevant
state - A larger state space
- Restrict the environment to trigger TCP events
only - Compensated by the ease of environment model
generation - Approach not possible when model checking with an
intermediate description
28Specifying Properties
- Assertion in the code
- Trigger automatically as we are running the code
- Heap related errors
- Build your own memory allocator
- Check for leaks, double-free
- Purify-style dynamic techniques
- Reading uninitialized variables, access after
free - Checking for resource leaks
- Check if you reached the initial state if you
should have - Identify idempotent sequences
- CreateFile(A) followed by DeleteFile(A)
29Some properties are hard to specify
- Real systems have ambigous / incomplete
specifications - TCP congestion control should does not use up
"too much " network bandwidth - A file system should not lose files
- Difficult to check in the presence of crashes
- Identify properties that are easy to check
- A file system is in a bad state if its own fsck()
cannot recover from it
30State Space Reduction Techniques
- Downscaling
- Hash Compaction
- Identifying State Symmetries
31Downscaling
- Check smaller versions of the model
- Example
- Run with only 3-4 nodes in the network
- Send just 3 data packets
- Find bugs involving complex interactions in
smaller instances - Potentially miss bugs present only in larger
instances
32Hash Compaction
- Compact states in the hash table Stern, 1995
- Compute a signature for each state
- Only store the signature in the hashtable
- Signature is computed incrementally
- Partial signature cached at each page
- Might miss errors due to collisions
- Orders of magnitude memory savings
- Compact 100 kilobyte state to 4-8 bytes
- Possible to search 10 million states
33State Symmetries
- Explore one out of a (large) set of equivalent
states - Canonicalize states before hashing
Canonical State
Hash Signature
Current State
Hash table
Successor States
- State transformations can be approximate
- But, use the original state for further state
exploration - Thus, approximations do not generate false errors!
34Heap Canonicalization
- Heap objects can be allocated in different order
- Depends on the order events happen
- Relocate heap objects to a unique representation
state1
state2
Canonical Representation
- Essentially
- Find a canonical representation for each heap
graph - By abstracting the concrete values of pointers
35Heap Canonicalization Algorithm
- Basic algorithm Iosif 01
- Do a deterministic graph traversal of the heap
(bfs / dfs) - Relocate objects in the order visited
- CMC extensions
- How to do it incrementally?
- Should not traverse the entire heap in every
transition - How to do it for C objects?
- Type information is not available at run time
36Iosifs Canonicalization Algorithm
- Do a deterministic graph traversal of the heap
(bfs / dfs) - Relocate objects to a canonical location
- Determined by the dfs (or bfs) number of the
object - Hash the resulting heap
r
0
2
4
6
r
2
6
s
s
Canonical Heap
Heap
37Two Linked List Example
Heap
Canonical Heap
0
2
4
6
r
r
2
6
s
s
Partial hash values
Transition Insert b
0
2
4
6
8
r
r
s
s
38A Much Larger Example Linux Kernel
Heap
Canonical Heap
p
Network
File- system
Core OS
Core OS
Network
Filesystem
p
An object insertion here
Affects the canonical location of objects here
39Incremental Heap Canonicalization
- Access Chain
- A path from the root to an object in the heap
- Bfs Access Chain
- Shortest of all access paths
- Break ties lexicographically
- Note Bfs access chain is a shortest path from a
global variable - Canonical location of an object is a function of
its bfs access chain
r
g
f
f
a
b
h
g
c
- Access chain of c
- ltr,f,ggt
- ltr,g,hgt
- ltr,f,f,hgt
- Bfs access chain of c
- ltr,f,ggt
40Revisiting Two Linked Lists Example
ltrgt 0 ltsgt 4
ltr,ngt 2 lts,ngt 6
ltr,n,ngt 8
Relocation Function Table
r,s are root vars n is the next field
0
2
4
6
r
r
2
6
s
s
0
2
4
6
8
r
r
s
s
Heap
Canonical Heap
41And on the much larger example
Heap
Canonical Heap
p
Network
File- system
Core OS
Filesystem
Core OS
p
Core OS
Filesystem
Changes here do not affect the canonical
location of p
- Canonical location of p does not change
- Unless its Bfs Access Chain changes
- For small changes to the graph
- Shortest path of most objects remains the same