Title: Distributed Verification of Multithreaded C Programs
1Distributed Verification of Multi-threaded C
Programs
Stefan Edelkamp joint work with Damian
Sulewski and Shahid Jabbar
2Motivation IO-HSF-SPIN
Same states in both parts
Arrives at the final state
Large jumps due to 2nd heuristic
Current state
Already seen final state
Arrives again at same final state
2.9 TB 20 days 1 node ---- 8 days 3 nodes
3Overview
- Software Checking in StEAM
4Overview
- Software Checking in StEAM
5Software Checking
Building a model unnecessary
Learning specification language unnecessary
Checking can be done more often
- Code has to be executed
- Huge number of states
- Huge states
6StEAM
- Can check concurrent C programs
- Uses a virtual machine for execution
- supports BFS, DFS, Best-First, A, IDA
- finds
- Deadlocks
- Assertion Violations
- Segmentation Faults
7StEAM - Checking a C Program
Model checker
igcc Compiler
Virtual Machine
char globalChar int globalBlocksize 7 int
main() allocateBlock(blocksize) void
allocateBlock(int size) void
memBlock memBlock (void ) malloc(size)
Objectcode
8 StEAM - Interpreting the Object Code
ICVM Virtual Machine
Register
char globalChar int globalBlocksize 7 int
main() allocateBlock(blocksize) void
allocateBlock(int size) void
memBlock memBlock (void ) malloc(size)
Text Section
Objectcode
BSS Section
Data Section
Stack
Memory Pool
9StEAM Generating States
ICVM Virtual Machine
Register
Text Section
BSS Section
Data Section
Stack
Memory Pool
10Overview
- Software Checking in StEAM
11Externalization - Motivation
time
Internal
External
problem size
12Externalization Mini States
EJMRS 06
- pointer to a state in RAM or on Disk
- pointer to the predecessor mini state
13Externalization Expanding a State
Cache
14Externalization Flushing the Cache
Cache
Mini States
15Externalization Collapse Compression
State
Caches
Files on Disk
Register
Text Section
BSS Section
Data Section
Stack
Memory Pool
16Overview
- Software Checking in StEAM
17Virtual Addresses
- memory assignment done by system
- moving program between nodes impossible
- converting the addresses before executing
18Virtual Addresses Memory Management
Memory pool
19Virtual Addresses - Overhead
time
virtual
real
nodes
20Overview
- Software Checking in StEAM
21Parallelization Motivation
- Distributed (Shared) Memory
- ? MPI channels/shared RAM communication
- Sending full states too expensive (if not used
for expansion) - ?Exploit externalization
- ? DualChannel (Speedup vs. Load Balance)
- ?Appropriate State Space Partitioning
22Parallelization Dual Channel Communication
23Parallelization Hash Partitioning
- Partitioning by hashing full state
- Problem Successors often not in same partition ?
high communication overhead - Partitioning by hashing partial state,
- e.g. memory pool
- Problem Too many states map to one hash value ?
Load balancing
24Parallelization Incremental Tree Hashing
EM05
h(s) (Si si 3i) mod 17
h(1,2,3,1,2,2,1,2) 4132 93(22) mod 17
11
h(3,1) 3319 mod 17 1
h(2,2,1,2) 9 6h(2,1,2)31 613 mod 17
h(1,2) 1329 mod 17 4
h(2) 231 mod 17 6
25Parallelization Search Partitioning
horizontal slices
vertical slices
DFS Holzman Bosnacki 2006
Best-First, A
26Parallelization - Hardware
- Cluster Vision System (PBS)
- Linux Suse 10.0
- MPI via infiniband
- Files via GBit Ethernet
- 224 nodes (464 procs), lt 15 used
- AMD Opteron DP 50 (2.4 GHz)
27Experiments 15-Puzzle Partial Hash
speedup
time
nodes
28Experiments Depth-First Slicing 200 Philosophers
time
Top Result 600 Phils / 6 nodes 97 KB /state
Ex Collapse Compression Distribution 16GB ? 1.5
GB per node
processors
29Experiments - Bath-Tub Effect (50 phils-avg.)
Time
validates Holzmann Bosnacki
Size of Depth Layer
30Experiment - Shared Memory Bakery (pthread)
- 4 Opteron MP 852 (2.6 GHZ)
speedup
time
nodes
31Conclusion
- Preceeding Work Full Externalization of States,
inIO-HSF-SPIN ? Constant-Size RAM, e.g. 1.8 GB
RAM, 20 days 1 proc, 8 days 4 procs, 2.9TB disk
EJ06, Distribution via (gh)-Value - Problem Huge Highly Dynamic States
- Solution Mini States as Constant Size Finger
Prints of States in RAM for Dual-Channel
Communication to combine External and Parallel
Search with Memory-Pool, Best-First Slicing
Partitioning