Title: Static and Runtime Verification A Monte Carlo Approach
1Static and Runtime VerificationA Monte Carlo
Approach
Radu Grosu
State University of New York at Stony
Brook grosu_at_cs.sunysb.edu
2Talk Outline
- Embedded Software Systems
- Automata-Theoretic Verification
- Monte Carlo Verification
- Monte Carlo Model Checking
- Static Verification of Software-Systems
- Dynamic Verification Software-Systems
3Embedded Software Systems
- Systems with ongoing interaction with
- their environment.
- Termination is rather an error than expected
behavior - Becoming an integral part of nearly every
- engineered product.
- - They control
4Embedded Systems
5Boeing 777 Super Computers with Wings
- Has
- gt 4M lines of code
- gt 1K embedded processors
- In order to
- - control subsystems
- - aid pilots in flight mngmnt.
A great challenge of software engineering
- hard real-time deadlines,
- mission and safety-critical,
- complex and embedded within another complex
system,
- interacts with humans in a sophisticated way.
6Embedded Software Systems
- Difficult to develop maintain
- Concurrent and distributed (OS, ES, middleware),
- Complicated by DS improving performance (locks,
RC,...), - Mostly written in C programming language.
- Have to be high-confidence
- Provide the critical infrastructure for all
applications, - Failures are very costly (business, reputation),
- Have to protect against cyber-attacks.
7Temporal Properties
- Safety (something bad never happens)
- Airborne planes are at least 1 mile apart
- Nuclear reactor core never overheats
- Gamma knife never exceeds prescribed dose
- Liveness (something good eventually happens)
- Core eventually reaches nominal temperature
- Dishwasher tank is eventually full
- Airbag inflates within 5ms of collision
8Linear Temporal Logic
- An LTL formula is made up of atomic propositions
p, boolean connectives ?, ?, ? and temporal
modalities X (neXt) and U (Until). - Safety nothing bad ever happens
- E.g. G(? (pc1cs ? pc2cs)) where G is a
derived modality (Globally). - Liveness something good eventually happens
- E.g. G( req ? F serviced ) where F is a
derived - modality (Finally).
9LTL Semantics
- Semantics given in terms of the inductively
defined entailment relation ? ? ?. - ? is an infinite word (execution) over the power
set of the set of atomic propositions. - ? is an LTL formula.
10LTL Semantics
X p
p
p U q
p
p
q
p
p
p
F p
p
G p
p
p
p
p
p
p
11What is High-Confidence?
Ability to guarantee that
?
system-software S satisfies LTL property f
12Talk Outline
- Embedded Software Systems
- Automata-Theoretic Verification
- Monte Carlo Verification
- Monte Carlo Model Checking
- Static Verification of Software-Systems
- Dynamic Verification Software-Systems
13Checking if
- Statically (at compile time)
- Abstract interpretation (sequential IS programs),
- Model checking (concurrent FS programs),
- Dynamically (at run time)
- Runtime analysis (sequential program
optimization). - Basic Idea
- Intelligently explore Ss state space in attempt
to - establish that S ? ?
14Automata-Theoretic Approach
- Büchi automaton NFA over ?-words with acceptance
condition - a final state must be visited ?-
often. - Every LTL formula ? can be translated to a Büchi
automaton B? such that L(?) L(B?). - State transition graph of S can also be viewed as
a Büchi automaton.
15Automata-theoretic approach
- Satisfaction reduced to language emptiness
- S ? ? ? L(BS) ? L(B? ) ? L(BS) n
L(B? ) ? - ? L(BS) n L(B?? ) ? ? L(BS ?
B?? ) ?
16Büchi Automata
- Finite automata over infinite words.
A
B
a
a
b
b
1
2
1
2
L(A) ab?
L(B) ?
- Checking non-emptiness is equivalent to finding a
reachable accepting cycle (lasso).
17Checking Non-Emptiness
Lassos Computation Tree (CT) of B
recurrence diameter
Explore all lassos in the CT DDFS,SCC time
efficient DFS memory efficient
18Talk Outline
- Embedded Software Systems
- Automata-Theoretic Verification
- Monte Carlo Verification
- Monte Carlo Model Checking
- Static Verification of Software-Systems
- Dynamic Verification Software-Systems
19Randomized Algorithms
- Huge impact on CS (distributed) algorithms,
complexity theory, cryptography, etc. - Takes of next step algorithm may depend on random
choice (coin flip). - Benefits of randomization include simplicity,
efficiency, and symmetry breaking.
20Randomized Algorithms
- Monte Carlo may produce incorrect result but
with bounded error probability. - Example Elections result prediction
- Las Vegas always gives correct result but
running time is a random variable. - Example Randomized Quick Sort
21Monte Carlo Approach
Lassos Computation tree (CT) of B
recurrence diameter
flip a k-sided coin
Explore N(?,?) independent lassos in the CT Error
margin ? and confidence ratio ?
22Lassos Probability Space
- Sample Space lassos in BS ? B??
- Bernoulli random variable Z (coin flip)
- Outcome 1 if randomly chosen lasso accepting
- Outcome 0 otherwise
- pZ ? pi Zi (expectation of an accepting
lasso) - where pi is lasso prob. (uniform random
walk)
23Example Lassos Probability Space
24Geometric Random Variable
- Value of geometric RV X with parameter pz
- No. of independent trials (lassos) until success
- Probability mass function
- p(N) PX N qzN-1 pz
- Cumulative Distribution Function
- F(N) PX ? N ?i ? Np(i) 1 - qzN
25How Many Lassos?
- Requiring 1 - qzN 1- d yields
- N ln (d) / ln (1- pz)
- Lower bound on number of trials N needed to
achieve success with confidence ratio d.
26What If pz Unknown?
- Requiring pz ? e yields
- M ln (d) / ln (1- e) ? N ln (d) / ln
(1- pz) - and therefore PX ? M ? 1- d
- Lower bound on number of trials M needed to
achieve success with - confidence ratio d and error margin e .
27Statistical Hypothesis Testing
- Null hypothesis H0 pz ? e
- Inequality becomes P X ? M H0 ? 1- d
- If no success after N trials, i.e., X gt M, then
reject H0 - Type I error a P X gt M H0 lt d
28Monte Carlo Verification (MV)
input B(S,Q,Q0,d,F), e, d N ln (d) / ln
(1- e) for (i 1 i ? N i) if (RL(B) 1)
return (1, error-trace) return (0, reject H0
with a Pr X gt N H0 lt d) RL(B) performs
a uniform random walk through B
storing states encountered in hash table to
obtain a random sample (lasso).
29Correctness of MV
- Theorem Given a Büchi automaton B, error margin
e, and confidence ratio d, if MV rejects H0,
then its type I error has probability - a P X gt M H0 lt d
30Complexity of MV
- Theorem Given a Büchi automaton B having
diameter D, error margin e, and confidence ratio
d, MV runs in time O(ND) and uses space O(D),
where N ln(d) / ln(1- e)
Cf. DDFS which runs in O(2Sf) time for B
BS ? B??
31Talk Outline
- Embedded Software Systems
- Automata-Theoretic Verification
- Monte Carlo Verification
- Monte Carlo Model Checking
- Static Verification of Software-Systems
- Dynamic Verification Software-Systems
32Model Checking ISOLA04, TACAS05
- Implemented DDFS and MV in jMocha model checker
for synchronous systems specified using Reactive
Modules. - Performance and scalability of MV compares very
favorably to DDFS.
33Dining Philosophers
34DPh Symmetric Unfair Version
(Deadlock freedom)
35DPh Symmetric Unfair Version
(Starvation freedom)
36DPh Asymmetric Fair Version
(Deadlock freedom)
d 10-1 e 1.810-3 N 1278
37DPh Asymmetric Fair Version
(Starvation freedom)
d 10-1 e 1.810-3 N 1278
38Related Work
- Random walk testing
- Heimdahl et al Lurch debugger
- Random walks to sample system state space
- Mihail Papadimitriou (and others)
- Monte Carlo Model Checking of Markov Chains
- Herault et al LTL-RP, bonded MC, zero/one ET
- Younes et al Time-Bounded CSL, sequential
analysis - Sen et al Time-Bounded CSL, zero/one ET
- Probabilistic Model Checking of Markov Chains
- ETMCC, PRISM, PIOAtool, and others.
39Talk Outline
- Embedded Software Systems
- Automata-Theoretic Verification
- Monte Carlo Verification
- Monte Carlo Model Checking
- Static Verification of Software-Systems
- Dynamic Verification Software-Systems
40Checking for High-Confidence (in-principle)
All Lassos Non-accepting
BA BS
LTL-P ?
BA BS ? B??
Instrumenter (Product)
Execution Engine
Accepting Lasso L
41Checking for High-Confidence (in-practice)
- Combine static runtime verification techniques
- Abstract interpretation (sequential IS programs),
- Model checking (concurrent FS programs),
- Runtime analysis (sequential program
optimization). - Make scalability a priority
- Open source compiler technology started to
mature, - Apply techniques to source code rather than
models, - Models can be obtained by abstraction-refinement
techniques, - Probabilistic techniques trade-of between
precision-effort.
42GCC Compiler
- Early stages a modest C compiler.
- Translation source code translated directly to
RTL. - Optimization at low RTL level.
- High level information lost calls, structures,
fields, etc. - Now days full blown, multi-language compiler
- generating code for more than 30 architectures.
- Input C, C, Objective-C, Fortran, Java and
Ada. - Tree-SSA added GENERIC, GIMPLE and SSA ILs.
- Optimization at GENERIC, GIMPLE, SSA and RTL
levels. - Verification Tree-SSA API suitable for
verification, too.
43GCC Compilation Process
44GCC Compilation Process
API Plug-In
45C Program and its GIMPLE IL
int main int a,b,c int T1,T2,T3,T4
a 5 b a 10 T1 foo(a,b)
T2 a T1 if (a gt T2) goto fi T3
b / a T4 b a c T2 T3
b b 1 fi bar(a,b,c)
int main() int a,b,c a 5 b a 10
c a foo(a,b) if (a gt c) c b/a
ba bar(a,b,c)
Gimplify
46Associated GIMPLE CFG
47MC Static Verification of ESS SOFTMC05, NGS06
48Monte Carlo Algorithm
- Input a set of CFGs.
- Main function A specifically designated CFG.
- Random walks in the Büchi automaton generated
on-the-fly. - Initial state of the main routine bookkeeping
information. - Next state choose process call GAM on its CFG.
- Processes created by using the fork primitive.
- Optimization GAM returns only upon context
switch. - Lassos detected by using a hierarchic hash
table. - Local variables removed upon return from a
procedure.
49Program State
Shared Variables Valuation (channels semaphores)
List Of Process states
p1
p2
p3
Control State
Data State
CFG Name
Statement
50Program State
Shared Variables Valuation (channels semaphores)
List Of Process states
p1
p2
p3
Control State
Data State
Heap
Global Variables Valuation
Frame Stack
f1
f2
Return Control State
Local Variables Valuation
51GIMPLE Abstract Machine (GAM)
- Interprets GIMPLE statements according to their
semantics. Interesting - Inter-procedural call(), return(). Manipulate
the frame stack. - Catches and interprets function calls to
various modeling and concurrency primitives - Modeling toss(), assert(). Nondeterminism and
checks. - Processes fork(), Manipulate the process
list. - Communication send(), recv(). Manipulate shared
vars. May involve a context switch.
52Results TCAS
53DPh Symmetric Fair Version
(Deadlock freedom)
54Needham-Schroeder Protocol
- Quite sophisticated C implementation.
- However, of a sequential nature
- Essentially executes only one round of a
- reactive system
55Related Work
- Software model checkers for concurrent C/C
- VeriSoft, Spin, Blast (Slam), Magic, C-Wolf.
Bogor? - Cooperative Bug Isolation Liblit, Naik Zheng
- Compile-time instrumentation. Distribute
binaries/collect bugs. - Statistical analysis to isolate erroneous code
segments. - Random interpretation Gulvany Necula
- Execute random paths and merge with random linear
operators. - Monte Carlo and abstract interpretation
Monniaux - Analyze programs with probabilistic and
nondeterministic input.
56Talk Outline
- Embedded Software Systems
- Automata-Theoretic Verification
- Monte Carlo Verification
- Monte Carlo Model Checking
- Static Verification of Software-Systems
- Dynamic Verification Software-Systems
57MC Runtime Verification of ESS MBT06, NGS06
SS S
Gimplify
GCC
CFG BS
CFG BS ? B??
Instrument
LTL-P ?
Verifier
58Runtime Verification Challenges
- Inserting instrumentation code
- Verifying states and transitions
- Reducing overheads
59Inserting Instrumentation Code
- struct inode my_inode
- atomic_t my_atomic
- my_atomic
- my_inode-gti_count
if(instrument) log_event(ATOMIC_INC,
INODE, my_atomic)
atomic_inc(my_atomic)
60Instrumentation Plug-Ins
- Ref-Counts detects misuse of reference counts
- Instruments inc(rc), dec(rc),
- Checks st-inv (rc?0), tr-inv (rc'-rc1),
leak-inv (rcgt0 gt rc0), - Maintains a list of reference counts and their
container type. - Malloc detects allocation bugs at runtime
- Instruments malloc() and free() function calls,
- Checks sequences free()free(), free() and
malloc(), - Maintains a list of existing allocations.
61Instrumentation Plug-Ins
- Bounds checks for invalid memory access
- Instruments malloc(), free() and f(a),
- Checks accesses to non-allocated areas,
- Maintains heap, stack and text allocations
- Higher accuracy than ElectricFence-like libraries.
62RC Runtime Verification
- Lasso concept weakened (abstracted)
- Execution where RC vary 0 ? ? 0
- State may include FS caches, HW regs, etc
- Lasso sampling used to reduce overhead
- Check for acceptance (error)
- Dynamically adjust sampling rate
63Sampling Granularity
Sample
64State and Transition Invariants
Change gt1
Change lt1
Value lt0
65The Leak Invariant
Timeout
Timeout
66Proof of Concept
- Check Linux file system cache objects
- inodes on-disk files
- dentries name-space nodes
- Optionally, log all events
- Simple per-category sampling policy
- Initially sample all objects
- Hypothesize err. rate e gt 10-5 and con. ratio d
10-5 - Stop sampling if hypothesis is false.
67Benchmarks
- Directory traversal benchmark
- Create a directory tree (depth 5, degree 6)
- Traverse the tree
- Recursively delete the tree
- Also tested GNU tar compilation
- Testbed
- 1.7GHz Pentium 4 (256Kb cache)
- 1Gbyte RAM
- Linux 2.6.10
68Results
Logging 10x
3x
1,33x
69Results
Checking 2x
1,1x
1,33x
70Sampling-Policy Automata
- Specify how to respond to events
- Violating trajectories
- Invalidations of violation rate estimates
- Control trajectory sampling rate
- A simple SPA
e gt pz
cs n1
cs n
71Related Work SWAT
- Chilimbi Hauswirth
- Low-Overhead Memory Leak Detection Using Adaptive
Statistical Profiling - Instrument heap accesses
- Block-level dynamic instrumentation
- Reduce instrumentation based on number of times a
block has been hit - No formal measure of confidence provided
72Conclusions
- GSRV is a novel tool suite for randomized
- Static and runtime verification of ESS (growing)
- General purpose tools (plug-ins)
- Code instrumenter constructs the product BA
- Intra/inter-procedural slicer in work
- Static verification tools (plug-ins)
- GAM CFG-GIMPLE abstract machine
- Monte Carlo MC statistical algorithm for LTL-MC
- Runtime verification tools (static libraries)
- Dispatcher catches and dispatches events to RV
- Monte Carlo RV statistical algorithm for LTL-RV