Title: Going from Concrete to Symbolic Model Checking via Predicate Abstraction
1Going from Concrete to Symbolic Model Checking
via Predicate Abstraction
- Willem Visser
- Corina Pasareanu and Radek Pelanek
- Automated Software Engineering GroupNASA Ames
Research Center
2Overview
- Abstraction
- Classic over-approximation based
- Counter-example based refinement
- Under-approximation based
- Refinement based on abstractions exactness
- Lightweight framework for testing
- Test generation environment built around JPF
with symbolic execution - Measure predicate coverage
- Evaluate against other test-case generation
methods - Java Container classes
3Predicate Abstraction
4Abstraction Mapping
For a,a in 2preds if wp(a,T) /\ a add
transition a ? a
may transition
must transition
5Example Abstraction
x 1 gt 0 x x 1 p
x 1 lt 0 x x 1 !p
wp(!p,xx-1) /\ p add p ? !p
wp(p,xx-1) /\ p add p ? p
wp(!p,xx-1) /\ !p add !p ? !p
wp(p,xx-1) /\ !p
!p ? wp(!p,xx-1) !p? !p is must trans
6Refinement
1 p T 2 while (p) 3 p !p ? F T
F 4 assert false
Infeasible Counter Example 1,2,3(F),2,4
1 x2 xgt0 2 x2 xgt0 3 x1 xlt0
x gt 1 x x -1 x gt 0
must
may
7Lets Go Outside the Box
- Rather than over-approximate and refine, we
under-approximate and refine - Clearly complements existing techniques
- If we restrict ourselves only to feasible
behaviors when under-approximating then all
safety property violations will be preserved - Build on top of classic explicit-state model
checking infrastructure
8Classic Explicit-State Search
PROCEDURE dfs() s top(Stack) FOR
all transitions t enabled in s DO s'
successor(s) after executing t IF s'
NOT IN VisitedStates THEN
Enter s' into VisitedStates
Push s' onto Stack
dfs() END
END Pop s from Stack
INIT
Enter s0 into VisitedStates
Push s0 onto Stack dfs()
9Explicit-State (1-step) aSearch
PROCEDURE dfs() s top(Stack) FOR
all transitions t enabled in s DO s'
successor(s) after executing t IF a(s)
NOT IN VisitedStates THEN
Enter a(s) into VisitedStates
Push s' onto Stack
dfs() END
END Pop s from
Stack INIT
Enter a(s0) into VisitedStates
Push s0 onto Stack
dfs()
10aSearch
Map concrete states to abstract states for state
storing
1 x 2 2 while (xgt0) 3 x x - 1 4
assert false
Abstraction Mapping p (xgt0)
Under-approximation of the behaviors
Always traverse only feasible paths
11Concrete, May Must
Concrete
12Concrete aSearch
A,0
C,0
B,1
Abstraction Search p (xlt2)
D,1
D,0
E,2
13Refinement aSearch
A,0
B,1
C,0
D,1
D,0
E,1
E,2
14Example
1 x 2 2 while (xgt0) 3 x x - 1 4
assert false
Abstraction Mapping p (xgt0)
15Refinement
Check if the induced abstract transition is a
must transition? If not, add new predicates
- Add x gt 1 to abstraction predicates and repeat
search - Globally for all transitions
- Locally only for the transition (location) it
refines
16Predicate Abstraction
aSearch
- Showing property holds
- Over-approximation based
- Counter-example driven refinement
- Expensive computation to calculate abstraction
- Finding defects
- Under-approximation based
- Abstraction driven refinement
- Trivial computation to calculate abstraction
mapping
17Issue
unreachable
reachable
wp(p,T)
T
p
if new predicates are infinitely required to
refine the unreachable area the algorithm will
not terminate
18Example
x 0 y 0 while (y gt 0) y x y
The refinement only refines the unreachable state
space!
19Modified Bakery
while true x y x x 1 wait
(xlty) x 0
while true y x y y 1 wait
(yltx) y 0
Search Order Matters!!
20Symbolic Execution and aSearch
- Current implementation is for a simple input
language - oCaml using Simplify as a decision procedure
- We would like to integrate the technique in Java
Pathfinder (JPF) that supports symbolic execution
(using the Omega Library) - To allow application to programs with complex
data structures (objects)
21From Concrete to Symbolic
X1, Y 0
X gt Y
Concrete Behavior
Symbolic Behavior
22Possible Approach
- Execute the concrete program on valid inputs
- Collect all predicates in path condition
- Solve constraints over all combinations of these
predicates - Use results as inputs for step 1
- When no new predicates are found, or, if an error
is found, terminate
23Example
method(1,1) true
public static void method(int x, int y) if
((x gt 0) (y lt 10)) if (y lt 5)
else else if (x gt 0) else
method(1,1) p1,p2
24Example (2)
method(1,6) p1,!p2
public static void method(int x, int y) if
((x gt 0) (y lt 10)) if (y lt 5)
else else if (x gt 0) else
x gt 0 y lt 10
y lt 5
end
method(1,6) p1,!p2
25Example (4)
method(-1,1) !p1,p2
public static void method(int x, int y) if
((x gt 0) (y lt 10)) if (y lt 5)
else else if (x gt 0) else
x gt 0 y lt 10
x gt 0
end
Solve Constraints
!p1,p3 ? method(1,11)
method(-1,1) !p1,p2,!p3
26Example (3)
method(1,11) !p1,p3
public static void method(int x, int y) if
((x gt 0) (y lt 10)) if (y lt 5)
else else if (x gt 0) else
x gt 0 y lt 10
x gt 0
end
method(1,11) !p1,p3
27End of Part One
- Showed under-approximation based search with
refinement - Backward weakest precondition based
- Forward symbolic execution based
- Part Two
- Rather than automated refinement we use
user-provided abstractions - Motivation is to generate test-cases to achieve
high behavioral coverage for Java container
classes
28Explicit-State (1-step) aSearch
PROCEDURE dfs() s top(Stack) FOR
all transitions t enabled in s DO s'
successor(s) after executing t IF a(s)
NOT IN VisitedStates THEN
Enter a(s) into VisitedStates
Push s' onto Stack
dfs() END
END Pop s from
Stack INIT
Enter a(s0) into VisitedStates
Push s0 onto Stack
dfs()
29General Idea
SUT
ENV (m,n) m is the seq. length of API calls n
is the number of values used in the parameters
of the calls
API put(v) del(v)
Evaluate different techniques for
selecting test-cases from ENV(m,n) to obtain
maximum coverage
30Predicate Coverage
Cover all combinations of a given set of
predicates at each branch in the code
Red-Black Tree Predicates root null, e.left
null, e.right null, e.parent null, e.color
BLACK
31Techniques Considered
- Random selection
- Classic model checking
- State matching on complete state
- Abstraction search
- State matching on abstract (partial) state
- Symbolic Execution
- Complete matching using subsumption checks
- Abstract matching
32Framework
SUT with minor instrumentation
ENV
TestListener
Abstraction Mapping State Storage
Coverage Manager
JPF
33Sample Output
Branch Number
Predicate Values
Unique ID for the test
Test case number 77 for '15,LRP-REDroot'
put(0)put(4)put(5)put(1)put(2)put(3)remove(
4)
Test-case to achieve above coverage
34Environment Skeleton
M sequence length N parameter values A
abstraction used for (int i 0 i lt M i)
int x Verify.random(N - 1) switch
(Verify.random(1)) case 0 put(x)
break case 1 remove(x) break
Verify.ignoreIf(checkAbstractState(A))
35Symbolic Environment Skeleton
M sequence length A abstraction used for
(int i 0 i lt M i) SymbolicInteger x
new SymbolicInteger(Xi) switch
(Verify.random(1)) case 0 put(x)
break case 1 remove(x) break
Verify.ignoreIf(checkAbstractState(A))
36Abstraction Search
- Map state to an abstract version and backtrack if
the abstract state was seen before, i.e. discard
test-case - Mapping can be lossy or not
- Abstraction mappings can be created by the
user/tester - Default abstraction mappings are provided
37Default Mappings
- Structure of the heap of the program
- e.g. structure of the containers
- Structure augmented with non-data fields
- Structure augmented with symbolic constraints on
the data in the structure - This requires checking constraint subsumption
38Linearization Comparing Structures
1
1
1 2 3 -1 -1 4 -1 -1 5 -1 -1
1 2 3 -1 -1 4 -1 -1 5 -1 -1
2
5
2
5
3
4
3
4
1
1
1 2 3 -1 -1 4 -1 -1 5 -1 -1
1 2 3 -1 -1 4 -1 5 -1 -1 -1
2
2
5
3
4
3
4
5
39Linearization Mapping
1b 2b 3r -1 -1 4r -1 -1 5b -1 -1
1b 2r 3r -1 -1 4r -1 -1 5r -1 -1
1
1
2
5
2
5
3
4
4
3
Linearization takes a mapping object as parameter
to indicate how each node in the heap should be
linearized. In the example above each node gets,
besides the unique identifier, a mapping of r
if the original structure had a red node and b
if the original structure had a black node in
that position. If we also added the key values
for each node the linearization might have
looked something like
1b6 2b4 3r3 -1 -1 4r5 -1 -1 5b7 -1 -1
40Symbolic Execution
Symbolic State
x1
x1 gt x2 x2 gt x3 x2 lt x4 x5 gt x1
x2
x5
x3
x4
Symbolic Constraints
Shape
41Subsumption Checking
x1
x1 gt x2 x2 gt x3 x2 lt x4 x5 gt x1
x2
x5
x3
x4
x1
x1 gt x2 x2 gt x3 x2 lt x4 x5 gt x1
x2
x5
x3
x4
If only it was this simple!
42Getting Ready for CheckingExistential Elimination
s1
x1
PC s1 lt s2 s4 gt s3 s4 lt s1 s4 lt s5 s7 lt
s2 s7 gt s1
s4
x2
x5
s2
x3
x4
s3
s5
? s1,s2,s3,s4,s5 such that x1 s1 x2 s4
x3 s3 x4 s5 x5 s2 PC
x1 gt x2 x2 gt x3 x2 lt x4 x5 gt x1
43Checking Subsumption
new state C1 C2 ..
Check new gt old
old state C1 C2 ..
Oops Negation and Or doesnt work in our version
of the Omega Lib.
!(new gt old) is unsatisfiable !( !new \/ old
) new !old C1 C2 (!C1 \/ !C2
\/) (C1 C2 !C1) \/ (C1 C2 !C2) \/
44Bidirectional Subsumption Checking
- If new gt old
- backtrack
- If old gt new
- new is more general than old
- replace old with new
- to increase chances of getting a match in the
future - Continue on path from new, i.e. dont backtrack
- Ultimately for each shape we want to use
disjunction of constraints - Small technicality prevents us bug in omega lib
45Evaluation
- Red-Black Trees
- Out of Memory runs are not reported
- Breadth-first Search unless stated
- Sequence Length Values for the non-symbolic
searches - First compare under Branch Coverage
46Exhaustive TechniquesBranch Coverage
Seq Cov Len Time Mem
Full MC 7 39 4.3 536 584
SCV 7 39 4.3 10.635 17.47
Sym SSub 7 39 4.3 14.201 16.95
Optimal Branch Coverage is 39
47Under-Approximation TechniquesBranch Coverage
Seq Cov Len Time Mem
S 21 39 6.1 57.353 72.07
SC 18 39 5.8 32.577 21.16
Sym - S 7 39 4.3 10.054 15.43
Sym SC 7 39 4.3 11.998 10.76
Random 9 39 7 40.429 3.06
Optimal Branch Coverage is 39
48Exhaustive TechniquesPredicate Coverage
Seq Cov Len Time Mem
Full MC 7 79 5.2 543 309
SCV 10 95 5.7 350 228
Sym SSub 11 102 6.1 222 117
Optimal Predicate Coverage is 106
49Under-Approximation TechniquesPredicate Coverage
Seq Cov Len Time Mem
S 25d 106 21.7 90 13.31
SC 30 106 8.3 354 100
Sym - S 12 100 6.1 230 123.27
Sym SC 12 104 6.2 356 138
Random 60 106 30.1 61.459 7.74
Optimal Predicate Coverage is 106
50Observations
- For a simple coverage such as branch coverage,
all the techniques work well, including the
exhaustive ones - But making the coverage more behavioral, even
by a small increment, kills off the exhaustive
techniques
51Observations
- Full Blown Model Checking doesnt work here
- Its close cousin, that only looks at the relevant
state at the relevant time, scales much better - Branch - full coverage after
- MC 536s 584Mb
- Complete 10s 17Mb
- Predicate best coverage after
- MC 79 covered with 543s 309Mb
- Complete 95 covered with 350s 228Mb
52Observations
- Symbolic techniques have a slight edge over
concrete ones for exhaustive analysis - Comparing for Predicate Coverage (10)
- Full Concrete(95) 350s 228Mb
- Full Symbolic(95) 123s 62Mb
- Current results indicate symbolic
under-approximation based search is less
efficient than concrete - Further experimentation required
53Observations
- Random Search?
- Seems to work rather well here
- It will always have an edge on memory, since it
uses almost none - It will most likely have an edge on speed, since
it needs to do little additional work it will
however redo work often - It will in general do worse on test-case length,
since it requires longer sequences to achieve
more complex coverage
54Observations
- Search Order Matters for the lossy techniques
- BFS is inherently better than DFS
- On occasion though it is the other way round
55Conclusions Future Work
- Showed how predicate abstraction can be used for
an under-approximation based search with
refinement - Showed how a lightweight variant, where the
abstraction mapping is given and no refinement is
done, can be used for bug-finding and test-case
generation - Goal Derive predicates for analyzing containers
automatically through the use of symbolic
execution during refinement - Can we derive shape predicates automatically?