Title: Putting Static Analysis to Work for Verification A Case Study
1Putting Static Analysis to Work for Verification
A Case Study
- Tal Lev-Ami
- Thomas Reps
- Mooly Sagiv
- Reinhard Wilhelm
2Program Verification
- Mathematically prove that the program is
partially correct on all inputs - Example Hoare style verification
x ? n x x 1 x ? n 1
3Why Use Program Verification?
- Debugging programs is hard
- Testing can only show the presence of errors -
not their absence - Can provide counter examples
- ...
4Obstacles to Program Verification
- Hard to specify software
- Does not scale
- Limited program size
- Programmer needs to provide loop invariants
- Pointers and dynamically allocated objects are
not handled
5Our Goals
- Handle pointers and dynamically allocated objects
(unbounded memory and/or multi-threading) - No loop invariants
- Input pre Procedure post
- Output
- A safe approximation to the strongest
postcondition p - Issue a warning if p ? post
- Conservative
- Never misses an error
- May yield false warnings
6L insert_sort(L x) L r, pr, rn, l, pl r x
pr NULL while (r ! NULL) l
x rn r -gtn pl NULL while (l ! r)
if (l-gtdata gt r-gtdata) pr-gtn
rn r-gtn l if (pl NULL) x r else
pl-gtn r r pr break pl
l l l-gtn pr r
r rn return x
list(x)
typedef struct node int data
struct node n L
olist(x)
7int main() L x, y, z, w L create(),
insert_sort(L) L merge(L,L), reverse(L) x
create() x insert_sort(x) y
create() y insert_sort(y) z
merge(x,y) w reverse(z)
olist(x)
list(x)
olist(y)
list(y)
olist(z)
rolist(w)
8Conventional Verification
Our Approach
- Formulae over program variables express pre- and
post-conditions - The assignment rule is used to generate the
strongest postcondition for non-destructive
updates - Programmer provides loop invariants
- Finite set of descriptors express pre- and
post-conditions - Predicate-update formulae specify safe set of
descriptors (abstract semantics) - Iteratively explore all the descriptors at every
program point (abstract interpretation) - The ADT designer can provide domain specific
information via instrumentation
9Outline of the Rest of this Talk
- Concentrate on sorting
- DescriptorsCompact representation of stores
- State-space exploration via abstract
interpretation - Prototype implementation in TVLA Three-Valued
Logic Analyzer - Conclusions
10Logical representation of stores
x
data
data
data
n
n
n
11Three-Valued Logic
- 1 - True
- 0 - False
- ½ 1, 0 Unknown
- A join semi-lattice
- 0 ? 1 ½
12Blurred Representation of Stores
x
data
data
data
n
n
n
dle
px1
px0
px0
dle
dle
dle
dle
13Parametric Abstraction (Blur)
- Merge all the nodes with the same unary
abstraction predicate values into a single
summary node - Join predicate values
- Convert a structure of arbitrary size into a
3-valued structure of bounded size
14Instrumentation
- Explicitly maintains information about
distinctions among cells - Leads to less blurring when used as abstraction
predicates - Unary predicates defined via a first order
formulatransitive closure - Example local order
- inOrdern(v) ?v1 n(v, v1)? dle(v, v1)
- inROrdern(v) ?v1 n(v, v1)? dle(v1, v)
15Blurred Representation of Stores
inOrdern(v1)
n(v1, v2)
dle(v1, v2)
px(v)
dle
px1 inOrdern1
px0 inOrdern1
px0 inOrdern1
dle
dle
dle
dle
16Arbitrary Lists
n
n
px0 inOrdern½
px1 inOrdern½
dle
dle
dle
17Abstract Interpretation
- Iteratively compute a set of structures at every
program location - Conservatively interpret statements (conditions)
on blurred structures - Must terminate since the number of blurred
structures is finite for a given program - Fully automatic
- Guaranteed to be sound
- But may be overly conservative
18Abstract Interpretation of Insertion Sort
n
n
px0 inOrdern½
px1 inOrdern½
dle
dle
dle
n
n
px0 inOrdern1
px1 inOrdern1
dle
dle
dle
19The Key Problem
- How to interpret statements (conditions) on
blurred structures? - Difficult to provide a conservative (and
reasonably precise) interpretation - It is difficult to show that specific
abstractions are conservative (Sagiv, Reps,
Wilhelm, TOPLAS 98) - Long and intimidating proofs
- Or no proofs (and bugs)
20The best conservative interpretationCousotCousot
1979
abstract representation
21The 3 Valued-Logic Approach
- Automatically derives a conservative
interpretation of statements and conditions from
- structural operational semantics
- written using logical formulae
- global properties
- abstraction predicates
- An experimental system (TVLA)
- Correct by construction
22x-gtd lt y-gtd
?v1, v2 px (v1 ) ?py( v2) ?dle (v1 , v2 )
true
23From Local Outlook to Global Outlook
- (Safety) Every time control reaches a given
point - there are no garbage memory cells
- the list is acyclic
- each cell is locally ordered
- (History) The list is a permutation of the
original list
24Bugs Found
- Pointer manipulations
- null dereferences
- memory leaks
- Forget to sort the first element
- Swap equal elements in bubble sort(non-terminatio
n)
25L insert_sort_b2(L x) L r, pr, rn, l, pl
if (x NULL) return NULL pr x r
x-gtn while (r ! NULL) pl x rn
r-gtn l x-gtn while (l ! r) if
(l-gtd gt r-gtd) pr-gtn rn r-gtn l
pl-gtn r r pr break pl l l
l-gtn pr r r rn return x
n
n
px1 inOrdern½
px0 inOrdern1
dle
dle
dle
26Running Times
27Properties Not Proved
- (Liveness) Termination
- Stability
28Related Work
- Temporal-logic model checking
- Manually extracts finite-state machine
- Does not handle dynamically allocated data
- But proves stronger properties, e.g., liveness
- Bourdoncle 93
- Handles integer arithmetic
- Cannot handle pointers
29Further Work
- Recursive programs (Quicksort)
- Experiment with other ADTs (AVL trees)
- Automatically derive predicate-update formulae
for instrumentation predicates? - Scaling to larger programs
- User annotations
- Class-level analysis
- Modular analysis
- Space optimizations
- Smart front-end that precomputes cheap
information
30Conclusions
- It is possible to automatically verify
non-trivial properties of complex C programs that
manipulate dynamically allocated memory w/o
providing loop invariant - The implementation is automatically generated
from TVLA - But scaling is an issue
31Other Applications of TVLA
- Verifying cleanness properties of C programs
(Dor, Rodeh, Sagiv 2000) - null derefernces
- memory leaks
- Verifying safety properties of Mobile Ambients
(Nielson, Nielson, Sagiv 2000) - Verifying safety programs of multithreaded Java
programs (Yahav 2000) - Deadlocks
- Nested monitors
- Read/Write interference
32Boolean Connectives Kleene
33The Operational Semantics of x t-gtn
x(v1, v2) ? v1 t (v1) ?n (v1, v2)
34The Operational Semantics of x-gtn NULL
n(v1, v2) n (v1, v2) ??x (v1)
inOrderdle, n(v) inOrder dle, n(v)?x (v)
inROrderdle, n(v) inROrder dle, n(v)?x (v)
35The Operational Semantics of x-gtn t
n(v1, v2) n (v1, v2) ?(x (v1) ?t(v2))
inOrderdle, n(v) (x(v)? ?v1 t(v1)
? dle(v, v1)
InOrderdle, n(v) )
inROrderdle, n(v) (x(v)? ?v1 t(v1)
? dle(v1, v)
inROrderdle, n(v) )