Title: Applications of TVLA
1Applications of TVLA
- Mooly Sagiv
- Tel Aviv University
Shape Analysis with Applications
http//www.cs.tau.ac.il/rumster/TVLA/
2Outline
- Issues
- Complexity of TVLA
- Weak vs. Strong Updates
- Cleanness
- Null derefernces
- Memory leaks
- Freed storage (homework)
- The concurrent modification problem
- Partial correctness
- Sorting
- GC
- Total Correctenss
- Flow dependences
- Multithreading
- Other
3Complexity of Shape Analysis
x malloc() if () y1 x if () y2 x if ()
y3 x ? if () yn x
4Complexity of TVLA analysis
- Maximal number of Nodes in Blurred Structures
- 3A
- Size of 3-valued structure representation
- Action cost
- Focus
- Precondition
- Coerce
- New
- Update
- Coerce
- Blur
5Weak vs. Strong Updates
if () x y else x z x?n NULL
6Detecting Incorrect Library Usages (J. Field, D.
Goyal. G. Ramalingam, A. Warshavski)
- Java provides libraries for manipulating data
structures - Collections
- Lists
- Hashset
-
- Iterators over collections allows sequential
accesses - Statically detect incorrect library usages
Set s worklist.unprocessedItems() for
(Iterator i s.iterator() i.hasNext())
Object item i.next() if (...)
processItem(item)
7The Concurrent Modification Problem
- Static analysis of Java programs manipulating
Java 2 collections - Inconsistent usages of iterators
- An Iterator object i defined on a collection
object c - No use of i may be preceded by update to the
contents of c, unless the update was also made
via I - Guarantees order independence
8Artificial Example
Set v new Set() Iterator i1
v.iterator() Iterator i2 v.iterator() Iterator
i3 i1 i1.next() i1.remove() if (...)
i2.next() if (...) i3.next()
v.add("...") if (...) i1.next()
9- class Make
- private Worklist worklist
- public static void main (String args)
- Make m new Make()
- m.initializeWorklist(args)
- m.processWorklist()
- void initializeWorklist(String args)
- ... worklist new Worklist() ...
- // add some items to worklist
- void processWorklist()
- Set s worklist.unprocessedItems()
- for (Iterator i s.iterator()
i.hasNext()) - Object item i.next()
- if (...) processItem(item)
-
- void processItem(Object i) ...
doSubproblem(...) - void doSubproblem(...)
- ... worklist.addItem(newitem) ...
public class Worklist Set s public
Worklist() . .. s new HashSet() ...
public void addItem(Object item) s.add(item)
public Set unprocessedItems() return s
return rev
10Static Detection of Concurrent Modifications
- Statically Check for CME exceptions
- Warn against potential CME
- Sound (conservative) solution
- Not too many false alarms
- Coding in TVLA
- Operational Semantics
- Vanilla solution is Imprecise (and inefficient)
- Derive instrumentation predicates
- Java to TVP front-end
- Extract potentially relevant client code
11CME specification in Java
class Version / represents distinct versions
of a Set / class Collection Version
version Collection() version new
Version() boolean add(Object o) version
new Version() Iterator iterator() return
new Iterator(this) class Iterator
Collection set Version definingVersion
Iterator (Collection s) definingVersion
s.version
set s void remove() requires
(definingVersion set.version) set.ver
new Version() definingVersion
set.version Object next() requires
(definingVersion set.version)
12Vanilla TVLA Encoding
- Local iterators are pointers
- Unary predicates
- Relevant fields are pointer selectors
- Binary predicates
13Artificial Example
Set v new Set() Iterator i1
v.iterator() Iterator i2 v.iterator() Iterator
i3 i1 i1.next() i1.remove() if (...)
i2.next() if (...) i3.next()
v.add("...") if (...) i1.next()
14Improved TVLA Encodings
- Use reachability
- Explicitly maintain relevant information
- validi i.defVersion i.set.Version
- iterOfi, v i.set v
- mutexi, j i.set j.set i ! j
- samev, w v w
- Can be automatically derived from the
specification - Polynomial complexity in programs where iterators
are not stored in the client heap Meet over all
path solution - Adaptive to programs with client heap
15Empirical Results
Benchmark Loc Err. FA Time (sec) Space (MB) Structs.
Kernel 683 15 0 60 19 4363
MapTest 335 1 0 61 20 4937
Iterator Test 126 0 0 0.23 4 208
JFE 2896 1 1 236 49 9878
16Partial Correctness
- P S Q
- How to derive loop invariants
- Abstract interpretation provides a sound solution
- The abstract domain represents a class of program
invariants
17Example Sorting of linked lists
typedef struct node struct node n int
data Elements
- dle(v1, v2) v1.data ? v2.data
- inOrdern, dle(v) ?v1 n(v, v1) ?dle(v, v1)
- inROrdern, dle(v) ?v1 n(v, v1) ?dle(v1, v)
- Captures intermediate invariants as well
18L insert_sort(L x) L r, pr, rn, l, pl r x
pr NULL while (r ! NULL) l
x rn r -gtn pl NULL while (l ! r)
if (l-gtdata gt r-gtdata) pr-gtn
rn r-gtn l if (pl NULL) x r else
pl-gtn r r pr break pl
l l l-gtn pr r
r rn return x
typedef struct node struct node n int
data Elements
19n
n
x
inOrderdle,n½ rn,x
inOrderdle,n1/2 rn,x
dle
dle
L insert_sort(L x) L r, pr, rn, l,
pl return x
n
x
inOrderdle,n rn,x
inOrderdle,n rn,x
dle
dle
20/pred.tvp / foreach (z in PVar) p z(v_1)
unique box p n(v_1, v_2) function i isn(v)
E(v_1, v_2) ( v_1 ! v_2 n(v_1, v) n(v_2,
v)) foreach (z in PVar) i rn,z(v) E(v_1)
(z(v_1) n(v_1, v)) i cn(v) n(v, v) p
dle(v1, v2) reflexive transitive i
inOrderdle,n(v) A(v_1) n(v, v_1) -gt dle(v,
v_1) nonabs i inROrderdle,n(v) A(v_1) n(v,
v_1) -gt dle(v_1, v) nonabs r !dle(v_1, v_2) gt
dle(v_2, v_1)
21/ cond.tvp / action uninterpreted() t
"uninterpreted" action Is_Not_Null_Var(x1)
t x1 " ! NULL" f x1(v) p
E(v) x1(v) action Is_Null_Var(x1) t x1
" NULL" f x1(v) p !(E(v)
x1(v))
action Is_Eq_Var(x1, x2) t x1 " "
x2 f x1(v), x2(v) p A(v) x1(v) lt-gt
x2(v) action Is_Not_Eq_Var(x1, x2) t
x1 " ! " x2 f x1(v), x2(v) p
!A(v) x1(v) lt-gt x2(v)
22action Greater_Data_L(x1, x2) t x1
"-gtdata gt " x2 "-gtdata" f x1(v_1)
x2(v_2) dle(v_1, v_2) p !E(v_1, v_2)
x1(v_1) x2(v_2) dle(v_1, v_2) action
Less_Equal_Data_L(x1, x2) t x1 "-gtdata
lt " x2 "-gtdata" f x1(v_1) x2(v_2)
dle(v_1, v_2) p E(v_1, v_2) x1(v_1)
x2(v_2) dle(v_1, v_2)
23stat.tvp
action Set_Next_Null_L(x1) t x1 "-gt"
n " null" f x1(v) message
!(E(v) x1(v)) -gt n(v_1, v_2) ...
isn(v) ... rn,x1(v) ... foreach
(z in PVar x) rn, x(v)
... cn(v)
inOrderdle,n(v) inOrderdle,n(v) x1(v)
inROrderdle,n(v) inROrderdle,n(v) x1(v)
24stat.tvp(more)
action Malloc_L(x1) t x1 " (L)
malloc(sizeof(struct node)) " new
x1(v) isNew(v) inOrderdle,
n(v1, v2) inROrderdle, n(v1,
v2)
25Abstract interpretation of if x-gtdata lt y.data
26From Local Outlook to Global Outlook
- (Safety) Every time control reaches a given
point - there are no garbage memory cells
- the list is acyclic
- each cell is locally ordered
- (History) The list is a permutation of the
original list
27Bugs Found
- Pointer manipulations
- null dereferences
- memory leaks
- Forget to sort the first element
- Swap equal elements in bubble sort(non-terminatio
n)
28L insert_sort_b2(L x) L r, pr, rn, l, pl
if (x NULL) return NULL pr x r
x-gtn while (r ! NULL) pl x rn
r-gtn l x-gtn while (l ! r) if
(l-gtd gt r-gtd) pr-gtn rn r-gtn l
pl-gtn r r pr break pl l l
l-gtn pr r r rn return x
x
n
n
inOrderdle,n½
inOrderdle,n1
dle
dle
dle
29Running Times
30Properties Not Proved
- (Liveness) Termination
- Stability
31Example Mark and Sweep
void Sweep() unexplored Universe
collected ? while (unexplored ? ?) x
SelectAndRemove(unexplored) if (x ? marked)
collected collected ? x
assert(collected Universe
Reachset(root) )
void Mark(Node root) if (root ! NULL)
pending ? pending pending ? root
marked ? while (pending ? ?)
x SelectAndRemove(pending) marked
marked ? x t x ? left if (t
? NULL) if (t ? marked)
pending pending ? t t x ? right
if (t ? NULL) if (t ? marked)
pending pending ? t
assert(marked Reachset(root))
Run Demo
32Total Correctness
- Usually more complicated
- Need to show that something good eventually
happens - Difficult for programs with unbounded concrete
states - Example linked lists
- Show decreased set of reachable locations
33Program Dependences
- A statement s1 depends on s2 if
- s2 writes into a location l
- s1 reads from location l
- There is no intervening write in between
- Useful for
- Parallelization
- Scheduling
- Program Slicing
- How to compute
- Scalars
- Stack pointers
- Heap allocated pointers
34Flow Dependences vs. May-Aliases
int y List p, q q (List) malloc() p
q tp p-gtd 5 t-gtd 7 y q-gtd
int y List p, q q (List) malloc() p
q p-gtd 5 y q-gtd
int y List p, q q (List) malloc() p
q tp p-gtd 5 p(List) malloc() y q-gtd
35 void append() List head, tail, temp
l1 head (List) malloc() l2 scanf("c",
head-gtd) l3 head-gtn NULL l4 tail
head l5 if (tail-gtd x') goto l12
l6 temp (List) malloc() l7
scanf("c", temp-gtd) l8 temp-gtn
NULL l9 tail-gtn temp l10 tail
tail-gtn l11 goto l_5 l12
printf("c", head-gtd) l13 printf("c",
tail-gtd) exit
36/pred.tvp / foreach (z in PVar) p z(v_1)
unique box p n(v_1, v_2) function i isn(v)
E(v_1, v_2) ( v_1 ! v_2 n(v_1, v) n(v_2,
v)) foreach (z in PVar) i rn,z(v) E(v_1)
(z(v_1) n(v_1, v)) foreach (l in Label) p
lst_w_vl,z() // l is the last write to into the
variable z foreach (l in Label) p
lst_w_fl,n(v_1) box // l is the last write to
into the v_1.n p lst_w_fl,d(v_1) box// l is
the last write to into v_1.data i cn(v)
n(v, v)
37Operational Semantics for Statements
st update
l xrhs lst_w_vl,x 1 lst_w_vl, x 0
l x-gtd rhs lst_w_fl, d(v) (x(v) ? 1 lst_w_fl, d(v)) lst_w_fl, d(v) (x(v)? 0 lst_w_fl, d(v))
l x-gtn rhs lst_w_fl, n(v) (x(v) ? 1 lst_w_fl, n(v)) lst_w_fl, n(v) (x(v)? 0 lst_w_fl, n(v))
38Read Formulae for Statememts
exp formula
l x NULL 0
l x y lst_w_vl,y
l x-gtn NULL lst_w_vl, x
l x-gtn y lst_w_vl, x?lst_w_vl, y
l x y-gtn lst_w_vl, y(v) ? ?v y(v) ?lst_w_fl, n(v)
39 void append() List head, tail, temp
l1 head (List) malloc() l2 scanf("c",
head-gtd) l3 head-gtn NULL l4 tail
head l5 if (tail-gtd x') goto l12
l6 temp (List) malloc() l7 scanf("c",
temp-gtd) l8 temp-gtn NULL l9
tail-gtn temp l10 tail tail-gtn l11
goto l5 l12 printf("c", head-gtd) l13
printf("c", tail-gtd) exit
lst_w_vl1, head lst_w_vl10, tail lst_w_vl6,
temp
temp
head
tail
lst_w_fl9,n
lst_w_fl8,n
lst_w_fl2, d
lst_w_fl7,d
40Java Concurrency
- Threads and locks are just dynamically allocated
objects - synchronized implements mutual exclusion
- wait, notify and notifyAll coordinate activities
across threads
41Example - Mutual Exclusion
l_0 while (true) l_1 synchronized(sharedLock)
l_C // critical actions l_2 l_3
Two threads (pc1,pc2,lockAcquired1,lockAcquired2)
- Allocate new lock ?
- Allocate new thread ?
42Program Model
- Interleaving model of concurrency
- Program is modeled as a transition system
43Configurations
- A program configuration encodes
- global store
- program-location of every thread
- status of locks and threads
- First-order logical structures used to represent
program configurations
44Configurations
- Predicates model properties of interest
- is_thread(t)
- atlab(t) lab ? Labels
- rvalfld(o1,o2) fld ? Fields
- held_by(l,t)
- blocked(t,l)
- waiting(t,l)
- Can use the framework with different predicates
45Configurations
blocked
is_thread atl_1
held_by
is_thread atl_C
rvalthis
blocked
rvalthis
is_thread atl_1
is_thread atl_0
is_thread atl_0
rvalthis
46Configurations
- Program control-flow is not separately
represented - Program location for each thread is encoded
inside the configuration - atlab(t) lab ? Labels
47Structural Operational Semantics - actions
- An action consists of
- precondition(when) formula
- update formulae
- Precondition formula may use a free variable ts
for currently scheduled thread - Semantics is non-deterministic
48Structural Operational Semantics - actions
49Safety Properties
- Configuration-local property as logical formula
50Concrete Configuration
blocked
is_thread atl_1
held_by
is_thread atl_C
rvalthis
blocked
rvalthis
is_thread atl_1
is_thread atl_0
is_thread atl_0
rvalthis
51Abstract Configuration
held_by
blocked
is_thread atl_C
is_thread atl_1
rvalthis
rvalthis
is_thread atl_0
52Safety Properties Revisited
- RW interference
- WW interference
- Total deadlock
- Nested monitors
- Illegal thread interactions
53Interprocedural Analysis (Rinetzky)
- Model the stack as a linked list (CC 2001)
- Observe alias patterns
- Handles recursion with pointers from the stack to
the heap (but rather slow) - Exploit referential transparency
- The part of the store modified by a procedure is
limited - Summarize irrelevant calling contexts
- Pre-analyze Abstract Data Types
- Analyzed parts of LEDA linked lists
54Other TVLA Applications
- Handling trees (G. Yorsh)
- Allowing data structure specification (M. Rinard)
- Refinement of implementations (A. Mulhren)
- Mobile Ambients (F. Nielson H.R. Nielson)
55TVLA Generalizations
- ITVLA (F. Dimaio, N. Dor)
- Support arithmetic operations
- Import static domains for integer analysis
- Intervals
- Polyhedra
- Derive Update Formula for Instrumentation
Predicates (A. Loginov, T. Reps)
56Summary
- TVLA supports design and prototyping of
sophisticated static analyzers - The heap is your friend
- First order logic specifications is natural
- Powerful static domain
- But scaling is an issue
- Interprocedural analysis
- User specification
- Predictability
- User interface
- Debugability of Operational Semantics