Title: Pointer and Shape Analysis Seminar http://www.cs.tau.ac.il/~msagiv/courses/shape.html
1Pointer and Shape Analysis Seminarhttp//www.cs.t
au.ac.il/msagiv/courses/shape.html
- Mooly Sagiv
- Schriber 317
- msagiv_at_post
- Office Hours Thursday 15-16
-
2General Information
- Prerequisites
- Compilers Program Analysis
- Select 3 topics by Sunday
- Participate in 9 seminar talks
- Present a paper
3Outline
- Schedule
- Point-to analysis
4Tentative Schedule
7/2 Shachar Itzhaky Practical virtual method call resolution for Java
14/2 Roy Ganor Effective Static Race Detection for Java
17/2 13-15 Hongseok Yang Scalable Shape Analysis
28/2 Roza Pogalnikova Context-Sensitive Points-to Analysis Is It Worth It?
3/3 Ory Samorodnitzky The undecidability of aliasing
14/3 Alex Shapiro Error detection using client driven poniter analysis
21/3 Roman Simkin Free-Me A Static Analysis for Automatic Individual Object Reclamation
28/3 Uri Inon
5Points-To Analysis
- Determine if a variable points to a variable at
some (all) execution paths
1 p a 2 q b 3 if (getc()) 4
q c 5
q ?? b
p ?? a
q ?? c
6Iterative Program Analysis
- Start by optimistically assuming that nothing is
wrong - No points-to set
- At every iteration apply the abstract meaning of
programming language statements and add more
points-to pairs - Stop when no changes occur
7Iterative Points-to Analysis
t a
t?a
y b
t?a, y?b
z c
t?a, y?b, z ?c
p t
t?a, y?b, z ?c
t?a, y?b, z ?c
p y
p z
t?a, y?b, z ?c, p?y
t?a, y?b, z ?c, p?z
t?a, y?b, z ?c, p?y, p?z
8Iterative Points-to Analysis
t a
t?a
y b
t?a, y?b
z c
t?a, y?b, z ?c, p?y, p?z
p t
t?a, y?b, z ?c, p?y, p?z, y?a, z?a
t?a, y?b, z ?c, p?y, p?z, y?a, z?a
p y
p z
t?a, y?b, z ?c, p?y
t?a, y?b, z ?c, p?z
t?a, y?b, z ?c, p?y, p?z
9Iterative Points-to Analysis
t a
t?a
y b
t?a, y?b
z c
t?a, y?b, z ?c, p?y, p?z
p t
t?a, y?b, z ?c, p?y, p?z, y?a, z?a
t?a, y?b, z ?c, p?y, p?z, y?a, z?a
p y
p z
t?a, y?b, z ?c, p?y, y?a, z?a
t?a, y?b, z ?c, p?z, y?a, z?a
t?a, y?b, z ?c, p?y, p?z
10Iterative Points-to Analysis
t a
t?a
y b
t?a, y?b
z c
t?a, y?b, z ?c, p?y, p?z, y?a, z?a
p t
t?a, y?b, z ?c, p?y, p?z, y?a, z?a
t?a, y?b, z ?c, p?y, p?z, y?a, z?a
p y
p z
t?a, y?b, z ?c, p?y, y?a, z?a
t?a, y?b, z ?c, p?z, y?a, z?a
t?a, y?b, z ?c, p?y, p?z, y?a, z?a
11A Simple Programming Language
- Arbitrary (uninterpreted) control flow statement
- Atomic statements
- x y
- x y
- x y
- x y
12Abstract Semantics
- For every atomic statement S
- ?S ? P(Var? Var)? P(Var? Var)
- ?x y ? (pt) pt (x, ) ? (x, y)
- ?x y ?(pt) pt (x, ) ? (x, z) (y, z)
?pt - ?x y ? (pt) pt (x, ) ? (x, z) (y,
w), (w, z) ?pt - ?x y ?(pt) pt ? (w, t) (x, w), (y, t)
?pt
13pt1
1 pt2(t, a)
2 pt3(t, a), (y, b)
3 pt4(t, a), (y, b), (z, c)
4 pt5 (t, a), (y, b), (z, c)
pt6 (t, a), (y, b), (z, c)
5 pt7 (t, a), (y, b), (z, c), (p, y)
6 pt7 (t, a), (y, b), (z, c), (p, y), (p, z)
7 pt4 (t, a), (y, b), (z, c), (p, y), (p, z)
4 pt5 (t, a), (y, b), (z, c), (p, y), (p, z), (y, a), (z, a)
pt6 (t, a), (y, b), (z, c), (p, y), (p, z), (y, a), (z, a)
5
6
t a
1
y b
2
z c
3
p t
4
5
6
p y
p z
7
14Supporting Memory Allocation
- Uniform treatment of the memory allocated at an
allocation statement - For every atomic statement S
- ?S ? P(Var? Var)? P(Var? Var)
- ?x y ? (pt) pt (x, ) ? (x, y)
- ?x y ? (pt) pt (x, ) ? (x, z) (y, z)
?pt - ?x y ? (pt) pt (x, ) ? (x, z) (y,
w), (w, z) ?pt - ?x y ?(pt) pt ? (w, t) (x, w), (y, t)
?pt - ?l x malloc() ?(pt) pt (x, ) ? (x,
l)
15Summary Flow-Sensitive Solution
- Limited destructive updates
- Can be improved with must information
- O(N Var2) space
16Context-Sensitivity
- How to handle procedures
- Separate points-to sets for every call
- A uniform set for all calls
17Context Sensitivity Example
x t1 a t2 foo(x, a)
z t3 b t4 foo(z, b)
void foo(source, target) source
target
18Flow-Insensitive Analysis
- Ignore control flow statements
- Arbitrary statement order
- Only accumulate Points-to
- Usually represented as a directed graph
- O(n2) space
19Flow Insensitive Solution
t a
y b
z c
p t
p y
p z
20Set Constraints
- A set of rules of the form
- lhs ? rhs
- t ? rhs ? lhs ? rhs (conditional constraint)
- lhs, rhs, rhs are variables over sets of terms
- t is a term
- The least solution can be found iteratively
- start with empty sets
- add terms when needed
- Cubic graph based solution
21t a a ? ptt y b b ?
pty z c c ? ptz if (nondet())
p y y ? ptp
else p z z ? ptp p
t a ?ptp ? ptt ? pta
b ?ptp ? ptt ?
ptb c ?ptp ? ptt ? ptc
y ?ptp ? ptt ?
pty
z ?ptp ? ptt ? ptz
t ?ptp ? ptt ? ptt
p ?ptp
?ptt ? ptp
22Unification Based Solution Steengard 1996
- Treat assignments as equalities
- Employ union-find algorithm
- Almost linear time complexity
23Conclusions
- Points-to analysis is a simple pointer analysis
problem - Effective solutions (8MLoc)
- But rather imprecise
- Set constraints are useful beyond pointer
analysis - Class level analysis