Title: Compositional Pointer and Escape Analysis for Java Programs
1Compositional Pointer and Escape Analysis for
Java Programs
- Martin Rinard
- Laboratory for
- Computer Science
- MIT
John Whaley IBM Tokyo Research Laboratory
2- Analysis
- Points-to information
- Escape information
- Optimizations
- Stack allocation
- Synchronization elimination
3Outline
- Example
- Analysis Algorithm
- Optimizations
- Experimental Results
- Conclusion
4Employee Database Example
- Read in database of employee records
- Extract statistics like max salary
5Employee Database Example
- Read in database of employee records
- Extract statistics like max salary
Name
Salary
6Computing Max Salary
- Traverse Records to Find Max Salary
Vector
John Doe
Name
Ben Bit
Jane Roe
45,000
Salary
30,000
55,000
max 0
7Computing Max Salary
- Traverse Records to Find Max Salary
Vector
John Doe
Name
Ben Bit
Jane Roe
45,000
Salary
30,000
55,000
max 45,000
who John Doe
8Computing Max Salary
- Traverse Records to Find Max Salary
Vector
John Doe
Name
Ben Bit
Jane Roe
45,000
Salary
30,000
55,000
max 45,000
who John Doe
9Computing Max Salary
- Traverse Records to Find Max Salary
Vector
John Doe
Name
Ben Bit
Jane Roe
45,000
Salary
30,000
55,000
max 55,000
who Jane Roe
10Computing Max Salary
- Traverse Records to Find Max Salary
Vector
John Doe
Name
Ben Bit
Jane Roe
45,000
Salary
30,000
55,000
max salary 55,000
highest paid Jane Roe
11Coding Max Computation
- class EmployeeDatabase
- Vector database new Vector()
- Employee highestPaid
- void computeMax()
- int max 0
- Enumeration enum database.elements()
- while (enum.hasMoreElements())
- Employee e enum.nextElement()
- if (max lt e.salary())
- max e.salary() highestPaid e
-
-
-
-
12Coding Max Computation
class EmployeeDatabase Vector database new
Vector() Employee highestPaid void
computeMax() int max 0 Enumeration enum
database.elements() while (enum.hasMoreElemen
ts()) Employee e enum.nextElement() if
(max lt e.salary()) max e.salary()
highestPaid e
13Issues In Implementation
- Enumeration object allocated on heap
- Increases heap memory usage
- Increases garbage collection frequency
- Heap allocation is unnecessary
- Enumeration object allocated inside max
- Not accessible outside max
- Should be able to use stack allocation
14Key Concept
- Enumeration object is captured in max
- Allocated inside computation of method
- Inaccessible to callers of method
- Within method, object does not escape to another
thread - So object is dead when method returns
- Can allocate captured objects on stack
15How is Database Used?
- void printStatistics()
- BufferedReader r new BufferedReader(
- new InputStreamReader(System.in))
- EmployeeDatabase e new EmployeeDatabase(r)
- e.computeMax()
- System.out.println(max salary
e.highestPaid)
16Interesting Fact
- void printStatistics()
- BufferedReader r new BufferedReader(
- new InputStreamReader(System.in))
- EmployeeDatabase e new EmployeeDatabase(r)
- e.computeMax()
- System.out.println(max salary
e.highestPaid) -
- Only one thread accesses employee database
- But accesses are synchronized!
- Employee e enum.nextElement()
17Synchronization in nextElement
- class VectorEnumerator implements Enumeration
- Vector vector
- int count
- Object nextElement()
- synchronized (vector)
- if (count lt vector.elementCount)
- return vector.elementDatacount
-
- throw new NoSuchElementException()
-
-
18Synchronized Libraries
- nextElement is in the Java class library
- Java is a multithreaded language
- By default, libraries are synchronized
- Result lots of unnecessary synchronization for
objects that are accessed by only one thread
19Eliminating Unnecessary Synchronization
- Captured objects are accessible to only one
thread - Can eliminate synchronization on captured objects
20Basic Idea
- Use pointer and escape analysis to recognize
captured objects - Transform program to
- Allocate captured objects on stack
- Eliminate synchronization on captured objects
21Analysis Overview
- Interprocedural analysis
- Compositional analysis
- Driving Principle
- Explicitly represent potential interactions of
method with its environment (callers and other
threads)
22Points-to Escape Graph in Example
void computeMax() int max 0 Enumeration
enum database.elements() while
(enum.hasMoreElements()) Employee e
enum.nextElement() if (max lt e.salary()) max
e.salary() highestPaid e
dotted outside
vector
elementData
solid inside
enum
database
highestPaid
this
e
23Definitions node types
- NI inside nodes
- represent objects created within the computation
of the method - one inside node for each object creation site
represents all objects created at site - NO outside nodes
- represent objects created outside of the
computation of the method
24Definitions outside node types
- NP parameter nodes
- represent objects passed as incoming parameters
- NL load nodes
- one load node for each load statement in method
- represents objects loaded from an escaped node
- NCL class nodes
- node from which static variables are accessed
25Points-to Escape Graph in Example
void computeMax() int max 0 Enumeration
enum database.elements() while
(enum.hasMoreElements()) Employee e
enum.nextElement() if (max lt e.salary()) max
e.salary() highestPaid e
dotted outside
vector
elementData
solid inside
enum
database
highestPaid
this
e
26Points-to Escape Graph in Example
void computeMax() int max 0 Enumeration
enum database.elements() while
(enum.hasMoreElements()) Employee e
enum.nextElement() if (max lt e.salary()) max
e.salary() highestPaid e
dotted outside
solid inside
vector
elementData
enum
database
highestPaid
red escaped
white captured
this
e
27Escaped nodes
- Escaped nodes
- parameter nodes
- class nodes
- thread nodes
- nodes in return set
- nodes reachable from other escaped nodes
- captured is the opposite of escaped
28Dataflow Analysis
- Computes a points-to escape graph for each
program point - Points-to escape graph is a triple ltI,O,egt
- I - set of inside edges
- O - set of outside edges
- e - escape function
29Dataflow Analysis
- Initial state
- I formals point to parameter nodes,
- classes point to class nodes
- O Ø
- Transfer functions
- I (I KillI ) U GenI
- O O U GenO
- Confluence operator is U
30Intraprocedural Analysis
- Must define transfer functions for
- copy statement l v
- load statement l1 l2.f
- store statement l1.f l2
- return statement return l
- object creation site l new cl
- method invocation l l0.op(l1lk)
31- copy statement l v
- KillI edges(I, l)
- GenI l succ(I, v)
- I (I KillI ) ? GenI
Existing edges
l
v
32- copy statement l v
- KillI edges(I, l)
- GenI l succ(I, v)
- I (I KillI ) ? GenI
Generated edges
l
v
33- load statement l1 l2.f
- SE n2 ? succ(I, l2) . escaped(n2)
- SI ?succ(I, n2,.f) . n2 ? succ(I, l2)
- case 1 l2 does not point to an escaped node (SE
Ø) - KillI edges(I, l1)
- GenI l1 SI
Existing edges
l1
f
l2
34- load statement l1 l2.f
- SE n2 ? succ(I, l2) . escaped(n2)
- SI ?succ(I, n2,.f) . n2 ? succ(I, l2)
- case 1 l2 does not point to an escaped node (SE
Ø) - KillI edges(I, l1)
- GenI l1 SI
Generated edges
l1
f
l2
35- load statement l1 l2.f
- case 2 l2 does point to an escaped node (SE ? Ø)
- KillI edges(I, l1)
- GenI l1 (SI ? n)
- GenO (SE f) n
Existing edges
l1
l2
36- load statement l1 l2.f
- case 2 l2 does point to an escaped node (SE ? Ø)
- KillI edges(I, l1)
- GenI l1 (SI ? n)
- GenO (SE f) n
Generated edges
l1
f
l2
37- store statement l1.f l2
- GenI (succ(I, l1) f) succ(I, l2)
- I I ? GenI
Existing edges
l1
l2
38- store statement l1.f l2
- GenI (succ(I, l1) f) succ(I, l2)
- I I ? GenI
Generated edges
l1
f
l2
39- object creation site l new cl
- KillI edges(I, l)
- GenI ltl, ngt
Existing edges
l
40- object creation site l new cl
- KillI edges(I, l)
- GenI ltl, ngt
Generated edges
l
41Method call
- Transfer function for method call
- Take points-to escape graph before the call site
- Retrieve the points-to escape graph from analysis
of callee - Map callee graph into caller graph
- Result is the points-to escape graph after the
call site
42Interprocedural Mapping
- Set up an abstract mapping between caller and
callee - outside nodes in the callee may refer to any
number of inside nodes in the caller - add all reachable inside edges from callees
graph into callers graph - outside edges from a node in the callee need to
be added to the mapped caller node if it escapes
43Interprocedural Mapping Example
- void printStatistics()
- BufferedReader r new BufferedReader(
- new InputStreamReader(System.in))
- EmployeeDatabase e new EmployeeDatabase(r)
- e.computeMax()
- System.out.println(max salary
e.highestPaid)
44Interprocedural Mapping Example
void printStatistics() BufferedReader r new
BufferedReader( new InputStreamReader(System.in)
) EmployeeDatabase e new EmployeeDatabase(r)
e.computeMax() System.out.println(max salary
e.highestPaid)
graph before call site
elementData
database
e
45Interprocedural Mapping Example
void printStatistics() BufferedReader r new
BufferedReader( new InputStreamReader(System.in)
) EmployeeDatabase e new EmployeeDatabase(r)
e.computeMax() System.out.println(max salary
e.highestPaid)
graph before call site
elementData
database
e
callee graph
elementData
database
this
highestPaid
Enum object is not present because it was
captured in the callee.
46Step 1 Map formals to actuals
void printStatistics() BufferedReader r new
BufferedReader( new InputStreamReader(System.in)
) EmployeeDatabase e new EmployeeDatabase(r)
e.computeMax() System.out.println(max salary
e.highestPaid)
graph before call site
elementData
database
e
callee graph
elementData
database
this
highestPaid
47Step 1 Map formals to actuals
void printStatistics() BufferedReader r new
BufferedReader( new InputStreamReader(System.in)
) EmployeeDatabase e new EmployeeDatabase(r)
e.computeMax() System.out.println(max salary
e.highestPaid)
graph before call site
elementData
database
e
highestPaid
database
callee graph
elementData
48Step 2 Match callee outside edges against caller
inside edges
void printStatistics() BufferedReader r new
BufferedReader( new InputStreamReader(System.in)
) EmployeeDatabase e new EmployeeDatabase(r)
e.computeMax() System.out.println(max salary
e.highestPaid)
graph before call site
elementData
database
e
highestPaid
database
callee graph
elementData
49Step 2 Match callee outside edges against caller
inside edges
void printStatistics() BufferedReader r new
BufferedReader( new InputStreamReader(System.in)
) EmployeeDatabase e new EmployeeDatabase(r)
e.computeMax() System.out.println(max salary
e.highestPaid)
graph before call site
elementData
database
e
elementData
callee graph
50Step 2 Match callee outside edges against caller
inside edges
void printStatistics() BufferedReader r new
BufferedReader( new InputStreamReader(System.in)
) EmployeeDatabase e new EmployeeDatabase(r)
e.computeMax() System.out.println(max salary
e.highestPaid)
graph before call site
elementData
database
e
elementData
callee graph
51Step 2 Match callee outside edges against caller
inside edges
void printStatistics() BufferedReader r new
BufferedReader( new InputStreamReader(System.in)
) EmployeeDatabase e new EmployeeDatabase(r)
e.computeMax() System.out.println(max salary
e.highestPaid)
graph before call site
elementData
database
e
highestPaid
callee graph
52Step 2 Match callee outside edges against caller
inside edges
void printStatistics() BufferedReader r new
BufferedReader( new InputStreamReader(System.in)
) EmployeeDatabase e new EmployeeDatabase(r)
e.computeMax() System.out.println(max salary
e.highestPaid)
graph before call site
elementData
database
e
highestPaid
callee graph
53Step 2 Match callee outside edges against caller
inside edges
void printStatistics() BufferedReader r new
BufferedReader( new InputStreamReader(System.in)
) EmployeeDatabase e new EmployeeDatabase(r)
e.computeMax() System.out.println(max salary
e.highestPaid)
graph before call site
elementData
database
e
highestPaid
callee graph
54Step 3 Add nodes and edges and update escape
function
void printStatistics() BufferedReader r new
BufferedReader( new InputStreamReader(System.in)
) EmployeeDatabase e new EmployeeDatabase(r)
e.computeMax() System.out.println(max salary
e.highestPaid)
graph before call site
elementData
database
e
highestPaid
55Step 3 Add nodes and edges and update escape
function
void printStatistics() BufferedReader r new
BufferedReader( new InputStreamReader(System.in)
) EmployeeDatabase e new EmployeeDatabase(r)
e.computeMax() System.out.println(max salary
e.highestPaid)
final graph
elementData
database
e
highestPaid
Escaped nodes and edges in callee may be
recaptured in caller!
56Life is not so Simple
- Dependences between phases
- Mapping best framed as constraint satisfaction
problem - Solved using constraint satisfaction
57Algorithm Features
- Partial program analysis
- can analyze libraries independent of callers
- can analyze a method without analyzing the
methods it invokes - Appropriate for a dynamic compiler
- Support for partial program analysis
- Analyze libraries offline and use results
58Stack Allocation of Captured Objects
- Objects that are captured can be allocated in
stack frame of capturing method
Stack before
max 0
obj header
enum
vect null
e null
index 0
fp
59Stack Allocation of Captured Objects
- Objects that are captured can be allocated on the
stack
Stack before
Stack after
max 0
max 0
obj header
obj header
enum
vect null
vect null
e null
index 0
index 0
fp
e null
fp
60Stack Allocation of Captured Objects
- Stack allocation can extend lifetime of objects
- Allocations of arbitrary size are not transformed
- Allocation sites inside loops
- Arbitrary-sized arrays
61Object captured within a caller
- Call path to the allocating method is specialized
- Add an extra parameter location in the callers
stack frame where to put the object - Object allocation is changed to allocate the
object at the given location, rather than on the
heap
62Specialization policy
- Distance in the call graph between capturing
method and allocating method may be large - Implemented solution full specialization
- Specialized versions of all methods on call chain
from capturing method to allocating method - Does not seem to blow up in practice
63Removing Synchronization on Captured Objects
- Identify lock-unlock pairs that can only operate
on captured objects - Augment analysis with sets of nodes for each
lock-unlock pair - If all nodes in a set are captured, then the
lock-unlock pair can be removed
64Experimental Results
- Implemented in the Jalapeño JVM
- Analysis performed on the entire VM
- Including the analysis code itself!
65Synch Operations Eliminated
66Stack Allocated Objects
67Size of Stack Allocated Objects
68Related Work
- Pointer Analysis for Sequential Programs
- Chatterjee, Ryder, Landi (POPL 99)
- Sathyanathan Lam (LCPC 96)
- Steensgaard (POPL 96)
- Wilson Lam (PLDI 95)
- Emami, Ghiya, Hendren (PLDI 94)
- Choi, Burke, Carini (POPL 93)
69Related Work
- Pointer Analysis for Multithreaded Programs
- Rugina and Rinard (PLDI 99) (fork-join
parallelism, not compositional) - We have extended our points-to analysis for
multithreaded programs (irregular, thread-based
concurrency, compositional) - Escape Analysis
- Blanchet (POPL 98)
- Deutsch (POPL 90, POPL 97)
- Park Goldberg (PLDI 92)
70Related Work
- Synchronization Optimizations
- Diniz Rinard (LCPC 96, POPL 97)
- Plevyak, Zhang, Chien (POPL 95)
- Aldrich, Chambers, Sirer, Eggers (SAS99)
- Blanchet (OOPSLA 99)
- Bogda, Hoelzle (OOPSLA 99)
- Choi, Gupta, Serrano, Sreedhar, Midkiff (OOPSLA
99)
71Conclusion
- Combined points-to and escape analysis
- Interprocedural
- Compositional - Inside and outside nodes/edges
- Implemented in Jalapeño
- Encouraging experimental results