Connectivity-Based%20Garbage%20Collection - PowerPoint PPT Presentation

About This Presentation
Title:

Connectivity-Based%20Garbage%20Collection

Description:

Collaborators: Amer Diwan, Michael Hind, Hal Gabow, Johannes Henkel, Matthew Hertz ... Garbage collection leads to simpler. Design no complex deallocation protocols ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 44
Provided by: amerd
Category:

less

Transcript and Presenter's Notes

Title: Connectivity-Based%20Garbage%20Collection


1
Connectivity-BasedGarbage Collection
  • Martin HirzelUniversity of Colorado at Boulder
  • Collaborators Amer Diwan, Michael Hind, Hal
    Gabow, Johannes Henkel, Matthew Hertz

2
Garbage Collection Benefits
  • Garbage collection leads to simpler
  • Design ? no complex deallocation protocols
  • Implementation ? automatic deallocation
  • Maintenance ? fewer bugs
  • Benefits are widely accepted
  • Java, C, Python,

3
Garbage CollectionHavent we solved this
problem yet?
  • For a state-of-the-art garbage collector
  • time 14 of execution time
  • space 3x high watermark
  • pauses 0.8 seconds
  • Can reduce any one cost
  • Challenge reduce all three costs

4
Example Heap
s1
o1
Boxes heap objects
o2
o3
o4
s2
o5
o6
Arrows pointers
o7
o8
o9
o15
g
o10
o11
o14
o12
o13
Long box stack global variables
5
Thesis
o1
  1. Objects form distinct data structures
  2. Connected objects die together
  3. Garbage collectors can exploit 1. and 2. to
    reclaim objects efficiently

o2
o3
o4
o5
o6
o7
o8
o9
o15
o10
o11
o14
o12
o13
stack globals
6
Experimental Infrastructure
  • JikesRVM Research Virtual Machine
  • From IBM Research
  • Written in Java
  • Application and runtime system share heap
  • ? Good garbage collection even more important
  • Benchmarks
  • SPECjvm98 suite and SPECjbb2000
  • Java Olden suite
  • xalan, ipsixql, nfc, jigsaw

7
Outline
  • Garbage Collector Design Principles
  • Family of Garbage Collectors
  • Design Space Exploration
  • Pointer Analysis for Java

8
Garbage Collector Design PrinciplesDo partial
collections.
  • Dont collect the full heap every time
  • ? Shorter pause times

o1
o2
o3
o4
o5
o6
o7
o8
o9
o15
o10
o11
o14
o12
o13
stack globals
9
Garbage Collector Design PrinciplesPredict
lifetime based on age.
  • Generational hypothesisMost objects die young
  • Generational garbage collection
  • Partition by age
  • Collect young objects most often
  • ? Low time overhead
  • Thats the state of the art.

o1
o2
o3
o4
o5
o6
o7
o8
o9
o15
o10
o11
o14
o12
o13
stack globals
young generation
old generation
10
Garbage Collector Design PrinciplesGenerational
GC Problems
o1
  • Regular full collections? Long peak pause
  • Old-to-young pointers? Need bookkeeping
  • 37.5 long-lived objects? Pig in the python

o2
o3
o4
o5
o6
o7
o8
o9
o15
o10
o11
o14
o12
o13
stack globals
young generation
old generation
11
Garbage Collector Design PrinciplesCollect
connected objects together.
Likelihood that two objects die at the same time Likelihood that two objects die at the same time Likelihood that two objects die at the same time
Connectivity Example Likelihood
Any pair 33.1
Weakly connected 46.3
Strongly connected 72.4
Direct pointer 76.4
?
o2
o1
o2
o1
o2
o1
o2
o1
12
Garbage Collector Design PrinciplesFocus on
objects with few ancestors.
Lifetime Median number of ancestor objects
Short 2 objects
Long 83,324 objects
? Shortlived objects are easy to collect
13
Garbage Collector Design PrinciplesPredict
lifetime based on roots.
Lifetime Lifetime
Objects reachable Short Long
indirectly from stack 25.6 16.2
only directly from stack 32.9 0.8
from globals 4.0 20.5
Total 62.5 37.5
o1
s
o2
g
o3
o4
For details, see our ISMM02 paper.
stack globals
14
Outline
  • Garbage Collector Design Principles
  • Family of Garbage Collectors
  • Design Space Exploration
  • Pointer Analysis for Java

15
CBGC Family of Garbage CollectorsConnectivity-Ba
sed Garbage Collection
p1
o1
  • Do partial collections.
  • Collect connected objects together.
  • Predict lifetime based on age.
  • Focus on objects with few ancestors.
  • Predict lifetime based on roots.

o2
p2
o3
o4
o5
o6
o7
o8
o9
p3
o15
o10
o11
o14
o12
o13
p4
stack globals
16
Family of Garbage CollectorsComponents of CBGC
  • Before allocation
  • PartitioningDecide into which partition to put
    each object
  • Collection algorithm
  • EstimatorEstimate dead live objects for each
    partition
  • ChooserChoose good set of partitions
  • Partial collectionCollect chosen partitions

17
Family of Garbage CollectorsPartitioning Problem
p1
  • Find fine-grained partitions, where
  • Partition edges respect pointers
  • Objects dont move between partitions

o1
o2
p2
o3
o4
o5
o6
o7
o8
o9
p3
o15
o10
o11
o14
o12
o13
p4
stack globals
18
Family of Garbage CollectorsPartitioning
Solutions
p1
  • Pointer analysis
  • Type-based Harris
  • o1 may point to o2 if o1 has a field of atype
    compatible to o2
  • Constraint-based Andersen
  • We will discuss this later in the talk

o1
o2
p2
o3
o4
o5
o6
o7
o8
o9
p3
o15
o10
o11
o14
o12
o13
p4
stack globals
19
Family of Garbage CollectorsEstimator Problem
p1
1 dead 2 live
  • For each partition guess
  • dead
  • Objects that can be reclaimed
  • Pay-off
  • live
  • Objects that must be traversed
  • Cost

p2
3 dead 3 live
p3
2 dead 0 live
2 dead 2 live
p4
stack globals
20
Family of Garbage CollectorsEstimator Solutions
p1
1 dead 2 live
  • Heuristics
  • Connected objects die together
  • Most objects die young
  • Objects reachable from globals live long
  • The past predicts the future

p2
3 dead 3 live
p3
2 dead 0 live
2 dead 2 live
p4
stack globals
21
Family of Garbage CollectorsChooser Problem
p1
1 dead 2 live
  • Pick subset of partitions
  • Maximize total dead
  • Minimize total live
  • Closed under predecessor relation
  • ? No bookkeeping for external pointers

p2
3 dead 3 live
7 dead 5 live
p3
p3
2 dead 0 live
2 dead 2 live
p4
stack globals
22
Family of Garbage CollectorsChooser Solutions
p1
1 dead 2 live
  • Optimal algorithm based on network flow TR
  • Simpler, greedy algorithm

p2
3 dead 3 live
7 dead 5 live
p3
p3
2 dead 0 live
2 dead 2 live
p4
stack globals
23
Family of Garbage CollectorsPartial Collection
Problem
  • Look only at chosen partitions
  • Traverse reachable objects
  • Reclaim unreachable objects

rest of heap
o2
p2
o
o5
o5
o6
o7
o8
o8
o
o9
p3
o15
o10
o10
o11
o11
o14
o12
o13
p4
stack globals
24
Family of Garbage CollectorsPartial Collection
Solutions
  • Generalize canonical full-heap algorithms
  • Mark and sweep McCarthy60
  • Semi-space copying Cheney70
  • Treadmill Baker92

rest of heap
o2
p2
o5
o5
o6
o7
o8
o8
o9
p3
o15
o10
o10
o11
o11
o14
o12
o13
p4
stack globals
25
Outline
  • Garbage Collector Design Principles
  • Family of Garbage Collectors
  • Design Space Exploration
  • Pointer Analysis for Java

26
Design Space ExplorationQuestions
  • How good is a naïve CBGC?
  • How good could CBGC be in 20 years?
  • How well does CBGC do in a JVM?

27
Design Space ExplorationSimulator Methodology
  • Garbage collection simulator (under GPL)
  • Uses traces of allocations and pointer
    writesfrom our benchmark runs
  • Simulator advantages
  • Easier to implement variety of collector
    algorithms
  • Know entire trace beforehandcan use that for
    in 20 years experiments
  • Simulator disadvantages
  • No bottom-line performance numbers
  • Currently adding CBGC to JikesRVM

28
Design Space ExplorationHow good is a naïve CBGC?
1.72
Cost in time
Cost in space
Pause times
Full-heap Semi-space copying CBGC-naïve Type-based partitioning Harris Heuristics estimator Appel Copying generational
0
0.87
0
0.22
0
29
Design Space ExplorationHow good could CBGC be
in 20 years?
1.72
Cost in time
Cost in space
Pause times
Full-heap Semi-space copying CBGC-oracles Partitioningand estimatorbased on trace Appel Copying generational
0
0.87
0
0.22
0
30
Design Space ExplorationHow good could CBGC be
in 20 years?
  • CBGC with oracles beats Appel
  • We did not find a performance wall
  • CBGC has potential
  • The performance gap between CBGC with oracles
    and naïve CBGC is large
  • Research challenges

31
How well does CBGC doin a Java virtual machine?
  • Implementation in progress
  • Need a pointer analysis for the partitioning

32
Outline
  • Garbage Collector Design Principles
  • Family of Garbage Collectors
  • Design Space Exploration
  • Pointer Analysis for Java

33
Pointer Analysis for JavaWhich analysis do we
need?
Cost in time
1.7
0
Full-heap CBGC Appel
Semi-space copying Type-based partitioning Harris Type-based partitioning (oracles) Allocation site partitioning (oracles) Copying generational
Andersen
34
Pointer Analysis for JavaAndersens Analysis
  • Allocation-site granularity
  • Set-inclusion constraints
  • Flow and context insensitive

cant analyze Javaahead of time!
What When
Constraint generation Model flow of pointers Ahead-of-timecompilation
Constraint propagation Find fixed-point solution Ahead-of-time compilation
35
Pointer Analysis for JavaAndersen for all of Java
  • Do
  • as little as possible
  • as late as possible

What When
Constraint generation Model flow of pointers VM build and start-up Class loading Type resolution Method compilation (JIT) Execution of reflection Execution of native code
Constraint propagation Find next fixed-point solution Points-to information used (before garbage collection)
36
Pointer Analysis for JavaCorrectness Properties
Constraintgeneration
Constraintpropagation


time
If there is a pointer
then the results predict it
  • Can not do any better for Java!

37
Pointer Analysis for JavaAnalysis Cost
Constraint Constraint Constraint propagation Constraint propagation Constraint propagation Constraint propagation Constraint propagation Constraint propagation
generation generation Eager Eager At GC At GC At End At End
Seconds Count Seconds Count Seconds Count Seconds
compress 21.4 130 3.2 5 40.4 1 67.4
db 20.1 143 3.6 5 42.9 1 71.4
mtrt 20.3 265 2.1 5 46.2 1 68.1
mpegaudio 20.6 319 2.2 5 46.1 1 66.6
jack 21.2 397 4.2 7 49.0 1 78.2
jess 22.3 733 6.8 8 49.7 1 85.7
javac 21.1 1,107 5.9 10 87.4 1 187.6
xalan 20.1 1,728 4.9 8 85.7 1 215.7
? Expensive, but once behavior stabilizes,costs
diminish to zero
38
Pointer Analysis for JavaValidation
  • Lots of corner cases
  • Dynamic class loading
  • Reflection
  • Native code
  • Missing any one leads to nasty bugs
  • CBGC relies on conservative results
  • We performed validation runs
  • Check analysis results against pointers in heap
    during garbage collection

39
Wrapping Up
40
Related Work Using Program Analysis for Garbage
Collection
  • Stack allocation ParkGoldberg92,
  • Regions TofteTalpin97,
  • Liveness analysis AgesenDetlefsMoss98,
  • Early reclamation Harris99
  • Thread-local heaps Steensgaard00,
  • Object inlining DolbyChien00
  • Write-barrier removal ZeeRinard02, Shuf02

41
Related WorkPointer analyses for Java
  • Andersens analysis for static Java
  • RountevMilanovaRyder01
  • LiangPenningsHarrold01
  • WhaleyLam02
  • LhotakHendren03
  • Weaker analyses with dynamic class loading
  • DOIT PechtchanskiSarkar01
  • XTA QianHendren04
  • Rufs escape analysis BogdaSingh01, King03
  • Demand-driven / incremental analysis

42
Other Research Interests
  • Accuracy of Garbage Collection
  • M.S.Thesis,ISMM00,ECOOP01,TOPLAS02
  • Profiling
  • FDDO01,Patent01a
  • Dynamic Optimizations, Prefetching
  • PLDI02,Patent02b
  • Future directions
  • More techniques for performance improvement
  • Reducing bugs, improving productivity

43
Contributions presented in this talk
  • Connectivity-based GC design principles
  • ISMM02
  • CBGC, a new family of garbage collectors
  • Design space exploration with simulator
  • OOPSLA03
  • First non-trivial pointer analysis for Java
  • ECOOP04 (to appear)
  • http//www.cs.colorado.edu/hirzel
Write a Comment
User Comments (0)
About PowerShow.com