Title: ConnectivityBased Garbage Collection
1Connectivity-BasedGarbage Collection
- Presenter Feng Xian
- Author Martin Hirzel, et.al
- Published in OOPSLA2003
2Garbage Collection Benefits
- Garbage collection leads to simpler
- Design ? no complex deallocation protocols
- Implementation ? automatic deallocation
- Maintenance ? fewer bugs
- Benefits are widely accepted
- Java, C, Python,
3Garbage CollectionHavent we solved this
problem yet?
- For a state-of-the-art garbage collector
- time 14 of execution time
- space 3x high watermark
- pauses 0.8 seconds
- Can reduce any one cost
- Challenge reduce all three costs
4Example Heap
s1
o1
Boxes heap objects
o2
o3
o4
s2
o5
o6
Arrows pointers
o7
o8
o9
o15
g
o10
o11
o14
o12
o13
Long box stack global variables
5Thesis
o1
- Objects form distinct data structures
- Connected objects die together
- Garbage collectors can exploit 1. and 2. to
reclaim objects efficiently
o2
o3
o4
o5
o6
o7
o8
o9
o15
o10
o11
o14
o12
o13
stack globals
6Experimental Infrastructure
- JikesRVM Research Virtual Machine
- From IBM Research
- Written in Java
- Application and runtime system share heap
- ? Good garbage collection even more important
- Benchmarks
- SPECjvm98 suite and SPECjbb2000
- Java Olden suite
- xalan, ipsixql, nfc, jigsaw
7Outline
- Garbage Collector Design Principles
- Family of Garbage Collectors
- Design Space Exploration
- Conclusion
8Garbage Collector Design PrinciplesDo partial
collections.
- Dont collect the full heap every time
- ? Shorter pause times
o1
o2
o3
o4
o5
o6
o7
o8
o9
o15
o10
o11
o14
o12
o13
stack globals
9Garbage Collector Design PrinciplesPredict
lifetime based on age.
- Generational hypothesisMost objects die young
- Generational garbage collection
- Partition by age
- Collect young objects most often
- ? Low time overhead
- Thats the state of the art.
o1
o2
o3
o4
o5
o6
o7
o8
o9
o15
o10
o11
o14
o12
o13
stack globals
young generation
old generation
10Garbage Collector Design PrinciplesGenerational
GC Problems
o1
- Regular full collections? Long peak pause
- Old-to-young pointers? Need bookkeeping
o2
o3
o4
o5
o6
o7
o8
o9
o15
o10
o11
o14
o12
o13
stack globals
young generation
old generation
11Garbage Collector Design PrinciplesCollect
connected objects together.
?
o2
o1
o2
o1
o2
o1
o2
o1
12Garbage Collector Design PrinciplesFocus on
objects with few ancestors.
? Shortlived objects are easy to collect
13Garbage Collector Design PrinciplesPredict
lifetime based on roots.
o1
s
o2
g
o3
o4
For details, see ISMM02 paper.
stack globals
14Outline
- Garbage Collector Design Principles
- Family of Garbage Collectors
- Design Space Exploration
- Conclusion
15CBGC Family of Garbage CollectorsConnectivity-Ba
sed Garbage Collection
p1
o1
- Do partial collections.
- Collect connected objects together.
- Predict lifetime based on age.
- Focus on objects with few ancestors.
- Predict lifetime based on roots.
o2
p2
o3
o4
o5
o6
o7
o8
o9
p3
o15
o10
o11
o14
o12
o13
p4
stack globals
16Family of Garbage CollectorsComponents of CBGC
- Before allocation
- PartitioningDecide into which partition to put
each object - Collection algorithm
- EstimatorEstimate dead live objects for each
partition - ChooserChoose good set of partitions
- Partial collectionCollect chosen partitions
17Family of Garbage CollectorsPartitioning Problem
p1
- Find fine-grained partitions, where
- Partition edges respect pointers
- Objects dont move between partitions
o1
o2
p2
o3
o4
o5
o6
o7
o8
o9
p3
o15
o10
o11
o14
o12
o13
p4
stack globals
18Family of Garbage CollectorsPartitioning
Solutions
p1
- Pointer analysis
- Type-based Harris
- o1 may point to o2 if o1 has a field of atype
compatible to o2 - -conservative they determine the absence of a
pointer btw two heaps only if they can prove that
such pointer cannot exist.
o1
o2
p2
o3
o4
o5
o6
o7
o8
o9
p3
o15
o10
o11
o14
o12
o13
p4
stack globals
19Family of Garbage CollectorsEstimator Problem
p1
1 dead 2 live
- For each partition guess
- dead
- Objects that can be reclaimed
- Pay-off
- live
- Objects that must be traversed
- Cost
p2
3 dead 3 live
p3
2 dead 0 live
2 dead 2 live
p4
stack globals
20Family of Garbage CollectorsEstimator Solutions
p1
1 dead 2 live
- Heuristics
- Connected objects die together
- Most objects die young
- Objects reachable from globals live long
- The past predicts the future
p2
3 dead 3 live
p3
2 dead 0 live
2 dead 2 live
p4
stack globals
21Family of Garbage CollectorsChooser Problem
p1
1 dead 2 live
- Pick subset of partitions
- Maximize total dead
- Minimize total live
- Closed under predecessor relation
- ? No bookkeeping for external pointers
p2
3 dead 3 live
7 dead 5 live
p3
p3
2 dead 0 live
2 dead 2 live
p4
stack globals
22Family of Garbage CollectorsChooser Solutions
p1
1 dead 2 live
- Optimal algorithm based on network flow TR
- Simpler, greedy algorithm
p2
3 dead 3 live
7 dead 5 live
p3
p3
2 dead 0 live
2 dead 2 live
p4
stack globals
23Family of Garbage CollectorsPartial Collection
Problem
- Look only at chosen partitions
- Traverse reachable objects
- Reclaim unreachable objects
rest of heap
o2
p2
o
o5
o5
o6
o7
o8
o8
o
o9
p3
o15
o10
o10
o11
o11
o14
o12
o13
p4
stack globals
24Family of Garbage CollectorsPartial Collection
Solutions
- Generalize canonical full-heap algorithms
- Mark and sweep McCarthy60
- Semi-space copying Cheney70
- Treadmill Baker92
rest of heap
o2
p2
o5
o5
o6
o7
o8
o8
o9
p3
o15
o10
o10
o11
o11
o14
o12
o13
p4
stack globals
25Outline
- Garbage Collector Design Principles
- Family of Garbage Collectors
- Design Space Exploration
- Conclusion
26Design Space ExplorationQuestions
- How good is a naïve CBGC?
- How good could CBGC be in 20 years?
- How well does CBGC do in a JVM?
27Design Space ExplorationSimulator Methodology
- Garbage collection simulator (under GPL)
- Uses traces of allocations and pointer
writesfrom our benchmark runs - Simulator advantages
- Easier to implement variety of collector
algorithms - Know entire trace beforehandcan use that for
in 20 years experiments - Currently adding CBGC to JikesRVM
28Design Space ExplorationHow good is a naïve CBGC?
1.72
0
0.87
0
0.22
0
29Design Space ExplorationHow good could CBGC be
in 20 years?
1.72
0
0.87
0
0.22
0
30Design Space ExplorationHow good could CBGC be
in 20 years?
- CBGC with oracles beats Appel
- We did not find a performance wall
- CBGC has potential
- The performance gap between CBGC with oracles
and naïve CBGC is large - Research challenges
31How well does CBGC doin a Java virtual machine?
- Implementation in progress
- Need a pointer analysis for the partitioning
32Contributions presented in this talk
- Connectivity-based GC design principles
- ISMM02
- CBGC, a new family of garbage collectors
- Design space exploration with simulator
- OOPSLA03