Title: Automatic Storage Management
1Automatic Storage Management
- Patrick Earl
- Simon Leonard
- Jack Newton
2Overview
- Terminology
- Why use Automatic Storage Management?
- Comparing garbage collection algorithms
- The Classic algorithms
- Copying garbage collection
- Incremental Tracing garbage collection
- Generational garbage collection
- Conclusions
3Terminology
- Stack a memory area where activation records or
frames are pushed onto when a procedure is called
and popped off when it returns - Heap a memory area where data structures can be
allocated and deallocated in any order.
4Terminology(Continued)
- Roots values that a program can manipulate
directly (i.e. values held in registers, on the
program stack, and global variables.) - Node/Cell/Object an individually allocated piece
of data in the heap. - Children Nodes the list of pointers that a given
node contains. - Live Node a node whose address is held in a root
or is the child of a live node.
5Terminology(Continued)
- Garbage nodes that are not live, but are not
free either. - Garbage collection the task of recovering
(freeing) garbage nodes. - Mutator The program running alongside the
garbage collection system.
6Why Garbage Collect?
- Language requirements
- In some situations it may be impossible to know
when a shared data structure is no longer in use.
7Why Garbage Collect?(Continued)
- Software Engineering
- Garbage collection increases abstraction level of
software development. - Simplified interfaces and decreases coupling of
modules. - Studies have shown a significant amount of
development time is spent on memory management
bugs Rovner, 1985.
8Comparing Garbage Collection Algorithms
- Directly comparing garbage collection algorithms
is difficult there are many factors to
consider. - Some factors to consider
- Cost of reclaiming cells
- Cost of allocating cells
- Storage overhead
- How does the algorithm scale with residency?
- Will user program be suspended during garbage
collection? - Does an upper bound exist on the pause time?
- Is locality of data structures maintained (or
maybe even improved?)
9Classes of Garbage Collection Algorithms
- Direct Garbage Collectors a record is associated
with each node in the heap. The record for node
N indicates how many other nodes or roots point
to N. - Indirect/Tracing Garbage Collectors usually
invoked when a users request for memory fails
because the free list is exhausted. The garbage
collector visits all live nodes, and returns all
other memory to the free list. If sufficient
memory has been recovered from this process, the
users request for memory is satisfied.
10Quick Review Reference Counting
- Every cell has an additional field the reference
count. This field represents the number of
pointers to that cell from roots or heap cells. - Initially, all cells in the heap are placed in a
pool of free cells, the free list.
11Reference Counting(Continued)
- When a cell is allocated from the free list, its
reference count is set to one. - When a pointer is set to reference a cell, the
cells reference count is incremented by 1 if a
pointer is to the cell is deleted, its reference
count is decremented by 1. - When a cells reference count reaches 0, its
pointers to its children are deleted and it is
returned to the free list.
12Reference Counting Example
1
0
1
0
2
0
1
0
1
13Reference Counting Example (Continued)
1
2
1
0
1
1
14Reference Counting Example (Continued)
1
2
1
0
1
1
15Reference Counting Example (Continued)
1
2
1
0
1
1
0
1
0
16Reference Counting Advantages and Disadvantages
- Advantages
- Garbage collection overhead is distributed.
- Locality of reference is no worse than mutator.
- Free memory is returned to free list quickly.
17Reference Counting Advantages and
Disadvantages(Continued)
- Disadvantages
- High time cost (every time a pointer is changed,
reference counts must be updated). - Storage overhead for reference counter can be
high. - Unable to reclaim cyclic data structures.
- If the reference counter overflows, the object
becomes permanent.
18Reference Counting Cyclic Data Structure -
Before
1
0
2
0
2
0
1
19Reference Counting Cyclic Data Structure After
1
0
1
0
2
0
1
20Deferred Reference Counting
- Optimisation
- Cost can be improved by special treatment of
local variables. - Only update reference counters of objects on the
stack at fixed intervals. - Reference counts are still affected from pointers
from one heap object to another.
21Quick Review Mark-Sweep
- The first tracing garbage collection algorithm
- Garbage cells are allowed to build up until heap
space is exhausted (i.e. a user program requests
a memory allocation, but there is insufficient
free space on the heap to satisfy the request.) - At this point, the mark-sweep algorithm is
invoked, and garbage cells are returned to the
free list.
22Mark-Sweep(Continued)
- Performed in two phases
- Mark phase identifies all live cells by setting
a mark bit. Live cells are cells reachable from
a root. - Sweep phase returns garbage cells to the free
list.
23Mark-Sweep Example
24Mark-Sweep Advantages and Disadvantages
- Advantages
- Cyclic data structures can be recovered.
- Tends to be faster than reference counting.
25Mark-Sweep Advantages and Disadvantages(Continu
ed)
- Disadvantages
- Computation must be halted while garbage
collection is being performed - Every live cell must be visited in the mark
phase, and every cell in the heap must be visited
in the sweep phase. - Garbage collection becomes more frequent as
residency of a program increases. - May fragment memory.
26Mark-Sweep Advantages and Disadvantages(Continu
ed)
- Disadvantages
- Has negative implications for locality of
reference. Old objects get surrounded by new ones
(not suited for virtual memory applications). - However, if objects tend to survive in clusters
in memory, as they apparently often do, this can
greatly reduce the cost of the sweep phase.
27Mark-Compact Collection
- Remedy the fragmentation and allocation problems
of mark-sweep collectors. - Two phases
- Mark phase identical to mark sweep.
- Compaction phase marked objects are compacted,
moving most of the live objects until all the
live objects are contiguous.
28Mark-Compact Advantages and Disadvantages(Conti
nued)
- Advantages
- The contiguous free area eliminates fragmentation
problem. Allocating objects of various sizes is
simple. - The garbage space is "squeezed out", without
disturbing the original ordering of objects. This
ameliorate locality.
29Mark-Compact Advantages and Disadvantages(Conti
nued)
- Disadvantages
- Requires several passes over the data are
required. "Sliding compactors" takes two, three
or more passes over the live objects. - One pass computes the new location
- Subsequent passes update the pointers to refer to
new locations, and actually move the objects
30Copying Garbage Collection
- Like mark-compact, copying garbage collection
does not really "collect" garbage. - Rather it moves all the live objects into one
area and the rest of the heap is know to be
available. - Copying collectors integrate the traversal and
the copying process, so that objects need only be
traversed once. - The work needed is proportional to the amount of
live date (all of which must be copied).
31Semispace Collector Using the Cheney Algorithm
- The heap is subdivided into two contiguous
subspaces (FromSpace and ToSpace). - During normal program execution, only one of
these semispaces is in use. - When the garbage collector is called, all the
live data are copied from the current semispace
(FromSpace) to the other semispace (ToSpace).
32Semispace Collector Using the Cheney Algorithm
A
B
C
D
FromSpace
ToSpace
33Semispace Collector Using the Cheney Algorithm
C
D
A
B
A
B
C
D
FromSpace
ToSpace
34Semispace Collector Using the Cheney
Algorithm(Continued)
- Once the copying is completed, the ToSpace is
made the "current" semispace. - A simple form of copying traversal is the Cheney
algorithm. - The immediately reachable objects from the
initial queue of objects for a breadth-first
traversal. - A scan pointer is advanced through the first
object location by location. - Each time a pointer into FromSpace is
encountered, the referred-to-object is
transported to the end of the queue and the
pointer to the object is updated.
35Cheney Algorithm Example
B
A
Root Nodes
B
F
A
B
C
A
E
D
C
A
B
C
D
C
D
A
B
E
A
B
C
D
E
F
36Semispace Collector Using the Cheney Algorithm
(Continued)
- Multiple paths must not be copied to tospace
multiple times. - When an object is transported to tospace, a
forwarding pointer is installed in the old
version of the object. - The forwarding pointer signifies that the old
object is obsolete and indicates where to find
the new copy.
37Copying Garbage Collection Advantages and
Disadvantages
- Advantages
- Allocation is extremely cheap.
- Excellent asymptotic complexity.
- Fragmentation is eliminated.
- Only one pass through the data is required.
38Copying Garbage Collection Advantages and
Disadvantages(Continued)
- Disadvantages
- The use of two semi-spaces doubles memory
requirement needs - Poor locality. Using virtual memory will cause
excessive paging.
39Problems with Simple Tracing Collectors
- Difficult to achieve high efficiency in a simple
garbage collector, because large amounts of
memory are expensive. - If virtual memory is used, the poor locality of
the allocation/reclamation cycle will cause
excessive paging. - Even as main memory becomes steadily cheaper,
locality within cache memory becomes increasingly
important.
40Problems with Simple Tracing Collectors(Continued
)
- With a simple semispace copy collector, locality
is likely to be worse than mark-sweep. - The memory issue is not unique to copying
collectors. - Any efficient garbage collection involves a
trade-off between space and time. - The problem of locality is an indirect result of
the use of garbage collection.
41Incremental Tracing Collectors Overview
- Introduction to Incremental Collectors
- Coherence and Conservatism
- Tricolor Marking
- Write Barrier Algorithms
- Bakers Read Barrier Algorithm
42Incremental Tracing Collectors
- Program (Mutator) and Garbage Collector run
concurrently. - Can think of system as similar to two threads.
One performs collection, and the other represents
the regular program in execution. - Can be used in systems with real-time
requirements. For example, process control
systems.
43Coherence Conservatism
- Coherence A proper state must be maintained
between the mutator and the collector. - Conservatism How aggressive the garbage
collector is at finding objects to be
deallocated.
44Tricoloring
- White Not yet traversed. A candidate for
collection. - Black Already traversed and found to be live.
Will not be reclaimed. - Grey In traversal process. Defining
characteristic is that its children have not
necessarily been explored.
45The Tricolor Abstraction
46Tricoloring Invariant
- There must not be a pointer from a black object
to a white object.
47Violation of Coloring Invariant
A
A
B
C
B
C
D
D
Before
After
48Steps in Violation
- Read a pointer to a white object
- Assign that pointer to a black object
- Original pointer must be destroyed without
collection system noticing.
49Read Barrier
- Barriers are essentially memory access detection
systems. - We detect when any pointers to any white objects
are read. - If a read to the pointer occurs, we conceptually
color that object grey.
50Write Barrier
- When a pointer is written to an object, we record
the write somehow. - The recorded write is dealt with at a later
point. - Read vs. Write efficiency considerations.
51Write Barrier Algorithms
- Snapshot-at-beginning
- Incremental update
52Snapshot-at-beginning
- Conceptually makes a copy-on-write duplication of
the pointer graph. - Can be implemented with a simple write barrier
that records pointer writes and adds the old
addresses to a stack to be traversed later.
53Snapshot-at-beginning Example
A
A
Pointer to D is now On stack
B
C
B
C
D
D
Stack
Before
After
54Comments on Snapshot-at-beginning
- Very conservative.
- All overwritten pointer values are saved and
traversed. - No objects can be freed while collection process
is occurring.
55Incremental Update Write-Barrier Algorithm
- No copy of tree is made.
- Catches overwrites of pointers that have been
copied. - If a pointer is not copied before being written,
it will be freed. - The object with the overwritten pointer is
colored grey and the algorithm must search that
node again at the end.
56Incremental Update Example
A
A
B
C
B
C
D
D
Before
After
57Comments on Incremental Update
- Things that are freed during collection are far
more likely to be collected than with the
snapshot algorithm. (Less conservative) - Although the collector restarts the traversal in
some places, it is guaranteed to do a full search
and will eventually terminate.
58Bakers Read Barrier Algorithms
- Incremental Copying
- Non-copying Algorithm (The Treadmill)
59Incremental Copying
- Variation of Copying Collector
- Garbage collection cycle begins with an atomic
flip. - All objects directly pointed to by the root are
copied into tospace.
60Read Barrier in Incremental Copying
- Whenever an object is read that is not already in
ToSpace, the read barrier catches that and copies
the object over to ToSpace at that point. - Normal background scavenging occurs
simultaneously to ensure that all objects are
traversed and reclamation can occur.
61Incremental Copying Example
A
B
D
A
B
C
C
D
E
FromSpace
ToSpace
Atomic Flip, then a read to D occurs
62Comments on Read Barrier
- If implemented in software can be quite slow due
to numerous reads to heap. - Specialized hardware is available on some unique
machines that allow this type of tracing to be
done quickly.
63Bakers Incremental Non-Copying Algorithm
- Doubly Linked Lists
- New area for allocations since started collection
- To/From spaces
- Free list
64Example - Allocation
- Take an object from the free list and move it to
the new list.
65Example - Scanning
- Searching nodes in ToSpace for references to
objects in FromSpace. - When found, object is unlinked in FromSpace and
is linked in ToSpace.
66Treadmill Workings
- When starting collection cycle
- New list is empty
- From list contains all New and To objects from
last cycle. - Collection proceeds and scanning and allocation
are performed. - When finished
- From list is merged with Free list.
67Comments on Treadmill
- As in Incremental Copying, the garbage found in
the FromSpace is reclaimed in constant time. - Conservative with new objects
- Conservative also in that reached objects will
not be removed even if they become garbage before
scan ends.
68Incremental Collectors Summary
- Incremental Tracing Collectors
- Tricolor Marking and Invariant
- Read and Write Barriers
- Snapshot-at-beginning
- Incremental Update
- Bakers Incremental Copying
- Bakers Non-copying (Treadmill)
69Generational Garbage Collection
- Attempts to address weaknesses of simple tracing
collectors such as mark-sweep and copying
collectors - All active data must be marked or copied.
- For copying collectors, each page of the heap is
touched every two collection cycles, even though
the user program is only using half the heap,
leading to poor cache behavior and page faults. - Long-lived objects are handled inefficiently.
70Generational Garbage Collection(Continued)
- Generational garbage collection is based on the
generational hypothesis - Most objects die young.
- As such, concentrate garbage collection efforts
on objects likely to be garbage young objects.
71Generational Garbage Collection Object Lifetimes
- When we discuss object lifetimes, the amount of
heap allocation that occurs between the objects
birth and death is used rather than the wall
time. - For example, an object created when 1Kb of heap
was allocated and was no longer referenced when 4
Kb of heap data was allocated would have lived
for 3Kb.
72Generational Garbage Collection Object
Lifetimes(Continued)
- Typically, between 80 and 98 percent of all
newly-allocated heap objects die before another
megabyte has been allocated.
73Generational Garbage Collection(Continued)
- Objects are segregated into different areas of
memory based on their age. - Areas containing newer objects are garbage
collected more frequently. - After an object has survived a given number of
collections, it is promoted to a less frequently
collected area.
74Generational Garbage Collection Example
C
B
A
Root Set
S
Memory Usage
Memory Usage
Old Generation
New Generation
75Generational Garbage Collection
Example(Continued)
R
C
B
A
Root Set
S
Memory Usage
Memory Usage
Old Generation
New Generation
76Generational Garbage Collection
Example(Continued)
R
D
C
B
A
Root Set
S
Memory Usage
Memory Usage
Old Generation
New Generation
77Generational Garbage Collection
Example(Continued)
- This example demonstrates several interesting
characteristics of generational garbage
collection - The young generation can be collected
independently of the older generations (resulting
in shorter pause times). - An intergenerational pointer was created from R
to D. These pointers must be treated as part of
the root set of the New Generation. - Garbage collection in the new generation result
in S becoming unreachable, and thus garbage.
Garbage in older generations (sometimes called
tenured garbage) can not be reclaimed via garbage
collections in younger generations.
78Generational Garbage Collection Implementation
- Usually implemented as a copying collector, where
each generation has its own semispace
FromSpace
FromSpace
ToSpace
ToSpace
Old Generation
New Generation
79Generational Garbage Collection Issues
- Choosing an appropriate number of generations
- If we benefit from dividing the heap into two
generations, can we further benefit by using more
than two generations? - Choosing a promotion policy
- How many garbage collections should an object
survive before being moved to an older generation?
80Generational Garbage Collection
Issues(Continued)
- Tracking intergenerational pointers
- Inter-generational pointers need to be tracked,
since they form part of the root set for younger
generations. - Collection Scheduling
- Can we attempt to schedule garbage collection in
such a way that we minimize disruptive pauses?
81Generational Garbage Collection Multiple
Generations
Generation 1
Generation 2
Generation 3
Generation 4
82Generational Garbage Collection Multiple
Generations(Continued)
- Advantages
- Keeps youngest generations size small.
- Helps address mistakes made by the promotion
policy by creating more intermediate generations
that still get garbage collected fairly
frequently. - Disadvantages
- Collections for intermediate generations may be
disruptive. - Tends to increase number of inter-generational
pointers, increasing the size of the root set for
younger generations. - Most generational collectors are limited to just
two or three generations.
83Generational Garbage Collection Promotion
Policies
- A promotion policy determines how many garbage
collections cycles (the cycle count) an object
must survive before being advanced to the next
generation. - If the cycle count is too low, objects may be
advanced too fast if too high, the benefits of
generational garbage collection are not realized.
84Generational Garbage Collection Promotion
Policies(Continued)
- With a cycle count of just one, objects created
just before the garbage collection will be
advanced, even though the generational hypothesis
states they are likely to die soon. - Increasing the cycle count to two denies
advancement to recently created objects. - Under most conditions, it increasing the cycle
count beyond two does not significantly reduce
the amount of data advanced.
85Generational Garbage Collection
Inter-generational Pointers
- Inter-generational pointers can be created in two
ways - When an object containing pointers is promoted to
an older generation. - When a pointer to an object in a newer generation
is stored in an object. - The garbage collector can easily detect
promotion-caused inter-generational pointers, but
handling pointer stores is a more complicated
task.
86Generational Garbage Collection
Inter-generational Pointers
- Pointer stores can be tracked via the use of a
write barrier - Pointer stores must be accompanied by extra
bookkeeping instructions that let the garbage
collector know of pointers that have been
updated. - Often implemented at the compiler level.
87Generational Garbage Collection
Inter-generational Pointers(Continued)
- However, write barriers only provide a
conservative estimation of live intergenerational
pointers
Root Set
Old Generation
New Generation
88Generational Garbage Collection
Inter-generational Pointers(Continued)
- Tracking inter-generational pointers are often
the largest cost of generational garbage
collection. - 1 percent of a typical Lisp programs total
instruction count are pointer stores. If a write
barrier adds 10 instructions to a pointer store,
overall performance will drop by 10 percent.
89Generational Garbage Collection
Inter-generational Pointers(Continued)
- Entry Tables
- Pointers from older generations point indirectly
to younger generations via an entry table
Generation 2
Generation 1
Generation 3
Entry Table
Entry Table
90Generational Garbage Collection
Inter-generational Pointers(Continued)
- Entry Table Advantages
- When a younger generation is collected, only the
entry table for that generation needs to be
scanned. - Entry Table Disadvantages
- Entry table may contain several entries to the
same object, making scans of the object table
proportional to the number of pointer stores
rather than to the number of inter-generational
pointers. - High overhead because of extra level of
indirection.
91Generational Garbage Collection
Inter-generational Pointers(Continued)
- Remembered Sets
- The write barrier checks to see if a pointer
being stored in an old objects points to an
object in a newer generation. If so, the address
of the old object is added to the remembered set
(if that object is not already in the set).
92Generational Garbage Collection
Inter-generational Pointers(Continued)
- Remembered Sets (Continued)
New Generation
Old Generation
Remembered Set
93Generational Garbage Collection
Inter-generational Pointers(Continued)
- Remembered Sets Advantages
- Scanning is proportional to the number of
stored-into objects, not the number of store
operations. - Remembered Sets Disadvantages
- Pointer store checking can be expensive.
94Generational Garbage Collection Collection
Scheduling
- Generational garbage collection aims to reduce
pause times. When should these (hopefully short)
pause times occur? - Two strategies exist
- Hide collections when the user is least likely to
notice a pause, or - Trigger efficient collections when there is
likely to be lots of garbage to collect.
95Generational Garbage Collection Advantages
- In practice it has proven to be an effective
garbage collection technique. - Minor garbage collections are performed quickly.
- Good cache and virtual memory behavior.
96Generational Garbage Collection Disadvantages
- Performs poorly if any of the main assumptions
are false - That objects tend die young.
- That there are relatively few pointers from old
objects to young ones. - Frequent pointer writes to older generations will
increase the cost of the write barrier, and
possibly increase the size of the root set for
younger generations.
97Garbage Collection Summary
Method Conservatism Space Time Fragmentation Locality
Mark Sweep Major Basic 1 traversal heap scan Yes Fair
Mark Compact Major Basic Many passes of heap No Good
Copying Major Two Semispaces 1 traversal No Poor
Reference Counting No Reference count field Constant per Assignment Yes Very Good
Deferred Reference Counting Only for stack variables Reference Count Field Constant per Assignment Yes Very Good
Incremental Varies depending on algorithm Varies Can be Guaranteed Real-Time Varies Varies
Generational Variable Segregated Areas Varies with number of live objects in new generation Yes (Non-Copying) No (Copying) Good
Tracing
Incremental
98Garbage Collection Conclusions
- Relieves the burden of explicit memory allocation
and deallocation. - Software module coupling related to memory
management issues is eliminated. - An extremely dangerous class of bugs is
eliminated.
99Garbage Collection Conclusions(Continued)
- Zorns study in 1989/93 compared garbage
collection to explicit deallocation - Non-generational
- Between 0 and 36 more CPU time.
- Between 40 and 280 more memory.
- Generational garbage collection
- Between 5 to 20 more CPU time.
- Between 30 and 150 more memory.
- Wilson feels these numbers can be improved, and
they are also out of date. - A well implemented garbage collector will slow a
program down by approximately 10 percent relative
to explicit heap deallocation.
100Garbage Collection Conclusions(Continued)
- Despite this cost, garbage collection a feature
in many widely used languages - Lisp (1959)
- Perl (1987)
- Java (1995)
- C (2001)
- Microsofts Common Language Runtime (2002)
101Garbage Collection Pointers
- Heap of fish applet (Mark and Sweep garbage
collection example) - http//www.artima.com/insidejvm/applets/HeapOfFish
.html - Java HotSpot Garbage Collection Strategies
- http//developer.java.sun.com/developer/technicalA
rticles/Networking/HotSpot/ - The Memory Management Reference
- http//www.memorymanagement.org/
- Uniprocessor Garbage Collection Techniques
(Wilson) - http//www.cs.ualberta.ca/duane/courses/425-52
5/WilsonACMDraft.pdf - Garbage Collection Algorithms for Automatic
Dynamic Memory Management - (Richard Jones and Rafael Lins)
102Questions?
- If you have any questions, please feel free to
- e-mail one of us
- Patrick Earl patrick_at_cs.ualberta.ca
- Simon Leonard sleonard_at_cs.ualberta.ca
- Jack Newton newton_at_cs.ualberta.ca