CSCI 435 Compiler Design - PowerPoint PPT Presentation

1 / 30

About This Presentation

Title:

CSCI 435 Compiler Design

Description:

an intuitive garbage collection algorithm that records in each chunk the number ... Simple trick that alleviates the problem some but is not a solution is Tail ... – PowerPoint PPT presentation

Number of Views:64

Avg rating:3.0/5.0

Slides: 31

Provided by: OwenAst9

Category:

more less

Transcript and Presenter's Notes

Title: CSCI 435 Compiler Design

1
CSCI 435 Compiler Design

Week 9 Class 1
Section 5.2.3 to 5.2.4.3
(415-425 )
Ray Schneider

2
Topics of the Day

Actual Garbage Collection Algorithms
REFERENCE COUNTING
MARK and SCAN

3
Reference Counting

an intuitive garbage collection algorithm that
records in each chunk the number of pointers that
point to it
when the number drops to zero the chunk can be
declared garbage
Reference Counting literally collects Garbage
while the other methods actually collect
reachable chunks

4
How Reference Counting Works

When a Chunk is allocated from the Heap its
Reference Count is initialized to one
Whenever a reference to the Chunk is duplicated
its Reference Count is incremented
Whenever a reference to the Chunk is deleted its
Reference Count is decremented
If the Reference Count drops to zero the Chunk
can be freed because it is no longer reachable

5
Chunks with Reference Count in a Heap
Returning the chunk with a zero reference count
to the free list is not enough to reclaim all
garbage.
6
Result of removing the reference to b
For example, deleting the reference to chunk b
means that chunk e also becomes garbage, but f
remains in use since there is still one reference
left so its reference count does not drop to zero.
7
Two Main Issues

Implementing Reference Counting requires
1) Keeping track of all reference manipulations,
and
2) Recursively freeing chunks with zero reference
count
Compiler plays an important part in keeping track
of references
the compiler inserts special code for all
reference manipulations
Recursive freeing is accomplished by a run-time
routine

8
Compiler Tracking of references

Compiler inserts special code for ALL reference
manipulations
incrementing the reference count when a reference
to a chunk is duplicated
References are typically duplicated as an effect
of some assignment in the source language a
variable in the program data area, a field in a
dynamically allocated data structure, parameter
transfers since reference passing is effectively
an assignment statement to the local variable
Not all references are to the heap, many point to
the program data area and these must not be
followed
decrementing the reference count when such a
reference is deleted
deleted implicitly by assignment. Assignment to
a reference variable overwrites the current
reference with a new value, so before installing
the new reference must decrement the current
reference

9
Code generated for pointer assignment

Given pointer assignment pq we generate code
such as that shown below
Another source of reference deletion is returning
from a routine call
On return all Local Variables are deleted
Local variables holding references to chunks
should be processed to decrement the associated
reference count
This also applies to parameters that hold
references

Processing a Pointer Assignment
IF Points into the heap (q) Increment q
.reference count IF Points into the heap (p)
Decrement p .reference count IF p .reference
count 0 Free recursively depending on
reference counts (p) SET p TO q
10
Memory Deallocation

proper way to reclaim memory allocated to a chunk
is to first decrement recursively the reference
counts of ALL references contained in the chunk
and then return it to the free list
RECURSION tends to require an unpredictable
amount of stack space needs to be avoided
Three Techniques
BEST is using pointer reversal which we'll see in
Section 5.2.4.3
Adequate solution is "bounded stack" discussed in
problem 5.10(c)
Simple trick that alleviates the problem some but
is not a solution is Tail Recursion Elimination

11
Recursively freeing chunks While Avoiding
Tail Recursion
Use repetition rather than recursion for the last
pointer in the chunk to be freed. This avoids
using the stack for freeing chunks that form a
linear list.
PROCEDURE Free recursively depending on reference
counts (Pointer) WHILE Pointer / No chunk
IF NOT Points into the heap (Pointer)
RETURN IF NOT Pointer reference count 0
RETURN FOR EACH Index IN 1 .. Pointer
.number of pointers 1 Free recursively
depending on reference counts
(Pointer .pointer Index) SET Aux
pointer to Pointer IF Pointer .number of
pointers 0 SET Pointer TO No chunk
ELSE Pointer .number of pointers gt 0
SET Pointer TO Pointer .pointer
Pointer .number of pointers Free
chunk(Aux pointer) // the actual freeing
operation
12
On the other hand ... Serious Drawbacks

1) Reference Counting cannot reclaim cyclic data
structures
reference count of a was two and drops to one
when the pointer from the program data area is
released but the chunk cannot be reclaimed
result Memory Leaks

heap
d
a
1
1
c
f
1
1
13
... more Serious Drawbacks

2) Reference Counting is inefficient due to the
amount of memory monitoring which it must do.
Other techniques don't require pointer monitoring
and reclaim garbage chunks only when needed.
3) the final problem is memory fragmentation.
The free memory is reclaimed but remains
fragmented. One could do compaction in principle
but few do.
Still Reference Counting is a popular technique
for managing small numbers of data structures in
handwritten software (ex. UNIX kernal's recovery
of file descriptors.)

14
Mark and Scan (also called Mark and Sweep)

Mark and Scan garbage collection is very
effective. It frees all the memory that can be
freed.
When combined with compaction it also provides
the largest possible chunk of free memory
Two Phase Method 1) Marking Phase marks all
chunks that are still reachable, and 2) Scan
Phase -- scans the allocated memory and consider
free all chunks that are not marked reachable and
makes them available again

15
Marking 1

Based on two principles
1) Chunks reachable through the root set are
reachable, and
2) Any chunk reachable from a pointer in a
reachable chunk is itself reachable
We are assuming that the root set resides in a
program data area or the topmost activation
record, and a data type description for it has
been constructed and made available by the
compiler
Chunks are marked reachable in a simple recursive
depth first search guaranteed to terminate due to
the fact that chunks are marked only once and
finite in number

16
Marking 2

Marking requires another bit to be added to the
chunk, the MARKED BIT in addition to the FREE BIT
Main Problem with a recursive process like this
is that it needs a stack of unknown size just at
a time when memory is limited (after all garbage
collection has been triggered)
Where to put the stack? is the question. A
clever and workable answer is to distribute the
required memory throughout the chunks
One pointer and a small counter are sufficient
The pointer points back to the parent chunk
The counter counts how many pointers have already
been processed in the present chunk

17
Marking the third child of a chunk

This technique costs room for one pointer, one
counter, and one bit per allocated chuck.
Whether or not this is a problem depends on the
average size of a chunk. We will see a way to
mark the directed graph of all reachable chunks
without using space for the extra pointer in a
brief discussion of the Schorr and Waite
algorithm.

18
Scanning and freeing

Freeing the Unreachable Chunks is now easy
Using the lengths noted in the chunks we step
through memory from chunk to chunk checking if it
has been marked reachable (if reachable we clear
the marked bit and if not we set the free bit)
We can also do relatively easy Compaction
we keep a pointer F to the first free chunk, find
and note its size and as long as we keep meeting
free blocks we just add up their size until we
run into a chunk that is in use. The we update
the administration of the chunk pointed to by F
to the total size of the chucks we've encountered
Repeat on encountering next free chunk
Result of Mark and Scan is a heap in which ALL
chunks marked in use are reachable and each pair
of free chunks is separated by chunks in use.
This is as good as it gets unless you move chunks
in a Compaction Phase so that all free chunks are
merged into one large chunk.

19
Pointer Reversal marking without using stack
space

We can reduce the pointer overhead almost
entirely by using a method called Pointer
Reversal
Allows one to visit all the nodes of a directed
graph without using additional stack space for
the graph, also called Schorr and Waite algorithm
after its inventors (Schorr and Waite, 1967)
How it works? (the basic observation)
When the marking algorithm is working on chunk C,
it finds the pointers in C, which point to the
children of C, and visits them one by one. If
marking algorithm has gone off to visit the nth
child of C, say D, when it returns from its visit
it can retain a pointer to D, the chunk it just
left which is the same pointer residing in C in
the nth pointer field. So while visiting the nth
child of C, the contents of the nth pointer field
in C is redundant
AHA!

20
Pointer Reversal 2

Schorr and Waite graph marking takes advantage of
this redundancy to store the pointer (not on the
stack) but ..
algorithm maintains two auxiliary pointers Parent
pointer and Chunk pointer
Chunk pointer points to the chunk being processed
Parent pointer points to its parent
Each chunk also contains a Counter Field which
records the number of the pointer in the chunk
that is being followed starting with 0 and
finishing when it has reached the total number of
pointers in the chunk

21
Schorr and Waite Arriving at C
The Situation when the processing of chunk C
begins! We assume that the processing of
pointers in C proceeds until we reach the nth
pointer, pointing to D
22
Moving to D
P
Parent pointer
nth pointer
Chunk pointer
To move to D we shift the contents of the Parent
pointer and the Chunk pointer, and the
nth pointer field in C circularly using a
temporary variable Old parent pointer
D
0
23
the hoochey koochey ...
//C is pointed to by Chunk Pointer. SET Old
parent pointer TO Parent pointer SET Parent
pointer TO Chunk pointer //C is pointed to by
Parent pointer. SET Chunk pointer TO n-th pointer
field in C SET n-th pointer field in C TO Old
parent pointer
Results in return pointer to the parent P of
C being stored in the nth pointer of C which
normally points to D. Situation is now
equivalent to our first arrival at C except C is
now the parent and D is the chunk being
processed.
24
Return to D
Shows the situation when we are about to return
from visiting D. Only difference is that the
counter in D has reached the number of pointers
in D. Now we have to circularly shift back the
pointers and increment the counter in C.
25
Gett'in outta Dodge returning from D
// C is pointed to by Parent pointer. SET Old
parent pointer TO Parent pointer // C is pointed
to by Old parent pointer. SET Parent pointer TO
n-th pointer field in C SET n-th pointer field
in C TO Chunk pointer SET Chunk pointer TO Old
parent pointer // C is pointed to by Chunk
pointer.
P
C
pp
D
opp
Cn
C
The whole fancy footwork is then repeated for
n1-th pointer in C until all children of C have
been visited. Then we return from C
cp
26
About to return from C
What we've seen is a clever technique for
avoiding a stack while visiting all the nodes
in a graph. To avoid looping the nodes need to
be marked at the beginning of a visit and if
marked not visited again.
27
Next time ...

We'll start looking at Chapter 6 which is the
last chapter we'll be looking at ...
Paradigm Specific Techniques
Specifically IMPERATIVE and OBJECT ORIENTED
PROGRAMS
We'll be covering the material from
Section 6.0 to 6.2.3.2 28 pages
Section 6.2.4 to 6.2.10 17 pages, and
Section 6.4 to 6.4.2.3 19 pages for a total of
64 pages

28
Homework for Week 11