CSCI 435 Compiler Design - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

CSCI 435 Compiler Design

Description:

an intuitive garbage collection algorithm that records in each chunk the number ... Simple trick that alleviates the problem some but is not a solution is Tail ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 31
Provided by: OwenAst9
Category:

less

Transcript and Presenter's Notes

Title: CSCI 435 Compiler Design


1
CSCI 435 Compiler Design
  • Week 9 Class 1
  • Section 5.2.3 to 5.2.4.3
  • (415-425 )
  • Ray Schneider

2
Topics of the Day
  • Actual Garbage Collection Algorithms
  • REFERENCE COUNTING
  • MARK and SCAN

3
Reference Counting
  • an intuitive garbage collection algorithm that
    records in each chunk the number of pointers that
    point to it
  • when the number drops to zero the chunk can be
    declared garbage
  • Reference Counting literally collects Garbage
    while the other methods actually collect
    reachable chunks

4
How Reference Counting Works
  • When a Chunk is allocated from the Heap its
    Reference Count is initialized to one
  • Whenever a reference to the Chunk is duplicated
    its Reference Count is incremented
  • Whenever a reference to the Chunk is deleted its
    Reference Count is decremented
  • If the Reference Count drops to zero the Chunk
    can be freed because it is no longer reachable

5
Chunks with Reference Count in a Heap
Returning the chunk with a zero reference count
to the free list is not enough to reclaim all
garbage.
6
Result of removing the reference to b
For example, deleting the reference to chunk b
means that chunk e also becomes garbage, but f
remains in use since there is still one reference
left so its reference count does not drop to zero.
7
Two Main Issues
  • Implementing Reference Counting requires
  • 1) Keeping track of all reference manipulations,
    and
  • 2) Recursively freeing chunks with zero reference
    count
  • Compiler plays an important part in keeping track
    of references
  • the compiler inserts special code for all
    reference manipulations
  • Recursive freeing is accomplished by a run-time
    routine

8
Compiler Tracking of references
  • Compiler inserts special code for ALL reference
    manipulations
  • incrementing the reference count when a reference
    to a chunk is duplicated
  • References are typically duplicated as an effect
    of some assignment in the source language a
    variable in the program data area, a field in a
    dynamically allocated data structure, parameter
    transfers since reference passing is effectively
    an assignment statement to the local variable
  • Not all references are to the heap, many point to
    the program data area and these must not be
    followed
  • decrementing the reference count when such a
    reference is deleted
  • deleted implicitly by assignment. Assignment to
    a reference variable overwrites the current
    reference with a new value, so before installing
    the new reference must decrement the current
    reference

9
Code generated for pointer assignment
  • Given pointer assignment pq we generate code
    such as that shown below
  • Another source of reference deletion is returning
    from a routine call
  • On return all Local Variables are deleted
  • Local variables holding references to chunks
    should be processed to decrement the associated
    reference count
  • This also applies to parameters that hold
    references

Processing a Pointer Assignment
IF Points into the heap (q) Increment q
.reference count IF Points into the heap (p)
Decrement p .reference count IF p .reference
count 0 Free recursively depending on
reference counts (p) SET p TO q
10
Memory Deallocation
  • proper way to reclaim memory allocated to a chunk
    is to first decrement recursively the reference
    counts of ALL references contained in the chunk
    and then return it to the free list
  • RECURSION tends to require an unpredictable
    amount of stack space needs to be avoided
  • Three Techniques
  • BEST is using pointer reversal which we'll see in
    Section 5.2.4.3
  • Adequate solution is "bounded stack" discussed in
    problem 5.10(c)
  • Simple trick that alleviates the problem some but
    is not a solution is Tail Recursion Elimination

11
Recursively freeing chunks While Avoiding
Tail Recursion
Use repetition rather than recursion for the last
pointer in the chunk to be freed. This avoids
using the stack for freeing chunks that form a
linear list.
PROCEDURE Free recursively depending on reference
counts (Pointer) WHILE Pointer / No chunk
IF NOT Points into the heap (Pointer)
RETURN IF NOT Pointer reference count 0
RETURN FOR EACH Index IN 1 .. Pointer
.number of pointers 1 Free recursively
depending on reference counts
(Pointer .pointer Index) SET Aux
pointer to Pointer IF Pointer .number of
pointers 0 SET Pointer TO No chunk
ELSE Pointer .number of pointers gt 0
SET Pointer TO Pointer .pointer
Pointer .number of pointers Free
chunk(Aux pointer) // the actual freeing
operation
12
On the other hand ... Serious Drawbacks
  • 1) Reference Counting cannot reclaim cyclic data
    structures
  • reference count of a was two and drops to one
    when the pointer from the program data area is
    released but the chunk cannot be reclaimed
    result Memory Leaks

heap
d
a
1
1
c
f
1
1
13
... more Serious Drawbacks
  • 2) Reference Counting is inefficient due to the
    amount of memory monitoring which it must do.
    Other techniques don't require pointer monitoring
    and reclaim garbage chunks only when needed.
  • 3) the final problem is memory fragmentation.
    The free memory is reclaimed but remains
    fragmented. One could do compaction in principle
    but few do.
  • Still Reference Counting is a popular technique
    for managing small numbers of data structures in
    handwritten software (ex. UNIX kernal's recovery
    of file descriptors.)

14
Mark and Scan (also called Mark and Sweep)
  • Mark and Scan garbage collection is very
    effective. It frees all the memory that can be
    freed.
  • When combined with compaction it also provides
    the largest possible chunk of free memory
  • Two Phase Method 1) Marking Phase marks all
    chunks that are still reachable, and 2) Scan
    Phase -- scans the allocated memory and consider
    free all chunks that are not marked reachable and
    makes them available again

15
Marking 1
  • Based on two principles
  • 1) Chunks reachable through the root set are
    reachable, and
  • 2) Any chunk reachable from a pointer in a
    reachable chunk is itself reachable
  • We are assuming that the root set resides in a
    program data area or the topmost activation
    record, and a data type description for it has
    been constructed and made available by the
    compiler
  • Chunks are marked reachable in a simple recursive
    depth first search guaranteed to terminate due to
    the fact that chunks are marked only once and
    finite in number

16
Marking 2
  • Marking requires another bit to be added to the
    chunk, the MARKED BIT in addition to the FREE BIT
  • Main Problem with a recursive process like this
    is that it needs a stack of unknown size just at
    a time when memory is limited (after all garbage
    collection has been triggered)
  • Where to put the stack? is the question. A
    clever and workable answer is to distribute the
    required memory throughout the chunks
  • One pointer and a small counter are sufficient
  • The pointer points back to the parent chunk
  • The counter counts how many pointers have already
    been processed in the present chunk

17
Marking the third child of a chunk
  • This technique costs room for one pointer, one
    counter, and one bit per allocated chuck.
    Whether or not this is a problem depends on the
    average size of a chunk. We will see a way to
    mark the directed graph of all reachable chunks
    without using space for the extra pointer in a
    brief discussion of the Schorr and Waite
    algorithm.

18
Scanning and freeing
  • Freeing the Unreachable Chunks is now easy
  • Using the lengths noted in the chunks we step
    through memory from chunk to chunk checking if it
    has been marked reachable (if reachable we clear
    the marked bit and if not we set the free bit)
  • We can also do relatively easy Compaction
  • we keep a pointer F to the first free chunk, find
    and note its size and as long as we keep meeting
    free blocks we just add up their size until we
    run into a chunk that is in use. The we update
    the administration of the chunk pointed to by F
    to the total size of the chucks we've encountered
  • Repeat on encountering next free chunk
  • Result of Mark and Scan is a heap in which ALL
    chunks marked in use are reachable and each pair
    of free chunks is separated by chunks in use.
    This is as good as it gets unless you move chunks
    in a Compaction Phase so that all free chunks are
    merged into one large chunk.

19
Pointer Reversal marking without using stack
space
  • We can reduce the pointer overhead almost
    entirely by using a method called Pointer
    Reversal
  • Allows one to visit all the nodes of a directed
    graph without using additional stack space for
    the graph, also called Schorr and Waite algorithm
    after its inventors (Schorr and Waite, 1967)
  • How it works? (the basic observation)
  • When the marking algorithm is working on chunk C,
    it finds the pointers in C, which point to the
    children of C, and visits them one by one. If
    marking algorithm has gone off to visit the nth
    child of C, say D, when it returns from its visit
    it can retain a pointer to D, the chunk it just
    left which is the same pointer residing in C in
    the nth pointer field. So while visiting the nth
    child of C, the contents of the nth pointer field
    in C is redundant
  • AHA!

20
Pointer Reversal 2
  • Schorr and Waite graph marking takes advantage of
    this redundancy to store the pointer (not on the
    stack) but ..
  • algorithm maintains two auxiliary pointers Parent
    pointer and Chunk pointer
  • Chunk pointer points to the chunk being processed
  • Parent pointer points to its parent
  • Each chunk also contains a Counter Field which
    records the number of the pointer in the chunk
    that is being followed starting with 0 and
    finishing when it has reached the total number of
    pointers in the chunk

21
Schorr and Waite Arriving at C
The Situation when the processing of chunk C
begins! We assume that the processing of
pointers in C proceeds until we reach the nth
pointer, pointing to D
22
Moving to D
P
Parent pointer
nth pointer
Chunk pointer
To move to D we shift the contents of the Parent
pointer and the Chunk pointer, and the
nth pointer field in C circularly using a
temporary variable Old parent pointer
D
0
23
the hoochey koochey ...
//C is pointed to by Chunk Pointer. SET Old
parent pointer TO Parent pointer SET Parent
pointer TO Chunk pointer //C is pointed to by
Parent pointer. SET Chunk pointer TO n-th pointer
field in C SET n-th pointer field in C TO Old
parent pointer
Results in return pointer to the parent P of
C being stored in the nth pointer of C which
normally points to D. Situation is now
equivalent to our first arrival at C except C is
now the parent and D is the chunk being
processed.
24
Return to D
Shows the situation when we are about to return
from visiting D. Only difference is that the
counter in D has reached the number of pointers
in D. Now we have to circularly shift back the
pointers and increment the counter in C.
25
Gett'in outta Dodge returning from D
// C is pointed to by Parent pointer. SET Old
parent pointer TO Parent pointer // C is pointed
to by Old parent pointer. SET Parent pointer TO
n-th pointer field in C SET n-th pointer field
in C TO Chunk pointer SET Chunk pointer TO Old
parent pointer // C is pointed to by Chunk
pointer.
P
C
pp
D
opp
Cn
C
The whole fancy footwork is then repeated for
n1-th pointer in C until all children of C have
been visited. Then we return from C
cp
26
About to return from C
What we've seen is a clever technique for
avoiding a stack while visiting all the nodes
in a graph. To avoid looping the nodes need to
be marked at the beginning of a visit and if
marked not visited again.
27
Next time ...
  • We'll start looking at Chapter 6 which is the
    last chapter we'll be looking at ...
  • Paradigm Specific Techniques
  • Specifically IMPERATIVE and OBJECT ORIENTED
    PROGRAMS
  • We'll be covering the material from
  • Section 6.0 to 6.2.3.2 28 pages
  • Section 6.2.4 to 6.2.10 17 pages, and
  • Section 6.4 to 6.4.2.3 19 pages for a total of
    64 pages

28
Homework for Week 11
  • Bison ...

29
References
  • Text Modern Compiler Design

30
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com