Title: 6.001 SICP Memory Management
16.001 SICPMemory Management
- Register Machine data storage
- registers, stack, and heap
- List structured memory
- how do we make cons cells?
- what is "garbage" and where does it come from?
- how do we reclaim stranded memory?
- Garbage collection algorithms
- Mark/Sweep
- Stop Copy
2A View of Our Register Machine
exp
- Places to store data
- registers
- stack
- heap
- Values to store
env
val
continue
proc
argl
unev
3Vector Abstraction
- Most hardware supports random access memory
through a memory indexing or vector abstraction
mechanism
4
5
0
1
2
3
6
7
8
9
10
11
...
myvector
base address
integer index (or offset)
places to put values
- Register Machine
- (assign ltreggt (op vector-ref) (reg ltvgt) ltindexgt)
- (perform (op vector-set!) (reg ltvgt) ltindexgt ltvalgt)
4Vector Implementation of the Stack
0
stack
1
Occupied
2
3
stack- pointer (sp)
4
Empty
5
6
7
- (save ltreggt)
- becomes
- (assign sp (op ) (reg sp) (const 1))
- (perform (op vector-set!) (reg stack)
- (reg sp) (reg ltreggt))
- (restore ltreggt)
- becomes
- (assign ltreggt (op vector-ref) (reg stack) (reg
sp)) - (assign sp (op -) (reg sp) (const 1))
...
5Implementation of the Heap
- Conceptually create cons cells as needed out of
heap
- Remember such structures reside within
environment...
6List Structured Memory
- Basic idea build cons cells out of two vectors
the-cars and the-cdrs
car value
cdr value
- Add two registers (the-cars and the-cdrs) to hold
the base address for these two vectors
7Implementing the Pair Abstraction
- CAR
- (assign ltreggt (op car) ltpairgt)
- becomes
- (assign ltreggt (op vector-ref)
- (reg the-cars) ltpairgt)
- CDR
- (assign ltreggt (op cdr) ltpairgt)
- becomes
- (assign ltreggt (op vector-ref)
- (reg the-cdrs) ltpairgt)
- where ltpairgt is a cell offset into the-cars
the-cdrs vectors
8Implementing Pair Allocation
- When we create a new pair (cons), we need to
allocate a cell from the vector - Allocation approach interacts with the memory
management strategy (as we shall see later) - Consider a simple approach
used cells
free cells
4
5
0
1
2
3
6
7
8
9
10
11
the-cars
...
the-cdrs
- free register contains a pointer (offset) to the
first free cell in the heap
free
9Implementing Pair Allocation
- CONS
- (assign ltregnamegt (op cons) ltval-1gt ltval-2gt)
- becomes
- (perform (op vector-set!)
- (reg the-cars) (reg free) ltval-1gt)
- (perform (op vector-set!)
- (reg the-cdrs) (reg free) ltval-2gt)
- (assign ltregnamegt (reg free))
- (assign free (op ) (reg free) (const 1))
10Representing Primitive Data
- Represent data using an underlying bit
representation
- Ex a pointer to cons cell at offset 514
- 00010000000000000000001000000010
0000 empty list null 0001 cons cell
pointer E.g. pointer to cell 5 0010 integer
E.g. number 3 0011 boolean
. . .
? E0 ? P5 ? N3
11List Structure and the Heap
- With our shorthand notation, we can now trace out
how data in the heap corresponds to our pair
abstraction
(define a (list 4 7 6))
(cons a a)
12Creation of Garbage
- Inaccessible memory cells (or "garbage") is
created in typical Scheme programs
free
(define b (cons 1 nil))
(set! b (cons 2 3))
13Running Out of Memory
- Consider a small 5 cell memory
- Out of memory... but now cells P2 and P3 are
garbage! - How detect that these are garbage?
- How reuse those cells?
the-cars
the-cdrs
free
(define c (list 8 4 7 6))
(set! c (cons (cdddr c) (cddr c)))
14Storage Management
- Alternative 1
- Force programmer to worry about both memory
allocation and memory de-allocation. - E.g. explicit language constructs or procedures
to "free" memory (e.g. as in the C language) - Gives control to the programmer, but also fraught
with danger memory "leaks" where memory is
consumed but never recovered are common bugs - Alternative 2
- Free the programmer from worrying about memory
allocation and de-allocation or recovery - An automatic mechanism to discover and reuse
memory"Garbage collection" (e.g. as in Scheme,
Java)
15What is garbage in the heap?
- The only cells that matter are those that could
affect a future computation. - The state of the evaluator is completely
specified by the contents of the registers, the
stack, and the global environment. - If we trace out list structure, starting from
values in those places, only those cells we reach
can affect future computation. - We can thus treat all other cells as garbage.
16Mark/Sweep Garbage Collection
- Basic idea
- vector of "mark" bits (initially 0's)
- start at "root" of good cells
- walk the tree mark good cells
- "sweep together" the unmarked cells into a list
of free cells
0
1
2
3
4
the-cars
the-cdrs
the-marks
P0
P1
P3
P2
c
6
7
4
8
c
17Mark/Sweep Algorithm
- Mark Phase
- (define (mark object)
- (cond ((not (pair? object)) f
- ((not ( 1 (vector-ref the-marks
object))) - (vector-set! the-marks object 1)
- (mark (car object))
- (mark (cdr object)))))
- Note For a pair, "object" is an integer offset
denoting the cell location
18Mark/Sweep Algorithm
- Sweep Phase
- (define (sweep i)
- (cond ((not ( i size))
- (cond (( 1 (vector-ref the-marks i))
- (vector-set! the-marks i 0))
- (else (set-cdr! (int-to-pointer i)
free) - (set! free (int-to-pointer
i)))) - (sweep ( i 1)))))
- (define (gc)
- (mark root)
- (sweep 0))
19Mark/Sweep Algorithm Issues
- Must change cons to work correctly with
mark/sweep - cons should get the cell pointed to by free list
- free-list should be updated to point to the next
free cell - set the car/cdr part of the cell (as before)
- How implement tree recursion?
- ? Need a stack!
- How deep might the stack need to be?
- ? As deep as the number of cells in the heap!
20Stop Copy Garbage Collection
0
1
2
3
4
the-cars
N6
N4
N7
N8
P0
the-cdrs
E0
P1
P0
P2
P1
root
Phase 1 move good cells into new memory (leaving
forwarding pointers)
Phase 2 update pointers in moved cells to
reflect new locations
21Stop Copy Algorithm
- set free and scan to beginning of new memory
- move root pair to new memory
- adjust root pointer to new location
- increment free as necessary
- mark old pairs by putting a forwarding pointer
("broken heart") to new memory - basic cycle
- trace pointers in car and cdr in cell pointed to
by scan back to old memory - relocate each one if a forwarding pointer, use
forwarding address to update pointers, and if not
a forwarding pointer, copy into free pair, then
increment free, store a forwarding pointer, and
update pointers to new pair - increment scan
- stop when scan catches up to free
22Stop Copy Algorithm
- (define (gccopy scan)
- (if (not ( scan free))
- (begin
- (set-car! scan (forward (vector-ref
the-cars scan))) - (set-cdr! scan (forward (vector-ref
the-cdrs scan))) - (gccopy ( scan 1)))))
- (define (forward pointer)
- (cond ((pair? pointer)
- (let (oldcar (car pointer)))
- (if (forward? oldcar)
- (int-to-pointer oldcar)
- (let ((newptr (int-to-pointer
free))) - (set-car! newptr oldcar)
- (set-cdr! newptr (cdr pointer))
- (set! free ( 1 free))
- (set-car! pointer (int-to-pointer
newptr)) - newptr)))
- (else pointer))))
23Stop Copy Issues
- Disadvantage
- Requires double the memory
- Advantages
- Compacts puts connected cells at "front" of
memory - More general idea than just managing cons cells
- E.g. disk file compaction/defragmentation
24Implications Language Design Memory
- List structured memory
- construct with a "pair" of vector locations
- language impact flexible compound data mechanism
- Tagged memory architecture
- some tags supported in the data bits
- language impact dynamic (runtime) types
flexibility - Garbage collection
- language impact free programmer from memory
management avoid memory leaks
25May 5, 2000 Recitation Problem
6.001
- Draw a box and pointer diagram corresponding to
the following memory. Note that there may be
garbage in the memory which you can safely
ignore. All good memory is accessible through
the root pointer.
- Do the mark/sweep and stopcopy garbage
collectors function properly when circular list
structures are encountered?