Title: 15213 Recitation 7 Greg Reshko
115-213 Recitation 7Greg Reshko
- Office Hours Wed 200-300PM
- March 31st, 2003
2Outline
- Virtual Memory
- Paging
- Page faults
- TLB
- Address translation
- Malloc Lab
- Lots of hints and ideas
3Virtual Memory
- Reasons
- Use RAM as a cache for disk
- Easier memory management
- Protection
- Enable partial swapping
- Share memory efficiently
4Physical memory
Memory
0
Physical Addresses
1
N-1
5Virtual Memory
Memory
Page Table
Virtual Addresses
Physical Addresses
0
1
P-1
Disk
6Paging Purpose
- Solves two problems
- External memory fragmentation
- Long delay to swap a whole process
- Divide memory more finely
- Page small logical memory region
- Frame small physical memory region
- Any page can map to any frame
7Paging Address Mapping
Logical Address
Page
Offset
Frame
Offset
....
f29
f34
....
Physical Address
Page table
8Paging Multi-Level
....
f99
f87
....
P1
Offset
Frame
Offset
P2
....
f07
f08
....
....
f29
Page Directory
f34
f25
Page Tables
9Page Faults
- Virtual address not in memory
- This means it is on a disk
- Go to disk, fetch the page, load it into memory,
get back to the process
Memory
Memory
Page Table
Page Table
Virtual Addresses
Physical Addresses
Virtual Addresses
Physical Addresses
CPU
CPU
Disk
Disk
10Copy-on-Write
- Simulated Copy
- Copy page table entries to new process
- Mark PTEs read-only in old and new
- What really happens
- Process writes to page
- Page fault handler is called
- Copy page into empty frame
- Mark read-write in both PTEs
- Result
- Faster and less work
11Relevance to Fork
- Why is paging good for fork and exec?
- Fork produces two very similar processes
- Same code, data, and stack
- Copying all pages is expensive
- Many will never be modified (especially in exec)
- Share pages instead
- i.e. just mark them as read only and duplicate
when necessary
12Address TranslationGeneral Idea
- Mapping between virtual and physical addresses
page fault
fault handler
Processor
?
Hardware Addr Trans Mechanism
Main Memory
Secondary memory
V
P
OS performs this transfer (only if miss)
physical address
virtual address
part of the on-chip memory mgmt unit (MMU)
13Address Translation In terms of address itself
- Higher bits of the address get mapped from
virtual address to physical. - Lower bits (page offset) stays the same.
0
p1
p
n1
virtual address
virtual page number
page offset
address translation
0
p1
p
m1
physical page number
page offset
physical address
14TLB
- Translation Lookaside Buffer
- Small hardware cache in MMU
- Maps virtual page numbers to physical page
numbers
15Address Translation with TLB
n1
0
p1
p
virtual address
virtual page number
page offset
valid
physical page number
tag
TLB
.
.
.
TLB hit
physical address
tag
byte offset
index
valid
tag
data
Cache
data
cache hit
16Example
- Motivation
- A detailed example of end-to-end address
translation - Same as in the book and lecture
- I just want to make sure it makes perfect sense
- Do practice problems at home
- Ask questions if anything is unclear
17Example Description
- Memory is byte addressable
- Accesses are to 1-byte words
- Virtual addresses are 14 bits
- Physical addresses are 12 bits
- Page size is 64 bytes
- TLB is 4-way set associative with 16 total
entries - L1 d-cache is physically addressed and direct
mapped, - with 4-byte line size and 16 total sets
18Example Addresses
- 14-bit virtual addresses
- 12-bit physical address
- Page size 64 bits
(Virtual Page Offset)
(Virtual Page Number)
(Physical Page Number)
(Physical Page Offset)
19Example Page Table
20Example TLB
- 16 entries
- 4-way associative
21Example Cache
- 16 lines
- 4-byte line size
- Direct mapped
22Example Address Translation
- Virtual Address 0x03D4
- Split into offset and page number
- 0x03D4 00001111010100
- VPO 010100 0x14
- VPN 00001111 0x0F
- Lets see if this is in TLB
- 0x03D4 00001111010100
- TLBI 11 0x03
- TLBT 000011 0x03
23Example TLB
- 16 entries
- 4-way associative
24Example Address Translation
- Virtual Address 0x03D4
- TLB lookup
- This address is in TLB (second entry, set 0x3)
- PPN 0x0D 001101
- PPO VPO 0x14 010100
- PA PPN PPO 001101010100
- Cache
- PA 0x354 0x001101010100
- CT 001101 0x0D
- CI 0101 0x05
- CO 00 0x0
25Example Cache
- 16 lines
- 4-byte line size
- Direct mapped
26Example Address Translation
- Virtual Address 0x03D4
- Cache Hit
- Tag in set 0x5 matches CT
- Data at offset CO is 0x36
- Data returned to MMU
- Data returned to CPU
27Lab 6 Hints and Ideas
- Due April 16
- 40 points for performance
- 20 points for correctness
- 5 points for style
- Get the correctness points this week
- Get a feel for how hard the lab is
- You'll probably need the time
- Starting a couple days before is a BAD idea!
28How to get the correctness points
- We provide mm-helper.c which contains the code
from the book - malloc works
- free works (with coalescing)
- Heap checking doesn't work
- realloc doesn't work
- Implement a dumb version of realloc
- malloc new block, memcpy, free old block, return
new block
29How to get the correctness points
- Implement heap checking
- Have to add a request id field to each allocated
block (tricky) - Hint need padding to maintain 8 byte alignment
of user pointer - In the book's code bp always the same as the user
pointer -
- The 4 bytes immediately before bp contain size of
payload - 3 lsb of size unused (because of alignment)
- first bit indicates of the block is alloced or
not
Sizea
Payload
Footer
bp
30How to get the correctness points
- Need to change block layout to look like this
-
- This changes how the implicit list has to be
traversed - But size is at same place relative to bp
Sizea
Payload
Footer
ID
bp
31How to get the correctness points
- Or change block layout to look like this
- All accesses to what was size now access id but
can be clever and make size 4 bytes larger - Could even make bp point to id..
- Most code would just work
ID
Payload
Footer
Sizea
bp
32How to get the correctness points
- Once malloc, free, and realloc work with the id
field, write heapcheck - Iterate over the whole heap and print out
allocated blocks - Need to read the id field
- That's it for correctness
33Hints
- Remember that pointer arithematic behaves
differently depending on type of pointer - Consider using structs/unions to eliminate some
messy pointer code - Get things working with the short trace file
first./mdriver -f short1-bal.rep - To get the best performance
- Red-Black trees
- Ternary trees
- Other interesting data structures
34Thats it for hints