Lecture 9-1 Virtual Memory - PowerPoint PPT Presentation

1 / 32

About This Presentation

Title:

Lecture 9-1 Virtual Memory

Description:

Compare virtual page number with the tag to make sure it is the one you want. if yes ... Page table contains a use tag. On access the use tag is set ... – PowerPoint PPT presentation

Number of Views:75

Avg rating:3.0/5.0

Slides: 33

Provided by: Rand222

Category:

more less

Transcript and Presenter's Notes

Title: Lecture 9-1 Virtual Memory

1
Lecture 9-1Virtual Memory

Original Note By Prof. Mike Schulte
Present by Pradondet Nilagupta
Spring 2001

2
Virtual Memory

Virtual memory (VM) allows main memory (DRAM) to
act like a cache for secondary storage (magnetic
disk).
VM address translation a provides a mapping from
the virtual address of the processor to the
physical address in main memory or on disk.
VM provides the following benefits
Allows multiple programs to share the same
physical memory
Allows programmers to write code as though they
have a very large amount of main memory
Automatically handles bringing in data from disk
Cache terms vs. VM terms
Cache block gt page or segment
Cache Miss gt page fault or address fault

3
Virtual Memory Basics

Programs reference virtual addresses in a
non-existent memory
These are then translated into real physical
addresses
Virtual address space may be bigger than physical
address space
Divide physical memory into blocks, called pages
Anywhere from 512 to 16MB (4k typical)
Virtual-to-physical translation by indexed table
lookup
Add another cache for recent translations (the
TLB)
Invisible to the programmer
Looks to your application like you have a lot of
memory!
Anyone remember overlays?

4
VM Page Mapping
Process 1s Virtual Address Space
Page Frames
Process 2s Virtual Address Space
Disk
Physical Memory
5
VM Address Translation
12 bits
20 bits
Log2 of pagesize
Virtual page number
Page offset
Per-process page table
Valid bit Protection bits Dirty bt Reference bit
Page Table base
Physical page number
Page offset
To physical memory
6
Typical Page Parameters
Parameter Value
Page Size 4KB 64KB
L1 Cache Hit Time 1-2 clock cycles
Virtual Hit (e.g. mapped to DRAM) 50-400 clock cycles
Miss Penalty (all the way to disk) 700k-6M clock cycles
Disk Access Time 500k-4M clock cycles
Page Transfer Time 200k-2M clock cycles
Page Fault Rate .001 - .00001
Main Memory Size 4MB 4GB

Its a lot like what happens in a cache
But everything (except miss rate) is a LOT worse

7
Paging vs. Segmentation

Pages are fixed sized blocks
Segments vary from 1 byte to 232 (for 32bit
addresses) bytes

Aspect Page Segment
Words per address One contains page and offset Two possible large max-size, so need Seg and offset words
Programmer visible? No Sometimes
Replacement Trivial because of fixed size Hard, need to find contiguous space, use garbage collection
Memory Efficiency Internal Fragmentation External Fragmentation
Disk Efficiency Yes adjust page size to balance access and transfer time Not always segment size varies
8
Cache and VM Parameters

How is virtual memory different from caches?
Software controls replacement - why?
Size of virtual memory determined by size of
processor address
Disk is also used to store the file system -
nonvolatile

9
Paged and Segmented VM(Figure 5.38, pg. 442)

Virtual memories can be catagorized into two main
classes
Paged memory fixed size blocks
Segmented memory variable size blocks

10
Paged vs. Segmented VM

Paged memory
Fixed sized blocks (4 KB to 64 KB)
One word per address (page number page offset)
Easy to replace pages (all same size)
Internal fragmentation (not all of page is used)
Efficient disk traffic (optimize for page size)
Segmented memory
Variable sized blocks (up to 64 KB or 4GB)
Two words per address (segment offset)
Difficult to replace segments (find where segment
fits)
External fragmentation (unused portions of
memory)
Inefficient disk traffic (may have small or large
transfers)
Hybrid approaches
Paged segments segments are a multiple of a page
size
Multiple page sizes (e.g., 8 KB, 64 KB, 512 KB,
4096 KB)

11
Pages are Cached in a Virtual Memory System

Can Ask the Same Four Questions we did about
caches
Q1 Block Placement
choice lower miss rates and complex placement or
vice versa
miss penalty is huge
so choose low miss rate gt place page anywhere
in physical memory
similar to fully associative cache model
Q2 Block Addressing - use additional data
structure
fixed size pages - use a page table
virtual page number gt physical page number and
concatenate offset
tag bit to indicate presence in main memory

12
Normal Page Tables

Size is number of virtual pages
Purpose is to hold the translation of VPN to PPN
Permits ease of page relocation
Make sure to keep tags to indicate page is mapped
Potential problem
Consider 32bit virtual address and 4k pages
4GB/4KB 1MW required just for the page table!
Might have to page in the page table
Consider how the problem gets worse on 64bit
machines with even larger virtual address spaces!
Alpha has a 43bit virtual address with 8k pages
Might have multi-level page tables

13
Inverted Page Tables

Similar to a set-associative mechanism
Make the page table reflect the of physical
pages (not virtual)
Use a hash mechanism
virtual page number gt HPN index into inverted
page table
Compare virtual page number with the tag to make
sure it is the one you want
if yes
check to see that it is in memory - OK if yes -
if not page fault
If not - miss
go to full page table on disk to get new entry
implies 2 disk accesses in the worst case
trades increased worst case penalty for decrease
in capacity induced miss rate since there is now
more room for real pages with smaller page table

14
Inverted Page Table
Page
Offset

Only store entries
For pages in physical
memory

Hash
Page
V
Frame

OK
Frame
Offset
15
Address Translation Reality

The translation process using page tables takes
too long!
Use a cache to hold recent translations
Translation Lookaside Buffer
Typically 8-1024 entries
Block size same as a page table entry (1 or 2
words)
Only holds translations for pages in memory
1 cycle hit time
Highly or fully associative
Miss rate lt 1
Miss goes to main memory (where the whole page
table lives)
Must be purged on a process switch

16
Back to the 4 Questions

Q3 Block Replacement (pages in physical memory)
LRU is best
So use it to minimize the horrible miss penalty
However, real LRU is expensive
Page table contains a use tag
On access the use tag is set
OS checks them every so often, records what it
sees, and resets them all
On a miss, the OS decides who has been used the
least
Basic strategy Miss penalty is so huge, you can
spend a few OS cycles to help reduce the miss rate

17
Last Question

Q4 Write Policy
Always write-back
Due to the access time of the disk
So, you need to keep tags to show when pages are
dirty and need to be written back to disk when
theyre swapped out.
Anything else is pretty silly
Remember the disk is SLOW!

18
Page Sizes

An architectural choice
Large pages are good
reduces page table size
amortizes the long disk access
if spatial locality is good then hit rate will
improve
Large pages are bad
more internal fragmentation
if everything is random each structures last
page is only half full
Half of bigger is still bigger
if there are 3 structures per process text,
heap, and control stack
then 1.5 pages are wasted for each process
process start up time takes longer
since at least 1 page of each type is required to
prior to start
transfer time penalty aspect is higher

19
More on TLBs

The TLB must be on chip
otherwise it is worthless
small TLBs are worthless anyway
large TLBs are expensive
high associativity is likely
gt Price of CPUs is going up!
OK as long as performance goes up faster

20
Address Translation withPage Table (Figure
5.40, pg. 444)

A page table translates a virtual page number
into a physical page number
The page offset remains unchaged
Page tables are large
32 bit virtual address
4 KB page size
220 4 byte table entries 4MB
Page tables are stored in main memory gt slow
Cache table entries in a translation buffer

21
Fast Address Translation with Translation Buffer
(TB)(Figure 5.41, pg. 446)

Cache translated addresses in TB
Alpha 21064 data TB
32 entries
fully associative
30 bit tag
21 bit physical address
Valid and read/write bits
Separate TB for instr.
Steps in translation
compare page no. to tags
check for memory access violation
send physical page no. of matching tag
combine physical page no. and page offset

22
Selecting a Page Size

Reasons for larger page size
Page table size is inversely proportional to the
page size therefore memory saved
Fast cache hit time easy when cache size lt page
size (VA caches) bigger page makes this
feasible as cache size grows
Transferring larger pages to or from secondary
storage, possibly over a network, is more
efficient
Number of TLB entries are restricted by clock
cycle time, so a larger page size maps more
memory, thereby reducing TLB misses
Reasons for a smaller page size
Want to avoid internal fragmentation dont waste
storage data must be contiguous within page
Quicker process start for small processes - dont
need to bring in more memory than needed

23
Memory Protection

With multiprogramming, a computer is shared by
several programs or processes running
concurrently
Need to provide protection
Need to allow sharing
Mechanisms for providing protection
Provide Base and Bound registers Base ? Address
? Bound
Provide both user and supervisor (operating
system) modes
Provide CPU state that the user can read, but
cannot write
Branch and bounds registers, user/supervisor bit,
exception bits
Provide method to go from user to supervisor mode
and vice versa
system call user to supervisor
system return supervisor to user
Provide permissions for each flag or segment in
memory

24
Alpha VM Mapping(Figure 5.43, pg. 451)

64-bit address divided into 3 segments
seg0 (bit 630) user code
seg1 (bit 63 1, 62 1) user stack
kseg (bit 63 1, 62 0) kernel segment for OS
Three level page table, each one page
Reduces page table size
Increases translation time
PTE bits valid, kernel user read write enable

25
Alpha 21064 Memory Hierarchy

The Alpha 21064 memory hierarchy includes
A 32 entry, fully associative, data TB
A 12 entry, fully associative instruction TB
A 8 KB direct-mapped physically addressed data
cache
A 8 KB direct-mapped physically addressed
instruction cache
A 4 entry by 64-bit instruction prefetch stream
buffer
A 4 entry by 256-bit write buffer
A 2 MB directed mapped second level unified cache
The virtual memory
Maps a 43-bit virtual address to a 34-bit
physical address
Has a page size of 8 KB

26
Alpha Memory Performance Miss Rates
8K
8K
2M
27
Alpha CPI Components

Largest increase in CPI due to
I stall Instruction stalls from branch
mispredictions
Other data hazards, structural hazards

28
Pitfall Address space to small

One of the biggest mistakes than can be made when
designing an architect is to devote to few bits
to the address
address size limits the size of virtual memory
difficult to change since many components depend
on it (e.g., PC, registers, effective-address
calculations)
As program size increases, larger and larger
address sizes are needed
8 bit Intel 8080 (1975)
16 bit Intel 8086 (1978)
24 bit Intel 80286 (1982)
32 bit Intel 80386 (1985)
64 bit Intel Merced (1998)

29
Pitfall Predicting Cache Performance of one
Program from Another Program

4KB Data cache miss rate 8,12,or 28?
1KB Instr cache miss rate 0,3,or 10?
Alpha vs. MIPS for 8KB Data17 vs. 10

30
Pitfall Simulating Too Small an Address Trace
31
Virtual Memory Summary

Virtual memory (VM) allows main memory (DRAM) to
act like a cache for secondary storage (magnetic
disk).
The large miss penalty of virtual memory leads to
different stategies from cache
Fully associative, TB PT, LRU, Write-back
Designed as
paged fixed size blocks
segmented variable size blocks
hybrid segmented paging or multiple page sizes
Avoid small address size

32
Summary 2 Typical Choices
Option TLB L1 Cache L2 Cache VM (page)
Block Size 4-8 bytes (1 PTE) 4-32 bytes 32-256 bytes 4k-16k bytes
Hit Time 1 cycle 1-2 cycles 6-15 cycles 10-100 cycles
Miss Penalty 10-30 cycles 8-66 cycles 30-200 cycles 700k-6M cycles
Local Miss Rate .1 - 2 .5 20 13 - 15 .00001 - 001
Size 32B 8KB 1 128 KB 256KB - 16MB
Backing Store L1 Cache L2 Cache DRAM Disks
Q1 Block Placement Fully or set associative DM DM or SA Fully associative
Q2 Block ID Tag/block Tag/block Tag/block Table
Q3 Block Replacement Random (not last) N.A. For DM Random (if SA) LRU/LFU
Q4 Writes Flush on PTE write Through or back Write-back Write-back

Write a Comment

User Comments (0)