Cpsc 318 Computer Structures Lecture 17 Virtual Memory presentation

About This Presentation

Transcript and Presenter's Notes

Title: Cpsc 318 Computer Structures Lecture 17 Virtual Memory

1
Cpsc 318Computer Structures Lecture 17
Virtual Memory Cache

Dr. Son Vuong
(vuong_at_cs.ubc.ca)
March 23, 2004

2
Why Caches?
µProc 60/yr.
1000
CPU
Moores Law
100
Processor-Memory Performance Gap(grows 50 /
year)
Performance
10
DRAM 7-9/yr.
DRAM
1
1980
1981
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
1982

1989 first Intel CPU with cache on chip
1998 Pentium III has two levels of cache on chip

3
Review (1/2)

Caches are NOT mandatory
Processor performs arithmetic
Memory stores data
Caches simply make things go faster
Each level of memory hierarchy is just a subset
of next higher level
Caches speed up due to temporal locality store
data used recently
Block size gt 1 word speeds up due to spatial
locality store words adjacent to the ones used
recently

4
Review (2/2)

Cache design choices
size of cache speed v. capacity
direct-mapped v. associative
for N-way set assoc choice of N
block replacement policy
2nd level cache?
Write through v. write back?
Use performance model to pick between choices,
depending on programs, technology, budget, ...

5
Another View of the Memory Hierarchy
Regs
Upper Level
Instr. Operands
Faster
Cache
Blocks
L2 Cache
Blocks
Memory
Pages
Disk
Files
Larger
Tape
Lower Level
6
Virtual Memory

If Principle of Locality allows caches to offer
(usually) speed of cache memory with size of DRAM
memory,then recursively why not use at next
level to give speed of DRAM memory, size of Disk
memory?
Called Virtual Memory
Also allows OS to share memory, protect programs
from each other
Today, more important for protection vs. just
another level of memory hierarchy
Historically, it predates caches

7
Virtual to Physical Addr. Translation
Program operates in its virtual address space
Physical memory (incl. caches)
HW mapping
virtual address (inst. fetch load, store)
physical address (inst. fetch load, store)

Each program operates in its own virtual address
space only program running
Each process is protected from the other
OS can decide where each goes in memory
Hardware (HW) provides virtual -gt physical mapping

8
Mapping Virtual Memory to Physical Memory
Virtual Memory

Divide into equal sizedchunks (page of about
4KB)

Stack
Any chunk of Virtual Memory assigned to any chunk
of Physical Memory
Physical Memory
64 MB
0
0
9
Virtual Memory Mapping Function

Cannot have simple function to predict arbitrary
mapping
Use table lookup of mappings

Virtual address

Use table lookup (Page Table) for mappings
Page number is index
Virtual Memory Mapping Function
Physical Offset Virtual Offset
Physical Page Number PageTableVirtual Page
Number
(Physical Page also called Page Frame)

10
Paging Organization (assume 1 KB pages)
11
Page Table

A page table is an operating system structure
which contains the mapping of virtual addresses
to physical locations
There are several different ways, all up to the
operating system, to keep this data around
Each process running in the operating system has
its own page table
State of process is PC, all registers, plus
page table
OS changes page tables by changing contents of
Page Table Base Register

12
Address Mapping Page Table
(actually concatenation)
Page Table
...
V
A.R.
P. P. A.

Access Rights
Physical Page Address
Val -id
Physical Memory Address
V
A.R.
P. P. A.
V
A.R.
Disk. A.
...
Disk
Page Table located in physical memory
13
Notes on Page Table

Solves Fragmentation problem all chunks same
size, so all holes can be used
OS must reserve Swap Space on disk for each
process
To grow a process, ask Operating System
If unused pages, OS uses them first
If not, OS swaps some old pages to disk
(Least Recently Used to pick pages to swap)
Each process has own Page Table
Will add details, but Page Table is essence of
Virtual Memory

14
Comparing the 2 levels of hierarchy

Cache Version Virtual Memory vers.
Block (or Line) Page
Miss Page Fault
Block Size 32-64B Page Size 4K-8KB
Placement Fully AssociativeDirect Mapped,
N-way Set Associative
Replacement Least Recently UsedLRU or
Random (LRU)
Write Thru or Back Write Back

15
Virtual Memory Problem 1

Map every address ? 1 indirection via Page Table
in memory per virtual address
? 1 virtual memory accesses 2 physical memory
accesses ? SLOW!
Observation since locality in pages of data,
there must be locality in virtual address
translations of those pages
Since small is fast, why not use a small cache of
virtual to physical address translations to make
translation fast?
For historical reasons, cache is called a
Translation Lookaside Buffer, or TLB

16
Translation Look-Aside Buffers

TLBs usually small, typically 128 - 256 entries
Like any other cache, the TLB can be direct
mapped, set associative, or fully associative

hit
PA
miss
VA
TLB Lookup
Cache
Main Memory
Processor
miss
hit
Trans- lation
data
17
Typical TLB Format
Virtual Physical Dirty Ref Valid
Access Address Address Rights

TLB just a cache on the page table mappings
TLB access time comparable to cache (much
less than main memory access time)
Dirty since use write back, need to know
whether or not to write page to disk when
replaced
Ref Used to help calculate LRU on replacement
Cleared by OS periodically, then checked to see
if page was referenced

18
What if we don't have enough memory?

We chose some other page belonging to a program
and transfer it onto the disk if it is dirty
If clean (disk copy is up-to-date), just
overwrite that data in memory
We chose the page to evict based on replacement
policy (e.g., LRU)
And update that program's page table to reflect
the fact that its memory moved somewhere else

19
Virtual Memory Problem 2

Not enough physical memory!
Only, say, 64 MB of physical memory
N processes, each 4 GB (232 B) of virtual memory!
Could have 1K virtual pages/physical page!
Spatial Locality to the rescue
Each page is 4 KB, lots of nearby references
No matter how big program is, at any time only
accessing a few pages
Working Set recently used pages

20
Virtual Memory Problem 3

Page Table too big!
4GB Virtual Memory 4 KB page ? 1 million
Page Table Entries ? 4 MB just for Page Table
for 1 process, 25 processes ? 100 MB for Page
Tables!
Variety of solutions to tradeoff memory size of
mapping function for slower when miss TLB
Make TLB large enough, highly associative so
rarely miss on address translation
CS 315 will go over more options and in greater
depth

21
2-level Page Table
22
Page Table Shrink

Single Page Table

Only have second level page table for valid
entries of super level page table
23
Space Savings for Multi-Level Page Table

If only 10 of entries of Super Page Table have
valid entries, then total mapping size is roughly
1/10-th of single level page table
Exercise 7.35 explores exact size

24
Three Advantages of Virtual Memory

1) Translation
Program can be given consistent view of memory,
even though physical memory is scrambled
Makes multiple processes reasonable
Only the most important part of program (Working
Set) must be in physical memory
Contiguous structures (like stacks) use only as
much physical memory as necessary yet still grow
later

25
Three Advantages of Virtual Memory

2) Protection
Different processes protected from each other
Different pages can be given special behavior
(Read Only, Invisible to user programs, etc).
Kernel data protected from User programs
Very important for protection from malicious
programs ? Far more viruses under Microsoft
Windows
Special Mode in processor (Kernel more) allows
processor to change page table/TLB
3) Sharing
Can map same physical page to multiple
users(Shared memory)

26
Crossing the System Boundary

System loads user program into memory and gives
it use of the processor
Switch back
SYSCALL
request service
I/O
TRAP (overflow)
Interrupt

User
Proc
Mem
System
I/O Bus
data reg.
27
Instruction Set Support for VM/OS

How to prevent user program from changing page
tables and go anywhere?
Bit in Status Register determines whether in user
mode or OS (kernel) mode Kernel/User bit (KU)
(0 ? kernel, 1 ? user)

On exception/interrupt disable interrupts (IE0)
and go into kernel mode (KU0)
Only change the page table when in kernel mode
(Operating System)

28
4 Questions for Memory Hierarchy

Q1 Where can a block be placed in the upper
level? (Block placement)
Q2 How is a block found if it is in the upper
level?(Block identification)
Q3 Which block should be replaced on a miss?
(Block replacement)
Q4 What happens on a write? (Write strategy)

29
Q1 Where block placed in upper level?

Block 12 placed in 8 block cache
Fully associative, direct mapped, 2-way set
associative
S.A. Mapping Block Number Mod Number Sets

Block no.
0 1 2 3 4 5 6 7
Block no.
0 1 2 3 4 5 6 7
Block no.
0 1 2 3 4 5 6 7
Set 0
Set 1
Set 2
Set 3
Set associative block 12 can go anywhere in set
0 (12 mod 4)
Fully associative block 12 can go anywhere
Direct mapped block 12 can go only into block 4
(12 mod 8)
30
Q2 How is a block found in upper level?
Set Select
Data Select

Direct indexing (using index and block offset),
tag compares, or combination
Increasing associativity shrinks index, expands
tag

31
Q3 Which block replaced on a miss?

Easy for Direct Mapped
Set Associative or Fully Associative
Random
LRU (Least Recently Used)
Miss RatesAssociativity 2-way 4-way
8-way
Size LRU Ran LRU Ran LRU Ran
16 KB 5.2 5.7 4.7 5.3 4.4 5.0
64 KB 1.9 2.0 1.5 1.7 1.4 1.5
256 KB 1.15 1.17 1.13 1.13 1.12
1.12

32
Q4 What to do on a write hit?

Write-through
update the word in cache block and corresponding
word in memory
Write-back
update word in cache block
allow memory word to be stale
gt add dirty bit to each line indicating that
memory be updated when block is replaced
gt OS flushes cache before I/O !!!
Performance trade-offs?
WT read misses cannot result in writes
WB no writes of repeated writes

33
Virtual Memory Overview

Lets say were fetching some data
Check TLB (input VPN, output PPN)
hit fetch translation
miss check page table (in memory)
Page table hit fetch translation
Page table miss page fault, fetch page from disk
to memory, return translation to TLB
Check cache (input PPN, output data)
hit return value
miss fetch value from memory

34
Paging/Virtual Memory with TLB
User B Virtual Memory
User A Virtual Memory

Physical Memory
Stack
Stack
64 MB
Heap
Heap
Static
Static
0
Code
Code
0
0
35
Virtual Memory Overview

TLB usually small, typically 128 - 256 entries
BS 1-2 page table entries (4-8 B each)
Hit time .5-1 cycle
Miss penalty 10-30 cycles
Miss rate .01-1

hit
PA
miss
VA
TLB Lookup
Cache
Main Memory
Processor
miss
hit
Trans- lation
data
36
Address Translation 3 Concept tests
TLB
...
P. P. N.
V. P. N.
Physical Page Number
Virtual Page Number
V. P. N.
P. P. N.
37
Cache/VM/TLB Summary 1/3

The Principle of Locality
Program access a relatively small portion of the
address space at any instant of time.
Temporal Locality Locality in Time
Spatial Locality Locality in Space
Caches, TLBs, Virtual Memory all understood by
examining how they deal with 4 questions 1)
Where can block be placed? 2) How is block
found? 3) What block is replaced on miss? 4)
How are writes handled?

38
Cache/VM/TLB Summary 2/3

Virtual Memory allows protected sharing of memory
between processes with less swapping to disk,
less fragmentation than always swap or base/bound
3 Problems
1) Not enough memory Spatial Locality means
small Working Set of pages OK
2) TLB to reduce performance cost of VM
3) Need more compact representation to reduce
memory size cost of simple 1-level page table,
especially for 64-bit address

39
Cache/VM/TLB Summary 3/3

Virtual memory was controversial at the time can
SW automatically manage 64KB across many
programs?
1000X DRAM growth removed controversy
Today VM allows many processes to share single
memory without having to swap all processes to
disk VM protection today is more important than
memory hierarchy
Today CPU time is a function of (ops, cache
misses) vs. just f(ops)What does this mean to
Compilers, Data structures, Algorithms?

40
Reading quiz

1. The page table is a memory data structure that
maps virtual pages to physical pages. It would
seem then that every virtual memory accesses
would result in two physical memory accesses one
to translate into physical addresses via the page
table and a second to get to the actual data. How
do operating systems and processors avoid such a
high overhead for virtual memory?
2. In addition to having a valid bit and a dirty
bit, some page tables have a reference bit. If
the bit is a one, it means that the page has been
accessed since the last time the operating system
set the bit to zero. What is the purpose of such
a bit? Can you think of a way to get a similar
effect without a reference bit?

41
Reading quiz

1. The standard four questions for memory
hierarchies emphasize the similarities between
caches and virtual memory. Some combinations of
the options that make sense for caches would be
silly in a virtual memory . Which combinations
would you never expect to see in a real system?
Why?
2. Is a TLB a cache? If so, what is it a cache
of? Is it OK to use either write through or write
back? Why?

42
Bonus rest of slides could appear after slide 43
Impact of Caches?

1960-1985 Speed (no. operations)
1990s
Pipelined Execution Fast Clock Rate
Out-of-Order execution
Superscalar
2001 Speed (non-cached memory accesses)?

43
Quicksort vs. Radix as vary number keys
Instructions
Radix sort
Quick sort
Instructions / key
Set size in keys
44
Quicksort vs. Radix as vary number keys
Instructions and Time
Radix sort
Time / key
Quick sort
Instructions / key
Set size in keys
45
Quicksort vs. Radix as vary number keys Cache
misses
What is proper approach to fast algorithms?
Radix sort
Cache misses / key
Quick sort
Set size in keys
46
Bonus slide Kernel/User Mode

Generally restrict device access, page table to
OS
HOW?
Add a mode bit to the machine K/U
Only allow SW in kernel mode to access device
registers, page table
If user programs could access I/O devices and
page tables directly?
could destroy each others data, ...
might break the devices,

Write a Comment

User Comments (0)

About PowerShow.com

Cpsc 318 Computer Structures Lecture 17 Virtual Memory PowerPoint PPT Presentation