Cpsc 318 Computer Structures Lecture 17 Virtual Memory - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

Cpsc 318 Computer Structures Lecture 17 Virtual Memory

Description:

Page table miss: page fault, fetch page from disk to memory, return translation to TLB ... miss: fetch value from memory. CPSC318 Lecture 17. Vuong Spring 04 UBC. 34 ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 47
Provided by: davepat4
Category:

less

Transcript and Presenter's Notes

Title: Cpsc 318 Computer Structures Lecture 17 Virtual Memory


1
Cpsc 318Computer Structures Lecture 17
Virtual Memory Cache
  • Dr. Son Vuong
  • (vuong_at_cs.ubc.ca)
  • March 23, 2004

2
Why Caches?
µProc 60/yr.
1000
CPU
Moores Law
100
Processor-Memory Performance Gap(grows 50 /
year)
Performance
10
DRAM 7-9/yr.
DRAM
1
1980
1981
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
1982
  • 1989 first Intel CPU with cache on chip
  • 1998 Pentium III has two levels of cache on chip

3
Review (1/2)
  • Caches are NOT mandatory
  • Processor performs arithmetic
  • Memory stores data
  • Caches simply make things go faster
  • Each level of memory hierarchy is just a subset
    of next higher level
  • Caches speed up due to temporal locality store
    data used recently
  • Block size gt 1 word speeds up due to spatial
    locality store words adjacent to the ones used
    recently

4
Review (2/2)
  • Cache design choices
  • size of cache speed v. capacity
  • direct-mapped v. associative
  • for N-way set assoc choice of N
  • block replacement policy
  • 2nd level cache?
  • Write through v. write back?
  • Use performance model to pick between choices,
    depending on programs, technology, budget, ...

5
Another View of the Memory Hierarchy
Regs
Upper Level
Instr. Operands
Faster
Cache
Blocks
L2 Cache
Blocks
Memory
Pages
Disk
Files
Larger
Tape
Lower Level
6
Virtual Memory
  • If Principle of Locality allows caches to offer
    (usually) speed of cache memory with size of DRAM
    memory,then recursively why not use at next
    level to give speed of DRAM memory, size of Disk
    memory?
  • Called Virtual Memory
  • Also allows OS to share memory, protect programs
    from each other
  • Today, more important for protection vs. just
    another level of memory hierarchy
  • Historically, it predates caches

7
Virtual to Physical Addr. Translation
Program operates in its virtual address space
Physical memory (incl. caches)
HW mapping
virtual address (inst. fetch load, store)
physical address (inst. fetch load, store)
  • Each program operates in its own virtual address
    space only program running
  • Each process is protected from the other
  • OS can decide where each goes in memory
  • Hardware (HW) provides virtual -gt physical mapping

8
Mapping Virtual Memory to Physical Memory
Virtual Memory
  • Divide into equal sizedchunks (page of about
    4KB)


Stack
Any chunk of Virtual Memory assigned to any chunk
of Physical Memory
Physical Memory
64 MB
0
0
9
Virtual Memory Mapping Function
  • Cannot have simple function to predict arbitrary
    mapping
  • Use table lookup of mappings

Virtual address
  • Use table lookup (Page Table) for mappings
    Page number is index
  • Virtual Memory Mapping Function
  • Physical Offset Virtual Offset
  • Physical Page Number PageTableVirtual Page
    Number
  • (Physical Page also called Page Frame)

10
Paging Organization (assume 1 KB pages)
11
Page Table
  • A page table is an operating system structure
    which contains the mapping of virtual addresses
    to physical locations
  • There are several different ways, all up to the
    operating system, to keep this data around
  • Each process running in the operating system has
    its own page table
  • State of process is PC, all registers, plus
    page table
  • OS changes page tables by changing contents of
    Page Table Base Register

12
Address Mapping Page Table
(actually concatenation)
Page Table
...
V
A.R.
P. P. A.

Access Rights
Physical Page Address
Val -id
Physical Memory Address
V
A.R.
P. P. A.
V
A.R.
Disk. A.
...
Disk
Page Table located in physical memory
13
Notes on Page Table
  • Solves Fragmentation problem all chunks same
    size, so all holes can be used
  • OS must reserve Swap Space on disk for each
    process
  • To grow a process, ask Operating System
  • If unused pages, OS uses them first
  • If not, OS swaps some old pages to disk
  • (Least Recently Used to pick pages to swap)
  • Each process has own Page Table
  • Will add details, but Page Table is essence of
    Virtual Memory

14
Comparing the 2 levels of hierarchy
  • Cache Version Virtual Memory vers.
  • Block (or Line) Page
  • Miss Page Fault
  • Block Size 32-64B Page Size 4K-8KB
  • Placement Fully AssociativeDirect Mapped,
    N-way Set Associative
  • Replacement Least Recently UsedLRU or
    Random (LRU)
  • Write Thru or Back Write Back

15
Virtual Memory Problem 1
  • Map every address ? 1 indirection via Page Table
    in memory per virtual address
  • ? 1 virtual memory accesses 2 physical memory
    accesses ? SLOW!
  • Observation since locality in pages of data,
    there must be locality in virtual address
    translations of those pages
  • Since small is fast, why not use a small cache of
    virtual to physical address translations to make
    translation fast?
  • For historical reasons, cache is called a
    Translation Lookaside Buffer, or TLB

16
Translation Look-Aside Buffers
  • TLBs usually small, typically 128 - 256 entries
  • Like any other cache, the TLB can be direct
    mapped, set associative, or fully associative

hit
PA
miss
VA
TLB Lookup
Cache
Main Memory
Processor
miss
hit
Trans- lation
data
17
Typical TLB Format
Virtual Physical Dirty Ref Valid
Access Address Address Rights
  • TLB just a cache on the page table mappings
  • TLB access time comparable to cache (much
    less than main memory access time)
  • Dirty since use write back, need to know
    whether or not to write page to disk when
    replaced
  • Ref Used to help calculate LRU on replacement
  • Cleared by OS periodically, then checked to see
    if page was referenced

18
What if we don't have enough memory?
  • We chose some other page belonging to a program
    and transfer it onto the disk if it is dirty
  • If clean (disk copy is up-to-date), just
    overwrite that data in memory
  • We chose the page to evict based on replacement
    policy (e.g., LRU)
  • And update that program's page table to reflect
    the fact that its memory moved somewhere else

19
Virtual Memory Problem 2
  • Not enough physical memory!
  • Only, say, 64 MB of physical memory
  • N processes, each 4 GB (232 B) of virtual memory!
  • Could have 1K virtual pages/physical page!
  • Spatial Locality to the rescue
  • Each page is 4 KB, lots of nearby references
  • No matter how big program is, at any time only
    accessing a few pages
  • Working Set recently used pages

20
Virtual Memory Problem 3
  • Page Table too big!
  • 4GB Virtual Memory 4 KB page ? 1 million
    Page Table Entries ? 4 MB just for Page Table
    for 1 process, 25 processes ? 100 MB for Page
    Tables!
  • Variety of solutions to tradeoff memory size of
    mapping function for slower when miss TLB
  • Make TLB large enough, highly associative so
    rarely miss on address translation
  • CS 315 will go over more options and in greater
    depth

21
2-level Page Table
22
Page Table Shrink
  • Single Page Table

Only have second level page table for valid
entries of super level page table
23
Space Savings for Multi-Level Page Table
  • If only 10 of entries of Super Page Table have
    valid entries, then total mapping size is roughly
    1/10-th of single level page table
  • Exercise 7.35 explores exact size

24
Three Advantages of Virtual Memory
  • 1) Translation
  • Program can be given consistent view of memory,
    even though physical memory is scrambled
  • Makes multiple processes reasonable
  • Only the most important part of program (Working
    Set) must be in physical memory
  • Contiguous structures (like stacks) use only as
    much physical memory as necessary yet still grow
    later

25
Three Advantages of Virtual Memory
  • 2) Protection
  • Different processes protected from each other
  • Different pages can be given special behavior
  • (Read Only, Invisible to user programs, etc).
  • Kernel data protected from User programs
  • Very important for protection from malicious
    programs ? Far more viruses under Microsoft
    Windows
  • Special Mode in processor (Kernel more) allows
    processor to change page table/TLB
  • 3) Sharing
  • Can map same physical page to multiple
    users(Shared memory)

26
Crossing the System Boundary
  • System loads user program into memory and gives
    it use of the processor
  • Switch back
  • SYSCALL
  • request service
  • I/O
  • TRAP (overflow)
  • Interrupt

User
Proc
Mem
System
I/O Bus
data reg.
27
Instruction Set Support for VM/OS
  • How to prevent user program from changing page
    tables and go anywhere?
  • Bit in Status Register determines whether in user
    mode or OS (kernel) mode Kernel/User bit (KU)
    (0 ? kernel, 1 ? user)
  • On exception/interrupt disable interrupts (IE0)
    and go into kernel mode (KU0)
  • Only change the page table when in kernel mode
    (Operating System)

28
4 Questions for Memory Hierarchy
  • Q1 Where can a block be placed in the upper
    level? (Block placement)
  • Q2 How is a block found if it is in the upper
    level?(Block identification)
  • Q3 Which block should be replaced on a miss?
    (Block replacement)
  • Q4 What happens on a write? (Write strategy)

29
Q1 Where block placed in upper level?
  • Block 12 placed in 8 block cache
  • Fully associative, direct mapped, 2-way set
    associative
  • S.A. Mapping Block Number Mod Number Sets

Block no.
0 1 2 3 4 5 6 7
Block no.
0 1 2 3 4 5 6 7
Block no.
0 1 2 3 4 5 6 7
Set 0
Set 1
Set 2
Set 3
Set associative block 12 can go anywhere in set
0 (12 mod 4)
Fully associative block 12 can go anywhere
Direct mapped block 12 can go only into block 4
(12 mod 8)
30
Q2 How is a block found in upper level?
Set Select
Data Select
  • Direct indexing (using index and block offset),
    tag compares, or combination
  • Increasing associativity shrinks index, expands
    tag

31
Q3 Which block replaced on a miss?
  • Easy for Direct Mapped
  • Set Associative or Fully Associative
  • Random
  • LRU (Least Recently Used)
  • Miss RatesAssociativity 2-way 4-way
    8-way
  • Size LRU Ran LRU Ran LRU Ran
  • 16 KB 5.2 5.7 4.7 5.3 4.4 5.0
  • 64 KB 1.9 2.0 1.5 1.7 1.4 1.5
  • 256 KB 1.15 1.17 1.13 1.13 1.12
    1.12

32
Q4 What to do on a write hit?
  • Write-through
  • update the word in cache block and corresponding
    word in memory
  • Write-back
  • update word in cache block
  • allow memory word to be stale
  • gt add dirty bit to each line indicating that
    memory be updated when block is replaced
  • gt OS flushes cache before I/O !!!
  • Performance trade-offs?
  • WT read misses cannot result in writes
  • WB no writes of repeated writes

33
Virtual Memory Overview
  • Lets say were fetching some data
  • Check TLB (input VPN, output PPN)
  • hit fetch translation
  • miss check page table (in memory)
  • Page table hit fetch translation
  • Page table miss page fault, fetch page from disk
    to memory, return translation to TLB
  • Check cache (input PPN, output data)
  • hit return value
  • miss fetch value from memory

34
Paging/Virtual Memory with TLB
User B Virtual Memory
User A Virtual Memory


Physical Memory
Stack
Stack
64 MB
Heap
Heap
Static
Static
0
Code
Code
0
0
35
Virtual Memory Overview
  • TLB usually small, typically 128 - 256 entries
  • BS 1-2 page table entries (4-8 B each)
  • Hit time .5-1 cycle
  • Miss penalty 10-30 cycles
  • Miss rate .01-1

hit
PA
miss
VA
TLB Lookup
Cache
Main Memory
Processor
miss
hit
Trans- lation
data
36
Address Translation 3 Concept tests
TLB
...
P. P. N.
V. P. N.
Physical Page Number
Virtual Page Number
V. P. N.
P. P. N.
37
Cache/VM/TLB Summary 1/3
  • The Principle of Locality
  • Program access a relatively small portion of the
    address space at any instant of time.
  • Temporal Locality Locality in Time
  • Spatial Locality Locality in Space
  • Caches, TLBs, Virtual Memory all understood by
    examining how they deal with 4 questions 1)
    Where can block be placed? 2) How is block
    found? 3) What block is replaced on miss? 4)
    How are writes handled?

38
Cache/VM/TLB Summary 2/3
  • Virtual Memory allows protected sharing of memory
    between processes with less swapping to disk,
    less fragmentation than always swap or base/bound
  • 3 Problems
  • 1) Not enough memory Spatial Locality means
    small Working Set of pages OK
  • 2) TLB to reduce performance cost of VM
  • 3) Need more compact representation to reduce
    memory size cost of simple 1-level page table,
    especially for 64-bit address

39
Cache/VM/TLB Summary 3/3
  • Virtual memory was controversial at the time can
    SW automatically manage 64KB across many
    programs?
  • 1000X DRAM growth removed controversy
  • Today VM allows many processes to share single
    memory without having to swap all processes to
    disk VM protection today is more important than
    memory hierarchy
  • Today CPU time is a function of (ops, cache
    misses) vs. just f(ops)What does this mean to
    Compilers, Data structures, Algorithms?

40
Reading quiz
  • 1. The page table is a memory data structure that
    maps virtual pages to physical pages. It would
    seem then that every virtual memory accesses
    would result in two physical memory accesses one
    to translate into physical addresses via the page
    table and a second to get to the actual data. How
    do operating systems and processors avoid such a
    high overhead for virtual memory?
  • 2. In addition to having a valid bit and a dirty
    bit, some page tables have a reference bit. If
    the bit is a one, it means that the page has been
    accessed since the last time the operating system
    set the bit to zero. What is the purpose of such
    a bit? Can you think of a way to get a similar
    effect without a reference bit?

41
Reading quiz
  • 1. The standard four questions for memory
    hierarchies emphasize the similarities between
    caches and virtual memory. Some combinations of
    the options that make sense for caches would be
    silly in a virtual memory . Which combinations
    would you never expect to see in a real system?
    Why?
  • 2. Is a TLB a cache? If so, what is it a cache
    of? Is it OK to use either write through or write
    back? Why?

42
Bonus rest of slides could appear after slide 43
Impact of Caches?
  • 1960-1985 Speed (no. operations)
  • 1990s
  • Pipelined Execution Fast Clock Rate
  • Out-of-Order execution
  • Superscalar
  • 2001 Speed (non-cached memory accesses)?

43
Quicksort vs. Radix as vary number keys
Instructions
Radix sort
Quick sort
Instructions / key
Set size in keys
44
Quicksort vs. Radix as vary number keys
Instructions and Time
Radix sort
Time / key
Quick sort
Instructions / key
Set size in keys
45
Quicksort vs. Radix as vary number keys Cache
misses
What is proper approach to fast algorithms?
Radix sort
Cache misses / key
Quick sort
Set size in keys
46
Bonus slide Kernel/User Mode
  • Generally restrict device access, page table to
    OS
  • HOW?
  • Add a mode bit to the machine K/U
  • Only allow SW in kernel mode to access device
    registers, page table
  • If user programs could access I/O devices and
    page tables directly?
  • could destroy each others data, ...
  • might break the devices,
Write a Comment
User Comments (0)
About PowerShow.com