CS 162 Ch 7: Virtual Memory LECTURE 13 - PowerPoint PPT Presentation

About This Presentation
Title:

CS 162 Ch 7: Virtual Memory LECTURE 13

Description:

... multiple programs to use (different chunks of physical) memory at same ... Any chunk of Virtual Memory assigned to any chunk of Physical Memory ('page') Stack ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 21
Provided by: davep173
Learn more at: http://www.cs.ucr.edu
Category:

less

Transcript and Presenter's Notes

Title: CS 162 Ch 7: Virtual Memory LECTURE 13


1
CS 162Ch 7 Virtual Memory LECTURE 13
  • Instructor L.N. Bhuyan
  • www.cs.ucr.edu/bhuyan

2
Improving Cache Miss Latency-Reducing DRAM Latency
  • Same as improving DRAM latency
  • What is random access memory (RAM)? What are
    static RAM (SRAM) and dynamic RAM (DRAM)?
  • What is DRAM Cell organization? How are the cells
    arranged internally? Memory addressing?
    Refreshing of DRAMs? Difference between DRAM and
    SRAM?
  • Access time of DRAM Row access time column
    access time refreshing
  • What are page-mode and nibble-mode DRAMs?
  • Synchronous SRAM or DRAM Ability to transfer a
    burst of data given a starting address and a
    burst length suitable for transferring a block
    of data from main memory to cache.

3
Main Memory Organizations Fig. 7.13
C
P
U
C
P
U
C
P
U
M
u
l
t
i
p
l
e
x
o
r
C
a
c
h
e
C
a
c
h
e
C
a
c
h
e
B
u
s
B
u
s
B
u
s
M
e
m
o
r
y
M
e
m
o
r
y
M
e
m
o
r
y
M
e
m
o
r
y
M
e
m
o
r
y
b
a
n
k

1
b
a
n
k

2
b
a
n
k

3
b
a
n
k

0
interleaved memory organization
wide memory organization
M
e
m
o
r
y
one-word widememory organization
DRAM access time gtgt bus transfer time
4
Memory Access Time Example
  • Assume that it takes 1 cycle to send the address,
    15 cycles for each DRAM access and 1 cycle to
    send a word of data.
  • Assuming a cache block of 4 words and one-word
    wide DRAM (fig. 7.13a), miss penalty 1 4x15
    4x1 65 cycles
  • With main memory and bus width of 2 words (fig.
    7.13b), miss penalty 1 2x15 2x1 33
    cycles. For 4-word wide memory, miss penalty is
    17 cycles. Expensive due to wide bus and control
    circuits.
  • With interleaved memory of 4 memory banks and
    same bus width (fig. 7.13c), the miss penalty 1
    1x15 4x1 20 cycles. The memory controller
    must supply consecutive addresses to different
    memory banks. Interleaving is universally adapted
    in high-performance computers.

5
Virtual Memory
  • Idea 1 Many Programs sharing DRAM Memory so that
    context switches can occur
  • Idea 2 Allow program to be written without
    memory constraints program can exceed the size
    of the main memory
  • Idea 3 Relocation Parts of the program can be
    placed at different locations in the memory
    instead of a big chunk.
  • Virtual Memory
  • (1) DRAM Memory holds many programs running at
    same time (processes)
  • (2) use DRAM Memory as a kind of cache for disk

6
Virtual Memory has own terminology
  • Each process has its own private virtual address
    space (e.g., 232 Bytes) CPU actually generates
    virtual addresses
  • Each computer has a physical address space
    (e.g., 128 MegaBytes DRAM) also called real
    memory
  • Address translation mapping virtual addresses to
    physical addresses
  • Allows multiple programs to use (different chunks
    of physical) memory at same time
  • Also allows some chunks of virtual memory to be
    represented on disk, not in main memory (to
    exploit memory hierarchy)

7
Mapping Virtual Memory to Physical Memory
Virtual Memory
  • Divide Memory into equal sizedchunks (say, 4KB
    each)


Stack
  • Any chunk of Virtual Memory assigned to any chunk
    of Physical Memory (page)

Physical Memory
64 MB
Single Process
Heap
Static
Code
0
0
8
Handling Page Faults
  • A page fault is like a cache miss
  • Must find page in lower level of hierarchy
  • If valid bit is zero, the Physical Page Number
    points to a page on disk
  • When OS starts new process, it creates space on
    disk for all the pages of the process, sets all
    valid bits in page table to zero, and all
    Physical Page Numbers to point to disk
  • called Demand Paging - pages of the process are
    loaded from disk only as needed

9
Comparing the 2 levels of hierarchy
  • Cache Virtual Memory
  • Block or Line Page
  • Miss Page Fault
  • Block Size 32-64B Page Size 4K-16KB
  • Placement Fully AssociativeDirect Mapped,
    N-way Set Associative
  • Replacement Least Recently UsedLRU or
    Random (LRU) approximation
  • Write Thru or Back Write Back
  • How Managed Hardware SoftwareHardware (Operati
    ng System)

10
How to Perform Address Translation?
  • VM divides memory into equal sized pages
  • Address translation relocates entire pages
  • offsets within the pages do not change
  • if make page size a power of two, the virtual
    address separates into two fields
  • like cache index, offset fields

virtual address
Virtual Page Number
Page Offset
11
Mapping Virtual to Physical Address
Virtual Address
31 30 29 28 27 ..12 11 10
9 8 ... 3 2 1 0
Virtual Page Number
Page Offset
1KB page size
Translation
Page Offset
Physical Page Number
9 8 ... 3 2 1 0
29 28 27 ..12 11 10
Physical Address
12
Address Translation
  • Want fully associative page placement
  • How to locate the physical page?
  • Search impractical (too many pages)
  • A page table is a data structure which contains
    the mapping of virtual pages to physical pages
  • There are several different ways, all up to the
    operating system, to keep this data around
  • Each process running in the system has its own
    page table

13
Address Translation Page Table
Virtual Address (VA)
virtual page nbr
offset
Page Table
...
V
A.R.
P. P. N.

Access Rights
Physical Page Number
Val -id
Physical Memory Address (PA)
...
Page Table is located in physical memory
Access Rights None, Read Only, Read/Write,
Executable
disk
14
Handling Page Faults
  • A page fault is like a cache miss
  • Must find page in lower level of hierarchy
  • If valid bit is zero, the Physical Page Number
    points to a page on disk
  • When OS starts new process, it creates space on
    disk for all the pages of the process, sets all
    valid bits in page table to zero, and all
    Physical Page Numbers to point to disk
  • called Demand Paging - pages of the process are
    loaded from disk only as needed

15
Optimizing for Space
  • Page Table too big!
  • 4GB Virtual Address Space 4 KB page ? 220 ( 1
    million) Page Table Entries ? 4 MB just for Page
    Table of single process!
  • Variety of solutions to tradeoff Page Table size
    for slower performance when miss occurs in TLB
  • Use a limit register to restrict page table
    size and let it grow with more pages,Multilevel
    page table, Paging page tables, etc.
  • (Take O/S Class to learn more)

16
How Translate Fast?
  • Problem Virtual Memory requires two memory
    accesses!
  • one to translate Virtual Address into Physical
    Address (page table lookup)
  • one to transfer the actual data (cache hit)
  • But Page Table is in physical memory!
  • Observation since there is locality in pages of
    data, must be locality in virtual addresses of
    those pages!
  • Why not create a cache of virtual to physical
    address translations to make translation fast?
    (smaller is faster)
  • For historical reasons, such a page table cache
    is called a Translation Lookaside Buffer, or TLB

17
Typical TLB Format
Virtual Physical Valid Ref Dirty Access Page
Nbr Page Nbr Rights
data
tag
  • TLB just a cache of the page table mappings
  • Dirty since use write back, need to know
    whether or not to write page to disk when
    replaced
  • Ref Used to calculate LRU on replacement
  • TLB access time comparable to cache (much
    less than main memory access time)

18
Translation Look-Aside Buffers
  • TLB is usually small, typically 32-4,096 entries
  • Like any other cache, the TLB can be fully
    associative, set associative, or direct mapped

data
data
virtualaddr.
physicaladdr.
TLB
Cache
Main Memory
miss
hit
hit
Processor
miss
PageTable
Disk Memory
OS FaultHandler
page fault/protection violation
19
DECStation 3100/MIPS R2000
3
1

3
0

2
9


1
5

1
4

1
3

1
2

1
1

1
0

9

8


3

2

1

0

Virtual Address
P
a
g
e

o
f
f
s
e
t
V
i
r
t
u
a
l

p
a
g
e

n
u
m
b
e
r
1
2
2
0
P
h
y
s
i
c
a
l

p
a
g
e

n
u
m
b
e
r
V
a
l
i
d
D
i
r
t
y
T
a
g
TLB
T
L
B

h
i
t
64 entries, fully associative
2
0
P
a
g
e

o
f
f
s
e
t
P
h
y
s
i
c
a
l

p
a
g
e

n
u
m
b
e
r
Physical Address
C
a
c
h
e

i
n
d
e
x
P
h
y
s
i
c
a
l

a
d
d
r
e
s
s

t
a
g
B
y
t
e
1
4
2
1
6
o
f
f
s
e
t
T
a
g
D
a
t
a
V
a
l
i
d
Cache
16K entries, direct mapped
3
2
D
a
t
a
C
a
c
h
e

h
i
t
20
Real Stuff Pentium Pro Memory Hierarchy
  • Address Size 32 bits (VA, PA)
  • VM Page Size 4 KB, 4 MB
  • TLB organization separate i,d TLBs (i-TLB
    32 entries, d-TLB 64 entries) 4-way set
    associative LRU approximated hardware
    handles miss
  • L1 Cache 8 KB, separate i,d 4-way set
    associative LRU approximated 32 byte
    block write back
  • L2 Cache 256 or 512 KB
Write a Comment
User Comments (0)
About PowerShow.com