Virtual Memory - PowerPoint PPT Presentation

About This Presentation
Title:

Virtual Memory

Description:

Topics Motivations for VM Address translation Accelerating translation with TLBs Motivations for Virtual Memory Use Physical DRAM as a Cache for the Disk Address ... – PowerPoint PPT presentation

Number of Views:229
Avg rating:3.0/5.0
Slides: 51
Provided by: RandalE8
Category:

less

Transcript and Presenter's Notes

Title: Virtual Memory


1
Virtual Memory
  • Topics
  • Motivations for VM
  • Address translation
  • Accelerating translation with TLBs

2
Motivations for Virtual Memory
  • Use Physical DRAM as a Cache for the Disk
  • Address space of a process can exceed physical
    memory size
  • Sum of address spaces of multiple processes can
    exceed physical memory
  • Simplify Memory Management
  • Multiple processes resident in main memory.
  • Each process with its own address space
  • Only active code and data is actually in memory
  • Allocate more memory to process as needed.
  • Provide Protection
  • One process cant interfere with another.
  • because they operate in different address spaces.
  • User process cannot access privileged information
  • different sections of address spaces have
    different permissions.

3
DRAM a Cache for Disk
  • Full address space is quite large
  • 32-bit addresses
    4,000,000,000 (4 billion) bytes
  • 64-bit addresses 16,000,000,000,000,000,000 (16
    quintillion) bytes
  • Disk storage is 300X cheaper than DRAM storage
  • 80 GB of DRAM 33,000
  • 80 GB of disk 110
  • To access large amounts of data in a
    cost-effective manner, the bulk of the data must
    be stored on disk

4
Levels in Memory Hierarchy
cache
virtual memory
Memory
disk
8 B
32 B
4 KB
Register
Cache
Memory
Disk Memory
size speed /Mbyte line size
32 B 1 ns 8 B
32 KB-4MB 2 ns 125/MB 32 B
1024 MB 30 ns 0.20/MB 4 KB
100 GB 8 ms 0.001/MB
larger, slower, cheaper
5
DRAM vs. SRAM as a Cache
  • DRAM vs. disk is more extreme than SRAM vs. DRAM
  • Access latencies
  • DRAM 10X slower than SRAM
  • Disk 100,000X slower than DRAM
  • Importance of exploiting spatial locality
  • First byte is 100,000X slower than successive
    bytes on disk
  • vs. 4X improvement for page-mode vs. regular
    accesses to DRAM
  • Bottom line
  • Design decisions made for DRAM caches driven by
    enormous cost of misses

DRAM
Disk
SRAM
6
Impact of Properties on Design
  • If DRAM was to be organized similar to an SRAM
    cache, how would we set the following design
    parameters?
  • Line size?
  • Large, since disk better at transferring large
    blocks
  • Associativity?
  • High, to mimimize miss rate
  • Write through or write back?
  • Write back, since cant afford to perform small
    writes to disk
  • What would the impact of these choices be on
  • miss rate
  • Extremely low. ltlt 1
  • hit time
  • Must match cache/DRAM performance
  • miss latency
  • Very high. 20ms
  • tag storage overhead
  • Low, relative to block size

7
Locating an Object in a Cache
  • SRAM Cache
  • Tag stored with cache line
  • Maps from cache block to memory blocks
  • From cached to uncached form
  • Save a few bits by only storing tag
  • No tag for block not in cache
  • Hardware retrieves information
  • can quickly match against multiple tags

8
Locating an Object in Cache
  • DRAM Cache
  • Each allocated page of virtual memory has entry
    in page table
  • Mapping from virtual pages to physical pages
  • From uncached form to cached form
  • Page table entry even if page not in memory
  • Specifies disk address
  • Only way to indicate where to find page
  • OS retrieves information

Cache
Page Table
Location
0
On Disk

1
9
A System with Physical Memory Only
  • Examples
  • most Cray machines, early PCs, nearly all
    embedded systems, etc.
  • Addresses generated by the CPU correspond
    directly to bytes in physical memory

10
A System with Virtual Memory
  • Examples
  • workstations, servers, modern PCs, etc.

Memory
Page Table
Virtual Addresses
Physical Addresses
0
1
P-1
Disk
  • Address Translation Hardware converts virtual
    addresses to physical addresses via OS-managed
    lookup table (page table)

11
Page Faults (like Cache Misses)
  • What if an object is on disk rather than in
    memory?
  • Page table entry indicates virtual address not in
    memory
  • OS exception handler invoked to move data from
    disk into memory
  • current process suspends, others can resume
  • OS has full control over placement, etc.

Before fault
After fault
Memory
Memory
Page Table
Page Table
Virtual Addresses
Physical Addresses
Virtual Addresses
Physical Addresses
CPU
CPU
Disk
Disk
12
Servicing a Page Fault
(1) Initiate Block Read
  • Processor Signals Controller
  • Read block of length P starting at disk address X
    and store starting at memory address Y
  • Read Occurs
  • Direct Memory Access (DMA)
  • Under control of I/O controller
  • I / O Controller Signals Completion
  • Interrupt processor
  • OS resumes suspended process

Processor
Reg
(3) Read Done
Cache
Memory-I/O bus
(2) DMA Transfer
I/O controller
Memory
disk
Disk
13
Memory Management
  • Multiple processes can reside in physical memory.
  • How do we resolve address conflicts?
  • what if two processes access something at the
    same address?

memory invisible to user code
kernel virtual memory
stack
esp
Memory mapped region forshared libraries
Linux/x86 process memory image
the brk ptr
runtime heap (via malloc)
uninitialized data (.bss)
initialized data (.data)
program text (.text)
forbidden
0
14
Separate Virt. Addr. Spaces
  • Virtual and physical address spaces divided into
    equal-sized blocks
  • blocks are called pages (both virtual and
    physical)
  • Each process has its own virtual address space
  • operating system controls how virtual pages as
    assigned to physical memory

0
Physical Address Space (DRAM)
Address Translation
Virtual Address Space for Process 1
0
VP 1
PP 2
VP 2
...
N-1
(e.g., read/only library code)
PP 7
Virtual Address Space for Process 2
0
VP 1
PP 10
VP 2
...
M-1
N-1
15
Contrast Macintosh Memory Model
  • MAC OS 19
  • Does not use traditional virtual memory
  • All program objects accessed through handles
  • Indirect reference through pointer table
  • Objects stored in shared global address space

Shared Address Space
P1 Pointer Table
A
Process P1
B
P2 Pointer Table
C
Handles
Process P2
D
E
16
Macintosh Memory Management
  • Allocation / Deallocation
  • Similar to free-list management of malloc/free
  • Compaction
  • Can move any object and just update the (unique)
    pointer in pointer table

Shared Address Space
P1 Pointer Table
B
Process P1
A
Handles
P2 Pointer Table
C
Process P2
D
E
17
Mac vs. VM-Based Memory
  • Allocating, deallocating, and moving memory
  • can be accomplished by both techniques
  • Block sizes
  • Mac variable-sized
  • may be very small or very large
  • VM fixed-size
  • size is equal to one page (4KB on x86 Linux
    systems)
  • Allocating contiguous chunks of memory
  • Mac contiguous allocation is required
  • VM can map contiguous range of virtual addresses
    to disjoint ranges of physical addresses
  • Protection
  • Mac wild write by one process can corrupt
    anothers data

18
MAC OS X
  • Modern Operating System
  • Virtual memory with protection
  • Preemptive multitasking
  • Other versions of MAC OS require processes to
    voluntarily relinquish control
  • Based on MACH OS
  • Developed at CMU in late 1980s

19
Motivation 3 Protection
  • Page table entry contains access rights
    information
  • hardware enforces this protection (trap into OS
    if violation occurs)

Page Tables
Memory
Process i
Process j
20
VM Address Translation
  • Virtual Address Space
  • V 0, 1, , N1
  • Physical Address Space
  • P 0, 1, , M1
  • M lt N
  • Address Translation
  • MAP V ? P U ?
  • For virtual address a
  • MAP(a) a if data at virtual address a at
    physical address a in P
  • MAP(a) ? if data at virtual address a not in
    physical memory
  • Either invalid or stored on disk

21
VM Address Translation Hit
Processor
Hardware Addr Trans Mechanism
Main Memory
a
a'
physical address
virtual address
part of the on-chip memory mgmt unit (MMU)
22
VM Address Translation Miss
page fault
fault handler
Processor
?
Hardware Addr Trans Mechanism
Main Memory
Secondary memory
a
a'
OS performs this transfer (only if miss)
physical address
virtual address
part of the on-chip memory mgmt unit (MMU)
23
VM Address Translation
  • Parameters
  • P 2p page size (bytes).
  • N 2n Virtual address limit
  • M 2m Physical address limit

n1
0
p1
p
virtual address
virtual page number
page offset
address translation
0
p1
p
m1
physical address
physical page number
page offset
Page offset bits dont change as a result of
translation
24
Page Tables
Memory resident page table (physical page or
disk address)
Virtual Page Number
Physical Memory
Valid
1
1
0
1
1
1
0
1
Disk Storage (swap file or regular file system
file)
0
1
25
Address Translation via Page Table
26
Page Table Operation
  • Translation
  • Separate (set of) page table(s) per process
  • VPN forms index into page table (points to a page
    table entry)

27
Page Table Operation
  • Computing Physical Address
  • Page Table Entry (PTE) provides information about
    page
  • if (valid bit 1) then the page is in memory.
  • Use physical page number (PPN) to construct
    address
  • if (valid bit 0) then the page is on disk
  • Page fault

28
Page Table Operation
  • Checking Protection
  • Access rights field indicate allowable access
  • e.g., read-only, read-write, execute-only
  • typically support multiple protection modes
    (e.g., kernel vs. user)
  • Protection violation fault if user doesnt have
    necessary permission

29
Integrating VM and Cache
  • Most Caches Physically Addressed
  • Accessed by physical addresses
  • Allows multiple processes to have blocks in cache
    at same time
  • Allows multiple processes to share pages
  • Cache doesnt need to be concerned with
    protection issues
  • Access rights checked as part of address
    translation
  • Perform Address Translation Before Cache Lookup
  • But this could involve a memory access itself (of
    the PTE)
  • Of course, page table entries can also become
    cached

30
Speeding up Translation with a TLB
  • Translation Lookaside Buffer (TLB)
  • Small hardware cache in MMU
  • Maps virtual page numbers to physical page
    numbers
  • Contains complete page table entries for small
    number of pages

31
Address Translation with a TLB
n1
0
p1
p
virtual address
virtual page number
page offset
valid
physical page number
tag
TLB
.
.
.

TLB hit
physical address
tag
byte offset
index
valid
tag
data
Cache

data
cache hit
32
Simple Memory System Example
  • Addressing
  • 14-bit virtual addresses
  • 12-bit physical address
  • Page size 64 bytes

(Virtual Page Offset)
(Virtual Page Number)
(Physical Page Number)
(Physical Page Offset)
33
Simple Memory System Page Table
  • Only show first 16 entries

VPN PPN Valid VPN PPN Valid
00 28 1 08 13 1
01 0 09 17 1
02 33 1 0A 09 1
03 02 1 0B 0
04 0 0C 0
05 16 1 0D 2D 1
06 0 0E 11 1
07 0 0F 0D 1
34
Simple Memory System TLB
  • TLB
  • 16 entries
  • 4-way associative

Set Tag PPN Valid Tag PPN Valid Tag PPN Valid Tag PPN Valid
0 03 0 09 0D 1 00 0 07 02 1
1 03 2D 1 02 0 04 0 0A 0
2 02 0 08 0 06 0 03 0
3 07 0 03 0D 1 0A 34 1 02 0
35
Simple Memory System Cache
  • Cache
  • 16 lines
  • 4-byte line size
  • Direct mapped

Idx Tag Valid B0 B1 B2 B3 Idx Tag Valid B0 B1 B2 B3
0 19 1 99 11 23 11 8 24 1 3A 00 51 89
1 15 0 9 2D 0
2 1B 1 00 02 04 08 A 2D 1 93 15 DA 3B
3 36 0 B 0B 0
4 32 1 43 6D 8F 09 C 12 0
5 0D 1 36 72 F0 1D D 16 1 04 96 34 15
6 31 0 E 13 1 83 77 1B D3
7 16 1 11 C2 DF 03 F 14 0
36
Address Translation Example 1
  • Virtual Address 0x03D4
  • VPN ___ TLBI ___ TLBT ____ TLB Hit? __ Page
    Fault? __ PPN ____
  • Physical Address
  • Offset ___ CI___ CT ____ Hit? __ Byte ____

37
Address Translation Example 2
  • Virtual Address 0x0B8F
  • VPN ___ TLBI ___ TLBT ____ TLB Hit? __ Page
    Fault? __ PPN ____
  • Physical Address
  • Offset ___ CI___ CT ____ Hit? __ Byte ____

38
Address Translation Example 3
  • Virtual Address 0x0040
  • VPN ___ TLBI ___ TLBT ____ TLB Hit? __ Page
    Fault? __ PPN ____
  • Physical Address
  • Offset ___ CI___ CT ____ Hit? __ Byte ____

39
Multi-Level Page Tables
Level 2 Tables
  • Given
  • 4KB (212) page size
  • 32-bit address space
  • 4-byte PTE
  • Problem
  • Would need a 4 MB page table!
  • 220 4 bytes
  • Common solution
  • multi-level page tables
  • e.g., 2-level table (P6)
  • Level 1 table 1024 entries, each of which points
    to a Level 2 page table.
  • Level 2 table 1024 entries, each of which
    points to a page

Level 1 Table
...
40
Main Themes
  • Programmers View
  • Large flat address space
  • Can allocate large blocks of contiguous addresses
  • Processor owns machine
  • Has private address space
  • Unaffected by behavior of other processes
  • System View
  • User virtual address space created by mapping to
    set of pages
  • Need not be contiguous
  • Allocated dynamically
  • Enforce protection during address translation
  • OS manages many processes simultaneously
  • Continually switching among processes
  • Especially when one must wait for resource
  • E.g., disk I/O to handle page fault

41
Intel P6
  • Internal Designation for Successor to Pentium
  • Which had internal designation P5
  • Fundamentally Different from Pentium
  • Out-of-order, superscalar operation
  • Designed to handle server applications
  • Requires high performance memory system
  • Resulting Processors
  • PentiumPro (1996)
  • Pentium II (1997)
  • Incorporated MMX instructions
  • special instructions for parallel processing
  • L2 cache on same chip
  • Pentium III (1999)
  • Incorporated Streaming SIMD Extensions
  • More instructions for parallel processing

42
P6 Memory System
  • 32 bit address space
  • 4 KB page size
  • L1, L2, and TLBs
  • 4-way set associative
  • inst TLB
  • 32 entries
  • 8 sets
  • data TLB
  • 64 entries
  • 16 sets
  • L1 i-cache and d-cache
  • 16 KB
  • 32 B line size
  • 128 sets
  • L2 cache
  • unified
  • 128 KB -- 2 MB

DRAM
external system bus (e.g. PCI)
L2 cache
cache bus
bus interface unit
inst TLB
data TLB
instruction fetch unit
L1 i-cache
L1 d-cache
processor package
43
Linux Organizes VM as Collection of Areas
process virtual memory
vm_area_struct
task_struct
mm_struct
vm_end
vm_start
pgd
mm
vm_prot
vm_flags
mmap
shared libraries
vm_next
0x40000000
vm_end
  • pgd
  • page directory address
  • vm_prot
  • read/write permissions for this area
  • vm_flags
  • shared with other processes or private to this
    process

vm_start
data
vm_prot
vm_flags
0x0804a020
text
vm_next
vm_end
vm_start
0x08048000
vm_prot
vm_flags
0
vm_next
44
Linux Page Fault Handling
process virtual memory
  • Is the VA legal?
  • i.e. is it in an area defined by a
    vm_area_struct?
  • if not then signal segmentation violation (e.g.
    (1))
  • Is the operation legal?
  • i.e., can the process read/write this area?
  • if not then signal protection violation (e.g.,
    (2))
  • If OK, handle fault
  • e.g., (3)

vm_area_struct
shared libraries
1
read
3
data
read
2
text
write
0
45
Memory Mapping
  • Creation of new VM area done via memory mapping
  • create new vm_area_struct and page tables for
    area
  • area can be backed by (i.e., get its initial
    values from)
  • regular file on disk (e.g., an executable object
    file)
  • initial page bytes come from a section of a file
  • nothing (e.g., bss)
  • initial page bytes are zeros
  • dirty pages are swapped back and forth between a
    special swap file.
  • Key point no virtual pages are copied into
    physical memory until they are referenced!
  • known as demand paging
  • crucial for time and space efficiency

46
User-Level Memory Mapping
  • void mmap(void start, int len,
  • int prot, int flags, int fd, int
    offset)
  • map len bytes starting at offset offset of the
    file specified by file description fd, preferably
    at address start (usually 0 for dont care).
  • prot MAP_READ, MAP_WRITE
  • flags MAP_PRIVATE, MAP_SHARED
  • return a pointer to the mapped area.
  • Example fast file copy
  • useful for applications like Web servers that
    need to quickly copy files.
  • mmap allows file transfers without copying into
    user space.

47
mmap() Example Fast File Copy
  • include ltunistd.hgt
  • include ltsys/mman.hgt
  • include ltsys/types.hgt
  • include ltsys/stat.hgt
  • include ltfcntl.hgt
  • /
  • mmap.c - a program that uses mmap
  • to copy itself to stdout
  • /

int main() struct stat stat int i, fd,
size char bufp / open the file get its
size/ fd open("./mmap.c", O_RDONLY)
fstat(fd, stat) size stat.st_size / map
the file to a new VM area / bufp mmap(0,
size, PROT_READ, MAP_PRIVATE, fd, 0)
/ write the VM area to stdout / write(1,
bufp, size)
48
Exec() Revisited
  • To run a new program p in the current process
    using exec()
  • free vm_area_structs and page tables for old
    areas.
  • create new vm_area_structs and page tables for
    new areas.
  • stack, bss, data, text, shared libs.
  • text and data backed by ELF executable object
    file.
  • bss and stack initialized to zero.
  • set PC to entry point in .text
  • Linux will swap in code and data pages as needed.

process-specific data structures (page
tables, task and mm structs)
physical memory
same for each process
kernel code/data/stack
kernel VM
0xc0
demand-zero
stack
esp
process VM
Memory mapped region for shared libraries
.data
.text
libc.so
brk
runtime heap (via malloc)
demand-zero
uninitialized data (.bss)
initialized data (.data)
.data
program text (.text)
.text
p
forbidden
0
49
Fork() Revisited
  • To create a new process using fork()
  • make copies of the old processs mm_struct,
    vm_area_structs, and page tables.
  • at this point the two processes are sharing all
    of their pages.
  • How to get separate spaces without copying all
    the virtual pages from one space to another?
  • copy on write technique.
  • copy-on-write
  • make pages of writeable areas read-only
  • flag vm_area_structs for these areas as private
    copy-on-write.
  • writes by either process to these pages will
    cause page faults.
  • fault handler recognizes copy-on-write, makes a
    copy of the page, and restores write permissions.
  • Net result
  • copies are deferred until absolutely necessary
    (i.e., when one of the processes tries to modify
    a shared page).

50
Memory System Summary
  • Virtual Memory
  • Supports many OS-related functions
  • Process creation
  • Initial
  • Forking children
  • Task switching
  • Protection
  • Combination of hardware software implementation
  • Software management of tables, allocations
  • Hardware access of tables
  • Hardware caching of table entries (TLB)
Write a Comment
User Comments (0)
About PowerShow.com