Virtual Memory - PowerPoint PPT Presentation

About This Presentation
Title:

Virtual Memory

Description:

Contents of top-level segment registers (for this example) Pointer to top-level table (page table) ... Sometimes, top-level page tables called 'directories' ... – PowerPoint PPT presentation

Number of Views:120
Avg rating:3.0/5.0
Slides: 47
Provided by: ranveer7
Category:
Tags: memory | of | page | top | virtual

less

Transcript and Presenter's Notes

Title: Virtual Memory


1
Virtual Memory
2
Announcements
  • Prelim coming up in one week
  • In 203 Thurston, Thursday October 16th,
    10101125pm, 1½ hour
  • Topics Everything up to (and including)
    Thursday, October 9th
  • Lectures 1-13, chapters 1-9, and 13 (8th ed)
  • Review Session will be this Thursday, October
    9th
  • Time and Location TBD Possibly 630pm 730pm
  • Nazruls office hours changed for today
  • 1230m - 230pm in Upson 328
  • Homework 3 due today, October 7th
  • CS 4410 Homework 2 graded. (Solutions avail via
    CMS).
  • Mean 45 (stddev 5), High 50 out of 50
  • Common problems
  • Q1 did not satisfy bounded waiting
  • mutual exclusion was not violated

2
3
Homework 2, Question 1
  • Dekkers Algorithm (1965)

CSEnter(int i) insidei true while(
insideJ) if (turn J) inside
i false while(turn J) continue
insidei true
CSEnter(int i) insidei true while(
insideJ) insidei false
while(turn J) continue insidei tr
ue
  • CSExit(int i)
  • turn J
  • insidei false

4
Review Multi-level Translation
  • Illusion of a contiguous address space
  • Physicall reality
  • address space broken into segments or fixed-size
    pages
  • Segments or pages spread throughout physical
    memory
  • Could have any number of levels. Example (top
    segment)
  • What must be saved/restored on context switch?
  • Contents of top-level segment registers (for this
    example)
  • Pointer to top-level table (page table)

4
5
Review Two-Level Page Table
  • Tree of Page Tables
  • Tables fixed size (1024 entries)
  • On context-switch save single PageTablePtr
    register
  • Sometimes, top-level page tables called
    directories (Intel)
  • Each entry called a (surprise!) Page Table Entry
    (PTE)

5
6
What is in a PTE?
  • What is in a Page Table Entry (or PTE)?
  • Pointer to next-level page table or to actual
    page
  • Permission bits valid, read-only, read-write,
    execute-only
  • Example Intel x86 architecture PTE
  • Address same format previous slide (10, 10,
    12-bit offset)
  • Intermediate page tables called Directories
  • P Present (same as valid bit in other
    architectures)
  • W Writeable
  • U User accessible
  • PWT Page write transparent external cache
    write-through
  • PCD Page cache disabled (page cannot be
    cached)
  • A Accessed page has been accessed recently
  • D Dirty (PTE only) page has been modified
    recently
  • L L1?4MB page (directory only). Bottom 22
    bits of virtual address serve as offset

6
7
Examples of how to use a PTE
  • How do we use the PTE?
  • Invalid PTE can imply different things
  • Region of address space is actually invalid or
  • Page/directory is just somewhere else than
    memory
  • Validity checked first
  • OS can use other (say) 31 bits for location info
  • Usage Example Demand Paging
  • Keep only active pages in memory
  • Place others on disk and mark their PTEs invalid
  • Usage Example Copy on Write
  • UNIX fork gives copy of parent address space to
    child
  • Address spaces disconnected after child created
  • How to do this cheaply?
  • Make copy of parents page tables (point at same
    memory)
  • Mark entries in both sets of page tables as
    read-only
  • Page fault on write creates two copies
  • Usage Example Zero Fill On Demand
  • New data pages must carry no information (say be
    zeroed)
  • Mark PTEs as invalid page fault on use gets
    zeroed page

7
8
How is the translation accomplished?
  • What, exactly happens inside MMU?
  • One possibility Hardware Tree Traversal
  • For each virtual address, takes page table base
    pointer and traverses the page table in hardware
  • Generates a Page Fault if it encounters invalid
    PTE
  • Fault handler will decide what to do
  • More on this next lecture
  • Pros Relatively fast (but still many memory
    accesses!)
  • Cons Inflexible, Complex hardware
  • Another possibility Software
  • Each traversal done in software
  • Pros Very flexible
  • Cons Every translation must invoke Fault!
  • In fact, need way to cache translations for
    either case!

8
9
Caching Concept
  • Cache a repository for copies that can be
    accessed more quickly than the original
  • Make frequent case fast and infrequent case less
    dominant
  • Caching underlies many of the techniques that are
    used today to make computers fast
  • Can cache memory locations, address
    translations, pages, file blocks, file names,
    network routes, etc
  • Only good if
  • Frequent case frequent enough and
  • Infrequent case not too expensive
  • Important measure Average Access time (Hit
    Rate x Hit Time) (Miss Rate x Miss Time)

9
10
Why Bother with Caching?
Processor-DRAM Memory Gap (latency)
1000
Moores Law (really Joys Law)
100
Performance
10
Less Law?
1
1989
1980
1981
1983
1984
1985
1986
1987
1988
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
1982
Time
10
11
Another Major Reason to Deal with Caching
  • Too expensive to translate on every access
  • At least two DRAM accesses per actual DRAM
    access
  • Or perhaps I/O if page table partially on disk!
  • Even worse problem What if we are using caching
    to make memory access faster than DRAM access???
  • Solution? Cache translations!
  • Translation Cache TLB (Translation Lookaside
    Buffer)

11
12
Why Does Caching Help? Locality!
  • Temporal Locality (Locality in Time)
  • Keep recently accessed data items closer to
    processor
  • Spatial Locality (Locality in Space)
  • Move contiguous blocks to the upper levels

12
13
Review Memory Hierarchy of a Modern Computer
System
  • Take advantage of the principle of locality to
  • Present as much memory as in the cheapest
    technology
  • Provide access at speed offered by the fastest
    technology

13
14
A Summary on Sources of Cache Misses
  • Compulsory (cold start) first reference to a
    block
  • Cold fact of life not a whole lot you can do
    about it
  • Note When running billions of instruction,
    Compulsory Misses are insignificant
  • Capacity
  • Cache cannot contain all blocks access by the
    program
  • Solution increase cache size
  • Conflict (collision)
  • Multiple memory locations mapped to same cache
    location
  • Solutions increase cache size, or increase
    associativity
  • Two others
  • Coherence (Invalidation) other process (e.g.,
    I/O) updates memory
  • Policy Due to non-optimal replacement policy

14
15
Review Where does a Block Get Placed in a Cache?
  • Example Block 12 placed in 8 block cache

15
16
Other Caching Questions
  • What line gets replaced on cache miss?
  • Easy for Direct Mapped Only one possibility
  • Set Associative or Fully Associative
  • Random
  • LRU (Least Recently Used)
  • What happens on a write?
  • Write through The information is written to both
    the cache and to the block in the lower-level
    memory
  • Write back The information is written only to
    the block in the cache
  • Modified cache block is written to main memory
    only when it is replaced
  • Question is block clean or dirty?

16
17
Caching Applied to Address Translation
TLB
Physical Memory
CPU
Cached?
Translate (MMU)
  • Question is one of page locality does it exist?
  • Instruction accesses spend a lot of time on the
    same page (since accesses sequential)
  • Stack accesses have definite locality of
    reference
  • Data accesses have less page locality, but still
    some
  • Can we have a TLB hierarchy?
  • Sure multiple levels at different sizes/speeds

17
18
What Actually Happens on a TLB Miss?
  • Hardware traversed page tables
  • On TLB miss, hardware in MMU looks at current
    page table to fill TLB (may walk multiple
    levels)
  • If PTE valid, hardware fills TLB and processor
    never knows
  • If PTE marked as invalid, causes Page Fault,
    after which kernel decides what to do afterwards
  • Software traversed Page tables (like MIPS)
  • On TLB miss, processor receives TLB fault
  • Kernel traverses page table to find PTE
  • If PTE valid, fills TLB and returns from fault
  • If PTE marked as invalid, internally calls Page
    Fault handler
  • Most chip sets provide hardware traversal
  • Modern operating systems tend to have more TLB
    faults since they use translation for many
    things
  • Examples
  • shared segments
  • user-level portions of an operating system

18
19
Goals for Today
  • Virtual memory
  • How does it work?
  • Page faults
  • Resuming after page faults
  • When to fetch?
  • What to replace?
  • Page replacement algorithms
  • FIFO, OPT, LRU (Clock)
  • Page Buffering
  • Allocating Pages to processes

19
20
What is virtual memory?
  • Each process has illusion of large address space
  • 232 for 32-bit addressing
  • However, physical memory is much smaller
  • How do we give this illusion to multiple
    processes?
  • Virtual Memory some addresses reside in disk

20
21
Virtual Memory
  • Separates users logical memory from physical
    memory.
  • Only part of the program needs to be in memory
    for execution
  • Logical address space can therefore be much
    larger than physical address space
  • Allows address spaces to be shared by several
    processes
  • Allows for more efficient process creation

21
22
Virtual Memory
  • Load entire process in memory (swapping), run it,
    exit
  • Is slow (for big processes)
  • Wasteful (might not require everything)
  • Solutions partial residency
  • Paging only bring in pages, not all pages of
    process
  • Demand paging bring only pages that are
    required
  • Where to fetch page from?
  • Have a contiguous space in disk swap file
    (pagefile.sys)

22
23
How does VM work?
  • Modify Page Tables with another bit (valid)
  • If page in memory, valid 1, else valid 0
  • If page is in memory, translation works as
    before
  • If page is not in memory, translation causes a
    page fault

32 V1 4183 V0 177 V1 5721 V0
0 1 2 3
Mem
Page Table
23
24
Page Faults
  • On a page fault
  • OS finds a free frame, or evicts one from memory
    (which one?)
  • Want knowledge of the future?
  • Issues disk request to fetch data for page (what
    to fetch?)
  • Just the requested page, or more?
  • Block current process, context switch to new
    process (how?)
  • Process might be executing an instruction
  • When disk completes, set valid bit to 1, and
    current process in ready queue

24
25
Steps in Handling a Page Fault
25
26
Resuming after a page fault
  • Should be able to restart the instruction
  • For RISC processors this is simple
  • Instructions are idempotent until references are
    done
  • More complicated for CISC
  • E.g. move 256 bytes from one location to another
  • Possible Solutions
  • Ensure pages are in memory before the instruction
    executes

26
27
Page Fault (Cont.)
  • Restart instruction
  • block move
  • auto increment/decrement location

27
28
When to fetch?
  • Just before the page is used!
  • Need to know the future
  • Demand paging
  • Fetch a page when it faults
  • Prepaging
  • Get the page on fault some of its neighbors,
    or
  • Get all pages in use last time process was swapped

28
29
Performance of Demand Paging
  • Page Fault Rate 0 ? p ? 1.0
  • if p 0 no page faults
  • if p 1, every reference is a fault
  • Effective Access Time (EAT)
  • EAT (1 p) x memory access
  • p (page fault overhead
  • swap page out
  • swap page in
  • restart overhead

  • )

29
30
Demand Paging Example
  • Memory access time 200 nanoseconds
  • Average page-fault service time 8
    milliseconds
  • EAT (1 p) x 200 p (8 milliseconds)
  • (1 p) x 200 p x 8,000,000
  • 200 p x 7,999,800
  • If one access out of 1,000 causes a page fault
  • EAT 8.2 microseconds.
  • This is a slowdown by a factor of 40!!

30
31
What to replace?
  • What happens if there is no free frame?
  • find some page in memory, but not really in use,
    swap it out
  • Page Replacement
  • When process has used up all frames it is allowed
    to use
  • OS must select a page to eject from memory to
    allow new page
  • The page to eject is selected using the Page
    Replacement Algorithm
  • Goal Select page that minimizes future page
    faults

31
32
Page Replacement
  • Prevent over-allocation of memory by modifying
    page-fault service routine to include page
    replacement
  • Use modify (dirty) bit to reduce overhead of page
    transfers only modified pages are written to
    disk
  • Page replacement completes separation between
    logical memory and physical memory large
    virtual memory can be provided on a smaller
    physical memory

32
33
Page Replacement
33
34
Page Replacement Algorithms
  • Random Pick any page to eject at random
  • Used mainly for comparison
  • FIFO The page brought in earliest is evicted
  • Ignores usage
  • Suffers from Beladys Anomaly
  • Fault rate could increase on increasing number of
    pages
  • E.g. 0 1 2 3 0 1 4 0 1 2 3 4 with frame sizes 3
    and 4
  • OPT Beladys algorithm
  • Select page not used for longest time
  • LRU Evict page that hasnt been used the
    longest
  • Past could be a good predictor of the future

34
35
First-In-First-Out (FIFO) Algorithm
  • Reference string 1, 2, 3, 4, 1, 2, 5, 1, 2, 3,
    4, 5
  • 3 frames (3 pages can be in memory at a time per
    process) 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
  • 4 frames 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5

  • Beladys Anomaly more frames ? more page faults

1
1
4
5
2
2
9 page faults
1
3
3
3
2
4
1
1
5
4
2
2
10 page faults
1
5
3
3
2
4
4
3
35
36
FIFO Illustrating Beladys Anomaly
36
37
Optimal Algorithm
  • Replace page that will not be used for longest
    period of time
  • 4 frames example
  • 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
  • How do you know this?
  • Used for measuring how well your algorithm
    performs

1
4
6 page faults
2
3
4
5
37
38
Least Recently Used (LRU) Algorithm
  • Reference string 1, 2, 3, 4, 1, 2, 5, 1, 2, 3,
    4, 5

1
5
1
1
1
2
2
2
2
2
5
4
3
4
5
3
3
4
3
4
38
39
Implementing Perfect LRU
  • On reference Time stamp each page
  • On eviction Scan for oldest frame
  • Problems
  • Large page lists
  • Timestamps are costly
  • Approximate LRU
  • LRU is already an approximation!

13
t4 t14 t14 t5
14
39
14
40
LRU Clock Algorithm
  • Each page has a reference bit
  • Set on use, reset periodically by the OS
  • Algorithm
  • FIFO reference bit (keep pages in circular
    list)
  • Scan if ref bit is 1, set to 0, and proceed. If
    ref bit is 0, stop and evict.
  • Problem
  • Low accuracy for large memory

R1
R1
R0
R0
R1
R0
R1
R1
R1
R0
R0
40
41
LRU with large memory
  • Solution Add another hand
  • Leading edge clears ref bits
  • Trailing edge evicts pages with ref bit 0
  • What if angle small?
  • What if angle big?

41
42
Clock Algorithm Discussion
  • Sensitive to sweeping interval
  • Fast lose usage information
  • Slow all pages look used
  • Clock add reference bits
  • Could use (ref bit, modified bit) as ordered
    pair
  • Might have to scan all pages
  • LFU Remove page with lowest count
  • No track of when the page was referenced
  • Use multiple bits. Shift right by 1 at regular
    intervals.
  • MFU remove the most frequently used page
  • LFU and MFU do not approximate OPT well

42
43
Page Buffering
  • Cute simple trick (XP, 2K, Mach, VMS)
  • Keep a list of free pages
  • Track which page the free page corresponds to
  • Periodically write modified pages, and reset
    modified bit

used
free
unmodified free list
modified list (batch writes speed)
43
44
Allocating Pages to Processes
  • Global replacement
  • Single memory pool for entire system
  • On page fault, evict oldest page in the system
  • Problem protection
  • Local (per-process) replacement
  • Have a separate pool of pages for each process
  • Page fault in one process can only replace pages
    from its own process
  • Problem might have idle resources

44
45
Allocation of Frames
  • Each process needs minimum number of pages
  • Example IBM 370 6 pages to handle SS MOVE
    instruction
  • instruction is 6 bytes, might span 2 pages
  • 2 pages to handle from
  • 2 pages to handle to
  • Two major allocation schemes
  • fixed allocation
  • priority allocation

45
46
Summary
  • Demand Paging
  • Treat memory as cache on disk
  • Cache miss ? get page from disk
  • Transparent Level of Indirection
  • User program is unaware of activities of OS
    behind scenes
  • Data can be moved without affecting application
    correctness
  • Replacement policies
  • FIFO Place pages on queue, replace page at end
  • OPT replace page that will be used farthest in
    future
  • LRU Replace page that hasnt be used for the
    longest time
  • Clock Algorithm Approximation to LRU
  • Arrange all pages in circular list
  • Sweep through them, marking as not in use
  • If page not in use for one pass, than can
    replace

46
Write a Comment
User Comments (0)
About PowerShow.com