Title: Chapter 10 Virtual Memory
1Chapter 10 Virtual Memory
2Outline
- Background
- Demand Paging
- Process Creation
- Page Replacement
- Allocation of Frames
- Thrashing
- OS Examples
3Background
- Memory Management in Chapter 9
- Place the entire logical address space into the
physical address space - Exception -- overlays and dynamic loading
programmer - Virtual memory allow the execution of processes
that may not be completely in memory - Only part of the program needs to be in memory
for execution - Separate user logical address space (memory) from
physical address space (memory) - Logical address space can be much larger than
physical address space - Need to allow pages to be swapped in and out
4Does the entire process need in memory for
execution?
- Code to handle unusual error conditions
- Arrays, list, and tables are often allocated more
memory than they actually need - Certain options and features of a program may be
used rarely - Even when the entire program is needed it may
not all be needed at the same time
5Benefits of executing a program partially in
memory
- A program can be larger than the physical memory
- Programmers no longer need to worry about the
amount of physical memory available - Increase the level of multiprogramming
- Increase the CPU utilization and throughput,
without increasing the response time or
turnaround time (really?) - Less I/O would be needed to load or swap each
user program into memory
6A large VM when only a small physical memory is
available
Physical memory is full, swap a frame out (Page
Replacement)
Direct access
Page is not in physical memory
7Virtual Memory
- Implementation
- Demand paging
- Demand segmentation
- Segmentation-replacement algorithms are more
complex because the segments have variable sizes
810.2 Demand Paging
9Basic Concepts
- Similar to paging swapping
- Processes reside on secondary memory When
executing - Swap the entire process in/out memory (swapper)
- Demand Paging never swap a page into memory
unless it will be needed (Lazy Swapper or pager) - Demand paging
- Less I/O needed
- Less memory needed
- Faster response
- More processes
- Page is needed ? reference to it
- invalid reference ? abort
- not-in-memory ? bring to memory
Pager guesses which pages will be used before the
process is swapped out again ? Bring these
necessary pages into memory only
10Transfer of a paged memory to contiguous disk
space
11Hardware Support for Demand Paging
- With each page table entry a validinvalid bit is
associated - 1? legal and in-memory 0 ? not-in-memory or
illegal - Initially validinvalid bit is set to 0 on all
entries - Marking a page invalid will have no effect if the
process never attempts to access that page - During address translation, if validinvalid bit
in page table entry is 0 ? page-fault trap - Illegal?
- Legal but not in memory
12Page table when some pages are not in main memory
illegal access
OS puts the process in the backing store when it
starts executing.
13Page-Fault Trap
- If there is a reference to a page, first
reference will trap to OS ? page fault (this
page has not yet brought into memory) - OS looks at an internal table (keep with PCB) to
decide - Invalid reference ? abort
- Just not in memory ? page it in
- Get a free frame from the free-frame list
- What if no free fames ? Page Replacement
- Swap page into frame (schedule a disk operation)
- Reset tables (internal and page tables), valid
bit 1. - Restart instruction (need architecture support)
14Steps in handling a page fault
15Restart any instruction after a page fault
- ADD A, B, C
- Fetch an decode the instruction (ADD)
- Fetch A
- Fetch B
- Add A and B
- Store the sum in C
The Key is -- How to keep the original states for
restarting
Need Architecture Support
16What happens if there is no free frame?
- Page replacement find some page in memory, but
not really in use, swap it out. - Algorithm?
- Performance want an algorithm which will result
in minimum number of page faults - Same page may be brought into memory several
times - Page in ? Page out ? Page in ? ? Page out
17More about Demand Paging
- Pure demand paging
- Never bring a page into memory until it is
required - Start executing a process with no pages in memory
- Locality of reference
- Result in reasonable performance from demand
paging - Hardware support same as paging and swapping
- Page Table
- Secondary memory hold the pages not in main
memory - Swap space or backing store
- Must fast
18Performance of Demand Paging
- Page Fault Rate 0 ? p ? 1.0
- if p 0 no page faults
- if p 1, every reference is a fault
- Effective Access Time (EAT)
- EAT (1 p) x memory access
- p (service page-fault trap
- swap page out
- Read page in
- restart overhead)
19Performance of Demand Paging (Example)
- Memory access time 100 nanoseconds
- Average page-fault service time 25 milliseconds
- EAT (1 p) x (100) p x (25 milliseconds)
(1 p) x (100) p x 25,000,000
100 24,999,900 x p - p 0.001 ? EAT 25 microseconds (250 slow down)
- EAT 110 ns ? p lt 0.0000004 (10-percent slow
down)
2010.3 Process Creation
21Process Creation
- Virtual memory allows other benefits during
process creation - Copy-on-Write
- Memory-Mapped Files
22Copy-on-Write
- Copy-on-Write (COW) allows both parent and child
processes to initially share the same pages in
memory - Their page tables point to the same frames
- If either process modifies a shared page, only
then is the page copied (Copy-on-Write). - COW allows more efficient process creation as
only modified pages are copied. - Free pages are allocated from a pool of
zeroed-out pages.
23Memory-Mapped Files
- Memory-mapped file I/O allows file I/O to be
treated as routine memory access by mapping a
disk block to a page in memory. - A file is initially read using demand paging. A
page-sized portion of the file is read from the
file system into a physical page. - Subsequent reads/writes to/from the file are
treated as ordinary memory accesses. - Simplifies file access by treating file I/O
through memory rather than read() write() system
calls. - Also allows several processes to map the same
file allowing the pages in memory to be shared.
24Memory Mapped Files
2510.4 Page Replacement
26Introduction
- Prevent over-allocation of memory by modifying
page-fault service routine to include page
replacement. - Use modify (dirty) bit to reduce overhead of page
transfers only modified pages are written to
disk. - Page replacement completes separation between
logical memory and physical memory large
virtual memory can be provided on a smaller
physical memory.
27Need for Page Replacement
28Basic Page Replacement
- Find the location of the desired page on the disk
- Find a free frame
- If there is a free frame, use it
- If no free frame, use a page replacement
algorithm to select a victim frame - Write the victim to the disk change the page and
frame tables accordingly - Read the desired page into the newly free frame
- Update the page and frame tables
- Restart the user process
29Page Replacement
30Two Major Problems to Implement Demand Paging
- Frame-allocation algorithm
- How many frames to allocate to each process
- Page-replacement algorithm
- Select the frames that are to be replaced
- Want the lowest page-fault rate
- Evaluate an algorithm by running it on a
particular string of memory references (reference
string) and computing the number of page faults
on that string
- Page-replacement algorithm (Cont.)
- Reference String the string of memory references
- 0100, 0432, 0101, 0612, 0102, 0103, 0104, 0101,
0611, 0102, 0103, 0104, 0101, 0610, 0102, 0103,
0104, 0101, 0609, 0102, 0105 - 1, 4, 1, 6, 1, 6, 1, 6, 1, 6, 1 (with page
size100) - Reference String used 1, 2, 3, 4, 1, 2, 5, 1, 2,
3, 4, 5.
31Ideal graph of page faults VS. the number of
frames
32First-In-First-Out (FIFO) Algorithm
33FIFO Page Replacement
34Beladys Anomaly
The page-fault rate may increase as the number of
allocated frames increases
35Optimal Algorithm
- Replace page that will not be used for longest
period of time - How do you know this? Need future knowledge
- Used for comparison studies (like SJF)
36Optimal Algorithm
37Least Recently Used (LRU) Algorithm
- Replace the page that has not been used for the
longest period of time - Associate each page the time of that pages last
use
38LRU Algorithm
39LRU Implementation
- Counter or Clock
- Every page entry has a time-of-use field every
time page is referenced through this entry, copy
the clock into the field - When a page needs to be replaced, look at the
counters to select the victim - Require a search of the page table to find the
LRU page, and a write to memory for each memory
access - Overflow of the clock must be considered
40LRU Implementation(Cont.)
- Stack implementation keep a stack of page
numbers in a double link form - Page referenced
- move it to the top
- requires 6 pointers to be changed
- No search for replacement
41Stack Implementation of LRU
42LRU Approximation Algorithms
- Reference bit
- With each page associate a bit, initially 0
- When page is referenced bit set to 1.
- Replace the one which is 0 (if one exists). We
do not know the order, however.
- Additional-Reference-Bit
- Recording the reference bits at regular intervals
- Say use 8-bits to record
- Illustration
- Replace the page with the lowest no. in history
bits - 11000100 vs. 01110111
43LRU Approximation Algorithms (Cont.)
- Second chance (Clock)
- The basic algorithm is FIFO
- Need reference bit
- If page to be replaced (in FIFO order) has
reference bit 1. then - Set reference bit 0
- Leave page in memory (give it the second chance)
- Replace next page (in FIFO order), subject to
same rules
44The Clock Policy (Second Chance)
- The set of frames candidate for replacement is
considered as a circular buffer - When a page is replaced, a pointer is set to
point to the next frame in buffer - A use bit for each frame is set to 1 whenever
- A page is first loaded into the frame
- The corresponding page is referenced
- When it is time to replace a page, the first
frame encountered with the use bit set to 0 is
replaced. - During the search for replacement, each use bit
set to 1 is changed to 0
45(No Transcript)
46(No Transcript)
47Second-Chance Page-Replacement Algorithm
48Comparison of Clock with FIFO and LRU
- Asterisk indicates that the corresponding use bit
is set to 1 - Clock protects frequently referenced pages by
setting the use bit to 1 at each reference
49Comparison of Clock with FIFO and LRU
- Numerical experiments tend to show that
performance of Clock is close to that of LRU - Experiments have been performed when the number
of frames allocated to each process is fixed and
when pages local to the page-fault process are
considered for replacement - When few (6 to 8) frames are allocated per
process, there is almost a factor of 2 of page
faults between LRU and FIFO - This factor reduces close to 1 when several (more
than 12) frames are allocated. (But then more
main memory is needed to support the same level
of multiprogramming)
50(No Transcript)
51LRU Approximation Algorithms (Cont.)
- Enhanced Second-Chance reference modify bit
- Replace the first page encountered in the lowest
class - 1 (0,0) neither recently used nor modified
- 2 (0,1) not recently used but modified
- 3 (1,0) recently used but not modified
- 4 (1,1) recently used and modified
52Counting-Based Algorithms
- Keep a counter of the number of references that
have been made to each page. - LFU Algorithm replaces page with smallest
count. - MFU Algorithm based on the argument that the
page with the smallest count was probably just
brought in and has yet to be used.
5310.5 Allocation of Frames
54Minimum Number of Frames
- Each process needs minimum number of pages
- Defined by computer (instruction set)
architecture - Example IBM 370 6 pages to handle SS MOVE
instruction - instruction is 6 bytes, might span 2 pages.
- 2 pages to handle from.
- 2 pages to handle to.
- Two major allocation schemes
- fixed allocation
- priority allocation
55Fixed Allocation
- Equal allocation e.g., if 100 frames and 5
processes, give each 20 pages. - Proportional allocation Allocate according to
the size of process.
56Priority Allocation
- Use a proportional allocation scheme w.r.t.
priorities rather than size. - If process Pi generates a page fault,
- select for replacement one of its frames.
- select for replacement a frame from a process
with lower priority number.
57Global vs. Local Allocation
- Global replacement process selects a
replacement frame from the set of all frames one
process can take a frame from another. - Local replacement each process selects from
only its own set of allocated frames.
5810.6 Thrashing
59Thrashing
- If a process does not have enough pages, the
page-fault rate is very high. This leads to - Low CPU utilization
- Operating system thinks that it needs to increase
the degree of multiprogramming - Another process added to the system
- Think of a N-iteration loop, each iteration needs
4 pages and only 3 pages are allocated - Thrashing ? a process is busy swapping pages in
and out
1 2 3 4 2 3 4 1 3 4 1 2 3 1 2
1
2
3
4
60Thrashing Diagram
61Locality Model
- To prevent thrashing, we must provide a process
as many frames as it needs - Why does paging work? Locality model
- A locality is a set of pages that are actively
used together - In the previous loop example, the locality
consists of 4 pages - Process migrates from one locality to another
- If we can keep the current localities of all
processes in memory ? no thrashing - Why does thrashing occur?? size of locality gt
total memory size
62(No Transcript)
63(No Transcript)
64Working-Set Model
- Look at how many frames a process is actually
using - ? ? working-set window ? a fixed number of page
references - Example 10,000 instruction
- WSSi (working set of Process Pi) total number
of pages referenced in the most recent ? - if ? too small will not encompass entire
locality. - if ? too large will encompass several localities.
- if ? ? ? will encompass entire program.
- D ? WSSi ? total demand frames
- if D gt m ? Thrashing
- Policy if D gt m, then suspend one of the
processes.
65Working-Set Model Illustration
66Keeping Track of Working Set
- Approximate with interval timer a reference bit
- Example ? 10,000
- Timer interrupts after every 5000 time units.
- Keep in memory 2 bits for each page.
- Whenever a timer interrupts, copy and sets the
values of all reference bits to 0. - If one of the bits in memory 1 ? page in
working set. - Improvement 10 bits and interrupt every 1000
time units.
67Page-Fault Frequency Scheme
- Use page-fault frequency to establish
acceptable page-fault rate. - If actual rate too low, process loses frame.
- If actual rate too high, process gains frame.
68Process Suspension When Thrashing
- Lowest priority process
- Faulting process
- this process does not have its working set in
main memory so it will be blocked anyway - Last process activated
- this process is least likely to have its working
set resident
69Process Suspension
- Process with smallest resident set
- this process requires the least future effort to
reload - Largest process
- obtains the most free frames
- Process with the largest remaining execution
window
7010.8 Other Considerations
71Pre-paging and Page Size
- Pre-paging
- Attempt to prevent the high level of initial
paging - Keep the working-set when a process is suspended
- Whether the cost of pre-paging is less than that
of the page-fault service - Page size selection (Prefer larger page now)
- Internal fragmentation ? small page
- page table size ? large page
- I/O overhead ? large page
- Total I/O ? small page
- of page fault ? large page
72TLB Reach
- TLB Reach - Amount of memory accessible from the
TLB - TLB Reach (TLB Size) X (Page Size)
- Ideally, the working set of each process is
stored in the TLB. Otherwise there is a high
degree of page faults - Increasing TLB reach
- Increase TLB size
- Increase the Page Size. This may lead to an
increase in fragmentation as not all applications
require a large page size - Provide Multiple Page Sizes allow applications
requiring larger page sizes the opportunity to
use them without increase in fragmentation - Need OS-managed TLB
73Program Structure
- Program structure ? How to improve locality
- Array A1024, 1024 of integer
- Each row is stored in one page
- One frame
- Program 1 for j 1 to 1024 do for i 1 to
1024 do Ai,j 01024 x 1024 page faults - Program 2 for i 1 to 1024 do for j 1 to
1024 do Ai,j 01024 page faults
74Inverted Page Table
- IPT no longer contains complete information about
the logical address space of a process, which is
required to process page fault - An external page table (one for each process)
must be kept - Look like the traditional per-process page table,
containing information on where each virtual page
is located - Reference only when a page fault occur ? no need
to be quick - May page in and out of memory as necessary
75I/O Interlocks
- What happens if a page waiting for I/O is paged
out, and another page is put in the frame
originally used for that page - Solution 1 never execute I/O to user memory
- I/O takes place only between system memory and
I/O device - Extra copies are needed for transferring data
between system memory and user memory - Solution 2 Allow pages to be locked into memory
- A locked page cannot be selected for eviction by
a page replacement algorithm
76Reason Why Frames Used for I/O Must be in Memory
77Locking A Page to Prevent Replacement
- Kernel memory is usually locked in memory
(performance) - Prevent replacing a newly brought-in page until
it can be used at least once
78Real-Time Processing
- VM provides the best overall utilization of a
computer - VM increases overall system throughput, but
individual processes may suffer from page faults
and replacements - VM is the antithesis real-time computing
- VM can introduce unexpected long-term delays in
the execution of a process while pages are
brought into memory - Real-time systems almost never have virtual
memory - Solaris 2 real-time and time-sharing processes
- Allow a process to tell it which pages are
important to that process (hint on page use so
that important pages may not be paged out) - Privileged users can lock pages