Title: Virtual Memory
1Virtual Memory
- Matt Evett
- Dept. Computer Science
- Eastern Michigan University
2Virtual Memory Topics
- Demand Paging
- Performance of Demand Paging
- Page Replacement
- Page-Replacement Algorithms
- Allocation of Frames
- Thrashing
- Other Considerations
- Demand Segmenation
3Background
- Virtual memory separation of user logical
memory from physical memory. - Only part of the program needs to be in memory
for execution. - Logical address space can therefore be much
larger than physical address space. - Need to allow pages to be swapped in and out.
- Virtual memory can be implemented via
- Demand paging
- Demand segmentation
4Demand Paging
- Bring a page into memory only when it is needed.
- Similar to simple paging, but uses a lazy
swapper to bring in pages only as needed - Less I/O and memory than simple paging
- Faster response (dont have to swap entire proc.)
- Fewer pages/proc ? More users
- Page request reference to it
- invalid reference ? abort
- not-in-memory ? bring to memory
5Valid-Invalid Bit
- With each page table entry a validinvalid bit is
associated(1 ? in-memory, 0 ? not-in-memory) - recall from simple paging scheme.
- Initially the bit is set to 0 on all entries.
- Ex. snapshot of a page table
- During address translation, if validinvalid bit
in page table entry is 0 ? page fault.
6Swapper Page Fault
- Invalid reference ? OS trap, page fault
- OS looks at another table to differentiate...
- If truly invalid (illegal address) ? abort
- Else, refd page is in swap space
- Get empty frame (may need swap out)
- Swap page into frame
- Could handle another proc while waiting.
- Reset tables, validation bit 1.
- Restart instruction
7Potential Problems During Swap
- Restarting a partially executed instruction
- Block operations span blocks of memory, perhaps
across page boundaries. How to deal with a
partial move at page fault? - MOV (A2), -(A3) auto increment/decrement can
be partial at time of page fault (A2 and A3 are
address registers)
8What happens if there is no free frame?
- Page replacement find a page in memory
(resident), but not really in use, swap it out. - But which page?
- There are many algorithms
- performance want an algorithm which will
minimize number of page faults. - Same page may be brought into memory several
times.
9Performance of Demand Paging
- Page Fault Rate 0 ? p ? 1.0
- if p 0 no page faults
- if p 1, every reference is a fault
- m is the modification rate probability that
frame has been modified since swapped in. - Effective Access Time (EAT)
- EAT (1 p) tmemory access
- p (tpage_fault_overhead m tswap_page_out
- tswap_page_in trestart overhead)
10Demand Paging Example
- Memory access time 1 microsecond
- 50 of the time the page that is being replaced
has been modified and therefore needs to be
swapped out. - Swap Page Time 10 msec 10,000 ?sec
- EAT (1 p) 1 p (15000) 1 14999p ?sec
- So if p .001, EAT 15.999 ?sec, a 16-fold
decrease in access time! For decrease to be lt
10, p must be lt .1/14999 .00000667 1 fault /
149,990 references
11Page Replacement
- Use modify (dirty) bit to reduce overhead of page
transfers only modified pages are written back
to disk. Bit is hardware manipd. - Page replacement completes separation between
logical memory and physical memory large
virtual memory can be provided on a smaller
physical memory.
12Over-allocation
- Because demand paging allows processes to have
just a few of their pages resident, the system
can support more simultaneous jobs. - As degree of over-allocation increases, so does
CPU utilization, but also the page-fault ratio.
13Page-Replacement Algorithms
- Want lowest page-fault rate.
- Evaluate algorithm by running it on a particular
string of memory references (reference string)
and computing the number of page faults on that
string. - In the following examples, the reference string
is - 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5.
14First-In-First-Out (FIFO) Algorithm
- Reference string 1, 2, 3, 4, 1, 2, 5, 1, 2, 3,
4, 5 - 3 frames (3 pages can be in memory at a time per
process) vs. 4 - FIFO Replacement Beladys Anomaly
- more frames does not necessarily ? fewer page
faults
9 page faults
10 page faults
15Optimal Algorithm
- Replace page that will not be used for longest
period of time. - 4 frames example
- 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5How could
you know this? Crystal ball? - Its a benchmark used for measuring how well an
algorithm performs. - Analogous to shortest-job-first CPU scheduling
1
4
2
6 page faults
3
4
5
16Least Recently Used (LRU) Algorithm
- Reference string 1, 2, 3, 4, 1, 2, 5, 1, 2, 3,
4, 5 - Counter implementation
- Every page entry has a counter every time page
is referenced through this entry, copy the clock
into the counter. - When a page needs to be changed, look at the
counters to determine which are to change.
17LRU Algorithm (Cont.)
- Stack implementation keep a stack of page
numbers in a double link form - Page referenced
- move it to the top
- requires 6 pointers to be changed
- No search for replacement
18LRU Approximation Algorithms
- Reference bit
- With each page associate a bit, initially 0
- When page is referenced bit set to 1.
- Replace the one which is 0 (if one exists). We
do not know the order among the 0s, however. - Worst case is that all pages have same reference
bit. - Need a way to occasionally set bits back to 0?
19LRU Approximations (cont.)
- The Second Chance Algorithm
- Need reference bit.
- Clock replacement provides ordering scheme
among entries with same reference bit value. - If page to be replaced (in clock order) has
reference bit 0, replace it. Else, (bit1), - set reference bit to 0.
- leave page in memory.
- continue to next page (in clock order), subject
to same rules.
20Counting Algorithms
- Keep a counter of the number of references that
have been made to each page. - LFU Algorithm replaces page with smallest count.
- MFU Algorithm based on the argument that the
page with the smallest count was probably just
brought in and has yet to be used. - Suffers from cost of sorting page table.
Priority queue is reasonable solution. - Whats a priority queue?.
21Preemptive swap-outs
- Provide a cached bit for each page.
- If CPU and I/O are idle, copy resident pages to
swap-space, setting the cached bit. - When a page replacement is needed, if a cached
page is selected, there is no need to swap it
out. Instead simply overwrite it with swapped-in
page.
22Allocation of Frames
- Each process needs minimum of frames.
- Example IBM 370 a max. of 6 pages to handle
SS MOVE instruction - instruction is 6 bytes, might span 2 pages.
- from address could straddle two pages.
- to address could straddle two pages.
- Two major types of allocation schemes
- fixed allocation
- priority allocation
23Fixed Allocation
- Equal allocation e.g., if 100 frames and 5
processes, give each 20 pages. - Proportional allocation Allocate according to
the size of process.
24Priority Allocation
- Use a proportional allocation scheme using
priorities rather than size. - If process Pi generates a page fault,
- select for replacement one of its frames, or...
- select for replacement a frame from a process
with lower priority number.
25Global vs. Local Allocation
- Global replacement process selects a
replacement frame from the set of all frames one
process can take a frame from another. - Process execution time is dependent on global
environment. Less multitasking ? fewer pg faults - Local replacement each process selects from
only its own set of allocated frames. - Run-time is more stable across environments.
- But, idle pages in processes are kept from other,
busier processes. Generally global is used.
26Thrashing
- If a process does not have enough frames, the
page-fault rate is very high. This leads to - 1. low CPU utilization.
- 2. OS thinks that it needs to increase the degree
of multiprogramming (to increase CPU util.) - 3. another process added to the system,
decreasing number of frames per process! - Thrashing ? a process spends most of its cycles
swapping pages in and out.
27Thrashing Diagram
- Why does paging work?Locality model (See figure
10.15) - locality set of pages used together
- Process migrates from one locality to another.
- Localities may overlap.
- Why thrashing?? size of locality gt total memory
size (one proc thrashing vs. all)
28Working-Set Model
- Working set is an approximation of locality.
- ? ? working-set window ? a fixed number of page
references (e.g. 10,000 instructions) - WSSi (working set of Process Pi) total number
of pages referenced in the most recent ? (varies
in time) - if ? too small will not encompass entire
locality. - if ? too large will encompass several localities.
- if ? ? ? will encompass entire program.
29Working-Set Model (cont.)
- Thesis allocate enough frames to a process to
accommodate its current locality. - m is total frames
- D ? WSSi ? total demand for frames
- if D gt m ? Thrashing
- Policy if theres thrashing, then suspend one of
the processes. - Process will be completed later.
30Tracking the Working Set
- Approx. with interval timer a reference bit
- Example ? 10,000
- Timer interrupts after every 5000 time units.
- Keep 2 in-memory bits for each page.
- Whenever a timer interrupts, copy and set the
values of all reference bits to 0. - If one of the bits in memory 1 ? page in
working set. - How is this not accurate?
- Improvement 10 bits and interrupt every 1000
time units.
31Page-Fault Frequency Scheme
- Establish acceptable page-fault rate.
- If actual rate too low, process loses frame.
- If actual rate too high, process gains frame.
- Again, may need to suspend a process.
32Other Considerations
- Prepaging (vs. demand paging)
- With pure demand paging, many page faults as
processes begin, or are reactivated. - Prepaging strives to bring all pages of a process
into memory at same time. - A win because disk latency is gt gt throughput.
Combining faults combines latencies.
33Other Considerations (2)
- Page size selection - how to select page size?
- fragmentation -- big pages more internal
fragmentation - table size -- small pages better memory
utilization, but larger page tables - I/O overhead
- latency is, say, 8msec20msec seek time,
throughput 1KB/.4msec. Moving from 512B pages
to 1K ? small run-time increase - locality -- small pages better resolution
better representation of localities
34Programmers Knowledge of Paging
- Program structure
- int A1024, 1024
- Each row of A is stored in one page
- One frame allocd to process
- Program 1 for j 1 to 1024 do for i 1 to
1024 do Ai,j 01024 x 1024 page faults - Program 2 for i 1 to 1024 do for j 1 to
1024 do Ai,j 01024 page faults
35I/O Interlock
- Sometimes need to lock page into memory when
using VM. - Ex process requests input into a local buffer.
If process is then swapped out, buffer will be
inaccessible when input starts. - Ex low-priority process faults. After page is
swapped-in, it re-enters ready Q. A higher
priority job may then want to replace this
page.... - I/O interlock is a policy decision there are
many different methods.
36Real-Time Systems and VM
- Real-time systems usually cant afford run-time
cost of VM. - Solaris OS allows some processes (i.e., the
real-time ones) to be highest priority, and to
I/O interlock their pages. - Have to be careful that not too many pages are
locked.
37Demand Segmentation
- Used when hardware is insufficient to support
demand paging (e.g. 80286, etc.) - OS/2 allocates memory in segments, which it keeps
track of through segment descriptors - Segment descriptor contains a valid bit to
indicate whether the segment is currently in
memory. - If segment is in main memory, access continues,
- If not in memory, segment fault.
38Demand Segmentation (cont.)
- Much more overhead than demand paging
- Segment size is variables
- External fragmentation compaction
- After compaction, may need segment replacement
- Segments can be big, so replacement may be slow