Title: Emery Berger
1Operating SystemsCMPSCI 377Lecture 14 VM Meets
the Real World
- Emery Berger
- University of Massachusetts, Amherst
2Last Time Demand-Paged VM
- Reading pages
- Swap space
- Page eviction
- Cost of paging
- Page replacement algorithms
- Evaluation
3Virtual Memory in the Real World
- Implementing exact LRU
- Approximating LRU
- Hardware Support
- Clock
- Segmented queue
- Multiprogramming
- Global LRU
- Working Set
4Implementing Exact LRU
- On each reference, time stamp page
- When we need to evict select oldest page
least-recently used
A, B, C, B, C, C, D
5Implementing Exact LRU
- On each reference, time stamp page
- When we need to evict select oldest page
least-recently used
A 1
A, B, C, B, C, C, D
6Implementing Exact LRU
- On each reference, time stamp page
- When we need to evict select oldest page
least-recently used
A 1
B 2
A, B, C, B, C, C, D
7Implementing Exact LRU
- On each reference, time stamp page
- When we need to evict select oldest page
least-recently used
A 1
B 2
C 3
A, B, C, B, C, C, D
8Implementing Exact LRU
- On each reference, time stamp page
- When we need to evict select oldest page
least-recently used
A 1
B 4
C 3
A, B, C, B, C, C, D
9Implementing Exact LRU
- On each reference, time stamp page
- When we need to evict select oldest page
least-recently used
A 1
B 4
C 5
A, B, C, B, C, C, D
10Implementing Exact LRU
- On each reference, time stamp page
- When we need to evict select oldest page
least-recently used
A 1
B 4
C 6
A, B, C, B, C, C, D
11Implementing Exact LRU
- On each reference, time stamp page
- When we need to evict select oldest page
least-recently used
A 1
B 4
C 6
D 7
A, B, C, B, C, C, D
- How should we implement this?
12Implementing Exact LRUData Structures
- Could keep pages in order optimizes eviction
- Priority queueupdate O(log n), eviction
O(log n) - Optimize for common case!
- Common case hits, not misses
- Hash tableupdate O(1), eviction O(n)
13Cost of Maintaining Exact LRU
- Hash tables too expensive
- On every reference
- Compute hash of page address
- Update time stamp
- Unfortunately 10x 100x more expensive!
14Cost of Maintaining Exact LRU
- Alternative doubly-linked list
- Move items to front when referenced
- LRU items at end of list
- Still too expensive
- 4-6 pointer updates per reference
- Can we do better?
15Virtual Memory in the Real World
- Implementing exact LRU
- Approximating LRU
- Hardware Support
- Clock
- Segmented queue
- Multiprogramming
- Global LRU
- Working Set
16Hardware Support
- Maintain reference bits for every page
- On each access, set reference bit to 1
- Page replacement algorithm periodically resets
reference bits
A 1
B 1
C 1
A, B, C, B, C, C, D
17Hardware Support
- Maintain reference bits for every page
- On each access, set reference bit to 1
- Page replacement algorithm periodically resets
reference bits
A 0
B 0
C 0
A, B, C, B, C, C, D
reset reference bits
18Hardware Support
- Maintain reference bits for every page
- On each access, set reference bit to 1
- Page replacement algorithm periodically resets
reference bits
A 0
B 1
C 0
A, B, C, B, C, C, D
19Hardware Support
- Maintain reference bits for every page
- On each access, set reference bit to 1
- Page replacement algorithm periodically resets
reference bits
A 0
B 1
C 1
A, B, C, B, C, C, D
20Hardware Support
- Maintain reference bits for every page
- On each access, set reference bit to 1
- Page replacement algorithm periodically resets
reference bits
A 0
B 1
C 1
A, B, C, B, C, C, D
21Hardware Support
- Maintain reference bits for every page
- On each access, set reference bit to 1
- Page replacement algorithm periodically resets
reference bits - Evict page with reference bit 0
- Cost per miss O(n)
A 0
B 1
C 1
D 1
A, B, C, B, C, C, D
22Virtual Memory in the Real World
- Implementing exact LRU
- Approximating LRU
- Hardware Support
- Clock
- Segmented queue
- Multiprogramming
- Global LRU
- Working Set
23The Clock Algorithm
- Variant of FIFO LRU
- Keep frames in circle
- On page fault, OS
- Checks reference bit of next frame
- If reference bit 0, replace page, set bit to 1
- If reference bit 1, set bit to 0, advance
pointer to next frame
B 1
C 1
A 1
D 1
A, B, C, D, B, C, E, F, C, G
24The Clock Algorithm
- Variant of FIFO LRU
- Keep frames in circle
- On page fault, OS
- Checks reference bit of next frame
- If reference bit 0, replace page, set bit to 1
- If reference bit 1, set bit to 0, advance
pointer to next frame
B 1
C 1
A 1
D 1
A, B, C, D, B, C, E, F, C, G
25The Clock Algorithm
- Variant of FIFO LRU
- Keep frames in circle
- On page fault, OS
- Checks reference bit of next frame
- If reference bit 0, replace page, set bit to 1
- If reference bit 1, set bit to 0, advance
pointer to next frame
B 1
C 1
A 1
D 1
A, B, C, D, B, C, E, F, C, G
26The Clock Algorithm
- Variant of FIFO LRU
- Keep frames in circle
- On page fault, OS
- Checks reference bit of next frame
- If reference bit 0, replace page, set bit to 1
- If reference bit 1, set bit to 0, advance
pointer to next frame
B 1
C 1
A 0
D 1
A, B, C, D, B, C, E, F, C, G
27The Clock Algorithm
- Variant of FIFO LRU
- Keep frames in circle
- On page fault, OS
- Checks reference bit of next frame
- If reference bit 0, replace page, set bit to 1
- If reference bit 1, set bit to 0, advance
pointer to next frame
B 0
C 1
A 0
D 1
A, B, C, D, B, C, E, F, C, G
28The Clock Algorithm
- Variant of FIFO LRU
- Keep frames in circle
- On page fault, OS
- Checks reference bit of next frame
- If reference bit 0, replace page, set bit to 1
- If reference bit 1, set bit to 0, advance
pointer to next frame
B 0
C 0
A 0
D 1
A, B, C, D, B, C, E, F, C, G
29The Clock Algorithm
- Variant of FIFO LRU
- Keep frames in circle
- On page fault, OS
- Checks reference bit of next frame
- If reference bit 0, replace page, set bit to 1
- If reference bit 1, set bit to 0, advance
pointer to next frame
B 0
C 0
A 0
D 0
A, B, C, D, B, C, E, F, C, G
30The Clock Algorithm
- Variant of FIFO LRU
- Keep frames in circle
- On page fault, OS
- Checks reference bit of next frame
- If reference bit 0, replace page, set bit to 1
- If reference bit 1, set bit to 0, advance
pointer to next frame
B 0
C 0
A 0
E 1
D 0
A, B, C, D, B, C, E, F, C, G
31The Clock Algorithm
- Variant of FIFO LRU
- Keep frames in circle
- On page fault, OS
- Checks reference bit of next frame
- If reference bit 0, replace page, set bit to 1
- If reference bit 1, set bit to 0, advance
pointer to next frame
B 0
C 0
A 0
E 0
D 0
A, B, C, D, B, C, E, F, C, G
32The Clock Algorithm
- Variant of FIFO LRU
- Keep frames in circle
- On page fault, OS
- Checks reference bit of next frame
- If reference bit 0, replace page, set bit to 1
- If reference bit 1, set bit to 0, advance
pointer to next frame
B 0
F 1
C 0
A 0
E 0
D 0
A, B, C, D, B, C, E, F, C, G
33The Clock Algorithm
- Variant of FIFO LRU
- Keep frames in circle
- On page fault, OS
- Checks reference bit of next frame
- If reference bit 0, replace page, set bit to 1
- If reference bit 1, set bit to 0, advance
pointer to next frame
B 0
F 1
C 0
A 0
E 0
C 1
D 0
A, B, C, D, B, C, E, F, C, G
34The Clock Algorithm
- Variant of FIFO LRU
- Keep frames in circle
- On page fault, OS
- Checks reference bit of next frame
- If reference bit 0, replace page, set bit to 1
- If reference bit 1, set bit to 0, advance
pointer to next frame
B 0
F 0
C 0
A 0
E 0
C 1
D 0
A, B, C, D, B, C, E, F, C, G
35The Clock Algorithm
- Variant of FIFO LRU
- Keep frames in circle
- On page fault, OS
- Checks reference bit of next frame
- If reference bit 0, replace page, set bit to 1
- If reference bit 1, set bit to 0, advance
pointer to next frame
B 0
F 0
C 0
A 0
E 0
C 1
C 0
D 0
A, B, C, D, B, C, E, F, C, G
36The Clock Algorithm
- Variant of FIFO LRU
- Keep frames in circle
- On page fault, OS
- Checks reference bit of next frame
- If reference bit 0, replace page, set bit to 1
- If reference bit 1, set bit to 0, advance
pointer to next frame
B 0
F 1
C 0
A 0
E 0
C 1
C 0
D 0
G 1
A, B, C, D, B, C, E, F, C, G
37Enhancing Clock
- Recall we dont write back unmodified pages
- Idea favor eviction of unmodified pages
- Extend hardware to keep another bitmodified bit
- Total order of tuples (ref bit, mod bit)
- (0,0), (0,1), (1,0), (1,1)
- Evict page from lowest nonempty class
38Page Replacementin Enhanced Clock
- OS scans at most three times
- Page (0,0) replace that page
- Page (0,1) write out page, clear mod bit
- Page (1,0), (1,1) clear reference bit
- Passes
- all pages (0,0) or (0,1)
- all pages (0,1) - write out pages
- all pages (0,0) replace any page
- Fast, but still coarse approximation of LRU
39Segmented Queue
- Real systems segment queue into two parts
- approximate for frequently-referenced pages
- e.g., first 1/3 page frames fast
- exact LRU for infrequently-referenced pages
- last 2/3 page frames doubly-linked list
precise - How do we move between two segments?
clock
exact LRU
40Virtual Memory in the Real World
- Implementing exact LRU
- Approximating LRU
- Hardware Support
- Clock
- Segmented queue
- Multiprogramming
- Global LRU
- Working Set
41Multiprogramming VM
- Multiple programs compete for main memory
- Processes move memory from and to disk
- Pages needed by one process may get squeezed out
by another process - thrashing - effective cost of memory access
cost of disk access really really bad - Must balance memory across processes to avoid
thrashing
42Global LRU
- Put all pages from all processes in one pool
- Manage with LRU (Segmented Queue)
- Used by Linux, BSD, etc.
- Advantages
- Easy
- Disadvantages
- Many
43Global LRU Disadvantages
- No isolation between processes
- One process touching many pages can force another
process pages to be evicted - Priority ignored, or inverted
- All processes treated equally
- Greedy (or wasteful) processes rewarded
- Programs with poor locality squeeze out those
with good locality - Result more page faults
44Global LRU Disadvantages, Cont.
- Sleepyhead problem
- Intermittent, important process
- Every time it wakes up no pages! back to
sleep... - Susceptible to denial of service
- Non-paying guest, lowest priority, marches over
lots of pages gets all available memory - Alternatives?
45Working Set
- Denning Only run processes whose working set
fits in RAM - Other processes deactivate (suspend)
- Working set pages touched in last ? references
- Provides isolation
- A processs reference behavior only affects itself
46Working Set Problems
- Algorithm relies on key parameter, ?
- How do we set ??
- Is there one correct ??
- Different processes have different timescales
over which they touch pages - Not acceptable (or necessarily possible) to
suspend processes altogether - Not really used
- Very rough variant used in Windows
47Alternative Approach
- Just buy more RAM!
- Simplifies memory management
- Workload fits in RAM no more swapping!
- Sounds great
48Memory Prices Over Time
Soon it will be free
49Memory Prices Inflection Point!
50Memory Is Actually Expensive
- Desktops
- Most ship with 256MB
- 1GB 50 more
- Laptops 70, if possible
- Limited capacity
- Servers
- Buy 4GB, get 1 CPU free!
- Sun Enterprise 10000 8GB extra 150,000!
- Fast RAM new technologies
- Cosmic rays
8GB Sun RAM 1 Ferrari Modena
51New Approach SAVMM
- Scheduler-Aware Virtual Memory Management
- Scheduler proportion of desired CPU time
- Like lottery scheduling
- VM allocates memory to meet proportion
- Inversely related to miss rate
- Use recent reference behavior
- Can calculate misses for any size allocation
- Picks allocation highest throughput, nearest
proportion - In development (Berger, Kaplan et al.)
- Simulation Linux kernel
52SummaryVirtual Memory in the Real World
- Implementing exact LRU
- Approximating LRU
- Hardware Support
- Clock
- Segmented queue
- Multiprogramming
- Global LRU
- Working Set
53SummaryIf You Have to Spend ...
more Ferraris good
more memory bad