Title: Virtual Memory
1Virtual Memory
CS147 Lecture 18
- Prof. Sin-Min Lee
- Department of Computer Science
2(No Transcript)
3(No Transcript)
4(No Transcript)
5Fixed (Static) Partitions
- Attempt at multiprogramming using fixed
partitions - one partition for each job
- size of partition designated by reconfiguring the
system - partitions cant be too small or too large.
- Critical to protect jobs memory space.
- Entire program stored contiguously in memory
during entire execution. - Internal fragmentation is a problem.
6Simplified Fixed Partition Memory Table (Table
2.1)
7Table 2.1 Main memory use during fixed
partition allocation of Table 2.1. Job 3 must
wait.
Job List J1 30K J2 50K J3 30K J4 25K
Original State
After Job Entry
100K
Job 1 (30K)
Partition 1
Partition 1
Partition 2
25K
Job 4 (25K)
Partition 2
25K
Partition 3
Partition 3
50K
Job 2 (50K)
Partition 4
Partition 4
8Dynamic Partitions
- Available memory kept in contiguous blocks and
jobs given only as much memory as they request
when loaded. - Improves memory use over fixed partitions.
- Performance deteriorates as new jobs enter the
system - fragments of free memory are created between
blocks of allocated memory (external
fragmentation).
9Dynamic Partitioning of Main Memory
Fragmentation (Figure 2.2)
10Dynamic Partition Allocation Schemes
- First-fit Allocate the first partition that is
big enough. - Keep free/busy lists organized by memory location
(low-order to high-order). - Faster in making the allocation.
- Best-fit Allocate the smallest partition that
is big enough - Keep free/busy lists ordered by size (smallest
to largest). - Produces the smallest leftover partition.
- Makes best use of memory.
11First-Fit Allocation Example (Table 2.2)
- J1 10K
- J2 20K
- J3 30K
- J4 10K
- Memory Memory Job Job Internal
- location block size number
size Status fragmentation - 10240 30K J1 10K Busy 20K
- 40960 15K J4 10K Busy 5K
- 56320 50K J2 20K Busy 30K
- 107520 20K Free
- Total Available 115K Total Used 40K
Job List
12Best-Fit Allocation Example(Table 2.3)
- J1 10K
- J2 20K
- J3 30K
- J4 10K
- Memory Memory Job Job Internal
- location block size number
size Status fragmentation - 40960 15K J1 10K Busy 5K
- 107520 20K J2 20K Busy None
- 10240 30K J3 30K Busy None
- 56230 50K J4 10K Busy 40K
- Total Available 115K Total Used 70K
Job List
13First-Fit Memory Request
14Best-Fit Memory Request
15Best-Fit vs. First-Fit
- First-Fit
- Increases memory use
- Memory allocation takes less time
- Increases internal fragmentation
- Discriminates against large jobs
- Best-Fit
- More complex algorithm
- Searches entire table before allocating memory
- Results in a smaller free space (sliver)
16Release of Memory Space Deallocation
- Deallocation for fixed partitions is simple
- Memory Manager resets status of memory block to
free. - Deallocation for dynamic partitions tries to
combine free areas of memory whenever possible - Is the block adjacent to another free block?
- Is the block between 2 free blocks?
- Is the block isolated from other free blocks?
17Case 1 Joining 2 Free Blocks
18Case 2 Joining 3 Free Blocks
19Case 3 Deallocating an Isolated Block
20Relocatable Dynamic Partitions
- Memory Manager relocates programs to gather all
empty blocks and compact them to make 1 memory
block. - Memory compaction (garbage collection,
defragmentation) performed by OS to reclaim
fragmented sections of memory space. - Memory Manager optimizes use of memory improves
throughput by compacting relocating.
21Compaction Steps
- Relocate every program in memory so theyre
contiguous. - Adjust every address, and every reference to an
address, within each program to account for
programs new location in memory. - Must leave alone all other values within the
program (e.g., data values).
22Memory Before After Compaction (Figure 2.5)
23Contents of relocation register close-up of Job
4 memory area (a) before relocation (b) after
relocation and compaction (Figure 2.6)
24Virtual Memory
- Virtual Memory (VM) the ability of the CPU and
the operating system software to use the hard
disk drive as additional RAM when needed (safety
net) - Good no longer get insufficient memory error
- Bad - performance is very slow when accessing VM
- Solution more RAM
25Motivations for Virtual Memory
- Use Physical DRAM as a Cache for the Disk
- Address space of a process can exceed physical
memory size - Sum of address spaces of multiple processes can
exceed physical memory - Simplify Memory Management
- Multiple processes resident in main memory.
- Each process with its own address space
- Only active code and data is actually in memory
- Allocate more memory to process as needed.
- Provide Protection
- One process cant interfere with another.
- because they operate in different address spaces.
- User process cannot access privileged information
- different sections of address spaces have
different permissions.
26Virtual Memory
27Levels in Memory Hierarchy
cache
virtual memory
Memory
disk
8 B
32 B
4 KB
Register
Cache
Memory
Disk Memory
size speed /Mbyte line size
32 B 1 ns 8 B
32 KB-4MB 2 ns 100/MB 32 B
128 MB 50 ns 1.00/MB 4 KB
20 GB 8 ms 0.006/MB
larger, slower, cheaper
28DRAM vs. SRAM as a Cache
- DRAM vs. disk is more extreme than SRAM vs. DRAM
- Access latencies
- DRAM 10X slower than SRAM
- Disk 100,000X slower than DRAM
- Importance of exploiting spatial locality
- First byte is 100,000X slower than successive
bytes on disk - vs. 4X improvement for page-mode vs. regular
accesses to DRAM - Bottom line
- Design decisions made for DRAM caches driven by
enormous cost of misses
DRAM
Disk
SRAM
29Locating an Object in a Cache (cont.)
- DRAM Cache
- Each allocate page of virtual memory has entry in
page table - Mapping from virtual pages to physical pages
- From uncached form to cached form
- Page table entry even if page not in memory
- Specifies disk address
- OS retrieves information
Cache
Page Table
Location
0
On Disk
1
30A System with Physical Memory Only
- Examples
- most Cray machines, early PCs, nearly all
embedded systems, etc.
Memory
0
Physical Addresses
1
N-1
Addresses generated by the CPU point directly to
bytes in physical memory
31A System with Virtual Memory
- Examples
- workstations, servers, modern PCs, etc.
Memory
Page Table
Virtual Addresses
Physical Addresses
0
1
P-1
Disk
Address Translation Hardware converts virtual
addresses to physical addresses via an OS-managed
lookup table (page table)
32Page Faults (Similar to Cache Misses)
- What if an object is on disk rather than in
memory? - Page table entry indicates virtual address not in
memory - OS exception handler invoked to move data from
disk into memory - current process suspends, others can resume
- OS has full control over placement, etc.
Before fault
After fault
Memory
Memory
Page Table
Page Table
Virtual Addresses
Physical Addresses
Virtual Addresses
Physical Addresses
CPU
CPU
Disk
Disk
33Terminology
4
- Cache a small, fast buffer that lies between
the CPU and the Main Memory which holds the most
recently accessed data. - Virtual Memory Program and data are assigned
addresses independent of the amount of physical
main memory storage actually available and the
location from which the program will actually be
executed. - Hit ratio Probability that next memory access is
found in the cache. - Miss rate (1.0 Hit rate)
34Importance of Hit Ratio
5
- Given
- h Hit ratio
- Ta Average effective memory access time by CPU
- Tc Cache access time
- Tm Main memory access time
- Effective memory time is
- Ta hTc (1 h)Tm
- Speedup due to the cache is
- Sc Tm / Ta
- Example
- Assume main memory access time of 100ns and cache
access time of 10ns and there is a hit ratio of
.9. - Ta .9(10ns) (1 - .9)(100ns) 19ns
- Sc 100ns / 19ns 5.26
- Same as above only hit ratio is now .95 instead
- Ta .95(10ns) (1 - .95)(100ns) 14.5ns
35Cache vs Virtual Memory
6
- Primary goal of Cache
- increase Speed.
- Primary goal of Virtual Memory increase Space.
36Cache Replacement Algorithms
15
- Replacement algorithm determines which block in
cache is removed to make room. - 2 main policies used today
- Least Recently Used (LRU)
- The block replaced is the one unused for the
longest time. - Random
- The block replaced is completely random a
counter-intuitive approach.
37LRU vs Random
16
- Below is a sample table comparing miss rates for
both LRU and Random.
Cache Size Miss Rate LRU Miss Rate Random
16KB 4.4 5.0
64KB 1.4 1.5
256KB 1.1 1.1
- As the cache size increases there are more blocks
to choose from, therefore the choice is less
critical ? probability of replacing the block
thats needed next is relatively low.
38Virtual Memory Replacement Algorithms
17
- 1) Optimal
- 2) First In First Out (FIFO)
- 3) Least Recently Used (LRU)
39Optimal
18
- Replace the page which will not be used for the
longest (future) period of time.
Faults are shown in boxes hits are not shown.
1 2 3 4 1 2 5 1
2 5 3 4 5
7 page faults occur
40Optimal
19
- A theoretically best page replacement algorithm
for a given fixed size of VM. - Produces the lowest possible page fault rate.
- Impossible to implement since it requires future
knowledge of reference string. - Just used to gauge the performance of real
algorithms against best theoretical.
41FIFO
20
- When a page fault occurs, replace the one that
was brought in first.
Faults are shown in boxes hits are not shown.
1 2 3 4 1 2 5 1
2 5 3 4 5
9 page faults occur
42FIFO
21
- Simplest page replacement algorithm.
- Problem can exhibit inconsistent behavior known
as Beladys anomaly. - Number of faults can increase if job is given
more physical memory - i.e., not predictable
43Example of FIFO Inconsistency
22
- Same reference string as before only with 4
frames instead of 3.
Faults are shown in boxes hits are not shown.
1 2 3 4 1 2 5 1 2
5 3 4 5
10 page faults occur
44LRU
23
- Replace the page which has not been used for the
longest period of time.
Faults are shown in boxes hits only
rearrange stack
1 2 3 4 1 2 5 1
2 5 3 4 5
1
2
5
5
1
2
2
5
1
9 page faults occur
45LRU
24
- More expensive to implement than FIFO, but it is
more consistent. - Does not exhibit Beladys anomaly
- More overhead needed since stack must be updated
on each access.
46Example of LRU Consistency
25
- Same reference string as before only with 4
frames instead of 3.
Faults are shown in boxes hits only
rearrange stack
1 2 3 4 1 2 5 1
2 5 3 4 5
1
2
1
2
5
4
1
5
1
2
3
4
2
5
1
2
3
4
4
4
7 page faults occur
47Servicing a Page Fault
(1) Initiate Block Read
- Processor Signals Controller
- Read block of length P starting at disk address X
and store starting at memory address Y - Read Occurs
- Direct Memory Access (DMA)
- Under control of I/O controller
- I / O Controller Signals Completion
- Interrupt processor
- OS resumes suspended process
Processor
Reg
(3) Read Done
Cache
Memory-I/O bus
(2) DMA Transfer
I/O controller
Memory
disk
Disk
48Handling Page Faults
- Memory reference causes a fault called a page
fault - Page fault can happen at any time and place
- Instruction fetch
- In the middle of an instruction execution
- System must save all state
- Move page from disk to memory
- Restart the faulting instruction
- Restore state
- Backup PC not easy to find out by how much
need HW help
49Page Fault
- If there is ever a reference to a page, first
reference will trap to OS ? page fault - Hardware traps to kernel
- General registers saved
- OS determines which virtual page needed
- OS checks validity of address, seeks page frame
- If selected frame is dirty, write it to disk
- OS brings schedules new page in from disk
- Page tables updated
- Faulting instruction backed up to when it began
- Faulting process scheduled
- Registers restored
- Program continues
50What to Page in
- Demand paging brings in the faulting page
- To bring in additional pages, we need to know the
future - Users dont really know the future, but some OSs
have user-controlled pre-fetching - In real systems,
- load the initial page
- Start running
- Some systems (e.g. WinNT will bring in additional
neighboring pages (clustering))
51VM Page Replacement
- If there is an unused page, use it.
- If there are no pages available, select one
(Policy?) and - If it is dirty (M 1)
- write it to disk
- Invalidate its PTE and TLB entry
- Load in new page from disk
- Update the PTE and TLB entry!
- Restart the faulting instruction
- What is cost of replacing a page?
- How does the OS select the page to be evicted?
52Measuring Demand Paging Performance
- Page Fault Rate (p)
- 0 lt p lt 1.0 (no page faults to every ref is a
fault) - Page Fault Overhead
- fault service overhead read page restart
process overhead - Dominated by time to read page in
- Effective Access Time
- (1-p) (memory access) p (page fault overhead)
53Performance Example
- Memory access time 100 nanoseconds
- Page fault overhead 25 millisec (msec)
- Page fault rate 1/1000
- EAT (1-p) 100 p (25 msec)
- (1-p) 100 p 25,000,000
- 100 24,999,900 p
- 100 24,999,900 1/1000 25 microseconds!
- Want less than 10 degradation
- 110 gt 100 24,999,900 p
- 10 gt 24,999,900 p
- p lt .0000004 or 1 fault in 2,500,000 accesses!
54Page Replacement Algorithms
- Want lowest page-fault rate.
- Evaluate algorithm by running it on a particular
string of memory references (reference string)
and computing the number of page faults on that
string. - Reference string ordered list of pages accessed
as process executes - Ex. Reference String is A B C A B D A D B C B
55The Best Page to Replace
- The best page to replace is the one that will
never be accessed again - Optimal Algorithm - Beladys Algorithm
- Lowest fault rate for any reference string
- Basically, replace the page that will not be used
for the longest time in the future. - If you know the future, please see me after
class!! - Beladys Algorithm is a yardstick
- We want to find close approximations
56Page Replacement - FIFO
- FIFO is simple to implement
- When page in, place page id on end of list
- Evict page at head of list
- Might be good? Page to be evicted has been in
memory the longest time - But?
- Maybe it is being used
- We just dont know
- FIFO suffers from Beladys Anomaly fault rate
may increase when there is more physical memory!
57FIFO vs. Optimal
- Reference string ordered list of pages accessed
as process executes - Ex. Reference String is A B C A B D A D B C B
- OPTIMAL
- A B C A B D A D B C B
System has 3 page frames
5 Faults
toss A or D
A B C D A B C
FIFO A B C A B D A D B C B
toss ?
7 faults
58Second Chance
- Maintain FIFO page list
- On page fault
- Check reference bit
- If R 1 then move page to end of list and clear
R - If R 0 then evict page
59Clock Replacement
- Create circular list of PTEs in FIFO Order
- One-handed Clock pointer starts at oldest page
- Algorithm FIFO, but check Reference bit
- If R 1, set R 0 and advance hand
- evict first page with R 0
- Looks like a clock hand sweeping PTE entries
- Fast, but worst case may take a lot of time
- Two-handed clock add a 2nd hand that is n PTEs
ahead - 2nd hand clears Reference bit
60Not Recently Used Page Replacement Algorithm
- Each page has Reference bit, Modified bit
- bits are set when page is referenced, modified
- Pages are classified
- not referenced, not modified
- not referenced, modified
- referenced, not modified
- referenced, modified
- NRU removes page at random
- from lowest numbered non empty class
61Least Recently Used (LRU)
- Replace the page that has not been used for the
longest time
3 Page Frames Reference String - A B C A B D A
D B C
LRU 5 faults
A B C A B D A D B C
62LRU
- Past experience may indicate future behavior
- Perfect LRU requires some form of timestamp to be
associated with a PTE on every memory reference
!!! - Counter implementation
- Every page entry has a counter every time page
is referenced through this entry, copy the clock
into the counter. - When a page needs to be changed, look at the
counters to determine which are to change - Stack implementation keep a stack of page
numbers in a double link form - Page referenced move it to the top
- No search for replacement
63LRU Approximations
- Aging
- Keep a counter for each PTE
- Periodically check Reference bit
- If R 0 increment counter (page has not been
used) - If R 1 clear the counter (page has been used)
- Set R 0
- Counter contains of intervals since last access
- Replace page with largest counter value
- Clock replacement
64Contrast Macintosh Memory Model
- MAC OS 19
- Does not use traditional virtual memory
- All program objects accessed through handles
- Indirect reference through pointer table
- Objects stored in shared global address space
Handles
65Macintosh Memory Management
- Allocation / Deallocation
- Similar to free-list management of malloc/free
- Compaction
- Can move any object and just update the (unique)
pointer in pointer table
Handles
66Mac vs. VM-Based Memory Mgmt
- Allocating, deallocating, and moving memory
- can be accomplished by both techniques
- Block sizes
- Mac variable-sized
- may be very small or very large
- VM fixed-size
- size is equal to one page (4KB on x86 Linux
systems) - Allocating contiguous chunks of memory
- Mac contiguous allocation is required
- VM can map contiguous range of virtual addresses
to disjoint ranges of physical addresses - Protection
- Mac wild write by one process can corrupt
anothers data
67MAC OS X
- Modern Operating System
- Virtual memory with protection
- Preemptive multitasking
- Other versions of MAC OS require processes to
voluntarily relinquish control - Based on MACH OS
- Developed at CMU in late 1980s
68(No Transcript)
69(No Transcript)
70(No Transcript)
71(No Transcript)
72(No Transcript)
73(No Transcript)
74(No Transcript)
75Page Replacement Policy
- Working Set
- Set of pages used actively heavily
- Kept in memory to reduce Page Faults
- Set is found/maintained dynamically by OS
- Replacement OS tries to predict which page would
have least impact on the running program
Common Replacement Schemes Least Recently Used
(LRU) First-In-First-Out (FIFO)
76Page Replacement Policies
- Least Recently Used (LRU)
- Generally works well
- TROUBLE
- When the working set is larger than the Main
Memory
Working Set 9 pages Pages are executed in
sequence (0?8 (repeat))
THRASHING
77Page Replacement Policies
- First-In-First-Out(FIFO)
- Removes Least Recently Loaded page
- Does not depend on Use
- Determined by number of page faults seen by a page
78Page Replacement Policies
- Upon Replacement
- Need to know whether to write data back
- Add a Dirty-Bit
- Dirty Bit 0 Page is clean No writing
- Dirty Bit 1 Page is dirty Write back