Title: CS 2200
1CS 2200
2Memory Management
3Things to think about
- Olden Days
- Main Program
- Call Procedure A
- Call Procedure B
- Call Procedure C
- Call Procedure D
- Call Procedure B
- Call Procedure E
- Call Procedure F
Procedure F
Procedure E
Procedure D
Procedure C
Physical Memory Size
Procedure B
Procedure A
Main Program
4Things to think about
- Olden Days
- Main Program
- Call Procedure A
- Call Procedure B
- Call Procedure C
- Call Procedure D
- Call Procedure B
- Call Procedure E
- Call Procedure F
Procedure C
Procedure F
Procedure B
Procedure E
Procedure D
Procedure A
Main Program
5Goals
- Programs and data must be in main memory
- To keep CPU busy, we need multiple processes
available - Must work with secondary storage systems (e.g.
disk).
6Issue Addressing
- Originally addresses were fixed
- Single program running on computer.
- Amount of memory known
- Requirements change
- May want multiple programs in memory at the same
time. - Amount of memory may change
- Might even want to move program!!!
- Must consider address binding
7Address Binding
- We decide to have a variable
- The HLL lets us refer to it by some symbolic name
like x - Eventually it must refer to a real, live address
in the machine. - When does that decision get made?
- As a start, what are the choices?
8Address Binding
- Compile Time
- What would we have to know?
- Address where program will reside
- Means that moving program will require recompile
- Is this for real?
9Address Binding
- Compile Time
- What would we have to know?
- Address where program will reside
- Means that moving program will require recompile
- Is this for real?
- MS-DOS .com format
- Load Time
- How would this work?
- Compiler must generate relocatable code
- If starting address changes we just reload
- Relocatable code?
10Consider
- By being clever, we can make programs relocatable
- Position Independent Code
- beq R1, R2, PC Offset
11Address Binding
- Compile Time
- What would we have to know?
- Address where program will reside
- Means that moving program will require recompile
- Is this for real?
- MS-DOS .com format
- Load Time
- How would this work?
- Compiler must generate relocatable code
- If starting address changes we just reload
- Execution Time
- What would we need for this?
- Special hardware (used by most general purpose
O.S.'s)
12Some Techniques
- Dynamic Loading
- Only load a routine when needed
- Can be managed by user
- Dynamic Linking
- Typically used with system libraries
- Uses stubs to locate library routines
- Allows sharing
13Main Concepts
- Logical/Virtual Addresses vs. Physical
- Swapping
- Moving a program from one place in memory to
another - Paging
- Solution to external fragmentation
- Segmentation
- A logical approach?
- Virtual Memory
- A hybrid approach
14- Can we find a way to allow the programmer and the
CPU to think that they are in one place in memory
while in reality they are somewhere else???
15Addresses
- Logical addresses
- Generated by CPU
- Also called virtual addresses
- All program manipulation done with logical
address value (i.e. pointers, etc.) - Requires Memory Management Unit
- Address space 0 - max
16Addresses
- Physical addresses
- Generated by MMU
- All actual memory access done with physical
address value. - Address space R - (R max)
- Simplified schematic
Relocation Register
Physical Address
MEMORY
Logical Address
CPU
346
42346
17Where is virtual memory?
18Questions?
19Swapping
- Moving processes from main memory to secondary
(e.g.disk) memory. - If location in memory is changed, what is
required? - Fast?
- What could go wrong?
- Hint I/O
- First, consider a simple case
- Just want to have one user job and OS
- Want to protect OS from user!
20Single Partition Allocation
- Goals
- Maintain OS in low memory.
- Protect OS from users
- Implementation
- Relocation Register
- Limit Register
21Schematically
Physical memory
0
User view
X
U
User area
U
X
CPU
LIMIT
RELOCATION
trap
22Multiple Partition Allocation
- Want multiple users in memory at same time
- Simple technique MFT (OS/360)
- Fixed size partitions
- One process per partition
- Didnt seem too flexible in 1966!
- More Flexible MVT
- Start with OS and large free space
- Free space (or spaces) known as hole (or holes)
23Implementation
- We have multiple processes in memory
- Only one is active
- We load the relocation and limit register for the
process that is running - We might store the relocation and limit register
values for non-running jobs in their PCB - The OS must manage this
24OS
P1 600
P2 1000
P3 300
25OS
P1 600
P3 300
26OS
P1 600
P4 700
P3 300
27OS
P4 700
P3 300
28OS
P5 500
P4 700
P3 300
29Obvious Problems?
30Algorithms
- First-fit
- Can search from either end
- Start at beginning or where left off
- Best-fit
- Allocate smallest hole that is big enough
- Must search entire list (or keep sorted list)
- Produces smallest leftover hole
- Worst-fit
- Allocate largest hole
- produces largest leftover hole
31Algorithms
- First-fit (FASTER)
- Can search from either end
- Start at beginning or where left off
- Best-fit
- Allocate smallest hole that is big enough
- Must search entire list (or keep sorted list)
- Produces smallest leftover hole
- Worst-fit (WORST PERFORMANCE)
- Allocate largest hole
- produces largest leftover hole
32Fragmentation
- External
- Space between processes
- Over time using N blocks may result in an
additional 0.5N blocks being unused (50 rule) - Solution Compaction
- May be costly
- But need contiguous spaces!
- Internal Fragmentation
- For the sake of efficiency typically give process
more than is needed
33Paging
- Solution to external fragmentation.
- Divide logical address space into non-contiguous
regions of physical memory. - Commonly used technique in many operating systems.
34Basic Method
- Break Physical memory into fixed-sized blocks
called Frames. - Break Logical memory into the same-sized blocks
called Pages. - Disk also broken into blocks of the same size.
35Hardware
Physical Memory
CPU
page
offset
frame
offset
page table
page
frame
36Decimal Example
Physical Memory
Block size 1000 words Memory 1000
Frames Memory ?
CPU
page
offset
frame
offset
page table
page
frame
37Decimal Example
Physical Memory
Block size 1000 words Memory 1000
Frames Memory 1,000,000
CPU
page
offset
frame
offset
page table
page
frame
38Decimal Example
Physical Memory
Block size 1000 words Memory ? Frames Memory
10,000,000
CPU
page
offset
frame
offset
page table
page
frame
Assume addresses go up to 10,000,000. How big
is page table?
39Decimal Example
Physical Memory
Block size 1000 words Memory 10,000
Frames Memory 10,000,000 words
CPU
page
offset
frame
offset
page table
page
10,000 Entries
frame
Assume addresses go up to 10,000,000. How big
is page table?
40Decimal Example
Physical Memory
Block size 1000 words Memory 10,000
Frames Memory 10,000,000 words
CPU
42
356
256
356
page table
256
42
41Questions?
42Whats the right picture?
Physical Address Space
Logical Address Space
43Whats the right picture?
Logical Address Space
Physical Address Space
44Whats the right picture?
Physical Address Space
Logical Address Space
45Question
1
2
3
46Whats the right picture?
Logical Address Space
Physical Address Space
47Questions?
48Valid Bit
Physical Memory
CPU
42
356
256
356
page table
256 v
42
000 i
49Tiny Example32-byte memory with 4-byte pages
Physical memory 0 1 2 3 4 i 5 j 6 k 7 l 8 m 9 n 10
o 11 p 12 13 14 15 16 17 18 19 20 a 21 b 22 c 23
d 24 e 25 f 26 g 27 h 28 29 30 31
Logical memory 0 a 1 b 2 c 3 d 4 e 5 f 6 g 7 h 8 i
9 j 10 k 11 l 12 m 13 n 14 o 15 p
Page Table
0 1 2 3
5 6 1 2
50Test Yourself
- A processor asks for the contents of virtual
memory address 0x10020. The paging scheme in use
breaks this into a VPN of 0x10 and an offset of
0x020. - PTR (a CPU register that holds the address of the
page table) has a value of 0x100 indicating that
this processes page table starts at location
0x100. - The machine uses word addressing and the page
table entries are each one word long.
51Test Yourself
- ADDR CONTENTS
- 0x00000 0x00000
- 0x00100 0x00010
- 0x00110 0x00022
- 0x00120 0x00045
- 0x00130 0x00078
- 0x00145 0x00010
- 0x10000 0x03333
- 0x10020 0x04444
- 0x22000 0x01111
- 0x22020 0x02222
- 0x45000 0x05555
- 0x45020 0x06666
- What is the physical address calculated?
- 10020
- 22020
- 45000
- 45020
- none of the above
52Test Yourself
- ADDR CONTENTS
- 0x00000 0x00000
- 0x00100 0x00010
- 0x00110 0x00022
- 0x00120 0x00045
- 0x00130 0x00078
- 0x00145 0x00010
- 0x10000 0x03333
- 0x10020 0x04444
- 0x22000 0x01111
- 0x22020 0x02222
- 0x45000 0x05555
- 0x45020 0x06666
- What is the physical address calculated?
- What is the contents of this address returned to
the processor? - How many memory accesses in total were required
to obtain the contents of the desired address?
53Observations
- Paging is like having a relocation register for
each page. - No external fragmentation
- Some internal fragmentation. How much?
- About a frame
- Therefore, frame size should be minimized...
- Well, not exactly.
54Questions?
- How many page tables are there?
- What limit is there on a processes address space?
- What does the OS need to keep track of?
- Frame Table
- How many frames
- Allocated or not
- If allocated...to who?
- What happens during a system call involving an
address?
55How big is a page table?
- Suppose
- 32 bit architecture
- Page size 4 kilobytes
- Therefore
Offset 212
Page Number 220
56How big is a Page Table Entry
- Need physical frame number 20 bits
- Protection Information? Pages can be
- Read only
- Read/Write
- Possibly other info...so maybe each entry is one
32-bit word. - So, how big is a page table?
- 4kb ?
- 4Mb ?
- 4GB ?
57How big is a Page Table Entry
- Need physical frame number 20 bits
- Protection Information? Pages can be
- Read only
- Read/Write
- Possibly other info
- So, how big is a page table?
- 220 PTE x 4 bytes/entry 4 MB
58The Penalty Box
- What is the cost in performance of this scheme?
- We must be storing the Page Table in memory so...
- Every memory reference requires
- Page table look up
- Actual memory reference
- What to do?
59Learning about Locality
- What do you suppose is true about referencing
memory? - In terms of say
- Space
- Time
- These are and will be referred to as spatial and
temporal locality
60TCB with the TLB
- So based on the concepts of both temporal and
spatial locality we construct a hardware device
called a Translation-Lookaside Buffer - Now doesnt that clear things up?
61Physical Memory
CPU
Page
Offset
TLB
Page
Frame
Frame
Offset
Page
Frame
Page
Frame
Page
Frame
Page
Frame
Page Table
Page
Frame
62TLB Miss
Physical Memory
CPU
Page
Offset
TLB
Page
Frame
Frame
Offset
Page
Frame
Page
Frame
Page
Frame
Page
Frame
Page Table
Page
Frame
63TLB Hit
Physical Memory
CPU
Page
Offset
TLB
Page
Frame
Frame
Offset
Page
Frame
Page
Frame
Page
Frame
Page
Frame
Page Table
Page
Frame
64TLB Fun Facts
- Size 8 - 4,096 entries
- Hit time 0.5 - 1 clock cycle
- Miss penalty 10 - 30 clock cycles
- Miss rate 0.01 - 1
- Typical problem
- Assume Hit 1 clock cycle
- Miss 30 clock cycles
- Miss rate 1
- Effective clock cycles 1x.99 30x.01 1.29
65Notes...
- What has to happen on a context switch?
- Are the TLB entries valid after a context switch?
66Notes...
- How do we know if a TLB (or even a page table
entry) is valid or not? - If the page table is stored in memory how do we
get to the page table to look up how to get to
the page table...PTBR - Wait a minute...did you say that each process had
a 4 MB Page Table?
67Multi-level Page Tables
- To deal with excessively large page tables we can
page the page table...
Page 1
Page 2
Offset
0
Physical Memory
Outer Page Table
PTBR
0
Page of Page Table
1023
1023
68Inverted Page Tables
- One entry for each physical frame
- ltProcess_ID, Page_Numbergt
- Each virtual address consists of
- ltProcess_ID, Page_Number, offsetgt
- When processor issues memory request entire table
is searched for match. - Use hashing to get performance
- Also use form of TLB
69Sharing Code
- Requires reentrant code
- Non self-modifying code
- Operating system enforces Read-Only
- Multiple processes can now map a page into the
same frame - Major memory savings
- Does not work well with inverted page tables.
70Questions?
71Segmentation
72Segmentation
- Programmers logically think of the address space
as being broken into different segments for
different purposes - Example segments
- Code
- Locals
- Stacks
- etc.
73Segmentation
- Different segments have different sizes
- Each segment is mapped to a contiguous physical
memory location - Virtual addresses now look like this
- For each segment we need to know
- Starting address in physical memory (base)
- Size of segment (limit)
74Segmentation
For each process we have a segment table (in
memory or special hardware)
Base
Limit
75Segmentation
virtual address
seg num 2
offset
lt
5000
Base
Limit
5000
76Segmentation
- Pure segmentation has memory fragmentation
problems - Scheduler must find space for all segments for a
given process - Today some processors use combinations of paging
and segmentation.
77Questions?
78Virtual Memory
79Five Classic Components
Processor
Memory
Control
Datapath
80Five Classic Components
Processor
Memory
Control
Datapath
81What happens...
- When a process is running and it does some I/O?
- Does system call (like an interrupt)
- OS saves state of process and puts in I/O queue.
- Process at head of ready queue gets context
switched in - Control passed to active process
82But, how many processes can we fit into memory at
the same time???
83Quick Review...
- We know that an entire process could be swapped
out to disk and eventually swapped back in to a
different place - We know that a process can be broken up into
pages and that each logical page can correspond
to a physical frame using a page table
84Virtual Memory
- Scheme that allows execution of a process that is
not 100 in memory.
85Advantages
- Allows processes to be larger than physical
memory - Abstracts memory so programmers dont have to
worry about it. - Allows (a portion of) many more processes to be
in memory at the same time
Most of the time
86Background
- Weve seen how memory can be paged
- But doesnt the whole program have to be in
memory?
87Background
- Weve seen how memory can be paged
- But doesnt the whole program have to be in
memory? - Think about
- Error handling code
- Data structures may be bigger than needed
- Certain operations may not be used
88Background
- Programmer can use all of the address space
- But actual amount used might be smaller so more
processes at same time - Improved
- CPU Utilization
- Throughput
- Not improved
- Response time
- Turnaround time
89Demand Paging
- Paging systems
- Swapping systems
- Combine the two and make the swapping lazy!
- Remember the invalid bit?
Page Table
90Valid/Invalid Bit
- Before
- Used to indicate a page that the process was not
allowed to use - Encountering absolutely meant an error had
occurred.
- Now
- Indicates either the page is still on disk OR the
page is truly invalid - The PCB must contain information to allow the
processor to determine which of the two has
occurred
91Page Fault
Physical Memory
Operating System
CPU
page table
i
92Page Fault
?
Physical Memory
Operating System
CPU
page table
i
93Page Fault
Chimney ?
Physical Memory
Operating System
CPU
page table
i
94Page Fault
Disk
Physical Memory
Operating System
CPU
42
356
356
page table
i
95Page Fault
Physical Memory
Operating System
CPU
42
356
356
page table
i
96Page Fault
Physical Memory
Operating System
TRAP!
CPU
42
356
356
page table
i
97Page Fault
OpSys says page is on disk
Physical Memory
Operating System
CPU
42
356
356
page table
i
98Page Fault
Small detail OpSys must somehow maintain list of
what is on disk
Physical Memory
Operating System
CPU
42
356
356
page table
i
99Page Fault
Physical Memory
Operating System
CPU
42
356
356
page table
i
100Page Fault
Physical Memory
Operating System
CPU
42
356
356
page table
Free Frame
i
101Page Fault
Physical Memory
Operating System
CPU
42
356
356
page table
i
102Page Fault
Physical Memory
Operating System
CPU
42
356
356
page table
295
v
103Page Fault
Physical Memory
Operating System
CPU
42
356
356
Restart Instruction
page table
295
v
104Page Fault
Physical Memory
Operating System
CPU
42
356
295
356
page table
295
v
Now it works fine!
105New Hardware Requirements?
- No, not really.
- We needed the valid bit for paging.
- We needed the disk for swapping
- So demand paging is all software!!!
- Well almost!
- What happens when page fault occurs during
instruction fetch? Memory operation?
106Page Fault
- During Fetch
- Load appropriate page
- Restart instruction
- During Memory Reference
- Load appropriate page
- Restart instruction
- If instruction can modify things
- Check all references at beginning OR
- Save results to restore memory
107Performance of Demand Paging
- Assume probability of page fault is p
- So 0 ? p ? 1
- Effective access time
- ma p x pageFaultTime
108Page Fault
OpSys says page is on disk
Physical Memory
Operating System
CPU
Restart Instruction
page table
i
109Performance of Demand Paging
- Assume probability of page fault is p
- So 0 ? p ? 1
- Effective access time
- ma p x pageFaultTime
- 100ns p x 25,000,000ns
- 100 24,999,990 x p (ns)
- If p 0.001
- Effective access time 25 ?sec (250xs!)
110Performance of Demand Paging
- If we want only 10 degradation in performance
- 110ns gt 100ns 25,000,000ns x p
- 10 gt 25,000,000 x p
- p lt 0.0000004.
- Thus, 1 memory access in 2,500,000 can page
fault.
111Questions?
112See any problems withPage Replacement so far?
113Page Replacement
- A process may be 10 pages in size but may use
only half these pages. - In a multiprogramming environment this means we
will be able to bring in more processes - But what if some processes suddenly need their 10
pages? - Run out of free frames
Free Frame
114What to do???
- Page fault occurs
- We have no free frames
- Select victim frame
- Write victim page to disk
- Change page and frame tables
- Read the desired page into the (new) free frame
- Change page and frame tables
- Restart the user process.
115Page Replacement
- Notice page replacement requires double the disk
access - Remember that 25,000,000???
- We can add a dirty bit indicating that the page
in memory has been modified. - Still two key areas for performance
- Page replacement algorithms
- Frame allocation algorithms
116Page Replacement Algorithms
- Goal Minimum page fault rate
- Can test algorithms on random numbers or actual
sequences of instructions/data - Only care about different pages
117FIFO
- Maintain queue. As page is read in enqueue. Use
head of queue as frame to replace - Sample 1,2,3,4,1,2,5,1,2,3,4,5
118FIFO
12
12
Beladys Anomaly
9
10
119FIFO
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
5
4
4
4
4
4
4
4
4
4
5
5
5
5
5
5
120Optimal
- Replace the page that will not be used for the
longest period of time. - Hah!
- Difficult to implement (this is a joke)
- Used as benchmark to compare other algorithms
121LRU
- Least Recently Used
- Assume recent past is an indication of near
future - Performance good but how to implement?
- Store counter representing time last used
- Stack of page numbers (doubly linked list)
- Remove page number when used
- Put on top of stack
- Bottom is then LRU
122LRU Approximation
- Add reference bit
- Set when page is accessed
- Additional reference bits algorithm
- Every 100 ms put reference bit into high order
shifting all bits right and discarding low order - 00000000 (Not used)
- 11001100
- 00110011
123LRU Approximation
- Second chance
- Maintain pointer to next victim
- Essentially uses FIFO but if reference bit is set
keep looking. - As pointer moves it sets reference bit to zero
- If all reference bits are set ends up back where
it started - Degenerates into FIFO if all reference bits are
set
124Counting Algorithms
- Least frequently used
- Replace page with smallest count
- Suffers with page used a lot in the beginning
- Can remedy by periodic shifting
- Most frequently used
- Argument is that LFU has just been brought in and
has high potential to be used!
Rarely used/expensive/poor performance
125Page Buffering Algorithms
- Maintain a pool of free pages
- Bring the needed page into the pool
- Move the page to be replaced out freeing its
space for pool - Maintain list of modified pages
- When device idle write pages and flip dirty bit
to clean
126Allocation of Frames
- Minimum Number of Frames
- Process must have enough frames to execute any of
its instructions - Allocation Algorithms
- Equal allocation
- Proportional allocation
- Global vs. Local Allocation
- Which frames can process select from?
127Thrashing
- Causes
- OS monitors CPU usage
- If CPU usage too low introduce an additional
process - Global page replacement is being used
- A process needs a lot of pages and takes them
from other processes - These other processes start faulting
- Soon everyone is doing nothing
- OS brings in more processes
128Thrashing
- Working-Set Model
- Define the working set to be the frames the
process actually needs. - Make sure the process has enough
- Define a ? or time interval to look back
- If total demand (?wsi) too high suspend
- Page-Fault Frequency
- Frequency too high add more frames
- Frequency too low take away frames
129Other Considerations
- Prepaging
- e.g. Give process back its working set
- Page Size
- Trend is to bigger
- Tradeoff
- Smaller Less I/O, less allocated memory
- Larger Less effect of overhead
- Program Loading
- Can load entire program into swap space
130Other Considerations
- Program Structure
- Wait! Something that applies to YOU!
- I/O Interlock
- DMA?
- Two solutions
- Double buffering (disk to system, system to user)
- Locking (Add lock bit)
- Real-Time Processing
- Doesnt work well with RTS
- Have tried mix
131Question
- What works well with Virtual Memory
- Stacks
- Hashed symbol table
- Sequential search
- Binary search
- Pure code
- Indirection
132Questions?