Title: OS_Ch401'ppt
1Memory Management
- Without swapping or paging
0xFF..
Device drivers in ROM
OS in ROM
User Program
User Program
User Program
OS in RAM
OS in RAM
0
2Multiprogramming with fixed partitions
- How to organize the memory ?
- How to assign jobs to partitions ?
- Separate queues vs. single queue
3Relocation and Linking
- Compile time - create absolute code
- Load time - linker lists relocatable instructions
and loader changes instructions (at each
reload..) - Execution time - special hardware needed to
support moving of processes during run time - Dynamic Linking - used with system libraries and
includes only a stub in each user routine,
indicating how to locate the memory-resident
library function (or how to load it, if needed)
4Memory Protection
- Hardware
- IBM 360 had a 4bit protection code in PSW and
memory in 2k partitions - process code in PSW
matches memory partition code - Two registers - base limit
- base is added by hardware without changing
instructions - every request is checked against limit
- In the IBM/pc there are segment registers (but no
limit)
5Multiprogramming - Swapping
- Processes from disk to memory and from memory to
disk - Whenever there are too many jobs to fit in memory
- To use memory more efficiently, move to variable
partitions - Allocating memory
- Freeing memory and holes
- possible solution memory compaction
- another need dynamic growth of program memory
6Allocating memory - a series
7Allocating memory - growing segments
8Memory allocation - Keeping track (bitmaps
linked lists)
9Strategies for Allocation
- First fit
- Next fit - start search from last location
- Best fit - a drawback generates small holes
- Worst fit - solve the above problems badly
- Quick fit - several queues of different sizes
- Buddy system (Knuth 1973)
- Separate lists of free holes of sizes of powers
of two - For any request, pick the 1st hole of the right
size - Not very good memory utilization
- Freed blocks can only be merged with their own
size
10The Buddy System
11Some issues of memory allocation
- Fragmentation
- Internal wasted parts of allocated space
- External wasted unallocated space
- Allocating swap space
- Processes are swapped in/out from the same
location - Allocate space for non-memory resident processes
only
12Paging and Virtual Memory
- Divide memory into fixed-size blocks
(page-frames) - Small enough blocks - many for one process
- Allocate to processes non-contiguous memory
chunks - avoiding holes.. - 232 addresses for a 32 bit (address bus) machine
- virtual addresses - A memory management unit (MMU) does the mapping
to physical addresses - pages ---gt page frames
13Memory Management Unit
14Paging
15MMU Operation page fault if accessed page is
absent
16Page table considerations
- Can be very large (1M pages for 32bits addresses)
- Must be fast (every instruction needs it)
- One extreme will have it all in hardware - fast
registers that hold the page table and are loaded
with each process, too expensive - The other extreme has it all in memory and each
memory reference during instruction translation
is doubled, can be too large... - To avoid keeping page tables in memory completely
- make them multilevel
17Page Tables - Handling the size problem
18PDP-11 1 Level Paging Hardware
- 8k pages
- 16 bit addresses - 13bits offset 3 bit page
- Separate Text and Data - Page table size 16
- 4Mb physical memory - processes get 64k x 2 bytes
19Two level paging - VAX
- 512 byte pages (small...)
- Division of virtual space to 4 user data and
stack system ... - 2M pages (per process) !!!
- Solution - page tables reside in virtual address
space and may be paged
2
21
Virtual page number
Offset
20Two level paging - VAX
21SPARC 3 level pagingContext table (MMU
hardware) - 1 entry per process
2268030 4 level paging
- Programmable number of page table levels
- A global Translation Control Register (TCR)
defining the paging level scheme - ignore first few bits in the address (limiting
space) - page sizes from 8bits to 15bits
- each of up to four levels can be allocated its
number of bits - The MMU uses the TCR to translate addresses
23Associative Memory - content addressable
memorypage insertion - complete entry from page
tablepage deletion - just the modified bit to
page table
24Associative Memory
- With a large enough hit-ratio the average access
time is close to 0 - linked lists, for example, are bad..
- Only a complete virtual address (all levels) can
be counted as a hit - with multi-processing associative memory can be
cleared on context switch - wasteful.. - Add a field to the associative memory to hold
process ID and a special register for PID
25No page tables - MIPS R2000
- 64 entry associative memory for virtual pages
- if not found, TRAP to the operating system
- software uses some hardware registers to find the
virtual page needed - a second trap may happen by page fault...
26Too large virtual memory
- 64 bit addresses create the need for gigantic
page tables - Physical memory is much smaller (even if 32bit)
- invert page tables to be indexed by page frame
(Hash table)
27Pages the dataPage frames the physical memory
locations
- Page Table Entries (PTE) contain (per page)
- Page frame number (physical address)
- Present/absent bit (valid bit)
- Dirty (modified) bit
- Referenced (accessed) bit
- Protection
- Caching disable/enable
page frame number
28Page Replacement
- Demand paging - Page missing ??
- Retrieve page into empty page frame
- No empty page frame ?
- Evict (replace) a page
- Many algorithms possible for selecting a page for
replacement - Optimal page replacement
- Discard page to be used the longest time ahead
- Not realizable...
- but can be used to compare to real algorithms !!
29Optimal page replacement
- Demand comes in for pages
- 7, 5, 1, 0, 5, 4, 7, 0, 2,
1, 0, 7 - an optimal algorithm faults on
- 7 5 1 (0,1) - (4,5) - - (2,4) (1,2)
- - - altogether 7 page-replacements
- take FIFO for example
- 7 5 1 (0,7) - (4,5) (7,1) - (2,0) (1,4)
(0,7)(7,2) - 3 additional page-replacements
30NRU - Not Recently Used
- There are 4 classes of pages, according to
reference and modification bits - Select a page at random from the least-needed
class - Easy scheme to implement
- Prefers a frequently referenced (not modified)
page on an old modified page - Class b is interesting, can only happen when
clock tick generates an erasure of the referenced
bit..
31Good old FIFO
- implemented as a queue
- the usual drawback
- oldest page may be a referenced (needed) page
- second chance FIFO
- if reference bit is on - move to end of queue
- Better to implement as a circular queue
- no overhead of movements on the queue
32LRU - Least Recently Used
- Approximate the optimal algorithm -
- most recently used page as most probable next
reference - Replace page used furthest in the past
- Not easy to implement - needs counting of
references - Use a large counter (number of operations) and
save in a field in the page table, for each page
reference operation - Another option is to use a bit array of nxn bits
- In both cases the page entry with the smallest
number attached to it is selected for replacement
33LRU with bit tables
34NFU - Not Frequently Used
- In order to record frequently used pages add a
counter to all table entries - At each clock tick add the R bit to the counters
- Select page with lowest counter for replacement
- problem remembers everything
- remedy (an aging algorithm)
- shift-right the counter before adding the
reference bit - add the reference bit at the left
- Less operations than LRU, depends on the
intervals used for updating
35NFU - the aging simulation version
36Characterizing paging systems
- a Reference string (of requested pages)
- number of virtual pages n
- number of physical page frames m
- a page replacement algorithm
- can be represented by an array M of n rows
37Stack Algorithms
- Definition Set of pages in physical memory with
m page frames is a subset of the pages in
physical memory with m1 page frames (for every
reference string) - Stack algorithms have no anomaly
- Example LRU, optimal replacement
- FIFO is not a stack algorithm
- Useful definition
- Distance string distance from top of stack
38(dynamic) Page Allocation Policies
- Demand paging
- fixed number of pages per process (initially 0
pages loaded) - Locality of reference - a valid statistical
phenomenon - Working set - sets of pages used by each process
- Thrashing - very frequent page faults
- instructions are microseconds page faults are
milliseconds - what to do for processes being swapped ?
- Working set model - dynamic number of pages per
process (can be used for prepaging - load working
set before running process) - Keep track by aging by lookback parameter
WSClock
39Allocation to multiprocessors
- Fair share is not the best policy(static !!)
- allocate according to process size
- must be a minimum for running a process...
40Dynamic set - Page Allocation
- 0 2 1 3 5 4 6 3 7 5 7 3 3 5 5 3
- with 5 page frames (LRU)
- p p p p p p p - p - - - - - -
- optimal - with ? 5 (and LRU)
- p p p p p p p - p - - (4) - (3) - -
- WSClock Tp -- ref(frame) gt ?
- Tp0 50 Tp1 70 Tp2 90 ? 20
- page-frames 0 1 2 3 4 5 6
7 8 9 10 - ref 0 0 1 1 1 0 1 0 0
1 0 - process ID 0 1 0 1 2 1 0
0 1 2 2 - last_ref 10 30 52 71 81 37 61 27 31
47 55
41Page Daemons
- It is useful to keep a number of free pages
- freeing of page frames can be done by a page
daemon - a process that sleeps most of the time - awakened periodically to inspect the state of
memory - if there are too few free page frames
then they free page frames - yet another type of (global) dynamic page
replacement policy - this strategy performs better than evicting pages
when needed (and writing the modified to disk in
a hurry)
42Additional issues Locking and Sharing
- i/o channel/processor (DMA) transfers data
independently - page must not be replaced during transfer
- OS can use a lock variable per page
- Pages of editors code - shared among processes
- swapping out, or terminating, process A (and its
pages) may cause many page faults for process B
that shares them - looking up for evicted pages in all page tables
is impossible - solution maintain special data structures for
shared pages
43Page fault Handling
- 1. trap to kernel, save PC on stack and
(sometimes) partial state in registers (and/or
stack) - 2. assembly routine saves volatile information
and calls the operating system - 3. find requested virtual page
- 4. check protection. if legal, find free page
frame (or invoke page replacement algorithm) - 5. if replacing, check if modified and start
write to disk. Mark frame busy. Call scheduler
to block process until the write-to-disk process
has completed.
44Page fault Handling (cont.)
- 6. transfer of requested page from disk
(scheduler runs alternative processes) - 7. upon transfer completion, enter page table,
mark new page as valid - 8. back up faulted instruction
- 8. schedule faulting process, return from
operating system - 10. restore state and restart execution of
faulted process
45Segmentation
- several logical address spaces per process
- a compiler needs segments for
- source text
- symbol table
- constants segment
- stack
- parse tree
- compiler executable code
- Most of these segments grow during execution
46Segmentation vs. Paging
47Segmentation - segment table
48Segmentation with Paging
- MULTICS combined segmentation and paging
- 218 segments of up to 64k words (36 bits)
- addresses are 34 bits -
- 18 bit segment number
- 16 bit - page number (6) offset within page
(10) - Each process has a segment table (STBR)
- the segment table is a segment and is paged
(8bits page10 offset). STBR added to 18 bits
seg-num - Each segment is a separate virtual memory with a
page table (6 bits) - segment tables contain segment descriptors - 18
bits page table address 9 bits segment length
49segmentation and paging - locating addresses
50Segmentation - Memory reference procedure
- 1. Use segment number to find segment descriptor
- segment table is itself paged because it is
large, so in actuality a STBR is used to locate
page of descriptor - 2. Check if page table is in memory
- if not a segment fault occurs
- if there is a protection violation TRAP (fault)
- 3. page table examined, a page fault may occur.
- if page is in memory the address of start of page
is extracted from page table - 4. offset is added to the page origin to
construct main memory address - 5. perform read/store etc.
51Paged segmentation on the INTEL 80386
- 16k segments, each up to 1G (32bit words)
- 2 types of segment descriptors
- Local Descriptor Table (LDT), for each process
- Global (GDT) system etc.
- access by loading a 16bit selector to one of the
6 segment registers CS, DS, SS, (holding the
16bit selector during run time, 0 means
not-in-use) - Selector points to segment descriptor (8 bytes)
Privilege level (0-3)
5280386 - segment descriptors
5380386 - Forming the linear address
- Segment descriptor is in internal (microcode)
register - If segment is not zero (TRAP) or paged out (TRAP)
- Segment size is checked against limit field of
descriptor - Base field of descriptor is added to offset (4k
page-size)
5480386 - paged segmentation (contnd.)
- Combine descriptor and offset into linear address
- If paging disabled, pure segmentation (286
compatibility). Linear address is physical
address - Paging is 2-level
- page directory (1k) page table (1k)
- pages are 4k bytes each (12bit offset)
- Page directory is pointed to by a special
register - PTEs have 20bits page frame and 12 bits of
modified, accessed, protection, etc. - Small segments have just a few page tables
5580386 - 2-level paging