Memory Management Motivation

About This Presentation

Title:

Memory Management Motivation

Description:

Utilizing Memory. Assume each process takes 200k and so does the ... Not very good memory utilization. Freed blocks can only be merged with their own size ... – PowerPoint PPT presentation

Number of Views:69

Avg rating:3.0/5.0

Slides: 62

Provided by: csBg

Category:

more less

Transcript and Presenter's Notes

Title: Memory Management Motivation

1
Memory Management - Motivation

n processes, each spending a fraction p of their
time waiting for i/o, gives a probability of pn
of all processes waiting for i/o simultanously
cpu utilization 1 - pn

2
Utilizing Memory

Assume each process takes 200k and so does the
operating system
Assume there is 1Mb of memory available and that
p0.8
space for 4 processes 60 cpu
utilization
Another 1Mb enables 9 processes
87 cpu utilization

3
Issues - Relocation and Linking

Compile time - create absolute code
Load time - linker lists relocatable
instructions and loader changes instructions (at
each reload..)
Execution time - special hardware needed to
support moving of processes during run time
Dynamic Linking - used with system libraries and
includes only a stub in each user routine,
indicating how to locate the memory-resident
library function (or how to load it, if needed)

4
Multiprogramming with fixed partitions

How to organize the memory ?
How to assign jobs to partitions ?
Separate queues vs. single queue

5
Allocating memory - growing segments
6
Memory allocation - Keeping track (bitmaps
linked lists)
7
Strategies for Allocation

First fit do not search too much..
Next fit - start search from last location
Best fit - a drawback generates small holes
Worst fit - solve the above problems badly
Quick fit - several queues of different sizes
An example elaborate scheme the Buddy system
(Knuth 1973)
Separate lists of free holes of sizes of powers
of two
For any request, pick the 1st hole of the right
size
Not very good memory utilization
Freed blocks can only be merged with their own
size
Main problem of memory allocation -
Fragmentation
Internal wasted parts of allocated space
External wasted unallocated space

8
Memory Protection

Hardware
history IBM 360 had a 4bit protection code in
PSW and memory in 2k partitions - process code in
PSW matches memory partition code
Two registers - base limit
base is added by hardware without changing
instructions dynamic relocation
every request is checked against limit
runtime bound checking
reminder In the IBM/pc there are segment
registers (but no limit)

9
Managing memory by Swapping

Processes from disk to memory and from memory to
disk
Whenever there are too many jobs to fit in memory
To use memory more efficiently - variable
partitions
Allocating memory
Freeing memory and holes
possible solution memory compaction
some form of swapping is required with any
multiprogramming
since swapping is performed on whole processes
it results in a noticeable response time
longer queues of blocked processes can lead to
many swaps..
Allocating swap space
Processes are swapped in/out from the same
location
Allocate space for non-memory resident processes
only

10
Paging and Virtual Memory

Divide memory into fixed-size blocks
(page-frames)
Small enough blocks - many for one process
Allocate to processes non-contiguous memory
chunks - avoiding holes..
232 addresses for a 32 bit (address bus) machine
- virtual addresses
A memory management unit (MMU) does the mapping
to physical addresses
pages ---gt page frames
Machine instructions reference addresses more
than one address per instruction plus fetching
instructions
absolute code becomes meaningless..

11
Memory Management Unit
12
Paging
13
MMU Operation - page fault if accessed page is
absent
14
Page table considerations

Can be very large (1M pages for 32bits addresses)
Must be fast (every instruction needs it)
One extreme will have it all in hardware - fast
registers that hold the page table and are loaded
with each process, too expensive for the above
size
The other extreme has it all in memory (using a
page table base register (ptbr) to point to it -
each memory reference during instruction
translation is doubled...
To avoid keeping complete page tables in memory -
make them multilevel (and avoid the danger of
accumulating memory references per instruction by
caching)
a fast cache (additional 20) and a 98 hit
ratio, on a four-level page table, for a 100
nanoseconds memory access machine
effective access time 0.98 x 120 0.02 x 520
128 nanosecs

15
Page Tables - Handling the size problem
16
SPARC 3 level pagingContext table (MMU
hardware) - 1 entry per process
17
Associative Memory - content addressable
memorypage insertion - complete entry from page
tablepage deletion - just the modified bit to
page table
18
Associative Memory - comments

With a large enough hit-ratio the average access
time is close to 0
linked lists, for example, are bad..
Only a complete virtual address (all levels) can
be counted as a hit
with multi-processing associative memory can be
cleared on context switch - wasteful..
Add a field to the associative memory to hold
process ID and a special register for PID

19
No page tables - MIPS R2000

64 entry associative memory for virtual pages
if not found, TRAP to the operating system
software uses some hardware registers to find the
virtual page needed
a second trap may happen by page fault...

20
Inverted page tables

for very large memories (page tables) one can
have an inverted page table sorted by
(physical) page frames
IBM RT HP Spectrum (thinking of 64 bit
memories)
to avoid linear search for every virtual
address of a process use a hash table (one or a
few memory references)
only one page table the physical one for all
processes currently in memory
in addition to the hash table, associative
memory registers are used to store recently used
page table entries
the only way to deal with a 64 bit memory 4k
size pages two-level page tables can result in
242 entries

21
Inverted Page Table Architecture
22
Pages the dataPage frames the physical memory
locations

Page Table Entries (PTE) contain (per page)
Page frame number (physical address)
Present/absent bit (valid bit)
Dirty (modified) bit
Referenced (accessed) bit
Protection
Caching disable/enable

page frame number
23
Page fault Handling

1. trap to kernel, save PC on stack and
(sometimes) partial state in registers (and/or
stack)
2. assembly routine saves volatile information
and calls the operating system
3. find requested virtual page
4. check protection. If legal, find free page
frame (or invoke page replacement algorithm)
5. if replacing, check if modified and start
write to disk. Mark frame busy. Call scheduler
to block process until the write-to-disk process
has completed.

24
Page fault Handling (contnd.)

6. transfer of requested page from disk
(scheduler runs alternative processes)
7. upon transfer completion, enter page table,
mark new page as valid and update all other
parameters
8. back up faulted instruction which was in
principle in mid execution now the PC can be
set back to its initial value
9. schedule faulting process, return from
operating system
10. restore state (i.e. all volatile information
stored by the assembly routine) and restart
execution of faulted process

25
Architecture - Instruction backup

page faulting instructions trap to OS
OS must restart instruction
The page fault may originate at the op-code or
any of the operands - PC value useless
the location of the instruction itself is lost
worse still, undoing of autoincrement or
autodecrement - was it already performed ??
Hardware solutions
Register to store PC value of instruction and
register to store changes to other registers
(increment/decrement)
Micro-code dumps all information on the stack
Restart complete instruction and redo increments
etc.
Do nothing - RISC ......

26
Demand Paging

Processes reside on disk and their swapping-in
is performed partially only part of their pages
During run time a process may encounter a
missing page and demand it
a missing page has its invalid bit on (which
will need to be differentiated by the page-fault
routine from illegal address)
Page missing ?? Retrieve page into empty page
frame
No empty page frame ?? Evict (replace) a page
Many algorithms possible for selecting a page for
replacement
Optimal page replacement
Discard page to be used the longest time ahead
Not realizable...
but can be used to compare to real algorithms !!

27
Optimal page replacement

Demand comes in for pages
7, 5, 1, 0, 5, 4, 7, 0, 2,
1, 0, 7
an optimal algorithm faults on
7 5 1 (0,1) - (4,5) - - (2,4) (1,2)
- -
altogether 7 page-replacements
take FIFO for example
7 5 1 (0,7) - (4,5) (7,1) - (2,0) (1,4)
(0,7)(7,2)
3 additional page-replacements

28
Good old FIFO

implemented as a queue
the usual drawback
oldest page may be a referenced (needed) page
second chance FIFO
if reference bit is on - move to end of queue
Better to implement as a circular queue
save overhead of movements on the queue

29
Page replacement NRU - Not Recently Used

There are 4 classes of pages, according to
reference and modification bits
Select a page at random from the least-needed
class
Easy scheme to implement
Prefers a frequently referenced (not modified)
page on an old modified page
Class b is interesting, can only happen when
clock tick generates an erasure of the referenced
bit..

30
LRU - Least Recently Used

Approximate the optimal algorithm -
most recently used page as most probable next
reference
Replace page used furthest in the past
Not easy to implement - needs counting of
references
Use a large counter (number of operations) and
save in a field in the page table, for each page
reference operation
Another option is to use a bit array of nxn bits
In both cases the page entry with the smallest
number attached to it is selected for replacement

31
LRU with bit tables
32
NFU - Not Frequently Used

In order to record frequently used pages add a
counter to all table entries
At each clock tick add the R bit to the counters
Select page with lowest counter for replacement
problem remembers everything
remedy (an aging algorithm)
shift-right the counter before adding the
reference bit
add the reference bit at the left
Less operations than LRU, depends on the
intervals used for updating

33
NFU - the aging simulation version
34
Modelling paging algorithms

Beladys anomaly
Example FIFO with reference string 123412512345

35
Characterizing paging systems

a Reference string (of requested pages)
number of virtual pages n
number of physical page frames m
a page replacement algorithm
can be represented by an array M of n rows

36
Stack Algorithms

Definition Set of pages in physical memory with
m page frames is a subset of the pages in
physical memory with m1 page frames (for every
reference string)
Stack algorithms have no anomaly
Example LRU, optimal replacement
FIFO is not a stack algorithm
Useful definition
Distance string distance from top of stack

37
Predicting page fault number

Ci is the number of times that i is in the
distance string
the number of page faults with m frames is
Fm

38
Page Frame Allocation

for a page-fault rate p, memory access time of
100 nanosecs and page-fault service time of 25
milisecs the effective access time is (1-p) x
100 p x 25,000,000
for p of 0.001 the effective access time is
still larger than 100 nanosecs by a factor of 250
for a goal of only a 10 degradation in access
time we need p 0.0000004
policies for page-frame allocation must allocate
as much as possible to processes, to enhance
performance leave no unassigned page-frame
difficult to know how much frames to allocate to
processes differ in size structure priority

39
Allocation to multiprocesses

Fair share is not the best policy (static !!)
allocate according to process size
must be a minimum for running a process...

Age
A6
A6
40
(dynamic) Page Allocation Policies

1st option - fixed number of pages per process
2nd option proportional to process size
Locality of reference - a valid statistical
phenomenon
Working set - sets of pages used by each process
Working set model - dynamic number of pages per
process, a necessary condition for running (can
be used for prepaging - load working set before
running process)
Keep track by aging by lookback parameter
WSClock
Thrashing - very frequent page faults (more than
computation)
cpu utilization decreases ? increase
multiprogramming degree ? more utilization
decreases ?
whenever the in-memory pages are not the working
set
what to do for processes being swapped ?

41
Dynamic set - Page Allocation

0 2 1 3 5 4 6 3 7 5 7 3 3 5 6 4
with 5 page frames (LRU)
p p p p p p p - p - - - - - -
- optimal
with ? 5 (and LRU)
p p p p p p p - p - - (4)(3) - p -
for a window of size 5 the allocated WS is
decreasing after request 12 and 14
the maximum page allocation is ?
extra page fault, because of the size of the WS
after the last request, page 4, the number of
allocated page frames increases again (4)

42
Dynamic set - Clock Algorithm

WSClock is a global clock algorithm - for pages
held by all processes in memory
Circling the clock, the algorithm uses the
reference bit and adds to it a measure of window
size ?
Each time a reference bit is set an additional
data structure, ref(frame), is set to the current
virtual time of the process
WSClock Use an additional condition that
measures elapsed (process) time and compares it
to ?
replace page when two conditions apply
reference bit is unset
Tp -- ref(frame) gt ?

43
Dynamic set - WSClock Example

3 processes p0, p1 and p2
current (virtual) times of the 3 processes are
Tp0 50 Tp1 70 Tp2 90
WSClock replace when Tp -- ref(frame) gt ?
the minimal distance (window size) is ? 20
The clock hand is currently pointing to page
frame 4
page-frames 0 1 2 3 4 5 6
7 8 9 10
ref. bit 0 0 1 1 1 0 1
0 0 1 0
process ID 0 1 0 1 2 1 0
0 1 2 2
last_ref 10 30 52 71 81 37 61 37 31
47 55

44
Page Daemons - Unix

It is assumed useful to keep a number of free
pages
freeing of page frames can be done by a page
daemon - a process that sleeps most of the time
awakened periodically to inspect the state of
memory - if there are too few free page frames
then they free page frames
yet another type of (global) dynamic page
replacement policy
this strategy performs better than evicting pages
when needed (and writing the modified to disk in
a hurry)

45
Comment - Page size analysis

To minimize wasted memory
process size s
page size p
page table entry size e
Fragmentation overhead is
Table space overhead is
Total overhead is
Minimize overhead
Example s 128k e 8bytes
optimal page size is 1488 bytes... i.e. use
1k or 2k

46
Additional issues - Locking and Sharing

i/o channel/processor (DMA) transfers data
independently
page must not be replaced during transfer
OS can use a lock variable per page
Pages of editors code - shared among processes
swapping out, or terminating, process A (and its
pages) may cause many page faults for process B
that shares them
looking up for evicted pages in all page tables
is impossible
solution maintain special data structures for
shared pages

47
Handling the backing store

need to store non-resident pages on disk
the backing store (disk swap area) need to be
managed
allocate swap area to (whole) processes and
address pages by offset from swap address
processes grow during execution - assign separate
swap areas to Text Data and Stack
allocate disk blocks when needed - needs disk
addresses in memory to keep track of swapped pages

48
Segmentation

several logical address spaces per process
a compiler needs segments for
source text
symbol table
constants segment
stack
parse tree
compiler executable code
Most of these segments grow during execution

symbol table
symbol table
Source Text
source text
constant table
parse tree
call stack
49
Segmentation vs. Paging
50
Segmentation - segment table
51
Segmentation with Paging

MULTICS combined segmentation and paging
218 segments of up to 64k words (36 bits)
addresses are 34 bits -
18 bit segment number
16 bit - page number (6) offset within page
(10)
Each process has a segment table (STBR)
the segment table is a segment and is paged
(8bits page10 offset). STBR added to 18 bits
seg-num
Each segment is a separate virtual memory with a
page table (6 bits)
segment tables contain segment descriptors - 18
bits page table address 9 bits segment length

52
MULTICS segment descriptors
53
segmentation and paging - locating addresses
54
Segmentation - Memory reference procedure

1. Use segment number to find segment descriptor
segment table is itself paged because it is
large, so in actuality a STBR is used to locate
page of descriptor
2. Check if page table is in memory
if not a segment fault occurs
if there is a protection violation TRAP (fault)
3. page table examined, a page fault may occur.
if page is in memory the address of start of page
is extracted from page table
4. offset is added to the page origin to
construct main memory address
5. perform read/store etc.

55
Paged segmentation on the INTEL 80386

16k segments, each up to 1G (32bit words)
2 types of segment descriptors
Local Descriptor Table (LDT), for each process
Global (GDT) system etc.
access by loading a 16bit selector to one of the
6 segment registers CS, DS, SS, (holding the
16bit selector during run time, 0 means
not-in-use)
Selector points to segment descriptor (8 bytes)

Privilege level (0-3)
0 GDT/ 1 LDT
13
1
2
Index
56
80386 - segment descriptors
57
80386 - Forming the linear address

Segment descriptor is in internal (microcode)
register
If segment is not zero (TRAP) or paged out (TRAP)
Segment size is checked against limit field of
descriptor
Base field of descriptor is added to offset (4k
page-size)

58
80386 - paged segmentation (contnd.)

Combine descriptor and offset into linear address
If paging disabled, pure segmentation (286
compatibility). Linear address is physical
address
Paging is 2-level
page directory (1k) page table (1k)
pages are 4k bytes each (12bit offset)
Page directory is pointed to by a special
register
PTEs have 20bits page frame and 12 bits of
modified, accessed, protection, etc.
Small segments have just a few page tables