Memory Management

About This Presentation

Title:

Memory Management

Description:

Chapter 4 Memory Management 4.1 Basic memory management 4.2 Swapping 4.3 Virtual memory 4.4 Page replacement algorithms 4.5 Modeling page replacement algorithms – PowerPoint PPT presentation

Number of Views:223

Avg rating:3.0/5.0

Slides: 113

Provided by: SteveAr2

Category:

more less

Transcript and Presenter's Notes

Title: Memory Management

1
Memory Management

Chapter 4

4.1 Basic memory management 4.2 Swapping 4.3
Virtual memory 4.4 Page replacement
algorithms 4.5 Modeling page replacement
algorithms 4.6 Design issues for paging
systems 4.7 Implementation issues 4.8 Segmentation
2
Memory Management

Ideally programmers want memory that is
large
fast
non volatile
Memory hierarchy
small amount of fast, expensive memory cache
some medium-speed, medium price main memory
gigabytes of slow, cheap disk storage
Memory manager handles the memory hierarchy

3
Basic Memory ManagementMonoprogramming without
Swapping or Paging

Three simple ways of organizing memory
- an operating system with one user process

4
Multiprogramming with Fixed Partitions

Fixed memory partitions
separate input queues for each partition
single input queue

5
Memory Management

The CPU utilization can be modeled by the formula
CPU utilization 1 - pn
where there are n processes in memory and
each process spends a fraction p of its time
waiting for I/O.
CPU utilization is a function of n, which is
called the degree of multiprogramming.
A more accurate model can be constructed using
queuing theory.
Example
A computer has 32 MB. The OS takes 16 MB.
Each process takes 4 MB. 80 percent of time is
waiting for I/O. The CPU utilization is 1 0.84
60. If 16 MB is added, then the utilization is
1 0.88 83.

6
Modeling Multiprogramming
Degree of multiprogramming

CPU utilization as a function of number of
processes in memory

7
Analysis of Multiprogramming System Performance

Arrival and work requirements of 4 jobs
CPU utilization for 1 4 jobs with 80 I/O wait
Sequence of events as jobs arrive and finish
note numbers show amout of CPU time jobs get in
each interval

8
Relocation and Protection

Multiprogramming introduces two problems
relocation and protection.
Relocation - Cannot be sure where program will be
loaded in memory
address locations of variables, code routines
cannot be absolute
must keep a program out of other processes
partitions
Protection - Use base and limit values
address locations added to base value to map to
physical addr
address locations larger than limit value is an
error

9
Swapping

Two approaches to overcome the limitation of
memory
Swapping puts a process back and forth in memory
and on the disk.
Virtual memory allows programs to run even when
they are only partially in main memory.
When swapping creates multiple holes in memory,
memory compaction can be used to combine them
into a big one by moving all processes together.

10
Swapping

Memory allocation changes as
processes come into memory
leave memory
Shaded regions are unused memory

11
Swapping

Allocating space for growing data segment
Allocating space for growing stack data segment

12
Memory Management with Bit Maps and Linked Lists

There are two ways to keep track of memory usage
bitmaps and free lists.
The problem of bitmaps is to find a run of
consecutive 0 bits in the map. This is a slow
operation.
Four algorithms can be used in memory management
with linked lists (double-linked list)
First fit searches from the beginning for a hole
that fits.
Next fit searches from the place where it left
off last time for a hole that fits.
Best fit searches the entire list and takes the
smallest hole that fits.
Worst fit searches the largest hole that fits.

13
Memory Management with Bit Maps

Part of memory with 5 processes, 3 holes
tick marks show allocation units
shaded regions are free
Corresponding bit map
Same information as a list

14
Memory Management with Linked Lists

Four neighbor combinations for the terminating
process X

15
Virtual Memory

Problem Program too large to fit in memory
Solution
Programmer splits program into pieces called
Overlays - too much work
Virtual memory - Fotheringham 1961 - OS keeps
the part of the program currently in use in
memory
Paging is a technique used to implement virtual
memory.
Virtual Address is a program generated address.
The MMU (memory management unit) translates a
virtual address into a physical address.

16
Virtual MemoryPaging

The position and function of the MMU

17
Virtual Memory

Suppose the computer can generate 16-bit
addresses, (0-64k). However, the computer only
has 32k of memory ? 64k program can be written,
but not loaded into memory.
The virtual address space is divided into
(virtual) pages and those in the physical memory
are (page) frames.
A Present/Absent bit keeps track of whether or
not the page is mapped.
Reference to an unmapped page causes the CPU to
trap to the OS.
This trap is called a Page fault. The MMU
selects a little used page frame, writes its
contents back to disk, fetches the page just
referenced, and restarts the trapped instruction.

18
Paging

The relation betweenvirtual addressesand
physical memory addres-ses given bypage table

19
Paging Model Example
20
Page Tables

Example Virtual address 4097 0001
000000000001
Virtual page 12-bit offset
See Figure 4-11.
The purpose of the page table is to map virtual
pages into page frames. The page table is a
function to map the virtual page to the page
frame.
Two major issues of the page tables are faced
Page tables may be extremely large (e.g. most
computers use)
32-bit address with 4k page size,
12-bit offset
? 20 bits for virtual page number
?1 million entries!
The mapping must be fast because it is done on
every memory access!!

21
Pure paging
22
Page Tables

Internal operation of MMU with 16 4 KB pages

23
Two-Level Paging Example

A logical address (on 32-bit machine with 4K page
size) is divided into
a page number consisting of 20 bits.
a page offset consisting of 12 bits.
Since the page table is paged, the page number is
further divided into
a 10-bit page number.
a 10-bit page offset.
Thus, a logical address is as follows
where p1 is an index into the outer page
table, and p2 is the displacement within the page
of the outer page table.

page number
page offset
p2
p1
d
12
10
10
24
Address-Translation Scheme

Address-translation scheme for a two-level 32-bit
paging architecture is shown as below.

25
Two-Level Page-Table Scheme
26
Page Tables

Multilevel page tables - reduce the table size.
Also, don't keep page tables in memory that are
not needed.
See the diagram in Figure 4-12
Top level entries point to the page table for
0 program text
1 program data
1023 stack
? 4M stack
4M data
4M code

27
Page Tables
Second-level page tables
Top-level page table

32 bit address with 2 page table fields
Two-level page tables

28
Page Tables

Most operating systems allocate a page table for
each process.
Single page table consisting of an array of
hardware registers. As a process is loaded, the
registers are loaded with page table.
Advantage - simple
Disadvantage - expensive if table is large and
loading the full page table at every context
switch hurts performance.
Leave page table in memory - a single register
points to the table
Advantage - context switch cheap
Disadvantage - one or more memory references to
read table entries

29
Hierarchical Paging

Examples of page table design
PDP-11 uses one-level paging.
The Pentium-II uses this two-level architecture.
The VAX architecture supports a variation of
two-level paging (section page offset).
The SPARC architecture (with 32-bit addressing)
supports a three-level paging scheme.
The 32-bit Motorola 68030 architecture supports a
four-level paging scheme.
Further division could be made for large
logical-address space.
However, for 64-bit architectures, hierarchical
page are general infeasible.

30
Page Tables

Typical page table entry

31
Structure of a Page Table Entry

Page frame number map the frame number
Present/absent bit 1/0 indicates valid/invalid
entry
Protection bit what kids of access are
permitted.
Modified (dirty bit) set when modified and
writing to the disk occur
Referenced - Set when page is referenced (help
decide which page to evict)
Caching disabled - Cache is used to keep data
that logically belongs on the disk in memory to
improve performance. (Reference to I/O may
require no cache!)

32
TLB

Observation Most programs make a large number of
references to a small number of pages.
Solution Equip computers with a small hardware
device, called Translation Look-aside Buffers
(TLBs) or associative memory, to map virtual
addresses to physical addresses without using the
page table.
Modern RISC machines do TLB management in
software. If the TLB is large enough to reduce
the miss rate, software management of the TLB
become acceptably efficient.
Methods to reduce TLB misses and the cost of a
TLB miss
Preload pages
Maintain large TLB

33
TLBs Translation Lookaside Buffers

A TLB to speed up paging

34
Paging Hardware With TLB
35
Effective Access Time

Associative Lookup ? time unit
Assume memory cycle time is t time unit
Hit ratio percentage of times that a page
number is found in the associative registers
ration related to number of associative
registers.
Hit ratio ?
Effective Access Time (EAT)
EAT ? (t ?) (1 ?) (2t ?)
? t ? ? 2t ? - 2?t - ? ?
(2 ?)t
?
Example ? 0.8, ? 20 ns, t 100 ns
EAT 0.8 x 120 0.2 x (200 20) 140 ns.

36
Inverted Page Table

Usually, each process has a page table associated
with it. One of drawbacks of this method is that
each page table may consist of millions of
entries.
To solve this problem, an inverted page table can
be be used. There is one entry for each real page
(frame) of memory.
Each entry consists of the virtual address of the
page stored in that real memory location, with
information about the process that owns that
page.
Examples of systems using the inverted page
tables include 64-bit UltraSPARC and PowerPC.

37
Inverted Page Table

To illustrate this method, a simplified version
of the implementation of the inverted page is
described as ltprocess-id, page-number, offsetgt.
Each inverted page-table entry is a pair
ltprocess-id, page-numbergt. The inverted page
table is then searched for a match. If a match i
found, then the physical address lti, offsetgt is
generated. Otherwise, an illegal address access
has been attempted.
Although it decreases memory needed to store each
page table, but increases time needed to search
the table when a page reference occurs.
Use hash table to limit the search to one or at
most a few page-table entries.

38
Inverted Page Table Architecture
39
Inverted Page Tables

Comparison of a traditional page table with an
inverted page table

40
Page Replacement Algorithms

Page fault forces choice
which page must be removed
make room for incoming page
Modified page must first be saved
unmodified just overwritten
Better not to choose an often used page
will probably need to be brought back in soon
Applications Memory, Cache, Web pages

41
Optimal Page Replacement Algorithm

Replace the page which will be referenced at the
farthest point
Optimal but impossible to implement and is only
used for comparison
Estimate by
logging page use on previous runs of process
although this is impractical

42
Not Recently Used Page Replacement Algorithm

Each page has Reference bit (R) and Modified bit
(M).
bits are set when page is referenced (read or
written recently), modified (written to)
when a process starts, both bits R and M are set
to 0 for all pages.
periodically, (on each clock interval (20msec) ),
the R bit is cleared. (i.e. R0).
Pages are classified
Class 0 not referenced, not modified
Class 1 not referenced, modified
Class 2 referenced, not modified
Class 3 referenced, modified
NRU removes page at random
from lowest numbered non-empty class

43
FIFO Page Replacement Algorithm

Maintain a linked list of all pages
in order they came into memory with the oldest
page at the front of the list.
Page at beginning of list replaced
Advantage easy to implement
Disadvantage
page in memory the longest (perhaps often used)
may be evicted

44
Second Chance Page Replacement Algorithm

Inspect R bit
if R 0 ? evict the page
if R 1 ? set R 0 and put page at end
(back) of list. The page is treated like a newly
loaded page.
Clock Replacement Algorithm a different
implementation of second chance

45
Second Chance Page Replacement Algorithm

Operation of a second chance
pages sorted in FIFO order
Page list if fault occurs at time 20, A has R bit
set(numbers above pages are loading times)

46
The Clock Page Replacement Algorithm
47
Least Recently Used (LRU)

Assume pages used recently will used again soon
throw out page that has been unused for longest
time
Software Solution Must keep a linked list of
pages
most recently used at front, least at rear
update this list every memory reference ? Too
expensive!!
Hardware solution Equip hardware with a 64 bit
counter.
That is incrementing after each instruction.
The counter value is stored in the page table
entry of the page that was just referenced.
choose page with lowest value counter
periodically zero the counter

48
Least Recently Used (LRU)

Hardware solution
Equip hardware with a 64 bit counter.
That is incrementing after each instruction.
The counter value is stored in the page table
entry of the page that was just referenced.
choose page with lowest value counter
periodically zero the counter
Problem page table is larger.
Maintain a matrix of n x n bits for a machine
with n page frames.
When page frame K is referenced
(i) Set row K to all 1s.
(ii) Set column K to all 0s.
The row whose binary value is smallest is the
LRU page.

49
Simulating LRU in Software

LRU using a matrix pages referenced in order
0,1,2,3,2,1,0,3,2,3

50
Simulating LRU in Software

LRU hardware is not usually available. NFU (Not
Frequently Used) is implemented in software.
At each clock interrupt, the R bit is added to
the counter associated with each page. When a
page fault occurs, the page with the lowest
counter is replaced.
Problem NFU never forgets, so a page referenced
frequency long ago may have the highest counter.
Modified NFU NFU with Aging - at each clock
interrupt
the counters are shifted right one bit, and
the R bits are added to the leftmost bit.
In this way, we can give higher priority to
recent R values.

51
Simulating LRU in Software

The aging algorithm simulates LRU in software
Note 6 pages for 5 clock ticks, (a) (e)

52
Working-Set Model

Pages are loaded only on demand. This strategy is
called demand paging.
During the phase of execution the process
references relatively small fraction of its
pages. This is called a locality of reference.
The set of pages that a process is currently is
called its working set.
A program causing page faults every few
instructions is said to be thrashing.
Paging systems keep each process working set in
memory before letting the process run. This
approach is called the working set model.

53
Locality In A Memory-Reference Pattern
54
Working-Set Model

Loading the pages before letting processes run is
called prepaging.
? ? w (k, t) ? working-set window containing k
page references at time t
Example 10,000 instruction
WSSi (working set of Process Pi) total number
of pages referenced in the most recent ? (varies
in time)
if ? too small will not encompass entire
locality.
if ? too large will encompass several localities.
if ? ? ? will encompass entire program.

55
Working-set model
56
The Working Set Page Replacement Algorithm

The working set is the set of pages used by the k
most recent memory references
w(k,t) is the size of the working set at time, t

57
Working-Set Model

The idea is to examine the most recent ? page
references. Evict a page that is not in the
working set.
The working set of a process is the set of pages
it has referenced during the past t seconds of
virtual time (the amount of CPU time a process
has actually used).
Scan the entire page table and evict the page
Its referenced bit is clean and its age is
greater than t.
Its referenced bit is clean if no page has the
age greater than t.
Its age is largest if no page has its referenced
bit clear.
The basic working set algorithm is expensive.
Instead, WSCLock is used in practice.

58
The Working Set Page Replacement Algorithm

The working set algorithm

59
The WSClock Page Replacement Algorithm

Operation of the WSClock algorithm

60
Review of Page Replacement Algorithms
61
Modeling Page Replacement Algorithms

Belady discovered more page frames might not
always have fewer page faults. This is called
Belady's anomaly.
Conceptually, a process memory access can be
characterized by an (ordered) list of page
numbers. This list is called the reference
string.
A paging system can be characterized by three
items
The reference string of the executing process.
The page replacement algorithm.
The number of page frames available in memory, m.

62
Modeling Page Replacement AlgorithmsBelady's
Anomaly

FIFO with 3 page frames
FIFO with 4 page frames
P's show which page references show page faults

63
Modeling Page Replacement Algorithms

Modeling LRU Algorithms
When a page is referenced, it is always moved to
the top entry in pages in memory.
If the page referenced was already in memory, all
pages above it move down one position.
Pages that below the referenced page are not
moved.
The algorithms that do not suffer from Beladys
algorithm such as LRU and optimal page replace
algorithm are called stack algorithms.

64
Stack Algorithms

State of memory array, M, after each item in
reference string is processed.
n virtual pages and m page frames.

65
FIFO Page Replacement

15 page faults

66
Optimal Page Replacement

9 page faults

67
LRU Page Replacement

12 page faults

68
Distance String

Distance String - A page reference can be denoted
by the distance it is from the top. Pages not in
memory are at a distance infinity. The string so
generated for a given reference string is called
the distance string.
(Distance 1 ltgt page is at the top.)
(Distance infinity ltgt page is not in memory
and has not been accessed yet.

69
The Distance String

Probability density functions for two
hypothetical distance strings
Which needs more page frames?

70
The Distance String

Computation of page fault rate from distance
string
the C vector
the F vector

71
Local versus Global Allocation Policies

Global algorithms dynamically allocate page
frames among all runnable processes. Local
algorithms allocate pages for a single process.
A global algorithm such as page fault frequency
(PFF) algorithm is used to prevent thrashing and
keep the paging rate within acceptable bounds.
too high ? assign more page frames to the
process.
too low ? assign process fewer page frames.

72
Design Issues for Paging SystemsLocal versus
Global Allocation Policies

Original configuration
Local page replacement
Global page replacement

73
Local versus Global Allocation Policies

Page fault rate as a function of the number of
page frames assigned

74
Load Control

Despite good designs, system may still thrash
When PFF (page fault frequency) algorithm
indicates
some processes need more memory
but no processes need less
Solution Reduce number of processes competing
for memory
swap one or more to disk, divide up pages they
held
reconsider degree of multiprogramming

75
Page Size

Small page size
Advantages
less internal fragmentation
better fit for various data structures, code
sections
less unused program in memory
Disadvantages
programs need many pages, larger page tables

76
Page Size

Overhead due to page table and internal
fragmentation
Where
s average process size in bytes
p page size in bytes
e page entry

77
Page Size

Example
s 128K
e 8
p square root of (2(128K)(8)) 1448
p 1k or 2k
In general, 512 lt page size lt 8k.

78
Separate Instruction and Data Spaces

Most systems separate address spaces for
instructions (program text) and data.
A process can have two pointers in its process
table one to the instruction page and one to the
data page. A shared code can be pointed by two
processes.
For the shared data only the data pages that are
written need be copied. This approach is Copy on
write.

79
Separate Instruction and Data Spaces

One address space
Separate I and D spaces

80
Shared Pages

Two processes sharing same program sharing its
page table

81
Cleaning Policy

Need for a background process, paging daemon
periodically inspects state of memory
When too few frames are free
selects pages to evict using a replacement
algorithm
It can use a two-handed clock.
The font hand is controlled by the paging system.
The back hand is used for page replacement
algorithm.

82
Implementation IssuesOperating System
Involvement with Paging

Four times when OS involved with paging
Process creation
determine program size
create page table
Process execution
MMU reset for new process
TLB flushed
Page fault time
determine virtual address causing fault
swap target page out, needed page in
Process termination time
release page table, pages

83
Page Fault Handling

Hardware traps to kernel
General registers saved
OS determines which virtual page needed
OS checks validity of address, seeks page frame
If selected frame is dirty, write it to disk

84
Page Fault Handling

OS brings schedules new page in from disk
Page tables updated
Faulting instruction backed up to when it began
Faulting process scheduled
Registers restored
Program continues

85
Instruction Backup

An instruction causing a page fault

86
Locking Pages in Memory

Virtual memory and I/O occasionally interact
Process issues call for read from device into
buffer
while waiting for I/O, another processes starts
up
has a page fault
buffer for the first process may be chosen to be
paged out
If a page transferring data through the I/O is
paged out, it will cause part of the data in
buffer and part in the newly loaded page. In this
case, the page need be locked (pinning).

87
Backing Store

Two approaches can be used to allocate page space
on the disk
Paging to static swap area - Reserve separate
swap areas for the text, data, and stack when the
process is started.
Backing up pages dynamically with a disk map -
Allocate disk space for each page when it is
swapped in and out.

88
Backing Store

(a) Paging to static swap area
(b) Backing up pages dynamically

89
Separation of Policy and Mechanism

To managing the complexity of any system the
policy is separated from the mechanism.
The memory management system can be divided into
three parts
A low-level MMU handler.
A page fault handler that is part of the kernel.
An external pager running in user space.

90
Separation of Policy and Mechanism

Page fault handling with an external pager

91
Segmentation

Consider a compiler which has many tables.
In one-dimensional design the table will grow and
bump into another.
A segmented memory allows each table to grow or
shrink.

92
Segmentation

One-dimensional address space with growing tables
One table may bump into another

93
Segmentation

Allows each table to grow or shrink, independently

94
Segmentation

Comparison of paging and segmentation

95
Segmentation

A segment is a logically independent address
space.
segments may have different sizes
their sizes may change dynamically
the address space uses 2-dimensional memory
addresses and has 2 parts
(segment , offset within segment)
segments may have different protections
allows for the sharing of procedures and data
between processes. An example is the shared
library.

96
Implementation of Pure Segmentation

The implementation of segmentation differs from
paging in an essential way pages are fixed size
and segments are not.
External fragmentation or checkerboarding is
wasted memory in the holes. It can be dealt with
by compaction.

97
Implementation of Pure Segmentation

(a)-(d) Development of checkerboarding
(e) Removal of the checkerboarding by compaction

98
Pure paging

Address generated by CPU is divided into
Page number (p) used as an index into a page
table which contains base address of each page in
physical memory.
Page offset (d) combined with base address to
define the physical memory address that is sent
to the memory unit.
page number page
offset
m n
n
Where p is an index to the page table and d is
the displacement within the page.

p d
99
Pure paging
100
Pure Segmentation
101
Segmentation with Paging

Paging segments allows the large segments to fit
in main memory..
MULTICS ran on the Honeywell 6000 machines
Motorola 68000 line is designed based on a
flat-address space, whereas the Intel 80x86 and
Pentium family are based on segmentation. Both
are merging memory models toward a mixture of
paging and segmentation.

102
Segmentation with Paging MULTICS

A 34-bit MULTICS virtual address

103
Segmentation with Paging MULTICS

Descriptor segment points to page tables
Segment descriptor numbers are field lengths

104
Segmentation with Paging MULTICS

Conversion of a 2-part MULTICS address into a
main memory address

105
Segmentation with Paging MULTICS

Simplified version of the MULTICS TLB
Existence of 2 page sizes makes actual TLB more
complicated

106
Segmentation with Paging Intel 386

On 386, the logical-address space of a process is
divided into two partitions. The first partition
is private to that process and the second is
shared among all process.
Information about the first partition is kept in
the local descriptor table (LDT), information
about the second partition is kept in the global
descriptor table (GDT).
The physical address on the 386 is 32 bits. The
segment register points to the appropriate entry
in the LDT or GDT.

107
Segmentation with Paging Intel 386

The base and limit information about the segment
are used to generate a linear address. It is
divided into a page number of 20 bits, and a page
offset consisting of 12 bits.
As shown in the following diagram, the Intel 386
uses segmentation with paging for memory
management with a two-level paging scheme.

108
Segmentation with Paging Pentium