Paging, Page Tables, and Such - PowerPoint PPT Presentation

About This Presentation

Title:

Paging, Page Tables, and Such

Description:

... page table sizes can become enormous Example: Alpha architecture 64 bit address space, 8KB pages Optimizing for Sparse Address Spaces Observation: ... – PowerPoint PPT presentation

Number of Views:171

Avg rating:3.0/5.0

Slides: 33

Provided by: AndrewW64

Learn more at: https://courses.cs.washington.edu

Category:

more less

Transcript and Presenter's Notes

Title: Paging, Page Tables, and Such

1
Paging, Page Tables, and Such

Andrew Whitaker
CSE451

2
Todays Topics

Page Replacement Strategies
Making Paging Fast
Reducing the Overhead of Page Tables

3
Review Working Sets
Request / second of throughput
thrashing )-
Over-allocation
Number of page frames allocated to process
4
Page Replacement

What happens when we take a page fault and weve
run out of memory?
Goal Keep each processs working set in memory
Giving more than the working set not necessary
Key issue how do we identify working sets?

5
Beladys Algorithm

Evict the page that wont be used for the longest
time in the future
This page is probably not in the working set
If it is in the working set, were thrashing
This is optimal!
Minimizes the number of page faults
Major problem this requires a crystal ball
There is no good way to predict future memory
accesses

6
How Good are These Page Replacement Algorithms?

LIFO
Newest page is kicked out
FIFO
Oldest page is kicked out
Random
Random page is kicked out
LRU
Least recently used page is kicked out

7
Temporal Locality

Assumption recently accessed pages will be
accessed again soon
Use the past to predict the future
LIFO is horrendous
Random is also pretty bad
LRU is pretty good
FIFO is mediocre
VAX VMS used a form of FIFO because of hardware
limitations

8
Implementing LRU Approach 1

One (bad) approach

on each memory reference long timeStamp
System.currentTimeMillis() sortedList.insert(p
ageFrameNumber,timeStamp)

Problem this is too inefficient
Time stamp data structure manipulation on each
memory operation
Too complex for hardware

9
Making LRU Efficient

Use hardware support
Reference bit is set when pages are accessed
Can be cleared by the OS
Trade off accuracy for speed
It suffices to find a pretty old page

20
2
1
1
1
page frame number
prot
M
R
V
10
Approach 2 LRU Approximation with Reference Bits

For each page, maintain a set of reference bits
Lets call it a reference byte
Periodically, shift the HW reference bit into the
highest-order bit of the reference byte
Suppose the reference byte was 10101010
If the HW bit was set, the new reference bit
become 11010101
Frame with the lowest value is the LRU page

11
Analyzing Reference Bits

Pro Does not impose overhead on every memory
reference
Interval rate can be configured
Con Scanning all page frames can still be
inefficient
e.g., 4 GB of memory, 4KB pages gt 1 million page
frames

12
Approach 3 LRU Clock

Use only a single bit per page frame
Basically, this is a degenerate form of reference
bits
On page eviction
Scan through the list of reference bits
If the value is zero, replace this page
If the value is one, set the value to zero

13
Why Clock?
Typically implemented with a circular queue
0
0
0
0
1
1
0
0
1
1
0
0
14
Analyzing Clock

Pro Very low overhead
Only runs when a page needs evicted
Takes the first page that hasnt been referenced
Con Isnt very accurate (one measly bit!)
Degenerates into FIFO if all reference bits are
set
Pro But, the algorithm is self-regulating
If there is a lot of memory pressure, the clock
runs more often (and is more up-to-date)

15
When Does LRU Do Badly?

LRU performs poorly when there is little temporal
locality

Example Many database workloads

SELECT FROM Employees WHERE Salary lt 25000
16
Todays Topics

Page Replacement Strategies
Making Paging Fast
Reducing the Overhead of Page Tables

17
Review Mechanics of address translation
virtual address
offset
virtual page
physical memory
page frame 0
page table
page frame 1
physical address
page frame 2
offset
page frame
page frame
page frame 3

page frame Y
Problem page tables live in memory
18
Making Paging Fast

We must avoid a page table lookup for every
memory reference
This would double memory access time
Solution Translation Lookaside Buffer
Fancy name for a cache
TLB stores a subset of PTEs (page table
translation entries)
TLBs are small and fast (16-48 entries)
Can be accessed for free

19
TLB Details

In practice, most (gt 99) of memory translations
handled by the TLB
Each processor has its own TLB
TLB is fully associative
Any TLB slot can hold any PTE entry
The full VPN is the cache key
All entries are searched in parallel
Who fills the TLB? Two options
Hardware (x86) walks the page table on a TLB miss
Software (MIPS, Alpha) routine fills the TLB on a
miss
TLB itself needs a replacement policy
Usually implemented in hardware (LRU)

20
What Happens on a Context Switch?

Each process has its own address space
So, each process has its own page table
So, page-table entries are only relevant for a
particular process
Thus, the TLB must be flushed on a context switch
This is why context switches are so expensive

21
Bens Idea

We can avoid flushing the TLB if entries are
associated with an address space
When would this work well?
When would this not work well?

20
2
1
1
1
4
page frame number
prot
M
R
V
ASID
22
TLB Management Pain

TLB is a cache of page table entries
OS must ensure that page tables and TLB entries
stay in sync
Massive pain TLB consistency across multiple
processors
Q How do we implement LRU if reference bits are
stored in the TLB?
One answer we dont
Windows uses FIFO for multiprocessor machines

23
Todays Topics

Page Replacement Strategies
Making Paging Fast
Reducing the Overhead of Page Tables

24
Page Table Overhead

For large address space, page table sizes can
become enormous
Example Alpha architecture
64 bit address space, 8KB pages

Num PTEs 264 / 213 251 Assuming 8 bytes
per PTE Num Bytes 254 16 Petabytes
And, this is per-process!
25
Optimizing for Sparse Address Spaces

Observation very little of the address space is
in use at a given time
This is why virtual memory works
Basic idea only allocate page tables where we
need to
And, fill in new page tables on demand

virtual address space
26
Implementing Sparse Address Spaces

We need a data structure to keep track of the
page tables we have allocated
And, this structure must be small
Otherwise, weve defeated our original goal
Solution multi-level page tables
Page tables of page tables
Any problem in CS can be solved with a layer of
indirection

27
Two level page tables
virtual address
secondary page
master page
offset
physical memory
page frame 0
master page table
physical address
page frame 1
offset
page frame
secondary page table
secondary page table
page frame 2
page frame 3
empty
page frame number

empty
page frame Y
Key point not all secondary page tables must be
allocated
28
Generalizing

Early architectures used 1-level page tables
VAX, x86 used 2-level page tables
SPARC uses 3-level page tables
Alpha 68030 uses 4-level page tables
Key thing is that the outer level must be wired
down (pinned in physical memory) in order to
break the recursion

29
Cool Paging Tricks

Basic Idea exploit the layer of indirection
between virtual and physical memory

30
Trick 1 Shared Memory

Allow different processes to share physical memory

Virt Address space 2
Virt Address space 1
Physical memory
31
Trick 2 Copy-on-write

Recall that fork() copies the parents address
space to the client
This is ineffient, especially if the child calls
exec
Copy-on-write allows for a fast copy by using
shared pages
If the child tries to write to a page, the OS
intervenes and makes a copy of the target page
Implementation pages are shared as read-only
OS intercepts write faults