Practical, transparent operating system support for superpages - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Practical, transparent operating system support for superpages

Description:

no transparent superpage support for apps ... Also for memory objects, to find reserved frames. Nodes lazily created ops are O(log n) ... – PowerPoint PPT presentation

Number of Views:224
Avg rating:3.0/5.0
Slides: 33
Provided by: Juan1160
Category:

less

Transcript and Presenter's Notes

Title: Practical, transparent operating system support for superpages


1
Practical, transparent operating system support
for superpages
  • Juan Navarro ? Sitaram Iyer
  • Peter Druschel ? Alan Cox

Rice University
OSDI 2002
2
Overview
  • Increasing cost in TLB miss overhead
  • growing working sets
  • TLB size does not grow at same pace
  • Processors now provide superpages
  • one TLB entry can map a large region
  • OSs have been slow to harness them
  • no transparent superpage support for apps
  • This talk a practical and transparent solution
    to support superpages

3
Translation look-aside buffer
  • TLB caches virtual-to-physical address
    translations
  • TLB coverage
  • amount of memory mapped by TLB
  • amount of memory that can be accessed without TLB
    misses

4
TLB coverage trend
TLB coverage as percentage of main memory
Factor of 1000 decrease in 15 years
5
How to increase TLB coverage
  • Typical TLB coverage ? 1 MB
  • Use superpages!
  • large and small pages
  • Increase TLB coverage
  • no increase in TLB size

6
What are these superpages anyway?
  • Memory pages of larger sizes
  • supported by most modern CPUs
  • Otherwise, same as normal pages
  • power of 2 size
  • use only one TLB entry
  • contiguous
  • aligned (physically and virtually)
  • uniform protection attributes
  • one reference bit, one dirty bit

7
A superpage TLB
Alpha 8,64,512KB 4MB Itanium 4,8,16,64,256KB
1,4,16,64,256MB
virtual memory
base page entry (size1)
physical address
virtual address
superpage entry (size4)
TLB
physical memory
8
II The superpage problem
9
Issue 1 superpage allocation
virtual memory
B
superpage boundaries
physical memory
B
  • How / when / what size to allocate?

10
Issue 2 promotion
  • Promotion create a superpage out of a set of
    smaller pages
  • mark page table entry of each base page
  • When to promote?

Forcibly populate pages? May incur I/O cost or
increase internal fragmentation.
11
Issue 3 demotion
Demotion convert a superpage into smaller pages
  • when page attributes of base pages of a superpage
    become non-uniform
  • during partial pageouts

12
Issue 4 fragmentation
  • Memory becomes fragmented due to
  • use of multiple page sizes
  • scattered wired (non-pageable) pages
  • Contiguity contended resource
  • OS must
  • use contiguity restoration techniques
  • trade off impact of contiguity restoration
    against superpage benefits

13
Previous approaches
  • Reservations
  • one superpage size only
  • Relocation
  • move pages at promotion time
  • must recover copying costs
  • Eager superpage creation (IRIX, HP-UX)
  • size specified by user non-transparent
  • Hardware support
  • Contiguous virtual superpage mapped to
    discontiguous physical base pages
  • Demotion issues not addressed
  • large pages partially dirty/referenced

14
IIIDesign
15
Key observation
Once an application touches the first page of a
memory object then it is likely that it will
quickly touch every page of that object
  • Example array initialization
  • Opportunistic policies
  • superpages as large and as soon as possible
  • as long as no penalty if wrong decision

16
Superpage allocation
Preemptible reservations
virtual memory
B
superpage boundaries
physical memory
B
reserved frames
How much do we reserve? Goal good TLB
coverage,without internal fragmentation.
17
Allocation reservation size
  • Opportunistic policy
  • Go for biggest size that is no larger than the
    memory object (e.g., file)
  • If required size not available, try preemption
    before resigning to a smaller size
  • preempted reservation had its chance

18
Allocation managing reservations
largest unused (and aligned) chunk
4
2
1
  • best candidate for preemption at front
  • reservation whose most recently populated frame
    was populated the least recently

19
Incremental promotions
  • Promotion policy opportunistic

2
4
42
8
20
Speculative demotions
  • One reference bit per superpage
  • How do we detect portions of a superpage not
    referenced anymore?
  • On memory pressure, demote superpages when
    resetting ref bit
  • Re-promote (incrementally) as pages are
    referenced
  • Demote also when the page daemon selects a base
    page as a victim page.

21
Demotions dirty superpages
  • One dirty bit per superpage
  • whats dirty and whats not?
  • page out entire superpage
  • Demote on first write to clean superpage

write
  • Re-promote (incrementally) as other pages are
    dirtied

22
Fragmentation control
  • Low contiguity modified page daemon for victim
    selection
  • restore contiguity
  • move clean, inactive pages to the free list
  • minimize impact
  • prefer pages that contribute the most to
    contiguity
  • Cluster wired pages

23
IVExperimentalevaluation
24
Experimental setup
  • FreeBSD 4.3
  • Alpha 21264, 500 MHz, 512 MB RAM
  • 8 KB, 64 KB, 512 KB, 4 MB pages
  • 128-entry DTLB, 128-entry ITLB
  • Unmodified applications

25
Best-case benefits
  • TLB miss reduction usually above 95
  • SPEC CPU2000 integer
  • 11.2 improvement (0 to 38)
  • SPEC CPU2000 floating point
  • 11.0 improvement (-1.5 to 83)
  • Other benchmarks
  • FFT (2003 matrix) 55
  • 1000x1000 matrix transpose 655
  • 30 in 8 out of 35 benchmarks

26
Why multiple superpage sizes
  • Improvements with only one superpage size vs. all
    sizes

27
Conclusions
  • Superpages 30 improvement
  • transparently realized low overhead
  • Contiguity restoration is necessary
  • sustains benefits low impact
  • Multiple page sizes are important
  • scales to very large superpages

28
Thanks!
  • Source code and more info at
  • www.cs.rice.edu/jnavarro/superpages

29
Backup slides
30
Superpage allocation
  • Relocation approach

virtual memory
B
superpage boundaries
physical memory
B
Copying costs
31
Population maps
Populationnone partial full
size 8
size 4
size 2
reservation
  • Keep track of population status of reservations
  • Also for memory objects, to find reserved frames
  • Nodes lazily created ? ops are O(log n)

32
Fragmentation control impact
  • Run web server concurrently with an app that
    continually demands 512 KB chunks
  • Impact for web server
  • lt1 overhead of daemon
  • 3 degradation due to deviation from LRU
  • But for the other app
  • 30 of requests for 512 KB are granted(9 times
    more than with original daemon)
Write a Comment
User Comments (0)
About PowerShow.com