MET TC670 B1 Computer Science Concepts in Telecommunication Systems - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

MET TC670 B1 Computer Science Concepts in Telecommunication Systems

Description:

Goals of memory management. convenient abstraction for programming ... protection: restrict which addresses processes can use, so they can't stomp on each other ... – PowerPoint PPT presentation

Number of Views:79
Avg rating:3.0/5.0
Slides: 45
Provided by: shudo
Category:

less

Transcript and Presenter's Notes

Title: MET TC670 B1 Computer Science Concepts in Telecommunication Systems


1
MET TC670 B1Computer Science Concepts in
Telecommunication Systems
  • Fall 2003

2
Lecture 4, September 30, 2003
  • Memory management
  • Programming Concepts, and Project 1

3
Memory Management
  • Goals of memory management
  • convenient abstraction for programming
  • isolation between processes
  • allocate scarce memory resources between
    competing processes, maximize performance
    (minimize overhead)
  • Mechanisms
  • physical vs. virtual address spaces
  • page table management, segmentation policies
  • page replacement policies

4
Memory Management Topics
  • Virtual memory techniques
  • Paging system techniques
  • Segmentation techniques
  • Replacement algorithms

5
Earlier Technique Virtual Memory
  • The basic abstraction that the OS provides for
    memory management is virtual memory (VM)
  • VM enables programs to execute without requiring
    their entire address space to be resident in
    physical memory
  • program can also execute on machines with less
    RAM than it needs
  • many programs dont need all of their code or
    data at once (or ever)
  • e.g., branches they never take, or data they
    never read/write
  • no need to allocate memory for it, OS should
    adjust amount allocated based on its run-time
    behavior
  • virtual memory isolates processes from each other
  • one process cannot name addresses visible to
    others each process has its own isolated address
    space

6
In the beginning
  • First, there was batch programming
  • programs used physical addresses directly
  • OS loads job, runs it, unloads it
  • Then came multiprogramming
  • need multiple processes in memory at once
  • to overlap I/O and computation
  • memory requirements
  • protection restrict which addresses processes
    can use, so they cant stomp on each other
  • fast translation memory lookups must be fast, in
    spite of protection scheme
  • fast context switching when swap between jobs,
    updating memory hardware (protection and
    translation) must be quick

7
Virtual Addresses
  • To make it easier to manage memory of multiple
    processes, make processes use virtual addresses
  • virtual addresses are independent of location in
    physical memory (RAM) that referenced data lives
  • OS determines location in physical memory
  • instructions issued by CPU reference virtual
    addresses
  • e.g., pointers, arguments to load/store
    instruction, PC,
  • virtual addresses are translated by hardware into
    physical addresses (with some help from OS)
  • The set of virtual addresses a process can
    reference is its address space
  • many different possible mechanisms for
    translating virtual addresses to physical
    addresses
  • well take a historical walk through them, ending
    up with our current techniques

8
Old technique 1 Fixed Partitions
  • Physical memory is broken up into fixed
    partitions
  • all partitions are equally sized, partitioning
    never changes
  • hardware requirement base register
  • physical address virtual address base
    register
  • base register loaded by OS when it switches to a
    process
  • how can we ensure protection?
  • Advantages
  • simple, ultra-fast context switch
  • Problems
  • internal fragmentation memory in a partition not
    used by its owning process isnt available to
    other processes
  • partition size problem no one size is
    appropriate for all processes
  • fragmentation vs. fitting large programs in
    partition

9
Fixed Partitions (K bytes)
physical memory
0
partition 0
K
3K
partition 1
base register
2K
partition 2
3K
partition 3

offset
4K
virtual address
partition 4
5K
partition 5
10
Old technique 2 Variable Partitions
  • Obvious next step physical memory is broken up
    into variable-sized partitions
  • hardware requirements base register, limit
    register
  • physical address virtual address base
    register
  • how do we provide protection?
  • if (physical address gt base limit) then ?
  • Advantages
  • no internal fragmentation
  • simply allocate partition size to be just big
    enough for process
  • (assuming we know what that is!)
  • Problems
  • external fragmentation
  • as we load and unload jobs, holes are left
    scattered throughout physical memory

11
Variable Partitions
physical memory
partition 0
base register
limit register
P3s base
P3s size
partition 1
partition 2
yes

lt?
offset
partition 3
virtual address
no
partition 4
raise protection fault
12
Modern technique Paging
  • Solve the external fragmentation problem by using
    fixed sized units in both physical and virtual
    memory

virtual memory
physical memory
page 0
frame 0
page 1
frame 1
page 2
frame 2
page 3


frame Y
page X
13
Users Perspective
  • Processes view memory as a contiguous address
    space from bytes 0 through N
  • virtual address space (VAS)
  • In reality, virtual pages are scattered across
    physical memory frames
  • virtual-to-physical mapping
  • this mapping is invisible to the program
  • Protection is provided because a program cannot
    reference memory outside of its VAS
  • the virtual address 0xDEADBEEF maps to different
    physical addresses for different processes

14
Paging
  • Translating virtual addresses
  • a virtual address has two parts virtual page
    number offset
  • virtual page number (VPN) is index into a page
    table
  • page table entry contains page frame number (PFN)
  • physical address is PFNoffset
  • Page tables
  • managed by the OS
  • map virtual page number (VPN) to page frame
    number (PFN)
  • VPN is simply an index into the page table
  • one page table entry (PTE) per page in virtual
    address space
  • i.e., one PTE per VPN

15
Paging
virtual address
offset
virtual page
physical memory
page frame 0
page table
page frame 1
physical address
page frame 2
offset
page frame
page frame
page frame 3

page frame Y
16
Paging example
  • assume 32 bit addresses
  • assume page size is 4KB (4096 bytes, or 212
    bytes)
  • VPN is 20 bits long (220 VPNs), offset is 12 bits
    long
  • lets translate virtual address 0x13325328
  • VPN is 0x13325, and offset is 0x328
  • assume page table entry 0x13325 contains value
    0x03004
  • page frame number is 0x03004
  • VPN 0x13325 maps to PFN 0x03004
  • physical address PFNoffset 0x03004328

17
Page Table Entries (PTEs)
20
2
1
1
1
page frame number
prot
M
R
V
  • PTEs control mapping
  • the valid bit says whether or not the PTE can be
    used
  • says whether or not a virtual address is valid
  • it is checked each time a virtual address is used
  • the reference bit says whether the page has been
    accessed
  • it is set when a page has been read or written to
  • the modify bit says whether or not the page is
    dirty
  • it is set when a write to the page has occurred
  • the protection bits control which operations are
    allowed
  • read, write, execute
  • the page frame number determines the physical
    page
  • physical page start address PFN

18
Paging Advantages
  • Easy to allocate physical memory
  • physical memory is allocated from free list of
    frames
  • to allocate a frame, just remove it from its free
    list
  • external fragmentation is not a problem!
  • complication for kernel contiguous physical
    memory allocation
  • many lists, each keeps track of free regions of
    particular size
  • regions sizes are multiples of page sizes
  • buddy algorithm
  • Easy to page out chunks of programs
  • all chunks are the same size (page size)
  • use valid bit to detect references to paged-out
    pages
  • also, page sizes are usually chosen to be
    convenient multiples of disk block sizes

19
Paging Disadvantages
  • Can still have internal fragmentation
  • process may not use memory in exact multiples of
    pages
  • Memory reference overhead
  • 2 references per address lookup (page table, then
    memory)
  • solution use a hardware cache to absorb page
    table lookups
  • translation lookaside buffer (TLB) many
    details, textbook
  • Memory required to hold page tables can be large
  • need one PTE per page in virtual address space
  • 32 bit AS with 4KB pages 220 PTEs 1,048,576
    PTEs
  • 4 bytes/PTE 4MB per page table
  • OSs typically have separate page tables per
    process
  • 25 processes 100MB of page tables
  • solution page the page tables (!!!)
  • (ow, my brain hurtsso complicated)

20
Two-level page tables
  • With two-level PTs, virtual addresses have 3
    parts
  • master page number, secondary page number, offset
  • master PT maps master PN to secondary PT
  • secondary PT maps secondary PN to page frame
    number
  • offset PFN physical address
  • Example
  • 4KB pages, 4 bytes/PTE
  • how many bits in offset? need 12 bits for 4KB
  • want master PT in one page 4KB/4 bytes 1024
    PTE
  • hence, 1024 secondary page tables
  • so master page number 10 bits, offset 12
    bits
  • with a 32 bit address, that leaves 10 bits for
    secondary PN

21
Two level page tables
virtual address
secondary page
master page
offset
physical memory
page frame 0
master page table
physical address
page frame 1
offset
page frame
secondary page table
secondary page table
page frame 2
page frame 3
page frame number

page frame Y
22
Addressing Page Tables
  • Where are page tables stored?
  • and in which address space?
  • Possibility 1 physical memory
  • easy to address, no translation required
  • but, page tables consume memory for lifetime of
    VAS
  • Possibility 2 virtual memory (OSs VAS)
  • cold (unused) page table pages can be paged out
    to disk
  • but, addresses page tables requires translation
  • how do we break the recursion?
  • dont page the outer page table (called wiring)
  • So, now that weve paged the page tables, might
    as well page the entire OS address space!
  • tricky, need to wire some special code and data
    (e.g., interrupt and exception handlers)

23
Making it all efficient
  • Original page table schemed doubled the cost of
    memory lookups
  • one lookup into page table, a second to fetch the
    data
  • Two-level page tables triple the cost!!
  • two lookups into page table, a third to fetch the
    data
  • How can we make this more efficient?
  • goal make fetching from a virtual address about
    as efficient as fetching from a physical address
  • solution use a hardware cache inside the CPU
  • cache the virtual-to-physical translations in the
    hardware
  • called a translation lookaside buffer (TLB)
  • TLB is managed by the memory management unit (MMU)

24
TLBs
  • Translation lookaside buffers
  • translates virtual page s into PTEs (not
    physical addrs)
  • can be done in single machine cycle
  • TLB is implemented in hardware
  • is a fully associative cache (all entries
    searched in parallel)
  • cache tags are virtual page numbers
  • cache values are PTEs
  • with PTE offset, MMU can directly calculate the
    PA
  • TLBs exploit locality
  • processes only use a handful of pages at a time
  • 16-48 entries in TLB is typical (64-192KB)
  • can hold the hot set or working set of
    process
  • hit rates in the TLB are therefore really
    important

25
Managing TLBs
  • Address translations are mostly handled by the
    TLB
  • gt99 of translations, but there are TLB misses
    occasionally
  • in case of a miss, who places translations into
    the TLB?
  • Hardware (memory management unit, MMU)
  • knows where page tables are in memory
  • OS maintains them, HW access them directly
  • tables have to be in HW-defined format
  • this is how x86 works
  • Software loaded TLB (OS)
  • TLB miss faults to OS, OS finds right PTE and
    loads TLB
  • must be fast (but, 20-200 cycles typically)
  • CPU ISA has instructions for TLB manipulation
  • OS gets to pick the page table format

26
Managing TLBs (2)
  • OS must ensure TLB and page tables are consistent
  • when OS changes protection bits in a PTE, it
    needs to invalidate the PTE if it is in the TLB
  • What happens on a process context switch?
  • remember, each process typically has its own page
    tables
  • need to invalidate all the entries in TLB!
    (flush TLB)
  • this is a big part of why process context
    switches are costly
  • can you think of a hardware fix to this?
  • When the TLB misses, and a new PTE is loaded, a
    cached PTE must be evicted
  • choosing a victim PTE is called the TLB
    replacement policy
  • implemented in hardware, usually simple (e.g. LRU)

27
More Techniques Segmentation
  • A similar technique to paging is segmentation
  • segmentation partitions memory into logical units
  • stack, code, heap,
  • on a segmented machine, a VA is ltsegment ,
    offsetgt
  • segments are units of memory, from the users
    perspective
  • A natural extension of variable-sized partitions
  • variable-sized partition 1 segment/process
  • segmentation many segments/process
  • Hardware support
  • multiple base/limit pairs, one per segment
  • stored in a segment table
  • segments named by segment , used as index into
    table

28
Segment lookups
segment table
physical memory
segment 0
segment
offset
segment 1
virtual address
segment 2
yes

lt?
segment 3
no
segment 4
raise protection fault
29
Combining Segmentation and Paging
  • Can combine these techniques
  • x86 architecture supports both segments and
    paging
  • Use segments to manage logically related units
  • stack, file, module, heap, ?
  • segment vary in size, but usually large (multiple
    pages)
  • Use pages to partition segments into fixed chunks
  • makes segments easier to manageme within PM
  • no external fragmentation
  • segments are pageable- dont need entire
    segment in memory at same time
  • Linux
  • 1 kernel code segment, 1 kernel data segment
  • 1 user code segment, 1 user data segment
  • N task state segments (stores registers on
    context switch)
  • 1 local descriptor table segment (not really
    used)
  • all of these segments are paged
  • three-level page tables

30
Cool Paging Tricks
  • Exploit level of indirection between VA and PA
  • shared memory
  • regions of two separate processes address spaces
    map to the same physical frames
  • read/write access to share data
  • execute shared libraries!
  • will have separate PTEs per process, so can give
    different processes different access privileges
  • must the shared region map to the same VA in each
    process?
  • copy-on-write (COW), e.g. on fork( )
  • instead of copying all pages, created shared
    mappings of parent pages in child address space
  • make shared mappings read-only in child space
  • when child does a write, a protection fault
    occurs, OS takes over and can then copy the page
    and resume client

31
Another great trick
  • Memory-mapped files
  • instead of using open, read, write, close
  • map a file into a region of the virtual address
    space
  • e.g., into region with base X
  • accessing virtual address XN refers to offset
    N in file
  • initially, all pages in mapped region marked as
    invalid
  • OS reads a page from file whenever invalid page
    accessed
  • OS writes a page to file when evicted from
    physical memory
  • only necessary if page is dirty

32
Demand Paging
  • Pages can be moved between memory and disk
  • this process is called demand paging
  • is different than swapping (entire process moved,
    not page)
  • OS uses main memory as a (page) cache of all of
    the data allocated by processes in the system
  • initially, pages are allocated from physical
    memory frames
  • when physical memory fills up, allocating a page
    in requires some other page to be evicted from
    its physical memory frame
  • evicted pages go to disk (only need to write if
    they are dirty)
  • to a swap file
  • movement of pages between memory / disk is done
    by the OS
  • is transparent to the application
  • except for performance

33
Key Algorithms Replacement
  • What happens to a process that references a VA in
    a page that has been evicted?
  • when the page was evicted, the OS sets the PTE as
    invalid and stores (in PTE) the location of the
    page in the swap file
  • when a process accesses the page, the invalid PTE
    will cause an exception (page fault) to be thrown
  • the OS will run the page fault handler in
    response
  • handler uses invalid PTE to locate page in swap
    file
  • handler reads page into a physical frame, updates
    PTE to point to it and to be valid
  • handler restarts the faulted process
  • But where does the page thats read in go?
  • have to evict something else (page replacement
    algorithm)
  • OS typically tries to keep a pool of free pages
    around so that allocations dont inevitably cause
    evictions

34
Why does this work?
  • Locality!
  • temporal locality
  • locations referenced recently tend to be
    referenced again soon
  • spatial locality
  • locations near recently references locations are
    likely to be referenced soon (think about why)
  • Locality means paging can be infrequent
  • once youve paged something in, it will be used
    many times
  • on average, you use things that are paged in
  • but, this depends on many things
  • degree of locality in application
  • page replacement policy and application reference
    pattern
  • amount of physical memory and application
    footprint

35
Why is this demand paging?
  • Think about when a process first starts up
  • it has a brand new page table, with all PTE valid
    bits false
  • no pages are yet mapped to physical memory
  • when process starts executing
  • instructions immediately fault on both code and
    data pages
  • faults stop when all necessary code/data pages
    are in memory
  • only the code/data that is needed (demanded!) by
    process needs to be loaded
  • what is needed changes over time, of course

36
Evicting the best page
  • The goal of the page replacement algorithm
  • reduce fault rate by selecting best victim page
    to remove
  • the best page to evict is one that will never be
    touched again
  • as process will never again fault on it
  • never is a long time
  • Beladys proof evicting the page that wont be
    used for the longest period of time minimizes
    page fault rate
  • Rest of this lecture
  • survey a bunch of replacement algorithms

37
1 Beladys Algorithm
  • Provably optimal lowest fault rate (remember
    SJF?)
  • pick the page that wont be used for longest time
    in future
  • problem impossible to predict future
  • Why is Beladys algorithm useful?
  • as a yardstick to compare other algorithms to
    optimal
  • if Beladys isnt much better than yours, yours
    is pretty good
  • Is there a lower bound?
  • unfortunately, lower bound depends on workload
  • but, random replacement is pretty bad

38
2 FIFO
  • FIFO is obvious, and simple to implement
  • when you page in something, put in on tail of
    list
  • on eviction, throw away page on head of list
  • Why might this be good?
  • maybe the one brought in longest ago is not being
    used
  • Why might this be bad?
  • then again, maybe it is being used
  • have absolutely no information either way
  • FIFO suffers from Beladys Anomaly
  • fault rate might increase when algorithm is given
    more physical memory
  • a very bad property

39
3 Least Recently Used (LRU)
  • LRU uses reference information to make a more
    informed replacement decision
  • idea past experience gives us a guess of future
    behavior
  • on replacement, evict the page that hasnt been
    used for the longest amount of time
  • LRU looks at the past, Beladys wants to look at
    future
  • when does LRU do well?
  • when does it suck?
  • Implementation
  • to be perfect, must grab a timestamp on every
    memory reference and put it in the PTE (way too
    )
  • so, we need an approximation

40
Approximating LRU
  • Many approximations, all use the PTE reference
    bit
  • keep a counter for each page
  • at some regular interval, for each page, do
  • if ref bit 0, increment the counter (hasnt
    been used)
  • if ref bit 1, zero the counter (has
    been used)
  • regardless, zero ref bit
  • the counter will contain the of intervals since
    the last reference to the page
  • page with largest counter is least recently used
  • Some architectures dont have PTE reference bits
  • can simulate reference bit using the valid bit to
    induce faults
  • hack, hack, hack

41
4 LRU Clock
  • AKA Not Recently Used (NRU) or Second Chance
  • replace page that is old enough
  • arrange all physical page frames in a big circle
    (clock)
  • just a circular linked list
  • a clock hand is used to select a good LRU
    candidate
  • sweep through the pages in circular order like a
    clock
  • if ref bit is off, it hasnt been used recently,
    we have a victim
  • so, what is minimum age if ref bit is off?
  • if the ref bit is on, turn it off and go to next
    page
  • arm moves quickly when pages are needed
  • low overhead if have plenty of memory
  • if memory is large, accuracy of information
    degrades
  • add more hands to fix

42
Another Problem allocation of frames
  • In a multiprogramming system, we need a way to
    allocate physical memory to competing processes
  • what if a victim page belongs to another process?
  • family of replacement algorithms that takes this
    into account
  • Fixed space algorithms
  • each process is given a limit of pages it can use
  • when it reaches its limit, it replaces from its
    own pages
  • local replacement some process may do well,
    others suffer
  • Variable space algorithms
  • processes set of pages grows and shrinks
    dynamically
  • global replacement one process can ruin it for
    the rest
  • linux uses global replacement

43
Important concept working set model
  • A working set of a process is used to model the
    dynamic locality of its memory usage
  • i.e., working set set of pages process
    currently needs
  • formally defined by Peter Denning in the 1960s
  • Definition
  • WS(t,w) pages P such that P was referenced in
    the time interval (t, t-w)
  • t time, w working set window (measured in
    page refs)
  • a page is in the working set (WS) only if it was
    referenced in the last w references

44
5 Working Set Size
  • The working set size changes with program
    locality
  • during periods of poor locality, more pages are
    referenced
  • within that period of time, the working set size
    is larger
  • Intuitively, working set must be in memory,
    otherwise youll experience heavy faulting
    (thrashing)
  • when people ask How much memory does Netscape
    need?, really they are asking what is
    Netscapes average (or worst case) working set
    size?
  • Hypothetical algorithm
  • associate parameter w with each process
  • only allow a process to start if its w, when
    added to all other processes, still fits in
    memory
  • use a local replacement algorithm within each
    process

45
6 Page Fault Frequency (PFF)
  • PFF is a variable-space algorithm that uses a
    more ad-hoc approach
  • monitor the fault rate for each process
  • if fault rate is above a given threshold, give it
    more memory
  • so that it faults less
  • doesnt always work (FIFO, Beladys anomaly)
  • if the fault rate is below threshold, take away
    memory
  • should fault more
  • again, not always

46
7 LFU
  • Evict the least frequently used page.
  • Bookkeeping the number of visits before
  • But
  • How long is the history?
  • A page was popular, but not known
  • The problem of Pollution useless pages occupy
    the space forever.

47
Thrashing
  • What the OS does if page replacement algos fail
  • happens if most of the time is spent by an OS
    paging data back and forth from disk
  • no time is spent doing useful work
  • the system is over-committed
  • no idea which pages should be in memory to
    reduced faults
  • could be that there just isnt enough physical
    memory for all processes
  • solutions?
  • Yields some insight into systems researchers
  • if system has too much memory
  • page replacement algorithm doesnt matter
    (over-provisioning)
  • if system has too little memory
  • page replacement algorithm doesnt matter
    (overcommitted)
  • problem is only interesting on the border between
    over-provisioned and over-committed
  • many research papers live here, but not many real
    systems do

48
Just to mention Internet caches
  • Similar idea, different applications
  • Web caches, to keep Web pages
  • Which to keep in cache?
  • Policies LRU, LFU, cost-aware
  • New issues different page sizes, different cost
    (latency) to download a page
  • etc

49
Summary
  • demand paging
  • start with no physical pages mapped, load them in
    on demand
  • page replacement algorithms
  • 1 Beladys optimal, but unrealizable
  • 2 Fifo replace page loaded furthest in past
  • 3 LRU replace page referenced furthest in
    past
  • approximate using PTE reference bit
  • 4 LRU Clock replace page that is old enough
  • 5 working set keep set of pages in memory
    that induces the minimal fault rate
  • 6 page fault frequency grow/shrink page set
    as a function of fault rate
  • local vs. global replacement
  • should processes be allowed to evict each others
    pages?

50
Lecture 4, September 30, 2003
  • Memory management
  • Programming Concepts and Project 1

51
Project Assignment
  • Handout
  • Assignments
  • Code
  • Programs will be available online
  • Hints on how to compile/run programs online

52
Reading
  • Chapter 4, section 4.1, 4.3, 4.4

53
Next Lecture
  • Cover Input/Output (Reading Chapter 5)
  • Start File Systems (Reading Chapter 6)
  • Homework 2
Write a Comment
User Comments (0)
About PowerShow.com