Memory Management, Background and Hardware Support - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Memory Management, Background and Hardware Support

Description:

... many process loaded in the front end of memory that must ... Client Host. Web Server. page.html. page.html. page.html. page.html. image.jpg. 1. 2. 3. Fred Kuhns ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 41
Provided by: Fre58
Category:

less

Transcript and Presenter's Notes

Title: Memory Management, Background and Hardware Support


1
Memory Management,Background and Hardware Support
Fred Kuhns (fredk_at_arl.wustl.edu,
http//www.arl.wustl.edu/fredk) Applied Research
Laboratory Department of Computer Science and
Engineering Washington University in St. Louis
2
Recall the Von Neumann Architecture
3
Primary Memory
  • Primary Memory Design Requirements
  • Minimize access time hardware and software
    requirement
  • Maximize available memory using physical and
    virtual memory techniques
  • Cost-effective limited to a small percentage
    total
  • Memory Manager Functions
  • Allocate memory to processes
  • Map process address space to allocated memory
  • Minimize access times while limiting memory
    requirements

4
Process Address Space
  • Compiler produces relocatable object modules
  • Linker combines modules into an absolute module
    (loadable module).
  • addresses are relative, typically starting at 0.
  • Loader loads program into memory and adjusts
    addresses to produce an executable module.

5
UNIX Process Address Space
Low Address (0x00000000)
Process Address space
stack (dynamic)
High Address (0x7fffffff)
6
Big Picture
kernel memory
7
Memory Management
  • Central Component of any operating system
  • Memory Partitioning schemes Fixed, Dynamic,
    Paging, Segmentation, Combination
  • Relocation
  • Hierarchical layering to optimize performance and
    cost
  • registers
  • cache
  • primary (main) memory
  • secondary (backing store, local disk) memory
  • file servers (networked storage)
  • Policies target expected memory requirements of
    processes
  • consider short, medium and long term resource
    requirements
  • long term admission of new processes (overall
    system requirements)
  • medium term memory allocation (per process
    requirements)
  • short term processor scheduling (immediate
    needs)
  • Common goal optimize number of runnable process
    resident in memory

8
Fixed Partitioning
  • Partition memory into regions with fixed
    boundaries
  • Equal-size partitions
  • program size lt partition
  • program size gt partition size, then must use
    overlays
  • Use swapping when no available partitions
  • Unequal-size partitions
  • Main memory use is inefficient, suffers from
    Internal Fragmentation

9
Placement Algorithm with Partitions
Operating System
  • Equal-size partitions
  • any partition may be used since all are equal in
    size
  • balance partition size with expected allocation
    needs
  • Unequal-size partitions
  • assign each process to the smallest partition
    within which it will fit. Use per partition
    process queues.
  • processes are assigned in such a way as to
    minimize wasted memory within a partition
    (internal fragmentation)

New Processes

wait for best fit
Operating System
New Processes

select smallest available
10
Variable Partitioning
  • External Fragmentation - small holes in memory
    between allocated partitions.
  • Partitions are of variable length and number
  • Process is allocated exactly as much memory as
    required
  • Must use compaction to shift processes so they
    are contiguous and all free memory is in one block

11
Example Dynamic Partitioning
Add processes 1 - 320K 2 - 224K 3 288K
Add processes 4 - 128K Swap Processes 2
224K swap process 2 to make room for process 4.
Remove processes 1 - 320K Swapped Processes 2
224K Relocate Processes move process 2 into
memory freed by process 1
12
Variable Partition Placement Algorithm
  • Best-fit generally worst performer overall
  • place in smallest unused block to minimize unused
    fragment sizes
  • Worst-fit
  • place in largest unused block to maximize unused
    fragment sizes
  • First-fit simple and fast
  • scan from beginning and choose first that is
    large enough.
  • may have many process loaded in the front end of
    memory that must be scanned
  • Next-fit tends to perform worse than first-fit
  • scan memory from the location of the last
    allocation and select next available block large
    enough to hold process
  • tens to allocate at the end of memory where the
    largest block is found
  • Use compaction to combine unused blocks into
    larger continuous blocks.

13
Variable Partition Placement Algorithm
start of memory
8K
8K
alloc 16K block
12K
12K
First Fit
22K
6K Fragment
6K
Last allocated block (14K)
Best Fit
18K
2K Fragment
2K
8K
8K
6K
6K
Allocated block
Free block
14K
14K
Next Fit
36K
20K Fragment
20K
Before
After
14
Addresses
  • Logical Address
  • reference to a memory location independent of the
    current assignment of data to memory
  • Relative Address (type of logical address)
  • address expressed as a location relative to some
    known point
  • Physical Address
  • the absolute address or actual location

15
Relocation
  • Fixed partitions When program loaded absolute
    memory locations assigned
  • A process may occupy different partitions over
    time
  • swapping and compaction cause a program to occupy
    different partitions gt different absolute memory
    locations
  • Dynamic Address Relocation relative address
    used with HW support
  • Special purpose registers are set when process is
    loaded relocated at run-time
  • Base register starting address for the process
  • relative address is added to base register to
    produce an absolute address
  • Bounds register ending location of the process
  • Absolute address compared to bounds register, if
    not within bounds then an interrupt is generated

16
Hardware Support for Relocation
Relative address
Process Control Block
Base Register
Adder
Program
Absolute address
Bounds Register
Comparator
Data
Interrupt to operating system
Stack
Process image in main memory
17
Techniques
  • Paging
  • Partition memory into small equal-size chunks
  • Chunks of memory are called frames
  • Divide each process into the same size chunks
  • Chunks of a process are called pages
  • Operating system maintains a page table for each
    process
  • contains the frame location for each process page
  • memory address page number offset
  • Segmentation
  • All segments of all programs do not have to be of
    the same length
  • There is a maximum segment length
  • Addressing consist of two parts - a segment
    number and an offset
  • Since segments are not equal, segmentation is
    similar to dynamic partitioning

18
Memory Management Requirements
  • Relocation
  • program memory location determined at load time.
  • program may be moved to different location at run
    time (relocate).
  • Consequences memory references must be
    translated to actual physical memory address
  • Protection
  • Protect against inter-process interference
    (transparent isolation).
  • Consequences Must check addresses at run time
    when relocation supported.
  • Sharing
  • Controlled sharing between processes
  • Access restrictions may depend on the type of
    access
  • Permit sharing of read-only program text for
    efficiency reasons
  • Require explicit concurrency protocol for
    processes to share program data segments.

19
Memory Hierarchy
CPU Registers
500 Bytes 1 clock cycle
Executable Memory
Cache Memory
lt 10MB 1-2 Clock cycles
Primary Memory
lt 1GB 1-4 Clock cycles
lt 100GB (per device) 5-50 usec
Rotating Magnetic Memory
Secondary Storage
Optical Memory
lt 15GB (per device) 25 usec 1 sec
lt 5GB (per tape) seconds
Sequential Accessed Memory
20
Principle of Locality
  • Programs tend to cluster memory references for
    both data and instructions. Further, this
    clustering changes slowly with time.
  • Hardware and software exploit principle of
    locality.
  • Temporal locality if location is referenced
    once, then it is likely to be referenced again in
    the near future.
  • Spatial locality if a memory location is
    referenced then other nearby locations will be
    referenced.
  • Stride-k (data) reference patterns
  • visit every kth element of a contiguous vector.
  • stride-1 reference patterns are very common.
  • for (i 1, Array0 0 i lt N i)
  • Arrayi calc_element(Arrayi-i)

21
Caching A possible Scenario
Client Host
Web Server
Disk (files)
CPU
DRAM (Primary)
page.html
page.html
4
page.html
page.html
cache
image.jpg
image.jpg
2
3
1
  • Copy of web page moved to a file on the client
    (cached).
  • Part of the file is copied into primary memory so
    program can process data (cached)
  • A cache line is copied into cache for program
    to use
  • Individual words are copied into CPU registers as
    they are manipulated by the program

22
Hardware Requirements
  • Protection Prevent process from changing own
    memory maps
  • Residency CPU distinguishes between resident and
    non-resident pages
  • Loading Load pages and restart interrupted
    program instructions
  • Dirty Determine if pages have been modified

23
Memory Management Unit
  • Translates Virtual Addresses
  • Page tables
  • Translation Lookaside Buffer (TLB)
  • Page tables
  • One for kernel addresses
  • One or more for user space processes
  • Page Table Entry (PTE) one per virtual page
  • 32 bits - page frame, protection, valid,
    modified, referenced

24
Caching terminology
  • Cache hit requested data is found in cache
  • Cache miss data not found in cache memory
  • cold miss cache is empty
  • conflict miss cache line occupied by a different
    memory location
  • capacity miss working set is larger than cache
  • Placement policy where new block (i.e. cache
    line) is placed
  • Replacement policy controls which block is
    selected for eviction
  • direct mapped one-to-one mapping between cache
    lines and memory locations.
  • fully associative any line in memory can be
    cached in any cache line
  • N-way set associative A line in memory can be
    stored in any of N-lines associated with the
    mapped set.

25
Cache/Primary Memory Structure
Memory Address
Set Number
Cache
0
1
2
Block
Set 0
3
...
Set S-1
E lines per set m address bits t s b M
2m, maximum memory address t m (sb) tag
bits per line S 2s, sets in the cache B 2b,
data Bytes per line V valid bit, 1 per line.
May also require a dirty bit. C cache size
B?E?S 2sb?E
Block
2n - 1
Word Length
The s-bits select the set number while the t-bits
(tag) uniquely id the memory location.
26
Cache Design
  • Write policy
  • hit write-through versus write-back
  • miss write-allocate versus no-write-allocate
  • Replacement algorithm
  • determines which block to replace (LRU)
  • Block size
  • data unit exchanged between cache and main memory

27
Translation
  • Virtual address
  • virtual page number offset
  • Finds PTE for virtual page
  • Extract physical page and adds offset
  • Fail (MMU raises an exception - page fault)
  • bounds error - outside address range
  • validation error - non-resident page
  • protection error - not permitted access

28
Some details
  • Limit Page Table size
  • segments
  • page the page table (multi-level page table)
  • MMU has registers which point to the current page
    table(s)
  • kernel and MMU can modify page tables and
    registers
  • Problem
  • Page tables require perhaps multiple memory
    access per instruction
  • Solution
  • rely on HW caching (virtual address cache)
  • cache the translations themselves - TLB

29
Translation Lookaside Buffer
  • Associative cache of address translations
  • Entries may contain a tag identifying the process
    as well as the virtual address.
  • Why is this important?
  • MMU typically manages the TLB
  • Kernel may need to invalidate entries,
  • Would the kernel ever need to invalidate entries?
  • Contains page table entries that have been most
    recently used
  • Functions same way as a memory cache
  • Given a virtual address, processor examines the
    TLB
  • If present (a hit), the frame number is retrieved
    and the real address is formed
  • If not found (a miss), page number is used to
    index the process page table

30
Address Translation - General
CPU
virtual address
MMU
cache
Physical address
data
Global memory
31
Address Translation Overview
MMU
Virtual address
CPU
physical address
cache
TLB
Page tables
32
Page Table Entry
Y bits
X bits
Virtual address
virtual page number
offset in page
frame number
M
R
control bits
Page Table Entry (PTE)
Z bits
  • Resident bit indicates if page is in memory
  • Modify bit to indicate if page has been altered
    since loaded into main memory
  • Other control bits
  • frame number, this is the physical frame address.

33
Example 1-level address Translation
Virtual address
DRAM Frames
12 bits
20 bits
Frame X
X
offset
add
PTE
frame number
M
R
control bits
(Process) Page Table
current page table register
34
SuperSPARC Reference MMU
Physical address
offset
Physical page
Context Tbl Ptr register
Context Tbl
12 Bits
24 Bits
PTD
Level 1
Level 2
PTD
Level 2
PTD
Context register
12 bit
PTE
6 bits
8 bits
6 bits
12 bits
Virtual address
4096
index 1
index 2
index 3
offset
virtual page
  • 12 bit index for 4096 entries
  • 8 bit index for 256 entries
  • 6 bit index for 64 entries
  • Virtual page number has 20 bits for 1M pages
  • Physical frame number has 24 bits with a 12 bit
    offset,permitting 16M frames.

35
Page Table Descriptor/Entry
Page Table Descriptor
type
Page Table Pointer
2 1 0
Page Table Entry
ACC
C
M
R
Phy Page Number
type
8 7 6 5 4 2 1 0
Type PTD, PTE, Invalid C - Cacheable M -
Modify R - Reference ACC - Access permissions
36
Page Size
  • Smaller page size
  • Reduce internal fragmentation
  • If program uses relatively small segments of
    memory then small page sizes reflect this
    behavior
  • Large page size
  • Secondary memory is designed to efficiently
    transfer large blocks of data
  • Smaller page size gt more pages required per
    process gt larger page tables gt larger page
    tables means large portion of page tables in
    virtual memory and smaller TLB footprint
  • Multiple page sizes provide the flexibility
    needed to effectively use a TLB
  • Large pages can be used for program instructions
    or better for kernel memory thereby decreasing
    its footprint in the page tables
  • Most operating system support only one page size

37
Segmentation
  • May be unequal, dynamic size
  • Simplifies handling of growing data structures
  • Allows programs to be altered and recompiled
    independently
  • Lends itself to sharing data among processes
  • Lends itself to protection
  • Segment tables
  • corresponding segment in main memory
  • Each entry contains the length of the segment
  • A bit is needed to determine if segment is
    already in main memory
  • Another bit is needed to determine if the segment
    has been modified since it was loaded in main
    memory

38
Segment Table Entries
Virtual Address
Segment Number
Offset
Segment table Entry
Length
ctrl
Segment Base Address
39
Combined Paging and Segmentation
  • Paging is transparent to the programmer
  • Paging eliminates external fragmentation
  • Segmentation is visible to the programmer
  • Segmentation allows for growing data structures,
    modularity, and support for sharing and
    protection
  • Each segment is broken into fixed-size pages

40
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com