CSCE 212 Chapter 7 Memory Hierarchy - PowerPoint PPT Presentation

About This Presentation
Title:

CSCE 212 Chapter 7 Memory Hierarchy

Description:

Example: papers on your desk vs. papers in your filing cabinet ... Page lookups must be performed in hardware. Page table is cached on-chip ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 27
Provided by: jasond9
Learn more at: https://www.cse.sc.edu
Category:

less

Transcript and Presenter's Notes

Title: CSCE 212 Chapter 7 Memory Hierarchy


1
CSCE 212Chapter 7Memory Hierarchy
  • Instructor Jason D. Bakos

2
Memory Hierarchy
  • Programmers want more memory and faster memory
  • Problems
  • Denser memories require longer access times
  • Example papers on your desk vs. papers in your
    filing cabinet
  • Fast memories are extremely expensive per unit
    capacity
  • Examples
  • SRAM .5 5 ns access time, 1K/GB
  • DRAM 50 70 ns access time, 100/GB
  • Magnetic disk 5 20 ms access time, .10/GB

3
Locality
  • Goal
  • Achieve the access time of smaller memories but
    have the effective capacity of larger memories
  • Solution
  • Temporal locality
  • memory locations are accessed more than once
  • Spatial locality
  • when a memory location is accessed, theres a
    good chance a nearly location will be accessed in
    the near future

4
Memory Hierarchy
5
Memory Hierarchy
  • Each level of the hierarchy stores a subset of
    the level below it
  • Each level can only communicate with the level
    below it
  • For now, assume 2-level hierarchy
  • CPU-cache-RAM
  • cache is usually on-chip
  • Sometimes the data we need is not in cache
  • hit rate
  • Block or line
  • spatial locality
  • miss penalty
  • time required to move a line to the top of the
    hierarchy (may vary)

main memory
CPU
cache
6
Caches
  • Questions
  • How do we know if the requested location is in
    the cache?
  • How do we find it?

7
Cache Organization
tags address(31 downto (log2 n 2))
  • Fully associative
  • Too many tags to compare!

n words
8
Direct Mapped Cache
9
Direct Mapped Cache
  • Direct mapped each memory location maps to only
    one location in the cache

tags addr(318)
8 words
addr(75)
000
001
010
011
100
101
110
111
10
Addresses
  • The memory address can be partitioned
  • Example 128 lines, 16 word lines

index log2lines bits (which line in each set?)
word offset log2lines_size bits (which word in
the line?)
byte offset 2 bits (which byte in the word?)
tag bits
10
52
93
3110
index
word offset
byte offset
tag bits
11
Cache Organization
12
The Three Cs
  • Three different kinds of misses
  • Compulsary (cold-start) misses
  • First access to a block
  • Capacity misses
  • Replaced block is needed again
  • Because cache capacity isnt sufficient for the
    program
  • Conflict (collision) misses
  • Multiple blocks compete for the same set

13
Associativity
  • 2-way set associative
  • Two choices where to store a given line
  • Replacement policy (ex. LRU)

tags 0 addr(318)
tags 1 addr(318)
8 words
8 words
addr(75)
000
001
010
011
100
101
110
111
14
Associative Cache Organization
15
Cache Behavior
  • Hits at the top-level cache can usually be
    performed in one (or a few) clock cycles
  • Misses stall the processor
  • Writes can be handled using
  • Write-through (write allocate, write no-allocate)
  • When cache data is changed, the lower level
    memory is updated immediately
  • Use a write buffer
  • Write-back
  • When cache data is changed, the lower level
    memory isnt updated until the cache line
    containing the changes is replaced

16
Memory Systems
  • Main memory is DRAM, designed for density (not
    access time)
  • How to reduce miss penalty?

17
Average Memory Access Time
  • AMAT hit_time miss_rate miss_penalty
  • Reduce miss rate
  • Larger cache (capacity misses)
  • Increase associativity (conflict misses)
  • Replacement policy
  • Each of these may increase hit time and miss
    penalty
  • Reduce miss penalty
  • Wider or banked memory bus

18
Virtual Memory
  • Main memory acts as a cache to secondary storage
  • Allows memory to be shared
  • Make memory appear to be larger than it
    physically is
  • Each program has own address space
  • Enforces protection
  • Virtual memory block is called a page, a miss is
    called a page fault
  • Virtual addresses are translated into physical
    addresses
  • Address mapping / address translation
  • Combination of hardware and software

19
Virtual Memory
20
Virtual Memory
21
Page Faults
  • Main memory is 100,000 times faster than disk
  • Page faults are expensive
  • Reduce page fault rate
  • Fully associative placement of pages in memory
  • Each process has a page table that maps virtual
    addresses to physical addresses
  • OS creates space on disk for all the processs
    pages
  • Swap space
  • OS maintains another table that keeps track of
    each page in main memory
  • During a page fault, the OS must decide which
    page to replace
  • Least recently used (LRU)
  • Write-back used for writes

22
Page Table
23
Page Table
24
TLB
  • Page lookups must be performed in hardware
  • Page table is cached on-chip
  • Translation-lookaside buffer
  • Small fully associative or large limited
    associative

25
Integrating Cache and VM
  • Data cannot be in the cache unless it is present
    in main memory
  • Cache can be
  • physically addressed (TLB in critical path)
  • virtually addressed (TLB out of critical path)
  • Cache miss requires TLB access
  • TLB miss means
  • page is in memory but we need the TLB entry, or
  • page is not in memory (page fault)
  • (both handled by OS software)

26
TLB Misses and Page Faults
  • When a virtual address causes a page fault
  • Look up page table entry and find location on
    disk
  • Choose a physical page to replace, write-back if
    dirty
  • Read page from disk into chosen physical page
    (allow another process to run)
  • TLB miss in MIPS
  • BadVAddr set, special exception triggered (8000
    0000), go to TLB miss handler
  • Context register
  • bits 3120 ? base of the page table
  • bits 192 ? virtual address of the missing page
  • Use Context register directly to load missing
    entry
  • If the page table entry is invalid, a page fault
    exception occurs at the normal handler (8000
    0180)
  • Move missing entry to EntryLo register
  • Execute tlbwr to move EntryLo to TLB at address
    stored in Random register (free running counter)
  • Execute eret to return
  • TLB miss exception doesnt save process state
    (fast) while page fault does (slow)
Write a Comment
User Comments (0)
About PowerShow.com