Cache Memory - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Cache Memory

Description:

for two 4-word blocks. The tiny, very fast CPU register file. has room ... M=16 byte addresses, B=2 bytes/block, S=4 sets, E=1 entry/set. Address trace (reads) ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 33
Provided by: binyu5
Category:
Tags: cache | memory | register

less

Transcript and Presenter's Notes

Title: Cache Memory


1
Cache Memory
2
Outline
  • General concepts
  • 3 ways to organize cache memory
  • Issues with writes
  • Suggested Reading 6.4

3
Cache Memory
  • History
  • At very beginning, 3 levels
  • Registers, main memory, disk storage
  • 10 years later, 4 levels
  • Register, SRAM cache, main DRAM memory, disk
    storage
  • Modern processor, 45 levels
  • Registers, SRAM L1, L2(,L3) cache, main DRAM
    memory, disk storage
  • Cache memories
  • are small, fast SRAM-based memories
  • are managed by hardware automatically
  • can be on-chip, on-die, off-chip

4
Cache Memory
CPU chip
pp. 488
register file
ALU
L1 cache
cache bus
system bus
memory bus
main memory
I/O bridge
bus interface
L2 cache
5
Cache Memory
  • L1 cache is on-chip
  • L2 cache is off-chip several years ago
  • L3 cache can be off-chip or on-chip
  • CPU looks first for data in L1, then in L2, then
    in main memory
  • Hold frequently accessed blocks of main memory in
    caches

6
Inserting an L1 cache between the CPU and main
memory
7
Generic Cache Memory Organization
8
Cache Memory
9
Cache Memory
10
Addressing caches
11
Direct-mapped cache
  • Simplest kind of cache
  • Characterized by exactly one line per set.

12
Accessing direct-mapped caches
  • Set selection
  • Use the set index bits to determine the set of
    interest

13
Accessing direct-mapped caches
  • Line matching and word extraction
  • find a valid line in the selected set with a
    matching tag (line matching)
  • then extract the word (word selection)

14
Accessing direct-mapped caches
15
Line Replacement on Misses in Directed Caches
  • If cache misses
  • Retrieve the requested block from the next level
    in the memory hierarchy
  • Store the new block in one of the cache lines of
    the set indicated by the set index bits

16
Line Replacement on Misses in Directed Caches
  • If the set is full of valid cache lines
  • One of the existing lines must be evicted
  • For a direct-mapped cache
  • Each set contains only one line
  • Current line is replaced by the newly fetched line

17
Direct-mapped cache simulation
  • M16 byte addresses
  • B2 bytes/block, S4 sets, E1 entry/set

18
Direct-mapped cache simulation
M16 byte addresses, B2 bytes/block, S4 sets,
E1 entry/set Address trace (reads) 0 0000 1
0001 13 1101 8 1000 0 0000
19
Direct-mapped cache simulation
M16 byte addresses, B2 bytes/block, S4 sets,
E1 entry/set Address trace (reads) 0 0000 1
0001 13 1101 8 1000 0 0000
20
Direct-mapped cache simulation
21
Why use middle bits as index?
  • High-Order Bit Indexing
  • Adjacent memory lines would map to same cache
    entry
  • Poor use of spatial locality
  • Middle-Order Bit Indexing
  • Consecutive memory lines map to different cache
    lines
  • Can hold C-byte region of address space in cache
    at one time

22
Set associative caches
  • Characterized by more than one line per set

23
Accessing set associative caches
  • Set selection
  • identical to direct-mapped cache

24
Accessing set associative caches
  • Line matching and word selection
  • must compare the tag in each valid line in the
    selected set.

25
Fully associative caches
  • Characterized by all of the lines in the only one
    set
  • No set index bits in the address

26
Accessing fully associative caches
  • Word selection
  • must compare the tag in each valid line

27
Issues with Writes
  • Write hits
  • Write through
  • Cache updates its copy
  • Immediately writes the corresponding cache block
    to memory
  • Write back
  • Defers the memory update as long as possible
  • Writing the updated block to memory only when it
    is evicted from the cache
  • Maintains a dirty bit for each cache line

28
Issues with Writes
  • Write misses
  • Write-allocate
  • Loads the corresponding memory block into the
    cache
  • Then updates the cache block
  • No-write-allocate
  • Bypasses the cache
  • Writes the word directly to memory
  • Combination
  • Write through, no-write-allocate
  • Write back, write-allocate

29
Multi-level caches
30
Cache performance metrics
  • Miss Rate
  • fraction of memory references not found in cache
    (misses/references)
  • Typical numbers
  • 3-10 for L1
  • Hit Rate
  • fraction of memory references found in cache (1 -
    miss rate)

31
Cache performance metrics
  • Hit Time
  • time to deliver a line in the cache to the
    processor (includes time to determine whether the
    line is in the cache)
  • Typical numbers
  • 1-2 clock cycle for L1
  • 5-10 clock cycles for L2
  • Miss Penalty
  • additional time required because of a miss
  • Typically 25-100 cycles for main memory

32
Cache performance metrics
  • Cache size
  • Hit rate vs. hit time
  • Block size
  • Spatial locality vs. temporal locality
  • Associativity
  • Thrashing
  • Cost
  • Speed
  • Miss penalty
  • Write strategy
  • Simple, read misses, fewer transfer
Write a Comment
User Comments (0)
About PowerShow.com