Lecture 19: Cache Replacement Policy, Line Size, Write Method, and Multilevel Caches - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Lecture 19: Cache Replacement Policy, Line Size, Write Method, and Multilevel Caches

Description:

... the CPU accesses any other word in the line, it will be found in the cache. ... Load the line to cache, and write the new data to both cache and memory (this ... – PowerPoint PPT presentation

Number of Views:157
Avg rating:3.0/5.0
Slides: 15
Provided by: soonte
Category:

less

Transcript and Presenter's Notes

Title: Lecture 19: Cache Replacement Policy, Line Size, Write Method, and Multilevel Caches


1
Lecture 19 Cache Replacement Policy, Line Size,
Write Method, and Multi-level Caches
  • Soon Tee Teoh
  • CS 147

2
Cache Replacement Policy
  • For direct-mapped cache, if a word is to be
    loaded to the cache, it goes into a fixed
    position, and replaces whatever was there before.
  • For set-associative or fully associative cache, a
    word can be loaded to more than one possible
    locations. We need a cache replacement policy to
    decide which position to load the new word to.

3
Cache Replacement Policy
  • 3 common options
  • Random replacement
  • Simple
  • But perhaps high miss rate
  • First in, first out (FIFO)
  • Rationale Oldest is most likely not needed
    anymore
  • Implementation Maintain a queue
  • Least recently used (LRU)
  • Rationale The one that has been unused for the
    longest time is most likely not needed anymore
  • Implementation Usually costly to implement

4
Assumptions for following examples
  • Assume 1 KB (that is, 1024 bytes) of main memory
  • Assume 4-byte words
  • Assume 32-word cache
  • Assume memory is byte-addressed
  • Therefore 1 KB memory needs 10-bit address

5
Line Size
  • Rather than fetching a single word from memory to
    cache, fetch a whole block of l words, called a
    line.
  • This takes advantage of spatial locality.
  • The number of words in a line is a power of two.

Example 4 words in a line
9 8 7 6 5 4 3
2 1 0
Tag Index Line Word
6
Line Size Example
2-way set associative cache, with 4-word lines,
and capacity to contain 32 words
9 8 7 6 5 4 3
2 1 0
Tag Index Line Word
Tag 2 Line 2
Index Tag 1 Line 1
00 01 10 11 1111

0011
Scenario Suppose CPU is requesting the word at
memory address 1010110100. Step 1 the index is
11, therefore look at the two tags at index
11. Suppose Tag 1 at index 11 is 1111 and Tag 2
is 0011. Since both tags dont match tag 1010
requested by CPU, we load words 1010110000
through 1010111100 to the cache.
7
Line Size Example (continued)
Suppose that the cache replacement policy
determines that we should replace Set 1, then the
line will be loaded into Set 1, and the tag
changed.
Tag 2 Line 2
Index Tag 1 Line 1
00 01 10 11 1010

0011
Word 1010110000
Word 1010110100
Word 1010111000
Word 1010111100
8
Line Size Example (continued)
The CPU memory request was word
1010110100. Therefore the second word
(highlighted) of the line is loaded to the CPU.
Tag 2 Line 2
Index Tag 1 Line 1
00 01 10 11 1010

0011
If after this the CPU accesses any other word in
the line, it will be found in the cache.
9
Cache Write Method
  • Suppose the CPU wants to write a word to memory.
  • If the memory unit uses the cache, it has several
    options.
  • 2 commonly-used options
  • Write-through
  • Write-back

10
Write-Through
  • On a write request by the CPU, check if the old
    data is in the cache.
  • If the old data is in the cache (Write Hit),
    write the new data into the cache, and also into
    memory, replacing the old data in both cache and
    memory.
  • If the old data is not in the cache (Write Miss),
    either
  • Load the line to cache, and write the new data to
    both cache and memory (this method is called
    write-allocate), or
  • Just write the new data to memory, dont load the
    line to cache (this method is called
    no-write-allocate)
  • Advantage Keeps cache and memory consistent
  • Disadvantage Needs to stall for memory access on
    every memory write
  • To reduce this problem, use a write buffer. When
    the CPU wants to write a word to memory, it puts
    the word into the write buffer, and then
    continues executing the instructions following
    the memory write. Simultaneously, the write
    buffer writes the words to memory.

11
Write-Back
  • When the instruction requires a write to memory,
  • If there is a cache hit, write only to the cache.
  • Later, if there is a cache miss, and this line
    needs to be replaced, write the data back to
    memory.
  • If there is a cache miss, either
  • Load the line to cache, and write the new data to
    both cache and memory (this method is called
    write-allocate), or
  • Just write the new data to memory, dont load the
    line to cache (this method is called
    no-write-allocate)
  • In the Write-back method, we have a dirty bit
    associated with each cache entry. If the dirty
    bit is set to 1, we need to write back to memory
    when this cache entry needs to be replaced. If
    the dirty bit is 0, we dont need to write back
    to memory, saving CPU stalls
  • Disadvantage of Write-Back approach
    Inconsistency - memory contains stale data
  • Note Write-allocate is usually used with
    write-back. No-write-allocate is usually used
    with write-through.

12
Cache Loading
  • In the beginning, the cache contains junk.
  • When the CPU makes a memory access, it compares
    the tag field in the memory address to the tag in
    the cache. Even if the tags match, we dont know
    if the data is valid.
  • Therefore, we add a valid bit to each cache
    entry.
  • In the beginning, all the valid bits are set to
    0.
  • Later, as data are loaded from memory to cache,
    the valid bit for the cache entry is set to 1.
  • To check if a word is in the cache, we have to
    check if the cache tag matches the address tag,
    and that the valid bit is 1.

13
Instruction and Data Caches
  • Can either have separate Instruction Cache and
    Data Cache, or have one unified cache.
  • Advantage of separate cache Can access
    Instruction Cache and Data Cache simultaneously
    in the same cycle, as required by a pipelined
    datapath
  • Advantage of unified cache More flexible, so may
    have a higher hit rate

14
Multiple-Level Caches
  • More levels in the memory hierarchy
  • Can have two levels of cache
  • The Level-1 cache (or L1 cache, or internal
    cache) is smaller and faster, and lies in the
    processor next to the CPU.
  • The Level-2 cache (or L2 cache, or external
    cache) is larger but slower, and lies outside the
    processor.
  • Memory access first goes to the L1 cache. If L1
    cache access is a miss, go to L2 cache. If L2
    cache is a miss, go to main memory. If main
    memory is a miss, go to virtual memory on hard
    disk.
Write a Comment
User Comments (0)
About PowerShow.com