Cache Memories - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Cache Memories

Description:

Cache: a small very fast memory that holds copies of recently used memory values ... processor or DMA unit writes to a main memory location also kept in the cache ... – PowerPoint PPT presentation

Number of Views:107
Avg rating:3.0/5.0
Slides: 43
Provided by: ciaranm
Category:
Tags: amain | cache | memories

less

Transcript and Presenter's Notes

Title: Cache Memories


1
Cache Memories
  • Fast processors need fast memories
  • Fast RAM (SRAM) is expensive and small(i.e. low
    memory density)
  • DRAM is cheaper bigger but slower
  • Use a CACHE to get best of both worlds

2
Computer Memory Hierarchy
3
Cache Memories
  • Cache Hidden Memory
  • Cache a small very fast memory that holds copies
    of recently used memory values
  • Cache operates transparently to the programmer
    automatically decides what to keep what to
    overwrite
  • Coherency Ensure that the contents of the cache
    and the main memory are the SAME (whenever they
    have to be)

4
Cache Memory Consistency
  • Fundamental Requirement
  • Every read access to a memory address always
    provides the most up-to-date data at that address
  • This requirement has to be satisfied even in a
    multi-busmaster or multi-processor system. Copies
    of memory areas may be residing in multiple cache
    memories.

5
Most Code and Data is local
  • Program Execution most code addresses are very
    close together
  • Think of a loop
  • Most data variables are used very frequently
  • A block of code will use a small number of
    variables at a time. Typically, variables are
    arranged in structures/objects/records, which
    are stored in a block of memory

6
Cache Memory Main Memory
7
Cache Organisation
  • Unified cache for code and data e.g. i486 More
    efficient use of resources
  • Separate (Harvard) code and data caches e.g.
    Pentium
  • Faster because you can access code and data in
    the same clock cycle

8
Code and Data Cache
9
Cache Hits Misses
  • Cache Hit if data required by the CPU is in the
    cache we have a cache hit, otherwise a cache miss
  • Cache Hit Rate Proportion of memory accesses
    satisfied by cache, Miss Rate more commonly
    referred to
  • To prevent memory bottlenecks cache miss rate
    needs to be no more than a few percent
  • Cache Line a block of data held in the cache
  • Cache Line Fill occurs when a block of data is
    read from main memory into a cache line

10
Cache Definitions
  • Cache LineThe smallest unit of storage that can
    be allocated in a cache. Processor always reads
    or writes entire cache lines. Popular cache line
    size 16B-32B
  • Cache SetA cache set is a group of cache lines
    into which a given line in memory can be mapped.
    Every memory address is mapped to a specific set.
    The number of cache lines per cache set depends
    on the associativity of the cache. A cache with n
    cache lines per cache set is called an - way
    set-associative cache.

11
Cache Organisition
  • Direct-mapped cache1 cache line per cache set
  • 2-way set-associative cache2 cache lines per
    cache set
  • 4-way set-associative cache4 cache lines per
    cache set
  • 8-way set-associative cache8 cache lines per
    cache set

12
Cache Definitions
  • Cache Entry consists of
  • Cache Directory EntryContains information such
    as what data is stored in the cache.
  • Cache Memory EntryContains actual cache data
    (cache lines)

13
Cache Write Strategies
14
Cache Invalidation Cache Flush
  • If another processor or DMA unit writes to a main
    memory location also kept in the cache the cache
    controller must perform a Cache Invalidation
  • When a write-back is used, sometimes updated
    cache data must be transferred to the main memory
    on demand (e.g. after DMA request) Cache Flush

15
Direct Mapped Cache (1)
16
Direct Mapped Cache (2)
  • Tag comparison and data access can be performed
    at the same time direct mapped cache is the
    fastest
  • Tag RAM is small, tag access is completed before
    data access
  • Two items with the same cache set address will
    contend for the use of a single cache entry
  • Cache Contention can lead to cache trashing
  • Only bits not used to select within the line or
    to address the cache RAM need to be stored in the
    tag field

17
2-way Set-Associative Cache (1)
18
2-way Set-associative Cache (2)
  • Effectively 2 x direct-mapped caches in parallel
  • Each of two items that were in contention may
    occupy a separate place in the cache
  • Moving from Direct-mapped cache to 2-way
    set-associtive cache for a given cache sizeSet
    address is one bit smaller, tag address is one
    bit bigger than Direct-mapped case. Can you see
    why?

19
Cache Line Replacement
  • When a cache miss causes a line fill and there is
    vacancy in a set the new line will replace a
    vacant line
  • However When a cache miss causes a line fill and
    all lines in a set are occupiedyou have to
    decide which line in a set will be replaced
  • This is done through a replacement algorithm.
    Popular algorithms are
  • Least Recently Used (LRU)Algorithm controlled by
    LRU bits in the cache directory
  • Random Allocaction
  • Cyclic

20
Cache Consistency (1)
  • Challenge in handling caches
  • Data in a cache may not always be the same as the
    data in main memory
  • Remember that the data in the cache is controlled
    by the CPU that owns the cache

21
Cache Consistency (2)
  • Suppose you have a second CPU or another possible
    Bus Master (eg DMA Controller)
  • Suppose this device wants to access some data
    thats in the CPUs cache
  • Second device can only access main memory
  • How does it know if the data it wants to read is
    up-to-date?
  • Solution Bus Snooping or Inquiry Cycles

22
Cache Consistency (2)
  • Inquiry cycles snoop cyclesInitiated by the
    system to determine if a line is present in the
    cache, and what state the line is in.

23
MESI Protocol Whats that?
  • Formal Mechanism for controlling cache
    consistency using snooping
  • Every cache line is in 1 of 4 MESI states
    (encoded in 2b)
  • Cache line can change state by
  • memory read and write cycles
  • inquiry cycles

24
MESI States
  • ModifiedAn M-state line is available in only one
    cache and it is also MODIFIED (different from
    main memory). An M-state line can be accessed
    (read/written to) without sending a cycle out on
    the bus
  • ExclusiveAn E-state line is also available in
    only one cache in the system, but the line is not
    MODIFIED (i.e., it is the same as main memory).
    An E-state line can be accessed (read/written to)
    without generating a bus cycle. A write to an
    E-state line causes the line to become MODIFIED

25
MESI States
  • SharedThis state indicates that the line is
    potentially shared with other caches (i.e., the
    same line may exist in more than one cache). A
    read to an S-state line does not generate bus
    activity, but a write to a SHARED line generates
    a write-through cycle on the bus. The
    write-through cycle may invalidate this line in
    other caches. A write to an S-state line updates
    the cache
  • InvalidThis state indicates that the line is not
    available in the cache. A read to this line will
    be a MISS and may cause the processor to execute
    a LINE FILL (fetch the whole line into the cache
    from main memory). A write to an INVALID line
    causes the processor to execute a write-through
    cycle on the bus

26
MESI States
 
27
MESI State Transitions
28
Cache Consistency and Bus Snooping (Inquiry
Cycles) -1
29
Cache Consistency and Bus Snooping (Inquiry
Cycles) -2
30
Cache Consistency and Bus Snooping (Inquiry
Cycles) -3
31
Cache Consistency and Bus Snooping (Inquiry
Cycles) -4
32
Cache Consistency and Bus Snooping (Inquiry
Cycles) -5
33
Cache Consistency and Bus Snooping (Inquiry
Cycles) -6
34
Cache Consistency and Bus Snooping (Inquiry
Cycles) -7
35
Cache Consistency and Bus Snooping (Inquiry
Cycles) -8
36
Cache Consistency and Bus Snooping (Inquiry
Cycles) -9
37
L2-Caches and the MESI Protocol
L2-caches are larger than L1 but a bit
slower MESI protocol applies to both
caches Inclusion all address in L1 also in L2
38
Intel Architecture Caches
39
Pentium L1-caches
40
Pentium L1 Cache elements
41
Pentium Page Cacheability
42
Pentium L2-cache
Write a Comment
User Comments (0)
About PowerShow.com