Cache Memories - PowerPoint PPT Presentation

1 / 42

About This Presentation

Title:

Cache Memories

Description:

Cache: a small very fast memory that holds copies of recently used memory values ... processor or DMA unit writes to a main memory location also kept in the cache ... – PowerPoint PPT presentation

Number of Views:107

Avg rating:3.0/5.0

Slides: 43

Provided by: ciaranm

Category:

more less

Transcript and Presenter's Notes

Title: Cache Memories

1
Cache Memories

Fast processors need fast memories
Fast RAM (SRAM) is expensive and small(i.e. low
memory density)
DRAM is cheaper bigger but slower
Use a CACHE to get best of both worlds

2
Computer Memory Hierarchy
3
Cache Memories

Cache Hidden Memory
Cache a small very fast memory that holds copies
of recently used memory values
Cache operates transparently to the programmer
automatically decides what to keep what to
overwrite
Coherency Ensure that the contents of the cache
and the main memory are the SAME (whenever they
have to be)

4
Cache Memory Consistency

Fundamental Requirement
Every read access to a memory address always
provides the most up-to-date data at that address
This requirement has to be satisfied even in a
multi-busmaster or multi-processor system. Copies
of memory areas may be residing in multiple cache
memories.

5
Most Code and Data is local

Program Execution most code addresses are very
close together
Think of a loop
Most data variables are used very frequently
A block of code will use a small number of
variables at a time. Typically, variables are
arranged in structures/objects/records, which
are stored in a block of memory

6
Cache Memory Main Memory
7
Cache Organisation

Unified cache for code and data e.g. i486 More
efficient use of resources
Separate (Harvard) code and data caches e.g.
Pentium
Faster because you can access code and data in
the same clock cycle

8
Code and Data Cache
9
Cache Hits Misses

Cache Hit if data required by the CPU is in the
cache we have a cache hit, otherwise a cache miss
Cache Hit Rate Proportion of memory accesses
satisfied by cache, Miss Rate more commonly
referred to
To prevent memory bottlenecks cache miss rate
needs to be no more than a few percent
Cache Line a block of data held in the cache
Cache Line Fill occurs when a block of data is
read from main memory into a cache line

10
Cache Definitions

Cache LineThe smallest unit of storage that can
be allocated in a cache. Processor always reads
or writes entire cache lines. Popular cache line
size 16B-32B
Cache SetA cache set is a group of cache lines
into which a given line in memory can be mapped.
Every memory address is mapped to a specific set.
The number of cache lines per cache set depends
on the associativity of the cache. A cache with n
cache lines per cache set is called an - way
set-associative cache.

11
Cache Organisition

Direct-mapped cache1 cache line per cache set
2-way set-associative cache2 cache lines per
cache set
4-way set-associative cache4 cache lines per
cache set
8-way set-associative cache8 cache lines per
cache set

12
Cache Definitions

Cache Entry consists of
Cache Directory EntryContains information such
as what data is stored in the cache.
Cache Memory EntryContains actual cache data
(cache lines)

13
Cache Write Strategies
14
Cache Invalidation Cache Flush

If another processor or DMA unit writes to a main
memory location also kept in the cache the cache
controller must perform a Cache Invalidation
When a write-back is used, sometimes updated
cache data must be transferred to the main memory
on demand (e.g. after DMA request) Cache Flush

15
Direct Mapped Cache (1)
16
Direct Mapped Cache (2)

Tag comparison and data access can be performed
at the same time direct mapped cache is the
fastest
Tag RAM is small, tag access is completed before
data access
Two items with the same cache set address will
contend for the use of a single cache entry
Cache Contention can lead to cache trashing
Only bits not used to select within the line or
to address the cache RAM need to be stored in the
tag field

17
2-way Set-Associative Cache (1)
18
2-way Set-associative Cache (2)

Effectively 2 x direct-mapped caches in parallel
Each of two items that were in contention may
occupy a separate place in the cache
Moving from Direct-mapped cache to 2-way
set-associtive cache for a given cache sizeSet
address is one bit smaller, tag address is one
bit bigger than Direct-mapped case. Can you see
why?

19
Cache Line Replacement

When a cache miss causes a line fill and there is
vacancy in a set the new line will replace a
vacant line
However When a cache miss causes a line fill and
all lines in a set are occupiedyou have to
decide which line in a set will be replaced
This is done through a replacement algorithm.
Popular algorithms are
Least Recently Used (LRU)Algorithm controlled by
LRU bits in the cache directory
Random Allocaction
Cyclic

20
Cache Consistency (1)

Challenge in handling caches
Data in a cache may not always be the same as the
data in main memory
Remember that the data in the cache is controlled
by the CPU that owns the cache

21
Cache Consistency (2)

Suppose you have a second CPU or another possible
Bus Master (eg DMA Controller)
Suppose this device wants to access some data
thats in the CPUs cache
Second device can only access main memory
How does it know if the data it wants to read is
up-to-date?
Solution Bus Snooping or Inquiry Cycles

22
Cache Consistency (2)

Inquiry cycles snoop cyclesInitiated by the
system to determine if a line is present in the
cache, and what state the line is in.

23
MESI Protocol Whats that?

Formal Mechanism for controlling cache
consistency using snooping
Every cache line is in 1 of 4 MESI states
(encoded in 2b)
Cache line can change state by
memory read and write cycles
inquiry cycles

24
MESI States

ModifiedAn M-state line is available in only one
cache and it is also MODIFIED (different from
main memory). An M-state line can be accessed
(read/written to) without sending a cycle out on
the bus
ExclusiveAn E-state line is also available in
only one cache in the system, but the line is not
MODIFIED (i.e., it is the same as main memory).
An E-state line can be accessed (read/written to)
without generating a bus cycle. A write to an
E-state line causes the line to become MODIFIED

25
MESI States

SharedThis state indicates that the line is
potentially shared with other caches (i.e., the
same line may exist in more than one cache). A
read to an S-state line does not generate bus
activity, but a write to a SHARED line generates
a write-through cycle on the bus. The
write-through cycle may invalidate this line in
other caches. A write to an S-state line updates
the cache
InvalidThis state indicates that the line is not
available in the cache. A read to this line will
be a MISS and may cause the processor to execute
a LINE FILL (fetch the whole line into the cache
from main memory). A write to an INVALID line
causes the processor to execute a write-through
cycle on the bus

26
MESI States

27
MESI State Transitions
28
Cache Consistency and Bus Snooping (Inquiry
Cycles) -1
29
Cache Consistency and Bus Snooping (Inquiry
Cycles) -2
30
Cache Consistency and Bus Snooping (Inquiry
Cycles) -3
31
Cache Consistency and Bus Snooping (Inquiry
Cycles) -4
32
Cache Consistency and Bus Snooping (Inquiry
Cycles) -5
33
Cache Consistency and Bus Snooping (Inquiry
Cycles) -6
34
Cache Consistency and Bus Snooping (Inquiry
Cycles) -7
35
Cache Consistency and Bus Snooping (Inquiry
Cycles) -8
36
Cache Consistency and Bus Snooping (Inquiry
Cycles) -9
37
L2-Caches and the MESI Protocol
L2-caches are larger than L1 but a bit
slower MESI protocol applies to both
caches Inclusion all address in L1 also in L2
38
Intel Architecture Caches
39
Pentium L1-caches
40
Pentium L1 Cache elements
41
Pentium Page Cacheability
42
Pentium L2-cache

Write a Comment

User Comments (0)