Title: CS 161 Ch 7: Memory Hierarchy LECTURE 20
1CS 161Ch 7 Memory Hierarchy LECTURE 20
- Instructor L.N. Bhuyan
- www.cs.ucr.edu/bhuyan
2Cache Organization
- (1) How do you know if something is in the cache?
- (2) If it is in the cache, how to find it?
- Answer to (1) and (2) depends on type or
organization of the cache - In a direct mapped cache, each memory address is
associated with one possible block within the
cache - Therefore, we only need to look in a single
location in the cache for the data if it exists
in the cache
3Simplest Cache Direct Mapped
4-Block Direct Mapped Cache
Memory
Cache Index
Block Address
0
0
0000two
1
1
2
2
3
3
4
0100two
5
6
- Cache Block 0 can be occupied by data from
- Memory block 0, 4, 8, 12
- Cache Block 1 can be occupied by data from
- Memory block 1, 5, 9, 13
7
8
1000two
9
10
11
12
1100two
13
14
15
4Simplest Cache Direct Mapped
4-Block Direct Mapped Cache
MainMemory
Cache Index
Block Address
0
0
1
1
2
2
0010
3
3
4
Memory block address
5
6
0110
index
tag
7
8
9
- index determines block in cache
- index (address) mod ( blocks)
- If number of cache blocks is power of 2, then
cache index is just the lower n bits of memory
address n log2( blocks)
10
1010
11
12
13
14
1110
15
5Simplest Cache Direct Mapped w/Tag
Direct Mapped Cache
MainMemory
cache index
Block Address
tag
data
0
0
1
1
11
2
2
0010
3
3
4
5
6
0110
7
8
- tag determines which memory block occupies cache
block - tag bits lefthand bits of address
- hit cache tag field tag bits of address
- miss tag field ? tag bits of addr.
9
10
1010
11
12
13
14
1110
15
6Accessing data in a direct mapped cache
- Three types of events
- cache miss nothing in cache in appropriate
block, so fetch from memory - cache hit cache block is valid and contains
proper address, so read desired word - cache miss, block replacement wrong data is in
cache at appropriate block, so discard it and
fetch desired data from memory - Cache Access Procedure (1) Use Index bits to
select cache block (2) If valid bit is 1, compare
the tag bits of the address with the cache block
tag bits (3) If they match, use the offset to
read out the word/byte.
7Data valid, tag OK, so read offset return word d
- 000000000000000000 0000000001 1100
3
1
Index
2
0
1
0
a
b
c
d
0
0
0
0
0
0
0
0
8An Example Cache DecStation 3100
- Commercial Workstation 1985
- MIPS R2000 Processor (similar to pipelined
machine of chapter 6) - Separate instruction and data caches
- direct mapped
- 64K Bytes (16K words) each
- Block Size 1 Word (Low Spatial Locality)
- Solution
- Increase block size 2nd example
9DecStation 3100 Cache
3
1
3
0
1
7
1
6
1
5
5
4
3
2
1
0
Address (showing bit positions)
ByteOffset
1
6
1
4
Data
H
i
t
1
6
b
i
t
s
3
2
b
i
t
s
V
a
l
i
d
T
a
g
D
a
t
a
1
6
K
e
n
t
r
i
e
s
If miss, cache controller stalls the processor,
loads data from main memory
1
6
3
2
1064KB Cache with 4-word (16-byte) blocks
31 . . . 16 15 . . 4 3 2 1 0
Address (showing bit positions)
1
6
1
2
B
y
t
e
2
H
i
t
D
a
t
a
T
a
g
o
f
f
s
e
t
B
l
o
c
k
o
f
f
s
e
t
I
n
d
e
x
1
6
b
i
t
s
1
2
8
b
i
t
s
Tag
Data
V
4
K
e
n
t
r
i
e
s
1
6
3
2
3
2
3
2
3
2
M
u
x
3
2
11Miss rates 1-word vs. 4-word block (cache
similar to DecStation 3100)
I-cache D-cache CombinedProgram miss
rate miss rate miss rategcc 6.1 2.1 5.4sp
ice 1.2 1.3 1.2gcc 2.0 1.7 1.9spice
0.3 0.6 0.4
1-wordblock
4-wordblock
12Miss Rate Versus Block Size
4
0
3
5
3
0
2
5
e
t
a
r
s
2
0
s
i
M
1
5
1
0
5
0
256
64
16
4
B
l
o
c
k
s
i
z
e
(bytes)
1
K
B
total cache size
8
K
B
1
6
K
B
- Figure 7.12 - for direct mapped cache
6
4
K
B
2
5
6
K
B
13Extreme Example 1-block cache
- Suppose choose block size cache size? Then
only one block in the cache - Temporal Locality says if an item is accessed, it
is likely to be accessed again soon - But it is unlikely that it will be accessed again
immediately!!! - The next access is likely to be a miss
- Continually loading data into the cache
butforced to discard them before they are used
again - Worst nightmare of a cache designer Ping Pong
Effect
14Block Size and Miss Penality
- With increase in block size, the cost of a miss
also increases - Miss penalty time to fetch the block from the
next lower level of the hierarchy and load it
into the cache - With very large blocks, increase in miss penalty
overwhelms decrease in miss rate - Can minimize average access time if design memory
system right
15Block Size Tradeoff
Miss Rate
Exploits Spatial Locality
Fewer blocks compromises temporal locality
Block Size
Average Access Time
Increased Miss Penalty Miss Rate
Block Size
16Direct-mapped Cache Contd.
- The direct mapped cache is simple to design and
its access time is fast (Why?) - Good for L1 (on-chip cache)
- Problem Conflict Miss, so low hit ratio
- Conflict Misses are misses caused by accessing
different memory locations that are mapped to the
same cache index - In direct mapped cache, no flexibility in where
memory block can be placed in cache, contributing
to conflict misses
17Another Extreme Fully Associative
- Fully Associative Cache (8 word block)
- Omit cache index place item in any block!
- Compare all Cache Tags in parallel
4
0
31
Byte Offset
Cache Tag (27 bits long)
Cache Data
Valid
Cache Tag
B 0
B 1
B 31
- By definition Conflict Misses 0 for a fully
associative cache
18Fully Associative Cache
- Must search all tags in cache, as item can be in
any cache block - Search for tag must be done by hardware in
parallel (other searches too slow) - But, the necessary parallel comparator hardware
is very expensive - Therefore, fully associative placement practical
only for a very small cache