Memory , Hierarchical Memory Systems Cache memory - PowerPoint PPT Presentation

1 / 63
About This Presentation
Title:

Memory , Hierarchical Memory Systems Cache memory

Description:

... The line that was accessed least recently is replaced with the new block of data. ... Motherboard-based system caches are typically direct mapped. ... – PowerPoint PPT presentation

Number of Views:348
Avg rating:3.0/5.0
Slides: 64
Provided by: Lee144
Category:

less

Transcript and Presenter's Notes

Title: Memory , Hierarchical Memory Systems Cache memory


1
Memory , Hierarchical Memory SystemsCache memory
  • Prof. Sin-Min Lee
  • Department of Computer Science

CS147 Lecture 14
2
The Five Classic Components of a Computer
3
The Processor Picture
4
Processor/Memory Bus
PCI Bus
I/O Busses
5
Capacity Speed (latency) Logic 2x in
3 years 2x in 3 years DRAM 4x in 3 years 2x
in 10 years Disk 4x in 3 years 2x in 10
years
Technology Trends
DRAM Year Size Cycle
Time 1980 64 Kb 250 ns 1983 256 Kb 220 ns 1986 1
Mb 190 ns 1989 4 Mb 165 ns 1992 16 Mb 145
ns 1995 64 Mb 120 ns
10001!
21!
6
Predicting Performance ChangeMoore's Law
Original version The density of transistors in
an integrated circuit will double every year.
(Gordon Moore, Intel, 1965) Current
version Cost/performance of silicon chips
doubles every 18 months.
7
(No Transcript)
8
(No Transcript)
9
Processor-DRAM Memory Gap (latency)
10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
The connection between the CPU and cache is very
fast the connection between the CPU and memory
is slower
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
There are three methods in block placement
Direct mapped if each block has only one place
it can appear in the cache, the cache is said to
be direct mapped. The mapping is usually (Block
address) MOD (Number of blocks in cache) Fully
Associative if a block can be placed anywhere
in the cache, the cache is said to be fully
associative. Set associative if a block can
be placed in a restricted set of places in the
cache, the cache is said to be set associative .
A set is a group of blocks in the cache. A block
is first mapped onto a set, and then the block
can be placed anywhere within that set. The set
is usually chosen by bit selection that is,
(Block address) MOD (Number of sets in cache)
29
(No Transcript)
30
(No Transcript)
31
Cache (cont.)
Bits 2-4 of main memory address is the cache
address (index). The upper 5 bits of main memory
(tag) is stored in cache along with data. If tag
and index requested from CPU matches, its a
cache hit.
  • Direct Mapping Cache

32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
(No Transcript)
36
(No Transcript)
37
(No Transcript)
38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
(No Transcript)
42
(No Transcript)
43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
  •                                                   
                                             
  • A pictorial example for a cache with only 4
    blocks and a memory with only 16 blocks.

47
(No Transcript)
48
(No Transcript)
49
Replacement Policies
  • Whenever there is a miss, the information must
    be read from main memory. In addition, the cache
    is updated with this new information. One line
    will be replaced with the new block of
    information.
  • Policies for doing this vary. The three most
    commonly used are FIFO, LRU, and Random.

50
FIFO Replacement Policy
  • First in, first out Replaces the oldest line in
    the cache, regardless of the last time that this
    line was accessed.
  • The main benefit is that this is easy to
    implement.
  • The principle drawback is that you wont keep any
    item in cache for long you may find that you
    are constantly removing and adding the same block
    of memory.

51
Hit Ratio
  • The hit ratiohits divided by the sum of hits and
    missesis a measure of cache performance.
  • A well-designed cache can have a hit ratio close
    to 1.
  • The number of cache hits, far outnumber the
    misses and this speeds up system performance
    dramatically.

52
Total 14 Hit 4 Hit ratio 2/7
53
LRU Replacement Policy
  • Least Recently Used The line that was accessed
    least recently is replaced with the new block of
    data.
  • The benefit is that this keeps the most
    frequently accessed lines in the cache.
  • The drawback is that this can be difficult and
    costly to implement, especially if there are lots
    of lines to consider.

54
(No Transcript)
55
Random Replacement Policy
  • With this policy, the line that is replaced is
    chosen randomly.
  • Performance is close to that of LRU, and the
    implementation is much simpler.

56
Mapping Technique
  • The cache mapping technique is another factor
    that determines how effective the cache is, that
    is, what its hit ratio and speed will be. Three
    types are
  • Direct Mapped Cache Each memory location is
    mapped to a single cache line that it shares with
    many others only one of the many addresses that
    share this line can use it at a given time. This
    is the simplest technique both in concept and in
    implementation. Using this cache means the
    circuitry to check for hits is fast and easy to
    design, but the hit ratio is relatively poor
    compared to the other designs because of its
    inflexibility. Motherboard-based system caches
    are typically direct mapped.
  • 2. Fully Associative Cache Any memory location
    can be cached in any cache line. This is the most
    complex technique and requires sophisticated
    search algorithms when checking for a hit. It can
    lead to the whole cache being slowed down because
    of this, but it offers the best theoretical hit
    ratio since there are so many options for caching
    any memory address.
  • 3. N-Way Set Associative Cache "N" is typically
    2, 4, 8 etc. A compromise between the two
    previous design, the cache is broken into sets of
    "N" lines each, and any memory address can be
    cached in any of those "N" lines. This improves
    hit ratios over the direct mapped cache, but
    without incurring a severe search penalty (since
    "N" is kept small). The 2-way or 4-way set
    associative cache is common in processor level 1
    caches.

57
Comparison of cache mapping techniques1. Direct
Mapped Cache
  • The direct mapped cache is the simplest form of
    cache and the easiest to check for a hit.
  • Since there is only one possible place that any
    memory location can be cached, there is nothing
    to search the line either contains the memory
    information we are looking for, or it doesn't.
  • Unfortunately, the direct mapped cache also has
    the worst performance, because again there is
    only one place that any address can be stored.

58
Direct mapped cache example
Data A B C A D B E F A C D B G C H I A B
C 0 A B B A A B B B A A A B B B B B A B
A 1 D D D D D D D D D D D D D D
C 2 C C C C C C C C C C C C C C C C
H 3 G G G G G G
E 4 E E E E E E E E E E E E
5 F F F F F F F F F F F
6 I I I
7 H H H H
Hit? Y Y Y
59
2. Fully Associative Cache
  • The fully associative cache has the best hit
    ratio because any line in the cache can hold any
    address that needs to be cached.
  • This means the problem seen in the direct mapped
    cache disappears, because there is no dedicated
    single line that an address must use.
  • However, this cache suffers from problems
    involving searching the cache. If a given address
    can be stored in any of 16,384 lines, how do you
    know where it is? Even with specialized hardware
    to do the searching, a performance penalty is
    incurred. And this penalty occurs for all
    accesses to memory, whether a cache hit occurs or
    not, because it is part of searching the cache to
    determine a hit.

60
Associative cache example
Data A B C A D B E F A C D B G C H I A B
C 0 A B B A A B B B A A A B B B B B A B
A 1 D D D D D D D D D D D D D D
C 2 C C C C C C C C C C C C C C C C
H 3 G G G G G G
E 4 E E E E E E E E E E E E
5 F F F F F F F F F F F
6 I I I
7 H H H H
Hit? Y Y Y
61
3. N-Way Set Associative Cache
  • The set associative cache is a good compromise
    between the direct mapped and set associative
    caches.
  • Each address is mapped to a certain set of cache
    locations.
  • The address space is divided into blocks of m
    bytes (the cache line size), discarding the
    bottom m address bits.
  • An "n-way set associative" cache with S sets has
    n cache locations in each set. Block b is mapped
    to set "b mod S" and may be stored in any of the
    n locations in that set with its upper address
    bits as a tag. To determine whether block b is in
    the cache, set "b mod S" is searched
    associatively for the tag. .
  • In the "real world", the direct mapped and set
    associative caches are by far the most common.
    Direct mapping is used more for level 2 caches on
    motherboards, while the higher-performance
    set-associative cache is found more commonly on
    the smaller primary caches contained within
    processors.

62
2-Way Set-Associative example
Data A B C A D B E F A C D B G C H I A B
C 0 A 0 A0 A1 B0 A1 B0 A0 B1 A1 B0 A1 B0 E0 B1 E0 B1 E1 A0 E1 A0 E1 A0 B0 A1 B0 A1 B0 A1 B0 A1 B0 A1 B1 A0 A0 A1
C 1 H 1 D0 D0 D0 D1 F0 D1 F0 D1 F0 D0 F1 D0 F1 D0 F1 D0 F1 D0 F1 D0 F1 D0 F1 D0 F1
E 2 2 C0 C0 C0 C0 C0 C0 C0 C0 C0 C0 C0 C0 C0 C1 I0 C1 I0 C1 I0
3 3 G0 G0 G1 H0 G1 H0 G1 H0 G1 H0
Hit? Y Y Y Y Y Y Y
63
Summary of mapping techniques
Cache Type Hit Ratio Search Speed
Direct Mapped Good Best
Fully Associative Best Moderate
N-Way Set Associative Very good, better as N increases. Good, but gets worse as N increases.
Write a Comment
User Comments (0)
About PowerShow.com