CSL718 : Memory Hierarchy - PowerPoint PPT Presentation

About This Presentation
Title:

CSL718 : Memory Hierarchy

Description:

... Motherboard: P4 Heavy Duty Cooling Fan With Heat ... Memory Hierarchy Memory technologies Hierarchical structure Slide 4 Main Memory for Pentium IV ... – PowerPoint PPT presentation

Number of Views:115
Avg rating:3.0/5.0
Slides: 27
Provided by: Ansh6
Category:

less

Transcript and Presenter's Notes

Title: CSL718 : Memory Hierarchy


1
CSL718 Memory Hierarchy
  • Cache Memories
  • 6th Feb, 2006

2
Memory technologies
  • Semiconductor
  • Registers
  • SRAM Random Access
  • DRAM
  • FLASH
  • Magnetic
  • FDD
  • HDD
  • Optical Random sequential
  • CD
  • DVD

3
Hierarchical structure
C
P
U
S
i
z
e
C
o
s
t

/
b
i
t
S
p
e
e
d
M
e
m
o
r
y
S
m
a
l
l
e
s
t
H
i
g
h
e
s
t
F
a
s
t
e
s
t
M
e
m
o
r
y
B
i
g
g
e
s
t
L
o
w
e
s
t
S
l
o
w
e
s
t
M
e
m
o
r
y
4
System Configuration e-bay price Rs. 37,500 System Configuration e-bay price Rs. 37,500
Processor Intel P4 3.2GHz (800FSB) 1024k CPU with Hyper Threading
CPU Fan P4 Heavy Duty Cooling Fan With Heat Sink
Motherboard D915G express chipset 800FSB  (up to 3.6GHz support)
Memory 1GB DDR400 PC3200 DUAL CHANNEL RAM
Video Card GeForce FX 6200 256MB 16x PCI-e video with TV out
Hard drive 160GB 7200RPM UDMA-150 SATA
CD drive 52x32x52x16x CDRW DVD ROM drive 
Floppy drive Sony 1.44MB 3.5" drive
Sound AC 97 6 ch 5.1 Full duplex digital sound, stereo speakers
Network 10/100 RJ45 onboard network (Ethernet, cable or DSL)
Modem 56k v92 modem 
Ports Six USB 2.0 ports,1 serial, 1 parallel, 1 microphone jack
Case Black i BOX 522 Mid Tower 400w power supply (front USB)
Keyboard Black PS2 Windows Keyboard
Mouse Black PS2 Scroll Mouse
Monitor 17" SAMSUNG 793S MONITOR
5
Main Memory for Pentium IVDDR (double data rate)
DRAM
Size Interface Price
128 MB PC-333 Rs. 599
256 MB PC-333 Rs. 1,299
1 GB PC-333 Rs. 4,999
1 GB PC-400 Rs, 5,299
6
Disk drives Seagate Baracuda 7200 RPM
Capacity Price
40 GB Rs. 2,999
80 GB Rs. 3,499
120 GB Rs. 4,499
160 GB Rs. 4,799
200 GB Rs. 5,500
250 GB Rs. 6,999
300 GB Rs. 9,900
400 GB Rs. 14,950
7
Data transfer between levels
hit
P
r
o
c
e
s
s
o
r
access
miss
D
a
t
a


t
r
a
n
s
f
e
r
unit of transfer block
8
Principle of locality
  • Temporal Locality
  • references repeated in time
  • Spatial Locality
  • references repeated in space
  • Special case Sequential Locality

9
Memory Hierarchy Analysis
  • Memory Mi M1, M2, . , Mn
  • Capacity si s1lt s2lt . lt sn
  • Unit cost ci c1gt c2gt . gt cn
  • Total cost Ctotal ?i ci . si
  • Access time ti ?1 ?2 . ?i (?i at
    level i)
  • ?1lt ?2lt . lt ?n
  • Hit ratios hi(si) h1lt h2lt . lt hn 1
  • Effective time Teff ?i mi . hi . ti ?i mi .
    ?i
  • Miss before level i, mi (1-h1)(1-h2) . (1-hi-1)

10
Cache Types
  • Instruction Data Unified Split
  • Split vs. Unified
  • Split allows specializing each part
  • Unified allows best use of the capacity
  • On-chip Off-chip
  • on-chip fast but small
  • off-chip large but slow
  • Single level Multi level

11
Cache Policies
  • Placement what gets placed where?
  • Read when? from where?
  • Load order of bytes/words?
  • Fetch when to fetch new block?
  • Replacement which one?
  • Write when? to where?

12
Block placement strategies
Direct mapped
Set associative
Fully associative
Block
Set

0
1
2
3
4
5
6
7

0
1
2
3
D
a
t
a
D
a
t
a
D
a
t
a
1
1
1
T
a
g
T
a
g
T
a
g
2
2
2
S
e
a
r
c
h
S
e
a
r
c
h
S
e
a
r
c
h
13
Organization/placement policy
Set 1
Cache
Set S
Set
Sector 1
Sector 2
Sector SE
LRU
Sector
Block 1
Block 2
Block B
Tag
Block
AU 1
AU 2
AU A
V D S
14
Addressing Cache
Sector Name Set Index
Block Displacement
Address
Selects set
Compared to Tags
Selects Block
Selects AU
Early select access data after tag matching Late
select access data while tag matching
15
Cache organization example
Sector
Sector
Block
Block
Block
Block
Tag V D AU AU V D AU AU Tag V D AU AU V D AU AU
1
2
3
4
Sets
5
6
7
8
16
Cache access mechanism
Address 31 0
18
12
2
Hit
Data
Tag
byte offset
index
index v tag data
0 1 ... ... 4095
32
18

17
Cache with 4 word blocks
Address 31 0
Data
18
10
2
2
Hit
Tag
byte offset
block offset
index
index v tag
data
0 1 ... ... 1023
32
32
32
32
18

Mux
18
4-way set associative cache
31 0
tag
20
byte offset
8
2
2
index
block offset
v tag data
v tag data
v tag data
v tag data
0 ... ... ... 255
20
20
20
20
128
128
128
128




Mux
Mux
Mux
Mux
32
32
32
32
Hit
Mux
Data
19
Read policies
  • Sequential or concurrent
  • initiate memory access only after detecting a
    miss
  • initiate memory access along with cache access in
    anticipation of a miss
  • With or without forwarding
  • give data to CPU after filling the missing block
    in cache
  • forward data to CPU as it gets filled in cache

20
Read Policies
Sequential Simple
1
1
1
Teff(1-pm).1 pm . (T2)
Cache
T
Memory
Concurrent Simple
1
1
1
Teff(1-pm).1 pm . (T1)
Cache
T
Memory
Sequential Forward
1
1
Teff(1-pm).1 pm . (T1)
Cache
T
Memory
Concurrent Forward
1
1
Teff(1-pm).1 pm . (T)
Cache
T
Memory
21
Load policies
4 AU Block
2
3
1
0
Cache miss on AU 1
Block Load
Load Forward
Fetch Bypass (wrap around load)
22
Fetch Policies
  • Fetch on miss (demand fetching)
  • Software prefetching
  • Hardware Prefetching

23
Fetch Policies
  • Demand fetching
  • fetch only when required (miss)
  • Hardware prefetching
  • automatically prefetch next block
  • Software prefetching
  • programmer decides to prefetch
  • questions
  • how much ahead (prefetch distance)
  • how often

24
Software Control of Cache
  • Software visible cache
  • mode selection (WT, WB etc)
  • block flush
  • block invalidate
  • block prefetch

25
Replacement Policies
  • Least Recently Used (LRU)
  • Least Frequently Used (LFU)
  • First In First Out (FIFO)
  • Random

26
Write Policies
  • Write Hit
  • Write Back
  • Write Through
  • Write Miss
  • Write Back
  • Write Through (with or without Write Allocate)
  • Buffers are used in all cases to hide latencies
Write a Comment
User Comments (0)
About PowerShow.com