CS 2200 - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

CS 2200

Description:

Clock cycle long enough for most complex instruction to execute ... Assume processor spends. 2 cycles to do IF, EX, MEM. 1 cycle to do ID and WB ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 53
Provided by: BillL161
Category:
Tags: spends

less

Transcript and Presenter's Notes

Title: CS 2200


1
CS 2200
  • Presentation 6d
  • Introduction to Pipelining

2
Questions?
3
So far...
  • We examined a simple design concept having a
    CPI of 1.
  • Requirements
  • Sufficient functional units to perform all
    necessary steps for every instruction
    simultaneously (e.g. 3 ALUs)
  • Clock cycle long enough for most complex
    instruction to execute
  • Requires separate memories for instructions and
    data

4
And...
  • We've looked at a typical multicycle design, the
    LC-2200
  • Single memory for instructions and data
  • One ALU
  • One or more registers after each functional unit
  • Multiplexors
  • Bus design
  • Do we need a bus based design?

5
Question
  • Do we need a bus?
  • Yes
  • No

6
Functional Units Busy?
  • During the course of execution of an instruction
    are the functional units all kept busy?
  • How can we use most of the functional units in
    every clock cycle?

7
Pipelining
8
Need for Speed
  • CPI of 1 was too slow
  • Multicycle made things better
  • But we want to go faster.
  • Can we break our instruction into pieces?

9
Passage of an Instruction
  • IF Instruction Fetch
  • ID Instruction Decode
  • EX Execute (or calculate address)
  • MEM Memory access (load or store)
  • WB Write Back (from ALU or memory)

10
An Example
  • Assume processor spends
  • 2 cycles to do IF, EX, MEM
  • 1 cycle to do ID and WB
  • Consider 3 successive load instructions
  • how much time to execute?
  • picture the execution
  • how can you make this faster?

11
24 Cycles
How can you make this faster?
12
Laundry Break?
13
Laundry
  • We have the following
  • A washer
  • A dryer
  • A person who irons/folds clothing
  • A person who puts the clothing away
  • Each of the above has a 30 minute cycle time
  • Given a load of dirty laundry how long to do the
    complete job?
  • Transfer time negligible.

14
Base Case
2 hours
W
D
S
F
How long for 4 loads?
2 hours/load
15
4 Load Case
8 hours
Can we do better?
2 hours/load
16
4 Load Case
3.5 hours
W
D
S
F
W
D
S
F
W
D
S
F
W
D
S
F
Note Actual time to completely do one load
unchanged. Some time required to fill
pipeline. Steady state improvement depends on
number of units
17
Question 4 Load Case
  • How many hours/load?
  • 0.5 hours/load
  • 0.875 hours/load
  • 2.0 hours/load

18
12 Load Case
7.5 hours
.625 hours/load
19
An Example
  • Assume processor spends
  • 2 cycles to do IF, EX, MEM
  • 1 cycle to do ID and WB
  • Consider 3 successive load instructions
  • how much time to execute?
  • picture the execution
  • how can you make this faster?

20
24 Cycles
21
New Concept
  • Can we try and separate the functional units into
    groupings or stages?
  • Equivalent to saying the IF functional units
    will be like the washer.
  • The ID units will be like the dryer and so
    on...
  • May need to add back some functional units AND
    nix the bus!

22
Step One
M X
1
P C
Instr Mem
DPRF
A
Data Mem
M X
M X
D
SE
23
R-Type
M X
1
P C
Instr Mem
DPRF
A
Data Mem
M X
M X
D
SE
24
LW
M X
1
P C
Instr Mem
DPRF
A
Data Mem
M X
M X
D
SE
25
SW
M X
1
P C
Instr Mem
DPRF
A
Data Mem
M X
M X
D
SE
26
BEQ
M X
1
P C
Instr Mem
DPRF
BEQ
A
Data Mem
M X
M X
D
SE
27
Pipelining
  • Clearly we have broken the design into stages but
    there is still a connection.
  • Stage n is holding data/register numbers/etc.
    that stage n1 needs...
  • Solution More registers to hold critical
    information

28
PipeLined
M X
1
P C
Instr Mem
DPRF
BEQ
A
Data Mem
M X
M X
D
SE
WB
EX
MEM
ID
IF
29
R-Type
M X
1
P C
Instr Mem
DPRF
BEQ
A
Data Mem
M X
M X
D
SE
WB
EX
MEM
ID
IF
30
R-Type
M X
1
P C
Instr Mem
DPRF
BEQ
A
Data Mem
M X
M X
D
SE
WB
EX
MEM
ID
IF
31
R-Type
M X
1
P C
Instr Mem
DPRF
BEQ
A
Data Mem
M X
M X
D
SE
WB
EX
MEM
ID
IF
32
R-Type
M X
1
P C
Instr Mem
DPRF
BEQ
A
Data Mem
M X
M X
D
SE
WB
EX
MEM
ID
IF
33
R-Type
M X
1
P C
Instr Mem
DPRF
BEQ
A
Data Mem
M X
M X
D
SE
WB
EX
MEM
ID
IF
34
Question
  • I understand this perfectly
  • I sort of understand this
  • I have mixed emotions and sometimes I feel like
    slappy myself silly
  • I'm sort of lost
  • So I put the oil in which end of the pipeline?

35
R-Type
M X
1
P C
Instr Mem
DPRF
BEQ
A
Data Mem
M X
M X
D
SE
WB
EX
MEM
ID
IF
36
13 cycles
WB
EX
ID
IF
MEM
WB
EX
ID
IF
MEM
WB
EX
ID
IF
MEM
Well, this is like a no-brainer!
37
See any problems?
38
The Dukes of Hazard
  • Hazards
  • Structural
  • Control
  • Data

39
The Dukes of Hazard
  • Hazards
  • Structural
  • Hardware cannot support desired instructions at
    the moment
  • Dealing with Structural Hazards
  • Replace bus with pipeline design
  • Multiple memories

40
The Dukes of Hazard
  • Hazards
  • Structural
  • Hardware cannot support desired instructions at
    the moment
  • Dealing with Structural Hazards
  • Replace bus with pipeline design
  • Multiple memories
  • Control
  • Need to make a decision but dont have results
    yet
  • Stall

41
WB
EX
ID
IF
MEM
add beq lw
WB
EX
ID
IF
MEM
WB
EX
ID
IF
MEM
Can be costly
42
One solution...
  • beq s0, s2, label
  • add s1, s2, s3
  • This instruction gets executed no matter what! So
    sometimes you will see
  • beq s0, s2, label
  • nop

43
Another...Branch Prediction
  • How accurately can branches be predicted?
  • 30 of the time correctly
  • 50
  • 70
  • 90

44
The Dukes of Hazard
  • Hazards
  • Structural
  • Hardware cannot support desired instructions at
    the moment
  • Dealing with Structural Hazards
  • Replace bus with pipeline design
  • Multiple memories
  • Control
  • Need to make a decision but dont have results
    yet
  • Stall (Can put instruction in slot!)
  • Predict
  • If correct, life is good
  • If wrong, will need to squash some instructions
  • Branch Target Buffer

45
Precedence/Dependence
  • a b c
  • d e f
  • g h i
  • Could these three statements be executed in
    parallel?

46
Precedence
  • a d c
  • d s g

47
Dependence
  • a b c
  • d a g

48
The Dukes of Hazard
  • Hazards
  • Data
  • Need a result that isnt finished yet to make
    another calculation
  • Forwarding or bypassing
  • add t0, s1, s2
  • add t3, t0, s4

49
WB
EX
ID
IF
MEM
add add sub
WB
EX
ID
IF
MEM
WB
EX
ID
IF
MEM
50
M X
1
P C
Instr Mem
DPRF
BEQ
A
Data Mem
M X
M X
D
SE
Forwarding Unit
WB
EX
MEM
ID
IF
51
Questions
52
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com