Enhancing Performance with Pipelining - PowerPoint PPT Presentation

About This Presentation
Title:

Enhancing Performance with Pipelining

Description:

Need to send back either ALU result or memory value to the register file. Pipeline Control (2) ... in the WB stage, because the register file is able to be ... – PowerPoint PPT presentation

Number of Views:113
Avg rating:3.0/5.0
Slides: 73
Provided by: zanziba
Category:

less

Transcript and Presenter's Notes

Title: Enhancing Performance with Pipelining


1
Enhancing Performance with Pipelining
Slides developed by Rami Abielmona and modified
by Miodrag Bolic High-Level Computer Systems
Design
2
Presentation Outline (1)
  • What is pipelining ?
  • Pipeline Taxonomies
  • Instruction Pipelines
  • MIPS Instruction Pipeline
  • Pipeline Hazards
  • MIPS Pipelined Datapath
  • Load Word Instruction Example
  • Pipeline Datapath Example
  • Pipeline Control
  • Pipeline Instruction Example

3
Presentation Outline (2)
  • Pipeline Hazards
  • Control Hazards
  • Data Hazards
  • Detecting Data Hazards
  • Resolving Data Hazards
  • Forwarding Example
  • Stalling Example
  • Branch Hazards
  • Branching Example
  • Key terms

4
What is Pipelining ? (1)
  • There are two main ways to increase the
    performance of a processor through high-level
    system architecture
  • Increasing the memory access speed
  • Increasing the number of supported concurrent
    operations
  • Pipelining !
  • Parallelism ?
  • Pipelining is the process by which instructions
    are parallelized over several overlapping stages
    of execution, in order to maximize datapath
    efficiency

5
What is Pipelining ? (2)
  • Pipelining is analogous to many everyday
    scenarios
  • Car manufacturing process
  • Batch laundry jobs
  • Basically, any assembly-line operation applies
  • Two important concepts
  • New inputs are accepted at one end before
    previously accepted inputs appear as outputs at
    the other end
  • The number of operations performed per second is
    increased, even though the elapsed time needed to
    perform any one operation remains the same

6
What is Pipelining ? (3)
  • Looking at the textbooks example, we have a
    4-stage pipeline of laundry tasks
  • Place one dirty load of clothes into washer
  • Place the washed clothes into a dryer
  • Place a dry load on a table and fold
  • Put the clothes away
  • Graphically speaking
  • Sequential (top) vs.
  • Pipelined (bottom) execution

7
Pipeline Taxonomies
  • There are two types of pipelines used in computer
    systems
  • Arithmetic pipelines
  • Used to pipeline data intensive functionalities
  • Instruction pipelines
  • Used to pipeline the basic instruction fetch and
    execute sequence
  • Other classifications include
  • Linear vs. nonlinear pipelines
  • Presence (or lack) of feedforward and feedback
    paths between stages
  • Static vs. dynamic pipelines
  • Dynamic pipelines are multifunctional, taking on
    a different form depending on the function being
    executed
  • Scalar vs. vector pipelines
  • Vector pipelines specifically target computations
    using vector data

8
MIPS Instruction Pipeline (1)
  • Let us now introduce the pipeline were working
    with
  • Its a 5-stage instruction, linear, static and
    scalar pipeline, consisting of the following
    steps
  • Fetch instruction from Memory (IF)
  • Read registers while decoding the instruction
    (ID)
  • Execute the operation or calculate an address
    (EX)
  • Access an operand in data memory (MEM)
  • Write the result into a register (WB)
  • Again, theoretically, pipeline speedup number
    of stages in pipeline

9
MIPS Instruction Pipeline (2)
  • Inst. Fetch (2ns), Reg. read/write (1ns), ALU op.
    (2ns), Data access (2ns)

10
Single Cycle, Multiple Cycle, vs. Pipeline 1
Cycle 1
Cycle 2
Clk
Single Cycle Implementation
Load
Store
Waste
Cycle 1
Cycle 2
Cycle 3
Cycle 4
Cycle 5
Cycle 6
Cycle 7
Cycle 8
Cycle 9
Cycle 10
Clk
Multiple Cycle Implementation
Load
Store
R-type
Pipeline Implementation
Load
Store
R-type
11
Why Pipeline?
  • Suppose
  • 100 instructions are executed
  • The single cycle machine has a cycle time of 45
    ns
  • The multicycle and pipeline machines have cycle
    times of 10 ns
  • The multicycle machine has a CPI of 4.6
  • Single Cycle Machine
  • 45 ns/cycle x 1 CPI x 100 inst 4500 ns
  • Multicycle Machine
  • 10 ns/cycle x 4.6 CPI x 100 inst 4600 ns
  • Ideal pipelined machine
  • 10 ns/cycle x (1 CPI x 100 inst 4 cycle drain)
    1040 ns
  • Ideal pipelined vs. single cycle speedup
  • 4500 ns / 1040 ns 4.33
  • What has not yet been considered?

12
MIPS Instruction Pipeline (3) 2
  • What makes it easy
  • all instructions are the same length
  • just a few instruction formats
  • memory operands appear only in loads and stores
  • What makes it hard?
  • structural hazards suppose we had only one
    memory
  • control hazards need to worry about branch
    instructions
  • data hazards an instruction depends on a
    previous instruction
  • Well build a simple pipeline and look at these
    issues

13
Pipeline Hazards 1
  • structural hazards attempt to use the same
    resource two different ways at the same time
  • E.g., two instructions try to read the same
    memory at the same time
  • data hazards attempt to use item before it is
    ready
  • instruction depends on result of prior
    instruction still in the pipeline
  • add r1, r2, r3
  • sub r4, r2, r1
  • control hazards attempt to make a decision
    before condition is evaulated
  • branch instructions
  • beq r1, loop
  • add r1, r2, r3
  • Can always resolve hazards by waiting
  • pipeline control must detect the hazard
  • take action (or delay action) to resolve hazards

14
MIPS Pipelined Datapath (1)
  • What do we need to split the datapath into stages
    ?

15
MIPS Pipelined Datapath (2)
  • Pipeline registers (buffers) are similar to
    multicycle processor design

16
Load Word Instruction (1)
  • Instruction fetch stage

17
Load Word Instruction (2)
  • Instruction decode and register file read stage

18
Load Word Instruction (3)
  • Execute or address calculation stage

19
Load Word Instruction (4)
  • Memory access stage

20
Load Word Instruction (5)
  • Write back stage

21
Load Word Corrected Datapath
  • Write register number comes from the MEM/WB
    pipeline register along with the data

22
Graphical Representations
Multiple-clock cycle (vs. single-clock cycle)
pipelined diagrams
23
Pipeline Datapath Example (1)
  • Single-cycle pipeline diagram with one
    instruction on the pipeline

24
Pipeline Datapath Example (2)
  • Single-cycle pipeline diagram with two
    instructions on the pipeline

25
Pipeline Control (1)
  • What control signals are required ?
  • First, notice that the pipeline registers are
    written every clock cycle, hence do not require
    explicit control signals, otherwise
  • Instruction fetch and PC increment
  • Again, asserted at every clock cycle
  • Instruction decode and register file read
  • Again, asserted at every clock cycle
  • Execution and address calculation
  • Need to select the result register, the ALU
    operation, and either Read data 2 or the
    sign-extended immediate for the ALU
  • Memory access
  • Need to read from memory, write to memory or
    complete branch
  • Write back
  • Need to send back either ALU result or memory
    value to the register file

26
Pipeline Control (2)
27
Pipeline Control (3)
28
Pipeline Datapath with Control
29
Pipeline Instruction Example (1)
30
Pipeline Instruction Example (2)
31
Pipeline Instruction Example (3)
32
Pipeline Instruction Example (4)
33
Pipeline Instruction Example (5)
34
Pipeline Instruction Example (6)
35
Pipeline Instruction Example (7)
36
Pipeline Instruction Example (8)
37
Pipeline Instruction Example (9)
38
Pipeline Hazards
  • Structural hazard
  • Occurs when a combination of instructions is not
    supported by the datapath
  • For example, a unified memory unit would need to
    be accessed in stages 1 (IF) and 4 (MEM), which
    would cause a contention
  • Pipeline outright fails in the presence of
    structural hazards
  • Control hazard
  • Occurs when a decision is made based on the
    results of one instructions, while others are
    executing
  • For example, a branch instruction is either taken
    or not
  • Solutions that exist are stalling and predicting
  • Data hazard
  • Occurs when an instruction depends on the results
    of an instruction resident on the pipeline
  • For example, adding two register contents and
    storing their result into a third register, then
    using that registers contents for another
    operation
  • Solutions that exist are based on forwarding

39
Control Hazards - Stalling
  • Three major solutions
  • Stall
  • Predict
  • Delayed branch slot
  • Stalling involves always waiting for the PC to be
    updated with the correct address before moving on
  • A pipeline stall (or bubble) allows us to perform
    this wait
  • Quite costly, as we have to stall even if the
    branch fails

40
Control Hazards - Predicting
  • Predicting involves guessing whether the branch
    is taken or not, and acting on that guess
  • If correct, then proceed with normal pipeline
    execution
  • If incorrect, then stall pipeline execution

41
Control Hazards Delayed branch
  • Delayed branch involves executing the next
    sequential instruction with the branch taking
    place after that delayed branch slot
  • The assembler automatically adjusts the
    instructions to make it transparent from the
    programmer
  • The instruction has to be safe, as in it
    shouldnt affect the branch
  • Longer pipelines requires the use of more branch
    delay slots
  • Actual MIPS architecture solution

42
Data Hazards Forwarding (1)
  • Forwarding involves providing the inputs to a
    stage of one instruction before the completion of
    another instruction
  • Valid if destination stage is later in time than
    the source stage
  • Left diagram shows typical forwarding scenario
    (add then sub)
  • Right diagram shows that we still need a stall in
    the case of a load-use data hazard (load then
    R-type)

43
Data Hazards Forwarding (2)
  • sub 2, 1, 3
  • and 12, 2, 5
  • or 13, 6, 2
  • add 14, 2, 2
  • sw 14, 100(2)

44
Data Hazards Crude Solution
  • We could insert no operation (nop) instructions
    to delay the pipeline execution until the correct
    result is in the register file
  • sub 2, 1, 3
  • nop
  • nop
  • and 12, 2, 5
  • or 13, 6, 2
  • add 14, 2, 2
  • sw 14, 100(2)
  • Too slow as it adds extra useless clock cycles
  • In reality, we try to find useful instructions to
    execute between data-dependent instructions, but
    this happens too often to be efficient

45
Data Hazards Detection (1)
  • Let us try to formalize detecting a data hazard
  • EX/MEM.RegisterRd ID/EX.RegisterRs
  • EX/MEM.RegisterRd ID/EX.RegisterRt
  • MEM/WB.RegisterRd ID/EX.RegisterRs
  • MEM/WB.RegisterRd ID/EX.RegisterRt
  • sub 2, 1, 3
  • and 12, 2, 5 Data hazard of type 1
  • or 13, 6, 2 Data hazard of type 4
  • add 14, 2, 2 No data hazard register file
  • sw 14, 100(2) No data hazard correct operation

46
Data Hazards Detection (2)
  • Two modifications are in order
  • Firstly, we dont have to forward all the time!
  • Some instructions dont write registers (e.g.
    beq)
  • Use RegWrite signal in WB control block to
    determine condition
  • Secondly, the 0 register must always return 0
  • Cant limit programmer of using it as a
    destination register
  • Use RegisterRd to determine if 0 is being used
  • If (EX/MEM.RegWrite (EX/MEM.RegisterRd ? 0)
    (EX/MEM.RegisterRdID/EX.RegisterRs)) ForwardA
    10
  • If (EX/MEM.RegWrite (EX/MEM.RegisterRd ? 0)
    (EX/MEM.RegisterRdID/EX.RegisterRt)) ForwardB
    10
  • If (MEM/WB.RegWrite (MEM/WB.RegisterRd ? 0)
    (MEM/WB.RegisterRdID/EX.RegisterRs)) ForwardA
    01
  • If (MEM/WB.RegWrite (MEM/WB.RegisterRd ? 0)
    (MEM/WB.RegisterRdID/EX.RegisterRt)) ForwardB
    01
  • Let us examine the hardware changes to our
    datapath

47
Data Hazards Forwarding Unit (1)
48
Data Hazards Forwarding Unit (2)
  • Remember that there is no hazard in the WB stage,
    because the register file is able to be written
    and read in the same stage

49
Data Hazards Forwarding Unit (3)
50
Data Hazards Forwarding Unit (4)
51
Forwarding Example (1)
52
Forwarding Example (2)
53
Forwarding Example (3)
54
Forwarding Example (4)
55
Data Hazards Stalling (1)
  • lw 2, 20(1)
  • and 4, 2, 5
  • or 8, 2, 6
  • add 9, 4, 2
  • slt 1, 6, 7

56
Data Hazards Stalling (2)
  • Let us try to formalize detecting a stalling data
    hazard
  • If (ID/EX.MemRead ((ID/EX.RegisterRt
    IF/ID.RegisterRs) or (ID/EX.RegisterRt
    IF/ID/RegisterRt)))
  • On the condition being true, we stall the
    pipeline!

57
Data Hazards Stalling (3)
58
Stalling Example (1)
59
Stalling Example (2)
60
Stalling Example (3)
61
Stalling Example (4)
62
Stalling Example (5)
63
Stalling Example (6)
64
Branch Hazards
  • Other instructions are on the pipeline when we
    find out whether we take the branch or not!

65
Branch Hazards Stalling (1)
  • Two solutions
  • Assume branch is not taken
  • Dynamic branch prediction
  • Weve already discussed the first solution
  • Note that three instruction stages have to be
    flushed when the branch is taken
  • Done similarly to a data hazard stall (control
    values set to 0s)
  • We can increase branch performance by moving the
    branch decision to the ID stage (rather than the
    MEM stage)
  • Branch target address calculated by moving adder
    into ID stage
  • Branch decision done by comparing Rs and Rt
  • Flushing the IF stage instruction involves nop
    instructions

66
Branch Hazards Stalling (2)
67
Branching Example (1)
68
Branching Example (2)
69
Branch Hazards Predicting (1)
  • Store, in a branch prediction buffer, the history
    of each branch instruction
  • 1-bit requires one wrong prediction to update
    history table
  • 2-bits requires two wrong predictions to update
    history table

70
Key Terms and Review Points (1)
  • Pipelining vs. Parallelism
  • Pipeline Stages
  • Pipeline Taxonomies
  • MIPS Instruction Pipeline
  • Structural Hazards
  • Control Hazards
  • Data Hazards
  • Pipeline Registers and Operation
  • Pipeline Control
  • Pipeline Throughput
  • Pipeline Efficiency

71
Key Terms and Review Points (2)
  • Control Hazard Stalling
  • Control Hazard Predicting
  • Control Hazard Delayed Branch
  • Data Hazard Forwarding
  • Data Hazard Detection
  • Forwarding Unit
  • Data Hazard Stalling
  • Branch Prediction Buffer

72
References
  1. Mike Schulte, Computer Architecture ECE 201 ,
    Lecture 11.
  2. Morgan Kaufmann Website Companion Web Site for
    Computer Organization and Design
Write a Comment
User Comments (0)
About PowerShow.com