COMP 206: Computer Architecture and Implementation - PowerPoint PPT Presentation

About This Presentation
Title:

COMP 206: Computer Architecture and Implementation

Description:

Title: Lecture 8 Author: Montek Singh Last modified by: Montek Singh Created Date: 3/13/2000 2:52:39 AM Document presentation format: Letter Paper (8.5x11 in) – PowerPoint PPT presentation

Number of Views:103
Avg rating:3.0/5.0
Slides: 18
Provided by: Montek3
Learn more at: http://www.cs.unc.edu
Category:

less

Transcript and Presenter's Notes

Title: COMP 206: Computer Architecture and Implementation


1
COMP 206Computer Architecture and Implementation
  • Montek Singh
  • Wed, Sep 21, 2005
  • Topic Pipelining -- Intermediate Concepts
  • (Control Hazards)

2
Control Hazard
  • A peculiar kind of RAW hazard involving the
    program counter
  • PC written by branch instruction
  • PC read by instruction fetch unit (not another
    instruction)
  • Possible misbehavior is that instructions fetched
    and executed after the branch instruction are not
    the ones specified by the branch instruction

3
Control Hazard Example
Br-1 Br Br1 Br2 Br3 T
4
More on Control Hazards
  • Branch delay the length of the control hazard
  • What determines branch delay?
  • We need to know that we have a branch instruction
  • We need to have the BTA
  • We need to know the branch outcome
  • So, we have to wait until we know all of these
    quantities
  • An older pipeline (DLX, HP2)
  • computes BTA in EX
  • computes branch outcome in EX
  • changes PC in MEM
  • To reduce branch delay, these steps are moved to
    earlier pipeline stages in MIPS (HP3)
  • Cant move up beyond ID (need to know its a
    branch instruction)

5
Reducing Branch Delays
Example sub 10, 4, 8 beq 10, 3, go add
12, 2, 5 ... go lw 4, 16(12)
6
Dealing with Branch Delays
  • Four strategies
  • Stall
  • Predict Taken, variation A (PTA)
  • Predict Taken, variation B (PTB)
  • Predict Not Taken (PNT)
  • Consider a hypothetical 12-stage pipeline
  • Instruction is fetched in stage 1 (IF)
  • Opcode becomes known in stage 2 (ID)
  • BTA becomes known in stage 4
  • Branch outcome becomes known in stage 6
  • Parameters
  • PU, PT, PNT penalties of unconditional branch,
    taken branch, untaken branch
  • T probability of branch being taken

7
Stall Strategy 12-Stage Pipeline
  • Pipeline stalls on all branches
  • Instructions 1 and 8 are branches
  • 1 is not taken, 8 is taken
  • Opcode determination in stage 2 stalls pipeline
  • Branch outcome determination in stage 6 restarts
    pipeline from IF or ID
  • BTA determination in stage 4 would restart
    pipeline from IF for jumps
  • PU 3, PT 5, PNT 4

8
PNT Strategy 12-Stage Pipeline
  • Pipeline continues execution assuming that the
    branch will fall through
  • Instructions 1 and 12 are branches
  • 1 is not taken, 12 is taken
  • Branch outcome determination in stage 6 restarts
    pipeline from IF for taken branches (cancelling
    instructions already in pipeline)
  • BTA determination in stage 4 would restart
    pipeline from IF for jumps
  • PU 3, PT 5, PNT 0

9
PTA Strategy 12-Stage Pipeline
  • Pipeline predicts all branches to be taken and
    restarts pipeline from IF at BTA as soon as BTA
    is known (cancelling instructions already in
    pipe)
  • Instructions 1 and 7 are branches
  • 1 is not taken, 7 is taken
  • Branch outcome determination in stage 6 restarts
    pipeline from IF for untaken branches (cancelling
    instructions already in pipeline)
  • PU 3, PT 3, PNT 5

10
PTB Strategy 12-Stage Pipeline
  • Pipeline predicts all instructions to be taken
    and starts fetching from BTA as soon as it is
    known in stage 4 (but without cancelling
    instructions already in pipeline)
  • Instructions 1 and 10 are branch instructions
  • 1 is not taken, 10 is taken
  • Branch outcome determination in stage 6 restarts
    pipeline from IF on fall-through path (for
    untaken branches), and causes cancellation
  • PU 3, PT 3, PNT 2

11
Effect of Control Hazards on Pipelines
Assume that 20 of all instructions are transfers
of control, split 5 for unconditional jumps and
15 for conditional branches. For each of the
four branching schemes for the 12-stage pipeline,
determine the branch penalty as a function of T,
the probability of a branch being taken.
12
Solution for 12-Stage Pipeline
  • Stall 0.2530.75(T5(1-T)4) 3.750.75T
  • PTA 0.2530.75(T3(1-T)5) 4.5-1.5T
  • PTB 0.2530.75(T3(1-T)2) 2.250.75T
  • PNT 0.2530.75(T5(1-T)0) 0.753.75T

13
Delayed Branches on MIPS
  • One branch delay slot on MIPS
  • Always execute instruction in branch delay slot
    (irrespective of branch outcome)
  • Question What instruction do we put in the
    branch delay slot?
  • Fill with NOP (always possible, penalty 1)
  • Fill from before (not always possible, penalty
    0)
  • Fill from target (not always possible, penalty
    1-T)
  • BTA is dynamic
  • BTA is another branch
  • Fill from fall-through (not always possible,
    penalty T)

14
Details of Various Branch Flavors
A B C D
true
false
X cond
M N P Q
E F G H
15
Instruction Sequence Alteration Strategies
  • To allow for more aggressive filling of branch
    delay slot from target or fall-through, we can
    selectively cancel instructions
  • Classification of branches
  • Delayed branch
  • Instruction in branch delay slot is always
    executed
  • Plain branch
  • Instruction in branch delay slot is cancelled if
    branch is taken
  • Useful if compiler filled branch delay slot from
    fall-through
  • Canceling (annulling, nullifying) branch
  • Instruction in branch delay slot is cancelled if
    branch is not taken
  • Useful if compiler filled branch delay slot from
    target
  • Should not cancel instruction if it may cause
    exception
  • A bit in the instruction set by compiler makes
    the choice
  • MIPS, SPARC, PA-RISC delayed (0), canceling (1)
  • M 88000, i860 delayed (0), plain (1)

16
Example Branch Penalties
Consider a DLX pipeline with a single branch
delay slot in which 25 of branches are
unconditional. 50 of the unconditional branches
have their delay slots filled from before, 40
from the target, and 10 with NOPs. The branch
delay slots of the conditional branches are
filled from various sources as shown in the table
below, depending on the kind of branch used. For
each of the cases, determine the branch penalty
as a function of T, the probability that a
conditional branch is taken. How do these
penalties compare to those obtained by using a
Stall, PT, or PNT strategy?
For all of Stall, PT, and PNT on DLX PU 1, PT
1, PNT 0
17
Solution Branch Penalties
Write a Comment
User Comments (0)
About PowerShow.com