PIPELINED PROCESSORS - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

PIPELINED PROCESSORS

Description:

The basic pipeline operates clocked (synchronously), that is each stage accepts ... As each clock cycle ends, the latches gates in their inputs and forward them ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 44
Provided by: najma7
Category:

less

Transcript and Presenter's Notes

Title: PIPELINED PROCESSORS


1
PIPELINED PROCESSORS
  • Chapter No. 5
  • By
  • Najma Ismat

2
Pipeline Evolution in Processors
  • First appeared in at the end of 1960s in the
    first supercomputers of that time such as IBM
    360/91 (1967) and the CDC 7600 (1970).
  • In 1970 the use of pipelining at instruction
    level in mainframe B7700.

3
Principle of Pipelining
  • A number of functional units are employed in
    sequence to perform a single computation.
  • Each functional unit represent a certain stage
    of computation.
  • Pipeline allows overlapped execution of
    instructions or temporal overlapping of
    processing.
  • It increases the overall processors throughput.
  • In pipelined operation each task is divided into
    a number of subtasks.

4
Principle of Pipelining
  • Each stage of pipeline is associated with with
    each subtask which performs required operation.
  • For a basic pipeline same amount of time is
    available in each stage for performing a certain
    task.
  • All the pipeline stages operate like assembly
    line, that is , receiving input typically from
    previous stage and delivering their output to the
    next stage.
  • The basic pipeline operates clocked
    (synchronously), that is each stage accepts a new
    input at the start of the clock cycle.

5
Principle of Pipelining
6
Pipelined Operation
7
Pipelined Operation
8
Pipelined and Unpipelined Processing
9
Processor Pipelines in Reality
  • A real pipeline may include a few extensions to
    basic pipeline.
  • Pipelined execution is also often performed using
    half-cycles. and in certain cases, one or more
    pipeline stages may have to be recycled to
    accomplish a given task.
  • These additional cycles may be required to
    perform certain arithmetic operations

10
Logical Layout of Pentium Pipeline
11
Logical Layout of PowerPC 604 Pipeline
12
General Structure of Pipelines
  • Pipeline consists of a number of stages, one for
    each subtask. The stages are decoupled from each
    other by registers, called latches.
  • As each clock cycle ends, the latches gates in
    their inputs and forward them into the associated
    stage where the required operation is performed.
  • In reality, each stage is often implemented by a
    number of different FUs/Eus in performing the
    required operations.
  • The latches are extended with multiplexers that
    selects and transfer data from the outputs of
    preceding Eus to input the subsequent execution
    units.

13
General Structure of Pipelines
14
(No Transcript)
15
Pipeline Performance Measures
  • Non-pipelined processor
  • characteristic is instruction cycle time and
    execution time
  • Pipelined processor
  • no importance of execution time
  • three different measures in pipelined processors
    cycle time, latency and repetition rate
  • Cycle time
  • specifies the time available for each stage to
    accomplish the required operations

16
Pipeline Performance Measures
  • determined by worst-case processing time of the
    longest stage
  • latency
  • specifies the amount of time that the result of a
    particular instruction takes to become available
    in the pipeline for a subsequent dependent
    instruction
  • used in context of processing subsequent RAW
    dependent instruction
  • Two kinds of latencies define-use dependency and
    load-use dependency (corresponds to two types of
    RAW dependencies)

17
Pipeline Performance Measures
  • define use latency
  • mul r1, r2, r3
  • add r5, r1, r4
  • define-use delay
  • the time a subsequent RAW-dependent instruction
    has to be stalled in a pipeline
  • load-use latency
  • r1, x
  • add r5, r1, r2
  • Load-use delay
  • interpreted same as define-use delay

18
Pipeline Performance Measures
  • Repetition rate
  • also known as throughput
  • specifies the shortest possible time interval
    between the subsequent instructions in pipeline
    the repetition rate of a basic pipeline is one
    cycle
  • repetition rate is the performance potential of a
    pipeline
  • Performance potential of a pipeline with no
    define-use delay or load-use delay exist between
    instructions can be calculated as
  • P 1/Rtc

19
Pipeline Performance Measures
  • where
  • Ris the repetition rate of the pipeline in
    cycles
  • tcis the cycle time of the pipeline

20
Application Scenarios of Pipelines
21
Design space of pipelines
Key aspects of the design space of pipelines
22
Basic Pipeline Layout
23
Basic Pipeline Layout
  • The number of pipeline stages
  • when more pipeline stages are used, more parallel
    execution and thus a higher performance can be
    expected
  • disadvantage more number of stages results in
    frequent data and control dependencies which
    decreases performance
  • specification of the subtasks to be performed in
    each stage
  • the specification of the subtasks at a number of
    levels of increasing details

24
Number of Pipeline Stages
25
Number of Pipeline Stages
26
Basic Pipeline Layout
  • Layout of the stage sequence
  • concerns how the pipeline stages are used
  • use of bypassing
  • intended to reduce or eliminate pipeline stalls
    due to RAW dependencies
  • ProblemUnless special arrangements are made, the
    results of the operation instruction is written
    into the register file, or into the memory, and
    then it is fetched from there as a source operand
  • Solutionthe result of the EU is immediately
    forwarded to its input for use in the next
    pipeline cycle

27
Layout of the Stage Sequence
28
Bypassing
29
Basic Pipeline Layout
  • Its implementation requires an additional data
    bus for forwarding the results of the execution
    stage to its input and an appropriate extension
    of the associated multiplexers and latches
  • timing of the pipeline operations
  • self-timed(asynchronous)
  • clocked (synchronous)

30
Timing of Pipeline Operations
31
Dependency Resolution
Method of dependency resolution
Static resolution performed by the compiler
Dynamic resolution performed by extra hardware
Combined resolution performed partly by the
compiler partly by the hardware
Trend
32
Overview of Pipelined Instructions
33
Logical Layout
  • It specifies the tasks to be accomplished, this
    includes
  • the declaration of pipeline to be implemented
  • usually separate pipelines for the processing of
    FX and logical data, called FX pipeline, for FP
    data, the FP pipeline, for loads and stores, L/S
    pipeline, and for branches , the B pipeline
  • DEC a 21164 provides two types of FX integer
    pipelines
  • detailed specification of subtasks to be
    performed and their execution sequence for each
    pipeline
  • detailed description of the subtasks to be
    performed in each stage

34
Power PC 601 Example
35
Detailed Description of FX Pipeline
36
Implementation of Instruction Pipeline
37
Layout of the Physical Pipelines
38
Layout of the Physical Pipelines
  • Multifunction
  • Only one published design of multifunction
    pipeline is available and that is MIPS R4200
    which implements all the FX, FP, L/S and B
    instructions
  • Classical approach/ Master pipeline approach is
    implemented in IBM 801, MIPS, MIPS-X, MIPS
    R-series (up to the R6000), i486, Pentium
  • Dedicated pipelines
  • dedicated pipelines are implemented in power PC
    603, Power PC 604, DEC a etc

39
Multiplicity of Pipelines
  • multiplicity refers to the concept that whether
    to use a single instance of physical pipeline or
    multiple instances of physical pipelines.
  • Two aspects should be considered while
    considering pipeline multiplicity
  • frequency of instructions
  • out-of-order execution of instructions due to
    multiple pipelines

40
Multiplicity of Pipelines
41
Preserving Sequential Consistency
42
Implementation Pipelined Instruction Processing
43
Implementation Pipelined Instruction Processing
Write a Comment
User Comments (0)
About PowerShow.com