Lecture: Pipelining Basics - PowerPoint PPT Presentation

About This Presentation
Title:

Lecture: Pipelining Basics

Description:

Title: PowerPoint Presentation Author: Rajeev Balasubramonian Last modified by: RB Created Date: 9/20/2002 6:19:18 PM Document presentation format – PowerPoint PPT presentation

Number of Views:182
Avg rating:3.0/5.0
Slides: 23
Provided by: RajeevBala136
Learn more at: https://my.eng.utah.edu
Category:

less

Transcript and Presenter's Notes

Title: Lecture: Pipelining Basics


1
Lecture Pipelining Basics
  • Topics Basic pipelining implementation
  • Video 1 What is pipelining?
  • Video 2 Clocks and latches
  • Video 3 An example 5-stage pipeline
  • Video 4 Loads/Stores and RISC/CISC

2
Building a Car
Unpipelined
Start and finish a job before moving to the next
Jobs
Time
3
The Assembly Line
Pipelined
Break the job into smaller stages
A
B
C
A
B
C
A
B
C
Jobs
A
B
C
Time
4
Clocks and Latches
Stage 1
Stage 2
5
Clocks and Latches
Stage 1
Stage 2
L
L
Clk
6
Some Equations
  • Unpipelined time to execute one instruction T
    Tovh
  • For an N-stage pipeline, time per stage T/N
    Tovh
  • Total time per instruction N (T/N Tovh) T
    N Tovh
  • Clock cycle time T/N Tovh
  • Clock speed 1 / (T/N Tovh)
  • Ideal speedup (T Tovh) / (T/N Tovh)
  • Cycles to complete one instruction N
  • Average CPI (cycles per instr) 1

7
Problem 1
  • An unpipelined processor takes 5 ns to work on
    one
  • instruction. It then takes 0.2 ns to latch its
    results into
  • latches. I was able to convert the circuits
    into 5 equal
  • sequential pipeline stages. Answer the
    following, assuming
  • that there are no stalls in the pipeline.
  • What are the cycle times in the two processors?
  • What are the clock speeds?
  • What are the IPCs?
  • How long does it take to finish one instr?
  • What is the speedup from pipelining?

8
Problem 1
  • An unpipelined processor takes 5 ns to work on
    one
  • instruction. It then takes 0.2 ns to latch its
    results into
  • latches. I was able to convert the circuits
    into 5 equal
  • sequential pipeline stages. Answer the
    following, assuming
  • that there are no stalls in the pipeline.
  • What are the cycle times in the two processors?
  • 5.2ns and 1.2ns
  • What are the clock speeds? 192 MHz and 833 MHz
  • What are the IPCs? 1 and 1
  • How long does it take to finish one instr?
    5.2ns and 6ns
  • What is the speedup from pipelining? 833/192
    4.34

9
Problem 2
  • An unpipelined processor takes 5 ns to work on
    one
  • instruction. It then takes 0.2 ns to latch its
    results into
  • latches. I was able to convert the circuits
    into 5 sequential
  • pipeline stages. The stages have the following
    lengths
  • 1ns 0.6ns 1.2ns 1.4ns 0.8ns. Answer the
    following,
  • assuming that there are no stalls in the
    pipeline.
  • What is the cycle time in the new processor?
  • What is the clock speed?
  • What is the IPC?
  • How long does it take to finish one instr?
  • What is the speedup from pipelining?
  • What is the max speedup from pipelining?

10
Problem 2
  • An unpipelined processor takes 5 ns to work on
    one
  • instruction. It then takes 0.2 ns to latch its
    results into
  • latches. I was able to convert the circuits
    into 5 sequential
  • pipeline stages. The stages have the following
    lengths
  • 1ns 0.6ns 1.2ns 1.4ns 0.8ns. Answer the
    following,
  • assuming that there are no stalls in the
    pipeline.
  • What is the cycle time in the new processor?
    1.6ns
  • What is the clock speed? 625 MHz
  • What is the IPC? 1
  • How long does it take to finish one instr? 8ns
  • What is the speedup from pipelining? 625/192
    3.26
  • What is the max speedup from pipelining?
    5.2/0.2 26

11
A 5-Stage Pipeline
Source HP textbook
12
A 5-Stage Pipeline
Use the PC to access the I-cache and increment
PC by 4
13
A 5-Stage Pipeline
Read registers, compare registers, compute branch
target for now, assume branches take 2 cyc
(there is enough work that branches can easily
take more)
14
A 5-Stage Pipeline
ALU computation, effective address computation
for load/store
15
A 5-Stage Pipeline
Memory access to/from data cache, stores finish
in 4 cycles
16
A 5-Stage Pipeline
Write result of ALU computation or load into
register file
17
Problem 3
  • For the following code sequence, show how the
    instrs
  • flow through the pipeline
  • ADD R1, R2, ? R3
  • BEZ R4, R5
  • LD R6 ? R7
  • ST R8 ? R9

18
RISC/CISC Loads/Stores
19
Problem 4
  • Convert this C code into equivalent RISC
    assembly
  • instructions
  • ai bi ci

20
Problem 4
  • Convert this C code into equivalent RISC
    assembly
  • instructions
  • ai bi ci
  • LD R1, R2 R1 has the address for
    variable i
  • MUL R2, 8, R3 the offset from the
    start of the array
  • ADD R4, R3, R7 R4 has the address of
    a0
  • ADD R5, R3, R8 R5 has the address of
    b0
  • ADD R6, R3, R9 R6 has the address of
    c0
  • LD R8, R10 Bringing bi
  • LD R9, R11 Bringing ci
  • ADD R10, R11, R12 Sum is in R12
  • ST R7, R12 Putting result in
    ai

21
Problem 5
  • Design your own hypothetical 8-stage pipeline.

22
Title
  • Bullet
Write a Comment
User Comments (0)
About PowerShow.com