COMP541 Sequencing - PowerPoint PPT Presentation

About This Presentation
Title:

COMP541 Sequencing

Description:

Design Reviews: Last Week of Classes ... Like production line book uses car wash example. Wash, rinse, dry. 36. Latency vs Throughput ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 40
Provided by: Montek5
Learn more at: http://www.cs.unc.edu
Category:

less

Transcript and Presenter's Notes

Title: COMP541 Sequencing


1
COMP541Sequencing III(Sequencing a Computer)
  • Montek Singh
  • April 9, 2007

2
Test 2
  • On April 17
  • Covers
  • Memories
  • Arithmetic
  • Datapaths
  • Sequencing

3
Design Reviews Last Week of Classes
  • Individual meetings during class time next week
    (and maybe one more day)
  • 20 minutes
  • Please prepare a presentation
  • Not necessarily a PPT, but dont make up your
    description on the fly

4
Chapter 10-7
  • Simple computer architecture
  • Not unlike MIPS, except 16 bits
  • Single-cycle hardwired control
  • Multicycle microprogrammed control

5
Instruction Formats
  • Register-type instructions
  • Only 8 registers (3 bits)

6
Immediate
  • Only 3 bits for the immediate value (Op)
  • Mostly useful for typical increments/decrement
  • Or just as an example

7
Branching
  • PC relative branching
  • The 6 bits are sign extended to 16
  • Opcode might specify branch on zero, if register
    SA is zero

8
Example Instructions
9
Contrast to Microoperations
  • Although appear similar, theyre not
  • Computer instructions fetched using PC
  • Branching much more general
  • Decoding of computer instructions usually more
    complex

10
Resources
  • Book implies Harvard architecture
  • Separate I and D
  • They treat I memory as ROM
  • Asynchronous

11
Single-Cycle Control
  • Datapath is same as example we used in datapath
    topic
  • Next slide shows for review
  • First look at overall control
  • Then look at instruction decoder

12
Datapath Control Word
13
Arch
14
Instruction Decoder
  • Many lines (the three regs) need no logic
  • RISC Style
  • Architecture tailored so parts of inst.
    correspond to control lines

15
Control
  • Not much more to say
  • Simple, partly because decoding so
    straightforward
  • Drawbacks
  • Some instructions, like multi-cycle shifts, cant
    be implemented w/o complex datapath
  • Two memories (essentially a ROM and an async data
    memory)
  • Two cycles needed to use one memory
  • Biggest problem is delay

16
Delay in Single-Cycle Control
  • Worst case delay with reasonable components
  • Say, total 17ns
  • Could only clock at about 50 MHz
  • Pipelining is a solution
  • First lets look at multi-cycle control

17
Multi-Cycle Hardwired Control
  • Goals
  • Support more complex instructions
  • Use single memory
  • Not necessarily coupled with multi-cycle

18
Arch
19
Instruction Register
  • IL load signal for IR
  • PS, for PC control
  • Hold value multiple cycles, increment, load, etc.

20
Single Memory
  • PC addresses memory
  • Mux M gates address
  • MM signal to select program/data address
  • Inst. stored in IR

21
Added Temporary Regs
  • Now 16x16
  • 8 not visible to user
  • New signals to address the registers

22
Sequence Control
23
Control Word
24
Control Design
  • Not hard to specify state diagram
  • Derived from definition of ISA
  • Hard to design logic manually
  • If didnt have logic synthesis, would probably
    use microprogramming

25
Two-Cycle Instructions
  • Simplest instructions have 2 cycles
  • Fetch (instruction)
  • Execute
  • This is minimum necessary
  • They assume async memory
  • Dont need extra clock cycles

26
Basic Inst.
27
Branch
  • Test and modify PC

28
Next Step
  • Make a table or write Verilog from ASM diagram
    and instruction descriptions
  • Tedious, but not hard
  • Same as youve done, with more details
  • State machine easy for these instructions

29
Table from ISA and ASM
30
Load Register Indirect
  • Three cycles
  • Temporary register used

31
Shift
  • Shift right/left multiple
  • RSA to be shifted
  • First tested for 0
  • R9 loaded with shift length

32
Multi-Cycle Table
33
Summary multi-cycle
  • Multi-cycle computer enables more complex
    instructions
  • May also be faster
  • Well also briefly look at pipelined computers
    more parallelism but more complex control

34
Limits to Clock Period
  • Conventional datapath
  • 12 ns delay, so maximum is 83 MHz clock
  • Maybe have even tighter constraints due to
    control logic

35
Pipelining
  • Break datapath into stages
  • Add registers between stages
  • Like production line book uses car wash example
  • Wash, rinse, dry

36
Latency vs Throughput
  • Latency, the amount of time it takes to execute
    an instruction, does not improve
  • In fact, typically increases
  • Throughput, the number of instructions executed
    per second, increases
  • By almost the number of pipeline stages

37
Expected Performance
  • Longest stage is 5ns
  • So clock can be 200 MHz
  • Not 3 x 83 MHz. Why?
  • Latency is 15ns
  • Also extra hardware

38
Datapath
  • 3 Stages
  • Operand Fetch
  • Execute
  • Write Back
  • Note that WB register is the register file (same
    as at top)

39
Pipelined Execution
  • Note pipeline fill and empty
  • Efficiency is not 100
  • Important to not stall pipeline
Write a Comment
User Comments (0)
About PowerShow.com