EEL-4713 Computer Architecture Designing a Single Cycle Datapath - PowerPoint PPT Presentation

About This Presentation
Title:

EEL-4713 Computer Architecture Designing a Single Cycle Datapath

Description:

EEL-4713 Computer Architecture Designing a Single Cycle Datapath – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 41
Provided by: annEceUfl
Category:

less

Transcript and Presenter's Notes

Title: EEL-4713 Computer Architecture Designing a Single Cycle Datapath


1
EEL-4713Computer Architecture Designing a
Single Cycle Datapath
2
Outline
  • Introduction
  • The steps of designing a processor
  • Datapath and timing for register-register
    operations
  • Datapath for logical operations with immediates
  • Datapath for load and store operations
  • Datapath for branch and jump operations

3
Big Picture
  • The five classic components of a computer
  • Todays topic design of a single cycle processor

4
The Big Picture The Performance Perspective
  • Performance of a machine is determined by
  • Instruction count
  • Clock cycle time
  • Clock cycles per instruction
  • CPI will discuss later
  • Processor design determines
  • Clock cycle time
  • Clock cycles per instruction
  • Single cycle processor
  • Advantage One clock cycle per instruction
  • Disadvantage long cycle time

5
How to Design a Processor step-by-step
  • 1. Analyze instruction set gt datapath
    requirements
  • The meaning of each instruction is given by the
    register transfers
  • The datapath must include storage element for ISA
    registers
  • And possibly more
  • The datapath must support each register transfer
  • 2. Select set of datapath components and
    establish clocking methodology
  • 3. Assemble datapath meeting the requirements
  • 4. Analyze implementation of each instruction to
    determine setting of control points that effects
    the register transfer.
  • 5. Assemble the control logic

6
MIPS ISA instruction formats
  • All MIPS instructions are 32 bits long. There
    are 3 instruction formats
  • R-type
  • I-type
  • J-type
  • The different fields are
  • op operation of the instruction
  • rs, rt, rd the source(s) and destination
    register specifiers
  • shamt shift amount
  • funct selects the variant of the operation in
    the op field
  • address / immediate address offset or immediate
    value
  • target address target address of the jump
    instruction

7
Step 1a The MIPS lite subset for today
  • ADD and SUB
  • addU rd, rs, rt
  • subU rd, rs, rt
  • OR Immediate
  • ori rt, rs, imm16
  • LOAD and STORE Word
  • lw rt, rs, imm16
  • sw rt, rs, imm16
  • BRANCH
  • beq rs, rt, imm16

8
Logical Register Transfers
  • RTL gives the meaning of the instructions
  • All start by fetching the instruction

op rs rt rd shamt funct MEM PC op
rs rt Imm16 MEM PC
inst Register Transfers ADDU Rrd lt Rrs
Rrt PC lt PC 4 SUBU Rrd lt Rrs
Rrt PC lt PC 4 ORi Rrt lt Rrs
zero_ext(Imm16) PC lt PC 4 LOAD Rrt lt MEM
Rrs sign_ext(Imm16) PC lt PC 4 STORE MEM
Rrs sign_ext(Imm16) lt Rrt PC lt PC
4 BEQ if ( Rrs Rrt ) then PC lt
PC sign_ext(Imm16)
else PC lt PC 4
9
Logical Register Transfers
  • RTL gives the meaning of the instructions
  • All start by fetching the instruction

op rs rt rd shamt funct MEM PC op
rs rt Imm16 MEM PC
inst Register Transfers ADDU Rrd lt Rrs
Rrt PC lt PC 4 SUBU Rrd lt Rrs
Rrt PC lt PC 4 ORi Rrt lt Rrs
zero_ext(Imm16) PC lt PC 4 LOAD Rrt lt MEM
Rrs sign_ext(Imm16) PC lt PC 4 STORE MEM
Rrs sign_ext(Imm16) lt Rrt PC lt PC
4 BEQ if ( Rrs Rrt ) then PC lt
PC sign_ext(Imm16) 00
else PC lt PC 4
10
Step 1 Requirements of the Instruction Set
  • Memory
  • instruction data
  • Registers (32 x 32)
  • read RS
  • read RT
  • Write RT or RD
  • PC
  • Extender
  • Add and Sub register or extended immediate
  • Add 4 or extended immediate to PC

11
Step 2 Components of the Datapath
  • Combinational Elements
  • Storage Elements
  • Clocking methodology

12
Combinational Logic Elements (Basic Building
Blocks)
CarryIn
  • Adder
  • MUX
  • ALU

A
32
Sum
Adder
32
B
Carry
32
Select
A
32
Y
MUX
32
B
32
OP
A
32
Result
ALU
32
B
32
13
Storage Element Register (Basic Building Block)
  • Register
  • Similar to the D Flip Flop except
  • N-bit input and output
  • Write Enable input
  • Write Enable
  • negated (0) (not asserted) Data Out will not
    change
  • asserted (1) Data Out will become Data In on the
    next triggering clock edge

Write Enable
Data In
Data Out
N
N
Clk
14
Storage Element Register File
Rw
Ra
Rb
  • Register File consists of 32 registers
  • Two 32-bit output busses
  • busA and busB
  • One 32-bit input bus busW
  • Register is selected by
  • Ra (number) selects the register to put on busA
    (data)
  • Rb (number) selects the register to put on busB
    (data)
  • Rw (number) selects the register to be
    writtenvia busW (data) when Write Enable is 1
  • Clock input (CLK)
  • The CLK input is a factor ONLY during write
    operation
  • Read operations behave as a combinational logic
    block (i.e., reads are not clocked)
  • RA or RB valid gt busA or busB valid after
    access time.

Write Enable
5
5
5
busA
busW
32
32 32-bit Registers
32
busB
Clk
32
15
Storage Element Idealized Memory
Write Enable
Address
  • Memory (idealized)
  • One input bus Data In
  • One output bus Data Out
  • Memory word is selected by
  • Address selects the word to put on Data Out
  • Write Enable 1 -gt address selects the
    memoryword to be written via the Data In bus
  • Clock input (CLK)
  • The CLK input is a factor ONLY during write
    operation
  • Read operations behave as a combinational logic
    block (i.e., reads are not clocked)
  • Address valid gt Data Out valid after access
    time.

Data In
DataOut
32
32
Clk
16
Clocking Methodology
Clk
Setup
Hold
Setup
Hold
Dont Care
  • All storage elements are clocked by the same
    clock edge
  • Cycle Time Hold Longest Delay Path Setup
    Clock Skew

17
Step 3
  • Register Transfer Requirements gt Datapath
    Assembly
  • Instruction Fetch
  • Read Operands and Execute Operation

18
3a Overview of the Instruction Fetch Unit
  • The common RTL operations
  • Fetch the Instruction memPC
  • Update the program counter
  • Sequential Code PC lt- PC 4
  • Branch and Jump PC lt- something else

Instruction Word
32
19
Next Address Logic No Branching
ADD
4
20
RTL The ADD Instruction
  • add rd, rs, rt
  • op rs rt rd shamt funct lt- memPC
  • Fetch the instruction from memory
  • Rrd lt- Rrs Rrt The actual operation
  • PC lt- PC 4 Calculate the next instructions
    address

21
RTL The Subtract Instruction
  • sub rd, rs, rt
  • op rs rt rd shamt funct lt- memPC
  • Fetch the instruction from memory
  • Rrd lt- Rrs - Rrt The actual operation
  • PC lt- PC 4 Calculate the next instructions
    address

22
3b Add Subtract
  • Rrd lt- Rrs op Rrt Example addU rd,
    rs, rt
  • Ra, Rb, and Rw come from instructions rs, rt,
    and rd fields
  • ALUctr and RegWr control logic after decoding
    the instruction

ctrl
Rs
Rt
Rd
ALUctr
RegWr
5
5
5
busA
Rw
Ra
Rb
busW
32
Result
32 32-bit Registers
ALU
32
32
busB
Clk
32
23
Register-Register Timing
Clk
Hold
New Value
Old Value
PC
Instruction Memory Access Time
Rs, Rt, Rd, Op, Func
Old Value
New Value
Delay through Control Logic
ALUctr
Old Value
New Value
RegWr
Old Value
New Value
Register File Access Time
busA, B
Old Value
New Value
ALU Delay
busW
Old Value
New Value
Rs
Rt
Rd
ALUctr
Register Write Occurs Here
RegWr
5
5
5
busA
Rw
Ra
Rb
busW
32
Result
32 32-bit Registers
ALU
32
32
busB
Clk
32
24
RTL The OR Immediate Instruction
  • ori rt, rs, imm16
  • op rs rt Imm16 lt- memPC
  • Fetch the instruction from memory
  • Rrt lt- Rrs OR ZeroExt(imm16)

  • The OR operation
  • PC lt- PC 4 Calculate the next instructions
    address

25
3c Logical Operations with Immediate
  • Rrt lt- Rrs op ZeroExtimm16

Rt
Rd
RegDst
Mux
Rs
ALUctr
RegWr
5
5
5
busA
Rw
Ra
Rb
busW
Result
32
32 32-bit Registers
ALU
32
32
busB
Clk
32
Mux
ZeroExt
imm16
32
16
ALUSrc
26
RTL The Load Instruction
  • lw rt, rs, imm16
  • op rs rt Imm16 lt- memPC
  • Fetch the instruction from memory
  • Addr lt- Rrs SignExt(imm16)

  • Calculate the memory address
  • Rrt lt- MemAddr Load the data into the
    register
  • PC lt- PC 4 Calculate the next instructions
    address

27
3d Load Operations
  • Rrt lt- MemRrs SignExtimm16 Example lw
    rt, rs, imm16

Rt
Rd
RegDst
Mux
Rs
ALUctr
RegWr
5
5
5
busA
W_Src
Rw
Ra
Rb
busW
32
32 32-bit Registers
ALU
32
32
busB
Clk
MemWr
32
Mux
Mux
WrEn
Adr
Data In
32
Data Memory
Extender
32
imm16
32
16
Clk
ALUSrc
ExtOp
28
3e Store Operations
  • Mem Rrs SignExtimm16 lt- Rrt Example
    sw rt, rs, imm16

Rt
Rd
ALUctr
MemWr
W_Src
RegDst
Mux
Rs
Rt
RegWr
5
5
5
busA
Rw
Ra
Rb
busW
32
32 32-bit Registers
ALU
32
32
Clk
busB
32
Mux
Mux
WrEn
Adr
Data In
32
32
Data Memory
Extender
imm16
32
16
Clk
ALUSrc
ExtOp
29
3f The Branch Instruction
  • beq rs, rt, imm16
  • op rs rt Imm16 lt- memPC
  • Fetch the instruction from memory
  • Equal lt- Rrs Rrt Calculate the branch
    condition
  • if (COND eq 0) Calculate the next instructions
    address
  • PC lt- PC 4 ( SignExt(imm16) x 4 )
  • else
  • PC lt- PC 4

30
Datapath for Branch Operations
  • beq rs, rt, imm16 Datapath generates
    condition (equal)

Inst Address
Branch
Instruction Word
32
31
Putting it All Together A Single Cycle Datapath
Instructionlt310gt
Branch
lt2125gt
lt1620gt
lt1115gt
lt015gt
Clk
PC
Imm16
Rd
Rt
Rs
RegDst
ALUctr
MemtoReg
MemWr
Rt
Rd
0
1
Equal
32
Rs
Rt
RegWr
5
5
5
busA
Rw
Ra
Rb

busW
32
32 32-bit Registers
ALU
0
Instructionlt310gt
32
32
busB
0
32
Mux
Mux
Clk
32
WrEn
Adr
1
1
Data In
Extender
Data Memory
imm16
32
16
Clk
ALUSrc
ExtOp
32
Next Address Logic With Branching
0
Mux
1
ADD
ADD
4
32
How would you add jump to this?
Branch?
Equal?
Sign Extended Immediate
33
An Abstract View of the Critical Path
  • Register file and ideal memory
  • The CLK input is a factor ONLY during write
    operation
  • During read operation, behave as combinational
    logic
  • Address valid gt Output valid after access time.

Critical Path (Load Operation) PCs Hold
Instruction Memorys Access Time
Register Files Access Time ALU to Perform
a 32-bit Add Data Memory Access Time
Setup Time for Register File Write Clock
Skew
Ideal Instruction Memory
Instruction
Rd
Rs
Rt
Imm
5
5
5
16
Instruction Address
A
Data Address
32
Rw
Ra
Rb
32
Ideal Data Memory
32
32 32-bit Registers
Next Address
Data In
B
Clk
Clk
32
34
Binary arithmetic for the next address
  • In theory, the PC is a 32-bit byte address into
    the instruction memory
  • Sequential operation PClt310gt PClt310gt 4
  • Branch operation PClt310gt PClt310gt 4
    SignExtImm16 4
  • The magic number 4 always comes up because
  • The 32-bit PC is a byte address
  • And all our instructions are 4 bytes (32 bits)
    long
  • In other words
  • The 2 LSBs of the 32-bit PC are always zeros
  • There is no reason to have hardware to keep the 2
    LSBs
  • In practice, we can simplify the hardware by
    using a 30-bit PClt312gt
  • Sequential operation PClt312gt PClt312gt 1
  • Branch operation PClt312gt PClt312gt 1
    SignExtImm16
  • In either case Instruction Memory Address
    PClt312gt concat 00

35
Binary arithmetic for the next address
  • In theory, the PC is a 32-bit byte address into
    the instruction memory
  • Sequential operation PClt310gt PClt310gt 4
  • Branch operation PClt310gt PClt310gt 4
    SignExtImm16 4
  • The magic number 4 always comes up because
  • The 32-bit PC is a byte address
  • And all our instructions are 4 bytes (32 bits)
    long
  • In other words
  • The 2 LSBs of the 32-bit PC are always zeros
  • There is no reason to have hardware to keep the 2
    LSBs
  • In practice, we can simplify the hardware by
    using a 30-bit PClt312gt
  • Sequential operation PClt312gt PClt312gt 1
  • Branch operation PClt312gt PClt312gt 1
    SignExtImm16
  • In either case Instruction Memory Address
    PClt312gt concat 00

36
Next Address Logic Expensive and Fast Solution
  • Using a 30-bit PC
  • Sequential operation PClt312gt PClt312gt 1
  • Branch operation PClt312gt PClt312gt 1
    SignExtImm16
  • In either case Instruction Memory Address
    PClt312gt concat 00

30
Addrlt312gt
30
Addrlt10gt
00
30
Instruction Memory
30
1
32
30
SignExt
30
imm16
16
Instructionlt310gt
Instructionlt150gt
Branch?
Equal?
37
Next Address Logic Cheap and Slow Solution
  • Why is this slow?
  • Cannot start the address add until Zero (output
    of ALU) is valid
  • Does it matter that this is slow in the overall
    scheme of things?
  • Probably not here. Critical path is the load
    operation.

30
Addrlt312gt
30
Addrlt10gt
1
00
Instruction Memory
Carry In
0
30
32
SignExt
imm16
30
30
16
Instructionlt150gt
Instructionlt310gt
Branch
Zero
38
RTL The Jump Instruction
  • j target
  • memPC Fetch the instruction from memory
  • PClt312gt lt- PClt3128gt concat targetlt250gt
  • Calculate the next instructions address

39
RTL The Jump Instruction
  • j target
  • memPC Fetch the instruction from memory
  • PClt312gt lt- PClt3128gt concat targetlt250gt
  • Calculate the next instructions address

40
Instruction Fetch Unit
  • j target
  • PClt312gt lt- PClt3128gt concat targetlt250gt

30
Addrlt312gt
30
PClt3128gt
Addrlt10gt
00
4
Target
Instruction Memory
30
Instructionlt250gt
26
30
32
30
1
Jump
Instructionlt310gt
30
SignExt
30
imm16
16
Instructionlt150gt
Branch
Zero
41
Putting it All Together A Single Cycle Datapath
  • We have everything except control signals
    (underline)

Instructionlt310gt
Branch
Instruction Fetch Unit
Jump
Rt
Rd
lt2125gt
lt1620gt
lt1115gt
lt015gt
Clk
RegDst
0
1
Mux
Imm16
Rd
Rs
Rt
Rs
Rt
ALUctr
RegWr
5
5
5
MemtoReg
busA
Zero
MemWr
Rw
Ra
Rb
busW
32
32 32-bit Registers
0
ALU
32
busB
32
0
Clk
Mux
32
Mux
32
1
WrEn
Adr
1
Data In
32
Data Memory
Extender
imm16
32
16
Clk
ALUSrc
ExtOp
42
An Abstract View of the Implementation
Control
Ideal Instruction Memory
Control Signals
Conditions
Instruction
Rd
Rs
Rt
5
5
5
Instruction Address
A
Data Address
Data Out
32
Rw
Ra
Rb
32
Ideal Data Memory
32
32 32-bit Registers
Next Address
Data In
B
Clk
Clk
32
Datapath
  • Logical vs. Physical Structure

43
Summary
  • 5 steps to design a processor
  • 1. Analyze instruction set gt datapath
    requirements
  • 2. Select set of datapath components establish
    clock methodology
  • 3. Assemble datapath meeting the requirements
  • 4. Analyze implementation of each instruction to
    determine setting of control points that effects
    the register transfer.
  • 5. Assemble the control logic
  • MIPS makes it easier
  • Instructions same size
  • Source registers always in same place
  • Immediates same size, location
  • Operations always on registers/immediates
  • Single cycle datapath gt CPI1, CCT gt long
  • Next time implementing control (Steps 4 and 5)
Write a Comment
User Comments (0)
About PowerShow.com