Title: CS 161Computer Architecture Chapter 5 Lecture 11
1CS 161Computer Architecture Chapter 5Lecture 11
- Instructor L.N. Bhuyan
- www.cs.ucr.edu/bhuyan
- Adapted from notes by Dave Patterson(http.cs.berk
eley.edu/patterson)
2Implementing Main Control
Main Control has one 6-bit input, 9 outputs (7
are 1-bit, ALUOp is 2 bits) To build Main
Control as sum-of-products (1) Construct a
minterm for each different instruction (or
R-type) each minterm corresponds to a single
instruction (or all of the R-type instructions),
e.g., MR-format, Mlw(2) Determine each main
control output by forming the logical OR of
relevant minterms (instructions), e.g., RegWrite
MR-format OR Mlw
Main Control
RegDstBranchMemReadMemtoRegALUopMemWrite
ALUSrcRegWrite
op
2
3Single-Cycle MIPS-lite CPU
M u x
a d d
ltlt 2
4
PCSrc
MemWrite
2521
ReadReg1
Branch
Read Addr
P C
Readdata
Readdata1
Zero
ReadReg2
310
2016
A L U
Instruc- tion
Address
Readdata2
M u x
MemTo- Reg
WriteReg
M u x
Dmem
Imem
Regs
ALU- con
WriteData
WriteData
1511
op3126
M u x
RegDst
ALU- src
ALU Control
RegWrite
MemRead
Main Control
150
50
ALUOp (2)
4Fig. 5.17 Datapath with Control Signals
5Fig. 5.18 Setting Control Lines Depend on Opcode
6Control Design
- Simple combinational logic (truth tables)
7Fig. 5.19 R-type operation, add t1, t2, t3
Active parts are highlighted
8Fig. 5.20 Active parts for a Load instruction
9Fig. 5.21 Active parts for a beq instruction
10Fig. 5.24 Extension for Jump instruction
11Single-Cycle Machine Appraisal
- All instructions complete in one clock cycle (CPI
1) - Some instructions take more steps than others
- lw is most expensive (5 steps, vs. 4 for R-type
and sw, 3 for beq) - Clock cycle must cover longest instruction ?
inefficient - suppose mul is added?
- 32-shift/add steps ? would delay every other
instruction
12Example
- Assume 2ns for instruction/data memory, 1ns for
decode/register read, 2ns for ALU and 1 ns for
register write. - Single-cycle datapath clock period 8 ns.
- Assume an instn mix of 24 loads, 12 stores, 44
R-format, 18 branches, and 2 jumps. - Assuming a variable-cycle datapath, average clock
period 6.3 ns. - Possible Speed-up 1.27
13Multicycle Implementation (MIPS-lite v.2)
- Want more efficient implementation
- Each step will take one clock cycle (not each
instruction) CPI gt 1 - ? shorter clock cycle cycle time constrained by
longest step, not longest instruction - simpler instructions take fewer cycles
- ? higher overall performance
- complex control finite state machine
- Versatile (can extend for new instructions add3,
swap, etc.)
14Recap Clocking single-cycle vs. multicycle
Single-cycle Implementation
clock
waste
waste
beq t0,t1,L
add t0,t1,t2
Multicycle Implementation
clock
add t0,t1,t2
beq t0,t1,L
- Multicycle Implementation less wastehigher
performance
15Recap How fast can we run the clock?
- Depends on how much want done per clock cycle
- Can do several inexpensive datapath operations
per clock - simple gates (AND, OR, )
- single datapath registers (PC)
- sign extender, left shifter, multiplexor
- PLUS exactly one expensive datapath operation
per clock - ALU operation
- Register File access (2 reads, or 1 write)
- Memory access (read or write)
16Multicycle Datapath (overview)
MIPS-lite Multicycle Version
PC
Instruction Register
ReadReg1
Address
Memory
A
Readdata 1
ReadReg2
A L U
Instruction or Data
ALU- Out
Registers
B
Readdata 2
WriteReg
Data
MemoryData Register
Data
- One ALU (no extra adders)
- One Memory (no separate Imem, Dmem)
- New Temporary Registers (clocked/require clock
input)
17Multicycle Implementation
- Datapath changes
- one memory both instructions and data (because
can access on separate steps) - one ALU (eliminate extra adders)
- extra invisible registers to capture
intermediate (per-step) datapath results - Controller changes
- controller must fire control lines in correct
sequence and correct time - ? controller must remember current execution
step, advance to next step
18Multicycle Datapath Add Multiplexors
PC
ReadReg1
Address
M u x
2521
Readdata1
Mem
zero
A L U
A
ALU- Out
ReadReg2
M u x
2016
Read Data
Readdata2
B
WriteReg
M u x
150
Write Data
4
1511
0 1M 2 u 3 x
IR
Regs
WriteData
MDR
M u x
Note inputs to multiplexors
ltlt 2
19Datapath Control Points
MemRead
IRWrite
RegWrite
PCWrite
PCSrc
MemWrite
ALUSrcA
IorD
RegDst
PCWrite- Cond
PC
M u x
ReadReg1
Address
M u x
2521
Readdata1
z
Mem
A L U
A
ReadReg2
ALU- Out
M u x
2016
Read Data
Readdata2
WriteReg
B
M u x
150
Write Data
4
1511
0 1M 2 u 3 x
IR
Regs
3
WriteData
MDR
M u x
ltlt 2
2
2
(funct) 50
ALUSrcB
MemtoReg
ALUOp
20Multicycle Instruction Execution
- All instructions execute in 3-5 cycles
- 3 cycles beq
- 4 cycles R-type, sw
- 5 cycles lw
- 1 fetch instruction, PCPC4
- 2 decode, fetch registers, brnch target
- 3 execute/compute address/branch
- 4 access memory/complete R-type
- 5 (lw) store memory
21Cycle 1 Datapath IRMemPC PCPC4
PC
M u x
ReadReg1
Address
M u x
2521
Readdata1
z
Mem
A L U
A
ReadReg2
ALU- Out
M u x
2016
Read Data
Readdata2
WriteReg
B
M u x
150
Write Data
4
1511
0 1M 2 u 3 x
IR
Regs
3
WriteData
MDR
M u x
(funct) 50
ltlt 2
IRMemPC PCPC4
2
2
22Cycle 2 ARegIR2521 ALUOut PC
sgn-ext(IR150) ltlt 2
PC
M u x
ReadReg1
Address
M u x
2521
Readdata1
z
Mem
A L U
A
ReadReg2
ALU- Out
M u x
2016
Read Data
Readdata2
WriteReg
B
M u x
150
Write Data
4
1511
IR
0 1M 2 u 3 x
Regs
3
WriteData
MDR
M u x
(funct) 50
ARegIR2521BRegIR2016
ltlt 2
ALUOut PC sgn-ext(IR150) ltlt 2
2
2
23Cycle 3 R-format ALUOut A op B
PC
M u x
ReadReg1
Address
M u x
2521
Readdata1
z
Mem
A L U
A
ReadReg2
ALU- Out
M u x
2016
Read Data
Readdata2
WriteReg
B
M u x
150
Write Data
4
1511
0 1M 2 u 3 x
IR
Regs
3
WriteData
MDR
M u x
(funct) 50
ltlt 2
ALUOutA op B
2
2
24Cycle 4 R-format RegIR1511 ALUOut
PC
M u x
ReadReg1
Address
M u x
2521
Readdata1
z
Mem
A L U
A
ReadReg2
ALU- Out
M u x
2016
Read Data
Readdata2
WriteReg
B
M u x
150
Write Data
4
1511
0 1M 2 u 3 x
IR
Regs
3
WriteData
MDR
M u x
(funct) 50
ltlt 2
RegIR1511 ALUOut
2
2
25Cycle 3 beq if (AB) PC ALUOut
PC
M u x
ReadReg1
Address
M u x
2521
Readdata1
z
Mem
A L U
A
ReadReg2
ALU- Out
M u x
2016
Read Data
Readdata2
WriteReg
B
M u x
150
Write Data
4
1511
0 1M 2 u 3 x
IR
Regs
3
WriteData
MDR
M u x
(funct) 50
ltlt 2
if (AB) PC ALUOut
2
2
26Cycle 3 lw ALUOut A sgn-ext(IR150)
MemRead
IRWrite
RegWrite
PCWrite
PCSrcx
MemWrite
ALUSrcA1
IorDx
RegDstx
PC
PCWrite- Cond
M u x
ReadReg1
Address
M u x
2521
Readdata1
z
Mem
A L U
A
ReadReg2
ALU- Out
M u x
2016
Read Data
Readdata2
WriteReg
B
M u x
150
Write Data
4
1511
0 1M 2 u 3 x
IR
Regs
3
WriteData
MDR
M u x
(funct) 50
ltlt 2
ALUOut A sgn-ext(IR150)
2
2
ALUSrcB2
MemtoRegx
ALUOp0
27Cycle 4 lwMDR MemALUout
MemRead
IRWrite
RegWrite
PCWrite
PCSrcx
MemWrite
ALUSrcAx
IorD1
RegDstx
PCWrite- Cond
PC
M u x
ReadReg1
Address
M u x
2521
Readdata1
z
Mem
A L U
A
ReadReg2
ALU- Out
M u x
2016
Read Data
Readdata2
WriteReg
B
M u x
150
Write Data
4
1511
0 1M 2 u 3 x
IR
Regs
3
WriteData
MDR
M u x
(funct) 50
ltlt 2
MDR MemALUout
2
2
ALUSrcBx
MemtoRegx
ALUOpx
28Cycle 5 lw RegIR1511 MDR
MemRead
IRWrite
RegWrite
PCWrite
PCSrcx
MemWrite
ALUSrcAx
IorDx
RegDst0
PCWrite- Cond
PC
M u x
ReadReg1
Address
M u x
2521
Readdata1
z
Mem
A L U
A
ReadReg2
ALU- Out
M u x
2016
Read Data
Readdata2
WriteReg
B
M u x
150
Write Data
4
1511
0 1M 2 u 3 x
IR
Regs
3
WriteData
MDR
M u x
(funct) 50
ltlt 2
RegIR1511 MDR
2
2
ALUSrcBx
MemtoReg1
ALUOpx
29Cycle 4 (sw) MemALUOut B
MemRead
IRWrite
RegWrite
PCWrite
PCSrc
MemWrite
ALUSrcA
IorD1
RegDst
PCWrite- Cond
PC
M u x
ReadReg1
Address
M u x
2521
Readdata1
z
Mem
A L U
A
ReadReg2
ALU- Out
M u x
2016
Read Data
Readdata2
WriteReg
B
M u x
150
Write Data
4
1511
0 1M 2 u 3 x
IR
Regs
3
WriteData
MDR
M u x
(funct) 50
ltlt 2
MemALUOut B
2
2
ALUSrc
MemtoReg
ALUOp