LECTURE 7: Multicycle CPU - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

LECTURE 7: Multicycle CPU

Description:

This presentation uses powerpoint animation: please viewshow. CWRU EECS 318 ... ( hardwired control ) Multi-cycle using Finite State Machine. CWRU EECS 318 ... – PowerPoint PPT presentation

Number of Views:89
Avg rating:3.0/5.0
Slides: 46
Provided by: francis55
Category:

less

Transcript and Presenter's Notes

Title: LECTURE 7: Multicycle CPU


1
LECTURE 7 Multicycle CPU
EECS 318 CADComputer Aided Design
Instructor Francis G. Wolff wolff_at_eecs.cwru.edu
Case Western Reserve University This
presentation uses powerpoint animation please
viewshow
2
MIPS instructions
ALU alu rd,rs,rt rd rs ltalugt rt
ALUi alui rd,rs,value rd rs ltalugt value
Data lw rt,offset(rs) rt Memrs
offsetTransfer sw rt,offset(rs) Memrs
offset rt
Branch beq rs,rt,offset pc (rd rs)?
(pc4offset)(pc4)
Jump j address pc address
3
MIPS fixed sized instruction formats
ALUi alui rt,rs,value
I - Format
Data lw rt,offset(rs)Transfer sw
rt,offset(rs)
Branch beq rs,rt,offset
4
Assembling Instructions
Suppose there are 32 registers, addu
opcode001001, addi op001000
5
MIPS instruction formats
Arithmetic addi rt, rs, value add
rd,rs,rt
Data Transfer lw rt,offset(rs) sw
rt,offset(rs)
Conditional branch beq rs,rt,offset
Unconditional jump j address
6
MIPS registers and conventions
Name Number Conventional usage0
0 Constant 0v0-v1 2-3 Expression
evaluation function returna0-a3 4-7
Arguments 1 to 4t0-t9 8-15,24,35 Temporary
(not preserved across call)s0-s7 16-23
Saved Temporary (preserved across call)k0-k1
26-27 Reserved for OS kernelgp 28
Pointer to global areasp 29 Stack
pointerfp 30 Frame pointerra
31 Return address (used by function call)
7
C function to MIPS Assembly Language
int power_2(int y) / compute x2y
/ register int x, i x1 i0 while(ilty)
xx2 ii1 return x
Assember .s Comments addi t0, 0, 1
x1 addu t1, 0, 0 i0w1
bge t1,a0,w2 while(ilty) / bge greater or
equal / addu t0, t0, t0 x x 2 / same
as xxx / addi t1,t1,1 i i
1 beq 0,0,w1 w2 addu v0,0,t0 return
x jr ra jump on register ( pc ra )
8
Power_2.s MIPS storage assignment
.text 0x00400020 addi 8, 0, 1 addi t0,
0, 1 0x00400024 addu 9, 0, 0 addu t1,
0, 0 0x00400028 bge 9, 4, 2 bge t1, a0,
w2 0x0040002c addu 8, 8, 8 addi t0, t0,
t0 0x00400030 addi 9, 9, 1 addi t1, t1,
1 0x00400034 beq 0, 0, -3 beq 0, 0,
w1 0x00400038 addu 2, 0, 8 addu v0, 0,
t0 0x0040003c jr 31 jr ra
9
Machine Language Single Stepping
Assume power2(0) is called then a00 and
ra700018
00400024 ? 0 1 ? 700018 addu t1, 0, 0
00400028 ? 0 1 0 700018 bge t1,a0,w2
00400038 ? 0 1 0 700018 add v0,0,t0
10
Von Neuman Harvard CPU Architectures
ALU
I/O
ALU
I/O
Data bus
Address bus
instructions and data
instructions
data
Harvard architecture was coined to describe
machines with separate memories.Speed efficient
Increased parallelism.
Von Neuman architectureArea efficient but
requires higher bus bandwidth because
instructions and data must compete for memory.
11
Multi-cycle Processor Datapath
12
Multi-cycle Datapath with controller
13
Multi-cycle using Finite State Machine
Finite State Machine( hardwired control )
C
o
m
b
i
n
a
t
i
o
n
a
l
c
o
n
t
r
o
l

l
o
g
i
c
D
a
t
a
p
a
t
h

c
o
n
t
r
o
l

o
u
t
p
u
t
s
O
u
t
p
u
t
s
I
n
p
u
t
s
N
e
x
t

s
t
a
t
e
S
t
a
t
e

r
e
g
i
s
t
e
r
I
n
p
u
t
s

f
r
o
m

i
n
s
t
r
u
c
t
i
o
n
r
e
g
i
s
t
e
r

o
p
c
o
d
e

f
i
e
l
d
14
Finite State Machine program overview
T1 T2 T3 T4 T5
Fetch
Decode
Mem1
Rformat1
BEQ1
JUMP1
Rformat11
LW2
SW2
LW21
15
The Four Stages of R-Format
  • Fetch
  • Fetch the instruction from the Instruction Memory
  • Decode
  • Registers Fetch and Instruction Decode
  • Exec ALU
  • ALU operates on the two register operands
  • Update PC
  • Write Reg
  • Write the ALU output back to the register file

16
R-Format State Machine
Clock1
17
The Five Stages of Load Instruction
Cycle 1
Cycle 2
Cycle 3
Cycle 4
Cycle 5
Load
  • Fetch
  • Fetch the instruction from the Instruction Memory
  • Decode
  • Registers Fetch and Instruction Decode
  • Exec Offset
  • Calculate the memory offset
  • Mem
  • Read the data from the Data Memory
  • Wr
  • Write the data back to the register file

18
R-Format I-Format State Machine
Clock1 AND R-Format1
Clock1
Clock1
19
Multi-Instruction sequence
20
State machine stepping T1 Fetch
(Done in parallel) IR?MEMORYPC
PC ? PC 4
PC
IR
21
T1 Fetch State machine
Start
MemRead1, MemWrite0IorD1 (MemAddr?PC)IRWrit
e1 (IR?MemPC)ALUSrcA0 (PC)ALUSrcB1 (4)AL
UOPADD (PC?4PC)PCWrite1, PCSource1
(ALU)RegWrite0, MemtoRegX, RegDstX
Instruction Fetch
22
T2 Decode (read rs and rt and offsetpc)
A?RegIR25-21 B?RegIR20-16
23
T2 Decode State machine
MemRead0, MemWrite0IorDXIRWrite0ALUSrcA0
(PC)ALUSrcB3 (signext(IRltlt2))ALUOP0 (add)
PCWrite0, PCSourceXRegWrite0, MemtoRegX,
RegDstX
Start
Instr. Decode Register Fetch
24
T3 ExecALU (ALU instruction)
ALUOut ? A op(IR31-26) B
op(IR31-26)
25
T3 ExecALU State machine
Start
R-Format Execution
MemRead0, MemWrite0IorDXIRWrite0ALUSrcA1 (
A Regrs)ALUSrcB0 (B Regrt) ALUOP2
(IR28-26)PCWrite0, PCSourceXRegWrite0,
MemtoRegX, RegDstX
26
T4 WrReg (ALU instruction)
Reg IR15-11 ? ALUOut
27
T4 WrReg State machine
Start
Exec
R-Format Write Register
MemRead0, MemWrite0IorDXIRWrite0ALUSrcAXA
LUSrcBXALUOPX PCWrite0, PCSourceXRegWrite1
, (Regrd ?ALUout) MemtoReg0, (ALUout)RegDst
1 (rd)
28
Review Moore Machine
Next State
29
Moore Output State Tables O(State)
T1 1 0 0 PC 1 0 0 PC 1 4 1 0 ALU 0 X X
T2 0 0 X 0 0 0 PC 3 offset 0 X 0 X X
T3-R 0 0 X 0 2 op 1 A rs 0 B rt 0 X 0 X X
T4-R 0 0 X 0 X X X 0 X 1 0 ALUOut 1 rd
State MemRead MemWrite MUX IorD
IRWrite ALUOP MUX ALUSrcA MUX
ALUSrcB PCWrite MUX PCSource
RegWrite MUX MemtoReg MUX RegDst
30
Review The Five Stages of Load Instruction
Cycle 1
Cycle 2
Cycle 3
Cycle 4
Cycle 5
Load
  • Fetch
  • Fetch the instruction from the Instruction Memory
  • Decode
  • Registers Fetch and Instruction Decode
  • Exec Offset
  • Calculate the memory offset
  • Mem
  • Read the data from the Data Memory
  • Wr
  • Write the data back to the register file

31
Review R-Format I-Format State Machine
Clock1 AND R-Format1
Clock1
Clock1
32
T3I Mem1 (common to both load store)
ALUOut ? A sign_extend(IR15-0)
33
T3 Mem1 I-Format State Machine rs offset
Clock1 AND R-Format1
MemRead0, MemWrite0IorDXIRWrite0ALUOP0
ALUSrcA1 A RegrsALUSrcB2
signext(IR15-0)PCWrite0, PCSourceXRegWrite
0, MemtoRegX, RegDstX
Clock1 AND opcodeLW
I-Format Execution ALUoutrsoffset
34
T4 LW1 load instruction, read memory
MDR ? MemoryALUOut
35
T4 LW2 I-Format State Machine MemALU
Clock1 AND I-Format1
Clock1 AND R-Format1
Clock1 AND opcodeLW
I-Format Memory Read
MemRead1, MemWrite0IorD1IRWrite0ALUOPXALU
SrcAXALUSrcBXPCWrite0, PCSourceXRegWrite0,
MemtoRegX, RegDstX
Clock1 AND opcodeLW
36
T5 LW2 Load instruction, write to register
Reg IR20-16 ? MDR
37
T5 LW2 I-Format State Machine rtMDR
Clock1 AND I-Format1
Clock1 AND R-Format1
Clock1 AND opcodeLW
I-Format Register Write
MemRead1, MemWrite0IorD1IRWrite0ALUOPXALU
SrcAXALUSrcBXPCWrite0, PCSourceXRegWrite1,
MemtoReg1, RegDst1
Clock1 AND opcodeLW
38
T4SW2 Store instruction, write to memory
Memory ALUOut ? B
39
T4 SW2 I-Format State Machine MemALU
Clock1 AND I-Format1
Clock1 AND R-Format1
Clock1 AND opcodeSW
I-Format Memory Write
MemRead0, MemWrite1IorD1IRWrite0ALUOPXALU
SrcAXALUSrcBXPCWrite0, PCSourceXRegWrite0,
MemtoRegX, RegDstX
Store not Load!
40
T3 BEQ1 (Conditional branch instruction)
If (A - B 0) PC ? ALUOut
Zero
ALUOut Address computed in T2 !
41
T3 BEQ1 I-Format State Machine rs offset
Clock1 AND opcodebranch
Clock1 AND R-Format1
MemRead0, MemWrite0IorDXIRWrite0ALUOP0
subtractALUSrcA1 A RegrsALUSrcB0
B RegrtPCWrite0, PCWriteCond1,
PCSource1 ALUoutRegWrite0, MemtoRegX,
RegDstX
B-Format Execution
42
T3 Jump1 (Jump Address)
PC ? PC31-28 IR25-0ltlt2
43
Moore Output State Tables O(State)
T4-SW 0 1 1ALU 0 X X X 0 X 0 X X
T1 1 0 0PC 1 0 0PC 14 1 0AL 0 X X
T2 0 0 X 0 0 0 3 0 X 0 X X
T3-R 0 0 X 0 2op 1Ars 0Brt 0 X 0 X X
T4-R 0 0 X 0 X X X 0 X 1 0ALU 1rd
T4-LW 1 0 1ALU 0 X X X 0 X 0 X X
T3-I 0 0 X 0 0add 1Ars 2sign 0 X 0 X X
T5-LW 0 0 X 0 X X X 0 X 1 1MDR 1rt
State MemRead MemWrite MUX IorD
IRWrite ALUOP MUX ALUSrcA MUX
ALUSrcB PCWrite MUX PCSource
RegWrite MUX MemtoReg MUX RegDst
44
Multi-cycle 5 execution steps
  • T1 (a,lw,sw,beq,j) Instruction Fetch
  • T2 (a,lw,sw,beq,j) Instruction Decode and
    Register Fetch
  • T3 (a,lw,sw,beq,j) Execution, Memory Address
    Calculation, or Branch Completion
  • T4 (a,lw,sw) Memory Access or R-type
    instruction completion
  • T5 (a,lw) Write-back step INSTRUCTIONS TAKE
    FROM 3 - 5 CYCLES!

45
Multi-cycle Approach
All operations in each clock cycle Ti are done in
parallel not sequential! For example, T1, IR
MemoryPC and PCPC4 are done simultaneously!
T1 T2 T3 T4 T5
Between Clock T2 and T3 the microcode sequencer
will do a dispatch 1
Write a Comment
User Comments (0)
About PowerShow.com