CS 2200 Lecture 7 Datapaths Control Logic, SingleMulticycle

About This Presentation

Title:

CS 2200 Lecture 7 Datapaths Control Logic, SingleMulticycle

Description:

Write signals along with clock tell when to write ... Register write only happens when RegWr is set to high and at the falling edge of the clock ... – PowerPoint PPT presentation

Number of Views:440

Avg rating:3.0/5.0

Slides: 140

Provided by: michaelt8

Category:

more less

Transcript and Presenter's Notes

Title: CS 2200 Lecture 7 Datapaths Control Logic, SingleMulticycle

1
CS 2200 Lecture 7Datapaths Control Logic,
Single/Multi-cycle

(Lectures based on the work of Jay Brockman,
Sharon Hu, Randy Katz, Peter Kogge, Bill Leahy,
Ken MacKenzie, Richard Murphy, and Michael
Niemier)

2
MIPS dataflow
3
The organization of a computer

Von Neumann Model
Stored-program machine instructions are
represented as numbers
Programs can be stored in memory to be
read/written just like numbers.

Compiler
Control
Input
Memory
Datapath
Output
Processor
4
Functions of Each Component

Datapath performs data manipulation operations
arithmetic logic unit (ALU)
floating point unit (FPU)
Control directs operation of other components
finite state machines
micro-programming
Memory stores instructions and data
random access v.s. sequential access
volatile v.s. non-volatile
RAMs (SRAM, DRAM), ROMs (PROM, EEPROM), disk
tradeoff between speed and cost/bit
Input/Output and I/O devices interface to the
environment
mouse, keyboard, display, device drivers

5
The Performance Perspective

Performance of a machine determined by
Instruction count, clock cycles per instruction,
clock cycle time
(Last time 210 ns vs. 1100 ns)
Processor design (datapath and control)
determines
Clock cycles per instruction
Clock cycle time
We will discuss two implementations.
Single-Cycle Implementation (a bx cx2
example)
Advantage One clock cycle per instruction
Disadvantage Less flexible
Multiple-Cycle Implementation (bus based)
Advantage Shorter clock cycle times, different
number of cycles for different instructions,
functional unit sharing,

6
Review of MIPS Instruction Formats

All MIPS instructions are 32 bits (4 bytes) long.
R-type
I-Type
J-type

7
The MIPS Subset

Consider a subset of instructions
memory-reference lw, sw
arithmetic-logical add, sub, and, or, slt
branching beq, j
Organizational overview
fetch an instruction based on the content of PC
decode the instruction
fetch operands
(read one or two registers)
execute
(effective address calculation/arithmetic-logical
operations/comparison)
store result
(write to memory / write to register / update PC)

At simplest level, this is how Von Neumann, RISC
model works
8
Implementation Overview
simplest view of Von Neumann, RISC mP

Abstract / Simplified View
2 types of signals data and control
Clocking strategy All storage elements clocked
by same
clock edge.

Data
Address
PC
Ra
Instruction
Address
Rb
A
L
U
Instruction Memory
Register File
Rw
Data Memory
Data
9
Single Cycle Implementation

Each instruction takes one cycle to complete.
We wait for everything to settle down, and the
right thing to be done
ALU might not produce right answer right away
Write signals along with clock tell when to write
Cycle time determined by length of longest path

referring to 2 slides ago, what instruction
takes the longest?
10
Instruction Fetch Unit

Fetch the instruction memPC ,
Update the program counter
sequential code PC lt- PC4
branch and jump PC lt- something else

PC
Next Addr Logic
Address
Instruction Word 32
Instruction Memory
11
R-Type Instructions

Instruction format
RTL
Instruction fetch memPC
ALU operation regrd lt- regrs op regrt
Go to next instruction Pc lt- PC 4
Ra, Rb and Rw are from instructions rs, rt, rd
fields.
Actual ALU operation and register write should
occur after decoding the instruction.

12
Datapath for R-Type Instructions
ALUctr
RegWr
5
Ra
32 32-bit Registers
rs
BusA 32
5
Rb
rt
ALU
5
Rw
rd
BusB 32
BusW 32

(note, unlike LC2200, multiple read ports here)
13
I-Type Arithmetic/Logic Instructions

Instruction format
RTL for arithmetic operations e.g., ADDI
Instruction fetch memPC
Add operation regrt lt- regrs
SignExt(imm16)
Go to next instruction Pc lt- PC 4
Also, immediate instructions

14
Datapath for I-Type A/L Instructions
note that we reuse ALU
ALUctr
RegWr
5
Ra
32 32-bit Registers
rs
BusA 32
5
Rb
rt
ALU
Rw
BusB 32
5
32
BusW 32
RegDst
Extender
ALUSrc
16
must zero out 1st 16 bits
rd
rt
imm16
In MIPS, destination registers are in
different places in opcode ? therefore we need a
mux
BusW 32
15
I-Type Load/Store Instructions

Instruction format
RTL for load/store operations e.g., LW
Instruction fetch memPC
Compute memory address Addr lt- regrs
SignExt(imm16)
Load data into register regrt lt- memAddr
Go to next instruction Pc lt- PC 4
How about store?

same thing, just skip 3rd step (memaddr ?
regrs)
16
Datapath for Load/Store Instructions
need a control signal
address input
32 bits of data
17
I-Type Branch Instructions

Instruction format
RTL for branch operations e.g., BEQ
Instruction fetch memPC
Compute conditon Cond lt- regrs - regrt
Calculate the next instructions address
if (Cond eq 0) then
PC lt- PC 4 (SignExd(imm16) x 4)
else ?

18
Datapath for Branch Instructions
PC
Next Addr Logic
To Instruction Mem
RegWr
ALUctr
5
Ra
32 32-bit Registers
rs
BusA 32
5
Rb
rt
ALU
Rw
BusB 32
5
MUX
well define this next (will need PC, zero
test condition from ALU)
32
Zero
MUX
ALUSrc
RegDst
Extender
16
rt
rd
imm16
19
Next Address Logic
contains PC 4
(why 30? subtlety see Chapter 5 in your text)
1
PC
CarryIn
30
ADD
Instruction Memory
30
May not want to change PC if BEQ condition not
met (implicitly says this stuff happens anyway
so we have to be sure we dont change things
we dont want to change)
0
MUX
30
SignExt
if branch instruction AND 0, can
automatically generate control signal
16
Zero
Branch
imm16
When does the correct new PC become available?
Can we do better?
20
J-Type Jump Instructions

Instruction format
RTL operations e.g., BEQ
Instruction fetch memPC
Set up PC PC lt- ((PC 4)lt3129gt
CONCAT(targetlt250gt) x 4

21
Instruction Fetch Unit
(why PClt3128gt subtlety see Page 383 in your
text)
PClt3128gt
Instructionlt250gt
1
PC
CarryIn
Jump
30
ADD
30
0
30
Instruction Memory
SignExt
16
Branch
Zero
imm16
22
A Single Cycle Datapath
P
C
S
r
c
A
d
d
4
t

2
ALUctr
3
i
M
e
m
W
r
i
t
e
A
L
U
S
r
c
M
e
m
t
o
R
e
g
i
Z
e
r
o
A
L
U
A
L
U
R
e
a
d
A
d
d
r
e
s
s
r
e
s
u
l
t
M
d
a
t
a
M
u
u
x
D
a
t
a
x
m
e
m
o
r
y
W
r
i
t
e
R
e
g
W
r
i
t
e
d
a
t
a
S
i
g
n
M
e
m
R
e
a
d
e
x
t
e
n
d
Add Jump.
23
Control logic for a single cycle machine
24
Recall Implementation Overview
simplest view of Von Neumann, RISC mP

Abstract / Simplified View
Two types of signals data and control
clocking strategy
All storage elements are clocked by the same
clock edge.

Data
Address
PC
Ra
Instruction
Address
Rb
A
L
U
Instruction Memory
Register File
Rw
Data Memory
Data
25
The HW needed, plus control
Single cycle MIPS machine
When we talk about control, we talk about these
blocks
26
Implementing Control

Implementation Steps Review
Identify control inputs and control outputs
Make a control signal table for each cycle
Derive control logic from the control table
As youve seen (and as well review), this logic
can take on many forms combinational logic,
ROMs, microcode, or combinations

I promise. This is not a hard thing to do. Dont
be intimated by complex datapath.
27
Single Cycle Control Input/Output

Control Inputs
Opcode (6 bits)
How about R-type instructions?
Control Outputs
RegDst
ALUSrc
MemtoReg
RegWrite
MemRead
MemWrite
Branch
Jump
ALUctr

Step 2 Make a control signal table for each cycle
28
Control Signal Table
(inputs)
R-type
(outputs)
29
The HW needed, plus control
Single cycle MIPS machine
30
Main control, ALU control
Func
ALUctr
OP
ALU Control
Main Control
6
ALUOp
3
6
2
(opcode)
ALU
Other cnt. signals

Use OP field to generate ALUOp (encoding)
Control signal fed to ALU control block
Use Func field and ALUOp to generate ALUctr
(decoding)
Specifically sets 3 ALU control signals
B-Invert, Carry-in, operation

31
Main control, ALU control
Or in other words 00 ALU performs add 01 ALU
performs sub 10 ALU does what function code
says (see p. 284 for more)
32
Generating ALUctr

We want these outputs

and - 00
or - 01
mux
adder - 10
ALUctrlt2gt B-negate (C-in B-invert) ALUctrlt1gt
Select ALU Output ALUctrlt0gt Select ALU Output
Invert B and C-in must be a 1 for subtract
less - 11
33
The Logic
This table is used to generate the actual Boolean
logic gates that produce ALUctr.
Could generate gates by hand, often done w/SW.
(ALUOp)
ALUOp0
X/1
ALUctrlt2gt
ALUOp1
1/0
0/X
1/1
F3
1/0
ALUctr
(funclt50gt)
110/110
ALUctrlt1gt
F2
0/X
1/1
Ex ALUctrlt2gt (SUB/BEQ)
ALUctrlt0gt
F1
1/X
0/0
0/0
F0
0/X
0/X
34
Recall
Single cycle MIPS machine
Recall, for MIPS, we have to build a Main Control
Block and an ALU Control Block
35
Well, heres what we did
Single cycle MIPS machine
We came up with the information to generate this
logic which would fit here in the datapath.
36
Single cycle versus multi-cycle
37
Single Cycle Implementation

Calculate cycle time assuming negligible delays
except
memory (2ns), ALU and adders (2ns), register file
access (1ns)

38
Single-Cycle Implementation (Contd)

Single-cycle, fixed-length clock
CPI 1
Clock cycle propagation delay of the longest
datapath operations among all instruction types
Easy to implement
Single-cycle, variable-length clock
CPI 1
Clock cycle ? ((type-i instructions)
propagation delay of the type i instruction
datapath operations)
Better than the previous, but impractical to
implement
Disadvantages
What if we have floating-point operations?
How about component usage?

39
Multiple Cycle Alternative

Break an instruction into smaller steps
Execute each step in one cycle.
Execution sequence
Balance amount of work to be done
Restrict each cycle to use only one major
functional unit
At the end of a cycle
Store values for use in later cycles, why?
Introduce additional internal registers
The advantages
Cycle time much shorter
Diff. inst. take different of cycles to
complete
Functional unit used more than once per
instruction

40
Multiple-Cycle Implementation

Datapath
Component sharing ALU, Instruction/Data memory
ALU used to compute address, increment PC
Memory used for instruction AND data
Additional elements MUXs, Instr Register,
Target Register
If a value needs to be alive during multiple
cycles, it should stay unchanged during the whole
time.
Control
Needed for each datapath element during each
clock cycle.

41
Five Step Execution

1. Instruction Fetch (Ifetch)
Fetch instruction at address (PC)
Store instruction in register IR
Increment PC
2. Instruction Decode and Register Fetch
(Decode)
Decode instruction format, read register
Store register contents in registers A and B
Compute new PC address, store it in ALUOut
3. Execution, Memory Address Computation, or
Branch Completion (Execute)
Compute memory address (for LW and SW), or
Perform R-type operation (for R-type
instruction), or
Update PC (for Branch and Jump)
Store memory address or register operation result
in ALUOut

42
Five Step Execution (contd)

4. Memory Access or R-type instruction completion
(MemRead/RegWrite)
Read memory at address ALUOut, store it in MDR
Write ALUOut content into register file, or
Read memory at address ALUOut, store it in B
5. Write-back step (WrBack)
Write the memory content read into register file
Number of cycles for an instruction
R-type
lw
sw
Branch or Jump

An exercise for the user
43
Some Simple Questions

How many cycles will it take to execute this
code? lw t2, 0(t3) lw t3, 4(t3) beq
t2, t3, Label assume branch not taken add
t5, t2, t3 sw t5, 8(t3)Label ...
What is going on during the 8th cycle of
execution?
In what cycle does the actual addition of t2 and
t3 takes place?

1 5 10
15 20
44
Transition slide5 steps in detail
45
Step 1 Instruction Fetch

Use PC to get instruction, put it in IR.
Increment PC by 4, put the result back in PC.
Can you write this using the RTL notation?
IR lt- MemoryPC , PC lt- PC 4What is the
advantage of updating the PC now?

46
Step 2 I-Decode and Register Fetch

Read registers rs and rt in case we need them
Compute branch address in case instruction is
branch
RTL A lt- RegIR25-21
B lt- RegIR20-16
ALUOut lt- PC (sign-extend(IR15-0) ltlt2)
Did we set any control lines based on the
instruction type? (we are busy "decoding" it in
our control logic)

Means in parallel
47
Step 3 (Instruction dependent)

ALU is performing 1 of 3 functions, based on
instruction type
Memory Reference ALUOut lt- A
sign-extend(IR15-0)
R-type ALUOut lt- A op B
Branch if (AB) then (PC lt- ALUOut)

48
Step 4 (R-type or memory-access)

Loads and stores access memory MDR lt-
MemoryALUOut or MemoryALUOut lt- B
R-type instructions finish RegIR15-11 lt-
ALUOutWhen does the write actually take
place?
-at the end of the cycle on the edge.

49
Step 5 Write-Back

RegIR20-16lt- MDR
What about all the other instructions?

50
Single cycle
51
Multi-cycle
(Now, critical path dependent on longest
delay for string of components used in 1 of 5
steps)

Where do we need to insert muxs?
Other functional units?

52
Execution Sequence Summary
IR ? MemoryPC
PC ? PC 4
A ? RegIR(2521)
B ? RegIR(2016)
ALUOut ? PC SignEx(IR(150) ltlt 2)
53
Multiple Cycle Design

Break up instructions into steps, each step takes
1 cycle
balance work to be done
restrict each cycle to use only 1 major
functional unit
At the end of a cycle
store values for use in later cycles (easiest
thing to do)
introduce additional internal registers

54
Control Signals
New
Old

PC PCWrite, PCWriteCond, PCSource
Memory IorD, MemRead, MemWrite
IR IRWrite
Reg. File RegWrite, MemtoReg, RegDst
ALU ALUSrcA, ALUSrcB, ALUOp, ALUCnt.

RegDst, MemToReg, RegWrite, MemRead, MemWrite,
Branch, ALUSrc, ALUOp, ALUCnt.
55
Implementing the Control

Value of control signals is dependent upon
what instruction is being executed
which step is being performed
Use accumulated information to specify a finite
state machine
use a state diagram, or
use microprogramming
Implementation can be derived from specification

56
Graphical Specification of FSM
t
Instruction Fetch
MemRead ALUSrcA 0 IorD 0 IRWrite ALUSrcB
01 ALUOp 00 PCWrite PCSource 00
Instruction decode/ Register fetch
1
0
ALUSrcA 0 ALUSrcB 11 ALUOp 00
start
8
9
Branch Completion
Memory address computation
Jump Completion
2
6
Execution
ALUSrcA 1 ALUSrcB 00 ALUOp
01 PCWriteCond PCSource 01
ALUSrcA 1 ALUSrcB 10 ALUOp 00
ALUSrcA 1 ALUSrcB 00 ALUOp 10
PCWrite PCSource 10
Memory access
5
Memory access
RegDst 1 RegWrite MemToReg 0
MemRead IorD 1
MemRead IorD 1
3
Tells us what values are needed and during what
step
R-type completion
7
RegDst 0 RegWrite MemToReg 1
4
Memory read completion
57
Finite State Machine for Control
Control logic is inside this box (could be
implemented in many different ways)
The outputs that we want now also dependent
on the current state.
could be ROM, logic, etc.
Inputs (which now also include the previous state)
(Still might need ALU control logic and hence
function code developed earlier)
58
Microprogramming

For our example, state diagrams, combinational
logic more than adequate
But were dealing with small subset of MIPS
processor
Full MIPS instruction set has over 100
instructions
In 1 implementation instructions take from 1 to
20 clock cycles
Control would be much more complex for this case
Another alternative microcoding
Think of control signals that must be asserted in
a state as an instruction to be executed by
datapath
Call these micro instructions

59
Micro-instructions

microinstruction
Set of datapath control signals that must be
asserted in given state
Executing has affect of asserting control signals
specified by the instruction
How do we sequence?
In some cases, fetch next instruction
Next instruction just depends on state
In others, consider inputs
i.e. next instruction depends on state input
Like assembly language, must branch explicitly
microprogramming
Designing control as a program that implements
machine instructions in simpler terms

60
Microprogramming guidelines

Make each field of microinstruction responsible
for specifying a non-overlapping set of control
signals
Signals never asserted simultaneously may share
same field
Have signals that a.) control datapath elements
b.) field that handles sequencing
(i.e. selecting the next instruction)
Microinstructions usually in a ROM or PLA
Therefore can assign addresses
Like choosing s for FSM elements

61
Example fields
62
Choosing the next instruction

How to we choose whats next?
Increment the address of current microinstruction
to obtain the next
Put Seq in the sequencing field
(Most common case, usually default)
Branch to next microinstruction
Place Fetch in the sequencing field
Choose next microinstruction based on control
unit inputs
This is called a dispatch
Usually implemented by creating a table
containing addresses of target microinstructions
(May be implemented in a ROM)

63
Dispatch tables

Often, (and realistically), there is more than 1
Example state diagram constructed earlier
We would need 2 dispatch tables here
1 to dispatch from state 1
1 to dispatch from state 2
Indicate next microinstruction should be chosen
by a dispatch operation by placing dispatch i
in the sequencing field
(i is table )

64
Recall
t
Instruction Fetch
MemRead ALUSrcA 0 IorD 0 IRWrite ALUSrcB
01 ALUOp 00 PCWrite PCSource 00
Instruction decode/ Register fetch
1
0
ALUSrcA 0 ALUSrcB 11 ALUOp 00
start
8
9
Branch Completion
Memory address computation
Jump Completion
2
6
Execution
ALUSrcA 1 ALUSrcB 00 ALUOp
01 PCWriteCond PCSource 01
ALUSrcA 1 ALUSrcB 10 ALUOp 00
ALUSrcA 1 ALUSrcB 00 ALUOp 10
PCWrite PCSource 10
Memory access
5
Memory access
RegDst 1 RegWrite MemToReg 0
MemRead IorD 1
MemRead IorD 1
3
Tells us what values are needed and during what
step
R-type completion
7
RegDst 0 RegWrite MemToReg 1
4
Memory read completion
65
Possible Values
66
Creating the microprogram

In microprogram, 2 situations where we could
leave a field of microinstruction blank
When field that controls a functional unit or
that causes state to be written (i.e. Memory
field, ALU dest field) is blank, no control
signals should be asserted
When a field only specifies control of a
multiplexor that determines input to a functional
unit, (i.e. SRC1), leaving it blank means that we
do not care about input to functional unit (or
output of multiplexor)

67
Example

1st component of every instruction execution is
to fetch instructions, decode them, and compute
the sequential and branch target PC
Correspond directly to 1st 2 steps of execution
described (see p.385-388)
2 microinstructions needed for 1st two steps are
below

68
Example

To understand each microinstruction, look at the
effect of a group of fields
In 1st microinstructions, fields asserted and
their effects are

Label field containing label Fetch, will be used
in Sequencing field when microprogram wants to
start execution of next instruction.
69
The entire microprogram
70
Control Example

Can you generate the control signal table?
How about micro-programmed implementation?

i
l
71
Sample Microinstruction

Ifetch IR lt- MemPC PC lt- PC4

Microinstruction 1d011ddd000100d11
72
A few words on MIPS exceptions
73
What is an exception?

Exception
An event other than a branch or a jump that
changes the normal flow of an instruction
execution
Often called an interrupt as well
Examples

74
Processing exceptions

For OS to process exception, it must know why it
was caused, which instruction cause it
(i.e. arithmetic exception, invalid instruction)
One method
(used in MIPS)
Have a status register called Cause Register
Holds a field that indicates reason for exception
Another method
Vectored interrupts
Address to which control is transferred
determined by cause of exception
OS knows reason for the exception by address at
which its initiated

75
Need more HW

To process exceptions we need more HW
EPC
A 32-bit register that holds address of affected
instruction
(Needed even with vectored interrupts)
Cause
Register used to record cause of exception
In MIPS, 32 bits
Well also need 2 more control signals
EPCWrite and CauseWrite

76
Finally, augmenting our FSM
t
Instruction Fetch
MemRead ALUSrcA 0 IorD 0 IRWrite ALUSrcB
01 ALUOp 00 PCWrite PCSource 00
Instruction decode/ Register fetch
1
0
ALUSrcA 0 ALUSrcB 11 ALUOp 00
start
8
9
Branch Completion
Jump Completion
Memory address computation
2
6
Execution
ALUSrcA 1 ALUSrcB 00 ALUOp
01 PCWriteCond PCSource 01
PCWrite PCSource 10
ALUSrcA 1 ALUSrcB 10 ALUOp 00
ALUSrcA 1 ALUSrcB 00 ALUOp 10
10
Op other
Memory access
5
IntCause 1 CauseWrite ALUSrcA 0 ALUSrcB
01 ALUOp 01 EPCWrite PCWrite PCSource 11
IntCause 0 CauseWrite ALUSrcA 0 ALUSrcB
01 ALUOp 01 EPCWrite PCWrite PCSource 11
Memory access
11
RegDst 1 RegWrite MemToReg 0
Overflow
MemRead IorD 1
MemRead IorD 1
3
R-type completion
7
RegDst 0 RegWrite MemToReg 1
4
Memory read completion
77
CS 2200 Lecture 7Interrupts, Memory-Mapped I/O

(Lectures based on the work of Jay Brockman,
Sharon Hu, Randy Katz, Peter Kogge, Bill Leahy,
Ken MacKenzie, Richard Murphy, and Michael
Niemier)

78
Interrupts

Whats an interrupt?
1 Idea an unsolicited procedure call.
Actual procedure called an exception/trap/interrup
t handler
Why do we need them?
Or put another way, what would we have to do if
we didnt have them?

(Example constantly or periodically check
I/0, peripheral devices, etc.)
79
Interrupts

How can interrupts be generated?

?
80
Interrupts

Different Types (2200 Definitions)
Exception - Associated with certain instruction
Overflow
Illegal Instruction
Traps System calls
Interrupt - Asynchronous event not associated
with a certain instruction (e.g. I/O device).

81
Interrupts/Exceptions/Traps
82
Interrupts

Hardware
System bus contains 1 or more interrupt lines.
Need to know who
might put device type code on data lines
might put address of table entry
might put address of handling routine
May have priority scheme
What would priority be based on?
How would it work?
What has to happen?

i.e. what do we do, consider if interrupt is
caused by HW?
83
Interrupts

Hardware (Continued)
Save current PC on stack
Why the stack?
Other possibilities?
Go somewhere to handle interrupt
Check each device
Must be quick
Interrupt vector table
Located in low memory
Table of pointers

(interrupt might tell CPU to go to this table
specific location is pointer to routine to
handle analogous to assembly code)
84
Interrupts

Hardware (Continued)
What if we get interrupted in while handling
interrupt?
What do we do when handling interrupt is
complete?
Special Instruction RETI
Can a user disable interrupts?
followed by
while(1)

85
Interrupts

Software
System call (Monitor call)
Why do we need such a construct?
Concept of Mode
Mode bit
User mode
Can execute limited instruction set
Supervisor or Kernel or Monitor Mode
Used by OS
Can execute all instructions
Switch to user mode before returning to user.

86
Interrupts

Interrupt handler code
Like a function
Pointed to by vector table or address supplied by
device
Must save state of interrupted process

(very much like a procedure call)
87
Today Interrupts

A. Running example an I/O device
e.g., network interface
B. Interrupt mechanics Hardware
C. Interrupt mechanics Software (handlers)
D. Aside CPU load of interrupts
E. Generalizing interrupts/exceptions/traps
and connect back to protection

88
A. Running Example

I/O Device a network interface

89
Network Interface?(NI)
?
90
Crude Network Interfaceinput-only

1. Network sends us messages need some state to
store those messages
2. Need to know that messages have arrived
3. Need some scheme to be sure we read a message
before the network overwrites it.

91
Crude Network Interface
1. data area
DAV bit (Data AVailable bit) 2. set by
network 3. reset by software
92
How to connect it?make it look like another
memory unit
could use combinational logic in control to
help check/process
93
Memory-Mapped I/O

NI is a 17-word block mapped to 0xF0000000
Existing 1024-word memory at 0x00000000
How do you wire up two memory units?
hardware question
How do you read messages from the NI?
software question

LC-2200 address space
0xFFFFFFFF 0xF0000000 0x000003FF 0x000
00000
94
Memory-Mapped Devices

Network, disk, display, sound, keyboard, mouse
Add data/control registers of each to addr. space
And continuously check for input??

95
B. Interrupt MechanicsHardware
96
Interrupts
97
Interrupts
Address Bus
Processor
Data Bus
Int
Device 1
Device 2
Add an interrupt request line. A device wishing
to interrupt asserts this line
98
Interrupts
Address Bus
Processor
Data Bus
Int
Device 1
Device 2
The interrupt line is connected to the processor
control (state machine)
99
Interrupts
Address Bus
Processor
Data Bus
Int
Device 1
Device 2
At the beginning of every instruction execution
sequence a check is made on the status of the
"int" line
100
Interrupts
Address Bus
Processor
Data Bus
Int
Device 1
Device 2
If "int" is asserted special states can be used
to handle the interrupt
101
Interrupts
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
If the processor decides to handle the interrupt
it asserts the inta (interrupt acknowledege) line
102
Interrupts
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
If Device 1 was one of the devices asserting
"int" it receives the acknowledgement and doesn't
pass it on
103
Interrupts
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
If Device 1 wasn't one of the devices asserting
"int" it receives the acknowledgement and passes
it on
104
Interrupts
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
Assume it's Device 2 that wants to interrupt.
105
Interrupts
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
Now knowing that the processor is listening,
Device 2 can put the address of it's entry in the
interrupt vector table onto the data bus
106
Interrupts
Memory
0x12345678 0x3579BDFA 0x12345678 0x3579BDFE
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
The interrupt vector table is located in very low
memory and consists of a table of pointers to
interrupt handling routines
107
Interrupts
Memory
0x12345678 0x3579BDFA 0x12345678 0x3579BDFE
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
This allows the processor to jump to the code to
handle the interrupt
108
Interrupts
Memory
0x12345678 0x3579BDFA 0x12345678 0x3579BDFE
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
Once complete the handler executes a "return from
interrupt" instruction
109
Hardware Mechanics Summary

1. Interrupt signal (INT)
devices-to-CPU?
2. Interrupt Acknowledge (IACK)
CPU-to-devices
3. Forced procedure call to interrupt handler

110
Hardware Mechanics SummarySubtleties

1. Interrupt signal (INT)
devices-to-CPU?
2. Interrupt Acknowledge (IACK)
CPU-to-devices
With multiple interrupts, which device goes
first??
3. Forced procedure call to interrupt handler
How do you get the address of the interrupt
handler??
Where do you keep the return address?
n. potential recursion
What if you get an interrupt while servicing an
interrupt??

111
IACK Problemone soln daisy-chain the IACK line
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
Limitations? Alternatives?
If Device 1 was one of the devices asserting
"int" it receives the acknowledgement and doesn't
pass it on
112
Which-Handler Problem
(i.e. how do we handle the interruption in the
CPU?)

Options?
1. One handler leave dispatch to software!
2. Interrupt vector table
device provides a number at IACK time
CPU (microcode) uses number to index into a table
CPU jumps to address in that table
Illustrated in preceeding slides
3. Raw vector
device provides an address at IACK time and CPU
jumps
used in Project 2

113
Crude Network Interfacea la project 2
Add 18th word NIVEC pointer to interrupt
handler
114
Return-Address Problem

Standard procedure call uses JALR and saves the
return address in register RA
Interrupt procedure call cant use RA
its unpredictable and would smash whatever is
there!
Options?
many...
Last time PRJ2 dedicates a processor register,
K0

115
Recursive Interrupt Problem
Memory
0x12345678 0x3579BDFA 0x12345678 0x3579BDFE
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
What if Device 2 interrupts while the handler for
Device 1 is running? Or vice versa? Or double
interrupt from the same device?
116
Recursive Interrupt Problem
Memory
0x12345678 0x3579BDFA 0x12345678 0x3579BDFE
Address Bus
Processor
Data Bus
Int
0
intr enable
Inta
Device 1
Device 2
Add an interrupt enable bit to the
processor 1. cleared at interrupt time 2. set
at RETI time 3. EI/DI instrs.
117
C. Interrupt MechanicsSoftware

Interrupt Handlers

118
Example Device Interrupt(Say, arrival of
network message)
Save registers ? lw r1,20(r0) lw r2,0(r1) addi
r3,r0,5 sw 0(r1),r3 ? Restore registers Clear
current Int RETI
? add r1,r2,r3 subi r4,r1,4 slli
r4,r4,2 Hiccup(!) lw r2,0(r4) lw r3,4(r4) add r2
,r2,r3 sw 8(r4),r2 ?
(callee save)
External Interrupt
Interrupt Handler
code to handle int.
(callee restore)
(reset bit)
(return from interrupt)
119
Interrupt Mechanisms

Basic mechanism forced subroutine call (transfer
of control w/saved return address)
Must have a means to disable interrupts to
prevent nested, recursive interrupts.
one bit
Additions for performance
selective disable of multiple interrupt sources
(priority level or a bit-per-source)
hardware to encode the source of the interrupt.

(if another interrupt comes along, we wait or
keep trying to send)
120
Nested Interrupts
(if higher priority interrupt comes along, we
could process it first)
Raise priority Reenable All Ints Save
registers ? lw r1,20(r0) lw r2,0(r1) addi
r3,r0,5 sw 0(r1),r3 ? Restore registers Clear
current Int Disable All Ints Restore priority RTE
? add r1,r2,r3 subi r4,r1,4 slli
r4,r4,2 Hiccup(!) lw r2,0(r4) lw r3,4(r4) add r2
,r2,r3 sw 8(r4),r2 ?
Could be interrupted by disk
Network Interrupt
Note that priority must be raised to avoid
recursive interrupts!
121
Example Handler

Init code
Write to NIVEC register
Handler code
save all registers used by handler to stack
do handler action
restore all registers used by handler from stack
JALR K0, ZERO

122
D. CPU Load of Interrupts

Interrupts cost some CPU time

123
Suppose we have lots of devices
Address Bus
Processor
Data Bus
Device 37
Device 1
Device 1
Device 1
Device 1
Device 1
Device 2
All generating interrupts...
124
How do you know theres enough CPU time?
Device Rate Handler time ------
---- ------------ Network 100/S
1mS Display 50/S 10mS
What fraction of the CPU is consumed by
interrupts? Could we add a sound card if it took
5mS, 100/S?
125
How do you know theres enough CPU time?
Device Rate Handler time ------
---- ------------ Network 100/S
1mS --gt 10 Display 50/S
10mS --gt 50
100 int/s 1 ms/int 1s/1000ms 0.1 50 int/s
10 ms/int 1s/1000ms 0.5 100 int/s 5
ms/int 1s/1000ms 0.5
What fraction of the CPU is consumed by
interrupts? ? 60 Could we add a sound card if
it took 5mS, 100/S? ? that would be 50 ...
no!, 6050 gt 100
126
E. Generalization

Interrupts for internal events
Interrupts as part of protection

127
Interrupt/Exception/Trap Classifications

Interrupts caused by asynchronous, outside
events
I/O devices requiring service (disk, network)
Clock interrupts (real time scheduling)
Exceptions relevant to the current instruction
Faults, arithmetic traps, other synchronous traps
Traps deliberately caused by the current
instruction
Invoke software on behalf of the currently
executing process
Other, e.g. hardware failure
Non recoverable ECC, power outage, FPU is on
fire...
asynchronous
not necessarily recoverable

128
Interrupt/Exception/Trap Classifications

Interrupts caused by asynchronous, outside
events
Exceptions synchronous but unintentional
Traps synchronous, intentional
HP Exceptions of which some are interrupts
SGG Interrupts of which some are
exceptions/traps
occasionally seen
fault (as in page fault ... an exception in
our terminology)
machine check (unrecoverably fatal condition)

WARNING Inconsistent Terminology Zone
first of several, unfortunately
129
Interrupts and Protection

Interrupts and protection are orthogonal
However, conventionally, interrupts switch into
supervisor (kernel) state.
some interrupt handlers must be protected
deliberately-invoked-traps (software traps) make
a nice interface for system calls
therefore, it has been convenient to have all
interrupts go to the kernel

130
Summary(note wrap-up visualization follows)

A. I/O devices memory-map their state
B. Interrupt mechanics Hardware
C. Interrupt mechanics Software (handlers)
D. CPU load of interrupts compute of time
E. General Mechanism Interrupts/Exceptions/Traps

131
Visualization of Program Execution
PC (mem. addr.)
time
132
Visualization of Program Execution
a procedure call
a loop
PC (mem. addr.)
an interrupt
time
133
Program Execution w/Protection
1. interrupts go to kernel mode 2. system calls
switch to kernel mode to interact w/IO
a loop
user space
PC (mem. addr.)
a system call
kernel space
an interrupt
time
134
Program Execution w/Protection ( w/IO)
I/O (kernel) space
a loop
user space
PC (mem. addr.)
a system call
kernel space
an interrupt
time
135
Bonus Slides

Speed of Interrupts

136
Example Device Interrupt(Say, arrival of
network message)
Raise priority Reenable All Ints Save
registers ? lw r1,20(r0) lw r2,0(r1) addi
r3,r0,5 sw 0(r1),r3 ? Restore registers Clear
current Int Disable All Ints Restore priority RTE
? add r1,r2,r3 subi r4,r1,4 slli
r4,r4,2 Hiccup(!) lw r2,0(r4) lw r3,4(r4) add r2
,r2,r3 sw 8(r4),r2 ?
External Interrupt
Interrupt Handler
137
Alternative Polling(again, for arrival of
network message)
Disable Network Intr ? subi r4,r1,4 slli
r4,r4,2 lw r2,0(r4) lw r3,4(r4) add r2,r2,r3 sw
8(r4),r2 lw r1,12(r0) beq r1,no_mess lw r1,20(r0)
lw r2,0(r1) addi r3,r0,5 sw 0(r1),r3 Clear
Network Intr ?
Polling Point (check device register)
Handler
no_mess
138
Delays of Interrupts/Polling

Interrupts
disrupts pipeline (usually must wait for a
pipeline flush)
save/restore registers
other housekeeping (priority adjustments, kernel
stuff)
Polling
must perform check whether theres an event
waiting to be processed or not.
if check is periodic, event delivery is delayed
by half a period if events arrive at random.

139
Is Polling faster or slower than Interrupts?

Polling is faster!
Compiler knows which registers in use at polling
point. Hence, do not need to save and restore
registers (or not as many).
Other interrupt overhead avoided (pipeline flush,
trap priorities, etc).
Interrupts are faster!
Overhead of polling instructions is incurred
regardless of whether or not handler is run.
This could add to inner-loop delay.
Device may have to wait for service for a long
time.
When to use one or the other?
Multi-axis tradeoff
Frequent, regular events are good for polling, as
long as the device can be controlled at user
level.
Interrupts are good for infrequent/irregular
events
Interrupts are good for ensuring predictable
service of events.