CS 2200 Lecture 7 Datapaths Control Logic, SingleMulticycle - PowerPoint PPT Presentation

1 / 139
About This Presentation
Title:

CS 2200 Lecture 7 Datapaths Control Logic, SingleMulticycle

Description:

Write signals along with clock tell when to write ... Register write only happens when RegWr is set to high and at the falling edge of the clock ... – PowerPoint PPT presentation

Number of Views:440
Avg rating:3.0/5.0
Slides: 140
Provided by: michaelt8
Category:

less

Transcript and Presenter's Notes

Title: CS 2200 Lecture 7 Datapaths Control Logic, SingleMulticycle


1
CS 2200 Lecture 7Datapaths Control Logic,
Single/Multi-cycle
  • (Lectures based on the work of Jay Brockman,
    Sharon Hu, Randy Katz, Peter Kogge, Bill Leahy,
    Ken MacKenzie, Richard Murphy, and Michael
    Niemier)

2
MIPS dataflow
3
The organization of a computer
  • Von Neumann Model
  • Stored-program machine instructions are
    represented as numbers
  • Programs can be stored in memory to be
    read/written just like numbers.

Compiler
Control
Input
Memory
Datapath
Output
Processor
4
Functions of Each Component
  • Datapath performs data manipulation operations
  • arithmetic logic unit (ALU)
  • floating point unit (FPU)
  • Control directs operation of other components
  • finite state machines
  • micro-programming
  • Memory stores instructions and data
  • random access v.s. sequential access
  • volatile v.s. non-volatile
  • RAMs (SRAM, DRAM), ROMs (PROM, EEPROM), disk
  • tradeoff between speed and cost/bit
  • Input/Output and I/O devices interface to the
    environment
  • mouse, keyboard, display, device drivers

5
The Performance Perspective
  • Performance of a machine determined by
  • Instruction count, clock cycles per instruction,
    clock cycle time
  • (Last time 210 ns vs. 1100 ns)
  • Processor design (datapath and control)
    determines
  • Clock cycles per instruction
  • Clock cycle time
  • We will discuss two implementations.
  • Single-Cycle Implementation (a bx cx2
    example)
  • Advantage One clock cycle per instruction
  • Disadvantage Less flexible
  • Multiple-Cycle Implementation (bus based)
  • Advantage Shorter clock cycle times, different
    number of cycles for different instructions,
    functional unit sharing,

6
Review of MIPS Instruction Formats
  • All MIPS instructions are 32 bits (4 bytes) long.
  • R-type
  • I-Type
  • J-type

7
The MIPS Subset
  • Consider a subset of instructions
  • memory-reference lw, sw
  • arithmetic-logical add, sub, and, or, slt
  • branching beq, j
  • Organizational overview
  • fetch an instruction based on the content of PC
  • decode the instruction
  • fetch operands
  • (read one or two registers)
  • execute
  • (effective address calculation/arithmetic-logical
    operations/comparison)
  • store result
  • (write to memory / write to register / update PC)

At simplest level, this is how Von Neumann, RISC
model works
8
Implementation Overview
simplest view of Von Neumann, RISC mP
  • Abstract / Simplified View
  • 2 types of signals data and control
  • Clocking strategy All storage elements clocked
    by same
  • clock edge.

Data
Address
PC
Ra
Instruction
Address
Rb
A
L
U
Instruction Memory
Register File
Rw
Data Memory
Data
9
Single Cycle Implementation
  • Each instruction takes one cycle to complete.
  • We wait for everything to settle down, and the
    right thing to be done
  • ALU might not produce right answer right away
  • Write signals along with clock tell when to write
  • Cycle time determined by length of longest path

referring to 2 slides ago, what instruction
takes the longest?
10
Instruction Fetch Unit
  • Fetch the instruction memPC ,
  • Update the program counter
  • sequential code PC lt- PC4
  • branch and jump PC lt- something else

PC
Next Addr Logic
Address
Instruction Word 32
Instruction Memory
11
R-Type Instructions
  • Instruction format
  • RTL
  • Instruction fetch memPC
  • ALU operation regrd lt- regrs op regrt
  • Go to next instruction Pc lt- PC 4
  • Ra, Rb and Rw are from instructions rs, rt, rd
    fields.
  • Actual ALU operation and register write should
    occur after decoding the instruction.

12
Datapath for R-Type Instructions
ALUctr
RegWr
5
Ra
32 32-bit Registers
rs
BusA 32
5
Rb
rt
ALU
5
Rw
rd
BusB 32
BusW 32
  • Register timing
  • Register can always be read.
  • Register write only happens when RegWr is set to
    high and at the falling edge of the clock

(note, unlike LC2200, multiple read ports here)
13
I-Type Arithmetic/Logic Instructions
  • Instruction format
  • RTL for arithmetic operations e.g., ADDI
  • Instruction fetch memPC
  • Add operation regrt lt- regrs
    SignExt(imm16)
  • Go to next instruction Pc lt- PC 4
  • Also, immediate instructions

14
Datapath for I-Type A/L Instructions
note that we reuse ALU
ALUctr
RegWr
5
Ra
32 32-bit Registers
rs
BusA 32
5
Rb
rt
ALU
Rw
BusB 32
5
32
BusW 32
RegDst
Extender
ALUSrc
16
must zero out 1st 16 bits
rd
rt
imm16
In MIPS, destination registers are in
different places in opcode ? therefore we need a
mux
BusW 32
15
I-Type Load/Store Instructions
  • Instruction format
  • RTL for load/store operations e.g., LW
  • Instruction fetch memPC
  • Compute memory address Addr lt- regrs
    SignExt(imm16)
  • Load data into register regrt lt- memAddr
  • Go to next instruction Pc lt- PC 4
  • How about store?

same thing, just skip 3rd step (memaddr ?
regrs)
16
Datapath for Load/Store Instructions
need a control signal
address input
32 bits of data
17
I-Type Branch Instructions
  • Instruction format
  • RTL for branch operations e.g., BEQ
  • Instruction fetch memPC
  • Compute conditon Cond lt- regrs - regrt
  • Calculate the next instructions address
  • if (Cond eq 0) then
  • PC lt- PC 4 (SignExd(imm16) x 4)
  • else ?

18
Datapath for Branch Instructions
PC
Next Addr Logic
To Instruction Mem
RegWr
ALUctr
5
Ra
32 32-bit Registers
rs
BusA 32
5
Rb
rt
ALU
Rw
BusB 32
5
MUX
well define this next (will need PC, zero
test condition from ALU)
32
Zero
MUX
ALUSrc
RegDst
Extender
16
rt
rd
imm16
19
Next Address Logic
contains PC 4
(why 30? subtlety see Chapter 5 in your text)
1
PC
CarryIn
30
ADD
Instruction Memory
30
May not want to change PC if BEQ condition not
met (implicitly says this stuff happens anyway
so we have to be sure we dont change things
we dont want to change)
0
MUX
30
SignExt
if branch instruction AND 0, can
automatically generate control signal
16
Zero
Branch
imm16
When does the correct new PC become available?
Can we do better?
20
J-Type Jump Instructions
  • Instruction format
  • RTL operations e.g., BEQ
  • Instruction fetch memPC
  • Set up PC PC lt- ((PC 4)lt3129gt
    CONCAT(targetlt250gt) x 4

21
Instruction Fetch Unit
(why PClt3128gt subtlety see Page 383 in your
text)
PClt3128gt
Instructionlt250gt
1
PC
CarryIn
Jump
30
ADD
30
0
30
Instruction Memory
SignExt
16
Branch
Zero
imm16
22
A Single Cycle Datapath
P
C
S
r
c
A
d
d
4
t

2
ALUctr
3
i
M
e
m
W
r
i
t
e
A
L
U
S
r
c
M
e
m
t
o
R
e
g
i
Z
e
r
o
A
L
U
A
L
U
R
e
a
d
A
d
d
r
e
s
s
r
e
s
u
l
t
M
d
a
t
a
M
u
u
x
D
a
t
a
x
m
e
m
o
r
y
W
r
i
t
e
R
e
g
W
r
i
t
e
d
a
t
a
S
i
g
n
M
e
m
R
e
a
d
e
x
t
e
n
d
Add Jump.
23
Control logic for a single cycle machine
24
Recall Implementation Overview
simplest view of Von Neumann, RISC mP
  • Abstract / Simplified View
  • Two types of signals data and control
  • clocking strategy
  • All storage elements are clocked by the same
    clock edge.

Data
Address
PC
Ra
Instruction
Address
Rb
A
L
U
Instruction Memory
Register File
Rw
Data Memory
Data
25
The HW needed, plus control
Single cycle MIPS machine
When we talk about control, we talk about these
blocks
26
Implementing Control
  • Implementation Steps Review
  • Identify control inputs and control outputs
  • Make a control signal table for each cycle
  • Derive control logic from the control table
  • As youve seen (and as well review), this logic
    can take on many forms combinational logic,
    ROMs, microcode, or combinations

I promise. This is not a hard thing to do. Dont
be intimated by complex datapath.
27
Single Cycle Control Input/Output
  • Control Inputs
  • Opcode (6 bits)
  • How about R-type instructions?
  • Control Outputs
  • RegDst
  • ALUSrc
  • MemtoReg
  • RegWrite
  • MemRead
  • MemWrite
  • Branch
  • Jump
  • ALUctr

Step 2 Make a control signal table for each cycle
28
Control Signal Table
(inputs)
R-type
(outputs)
29
The HW needed, plus control
Single cycle MIPS machine
30
Main control, ALU control
Func
ALUctr
OP
ALU Control
Main Control
6
ALUOp
3
6
2
(opcode)
ALU
Other cnt. signals
  • Use OP field to generate ALUOp (encoding)
  • Control signal fed to ALU control block
  • Use Func field and ALUOp to generate ALUctr
    (decoding)
  • Specifically sets 3 ALU control signals
  • B-Invert, Carry-in, operation

31
Main control, ALU control
Or in other words 00 ALU performs add 01 ALU
performs sub 10 ALU does what function code
says (see p. 284 for more)
32
Generating ALUctr
  • We want these outputs

and - 00
or - 01
mux
adder - 10
ALUctrlt2gt B-negate (C-in B-invert) ALUctrlt1gt
Select ALU Output ALUctrlt0gt Select ALU Output
Invert B and C-in must be a 1 for subtract
less - 11
33
The Logic
This table is used to generate the actual Boolean
logic gates that produce ALUctr.
Could generate gates by hand, often done w/SW.
(ALUOp)
ALUOp0
X/1
ALUctrlt2gt
ALUOp1
1/0
0/X
1/1
F3
1/0
ALUctr
(funclt50gt)
110/110
ALUctrlt1gt
F2
0/X
1/1
Ex ALUctrlt2gt (SUB/BEQ)
ALUctrlt0gt
F1
1/X
0/0
0/0
F0
0/X
0/X
34
Recall
Single cycle MIPS machine
Recall, for MIPS, we have to build a Main Control
Block and an ALU Control Block
35
Well, heres what we did
Single cycle MIPS machine
We came up with the information to generate this
logic which would fit here in the datapath.
36
Single cycle versus multi-cycle
37
Single Cycle Implementation
  • Calculate cycle time assuming negligible delays
    except
  • memory (2ns), ALU and adders (2ns), register file
    access (1ns)

38
Single-Cycle Implementation (Contd)
  • Single-cycle, fixed-length clock
  • CPI 1
  • Clock cycle propagation delay of the longest
    datapath operations among all instruction types
  • Easy to implement
  • Single-cycle, variable-length clock
  • CPI 1
  • Clock cycle ? ((type-i instructions)
    propagation delay of the type i instruction
    datapath operations)
  • Better than the previous, but impractical to
    implement
  • Disadvantages
  • What if we have floating-point operations?
  • How about component usage?

39
Multiple Cycle Alternative
  • Break an instruction into smaller steps
  • Execute each step in one cycle.
  • Execution sequence
  • Balance amount of work to be done
  • Restrict each cycle to use only one major
    functional unit
  • At the end of a cycle
  • Store values for use in later cycles, why?
  • Introduce additional internal registers
  • The advantages
  • Cycle time much shorter
  • Diff. inst. take different of cycles to
    complete
  • Functional unit used more than once per
    instruction

40
Multiple-Cycle Implementation
  • Datapath
  • Component sharing ALU, Instruction/Data memory
  • ALU used to compute address, increment PC
  • Memory used for instruction AND data
  • Additional elements MUXs, Instr Register,
    Target Register
  • If a value needs to be alive during multiple
    cycles, it should stay unchanged during the whole
    time.
  • Control
  • Needed for each datapath element during each
    clock cycle.

41
Five Step Execution
  • 1. Instruction Fetch (Ifetch)
  • Fetch instruction at address (PC)
  • Store instruction in register IR
  • Increment PC
  • 2. Instruction Decode and Register Fetch
    (Decode)
  • Decode instruction format, read register
  • Store register contents in registers A and B
  • Compute new PC address, store it in ALUOut
  • 3. Execution, Memory Address Computation, or
    Branch Completion (Execute)
  • Compute memory address (for LW and SW), or
  • Perform R-type operation (for R-type
    instruction), or
  • Update PC (for Branch and Jump)
  • Store memory address or register operation result
    in ALUOut

42
Five Step Execution (contd)
  • 4. Memory Access or R-type instruction completion
    (MemRead/RegWrite)
  • Read memory at address ALUOut, store it in MDR
  • Write ALUOut content into register file, or
  • Read memory at address ALUOut, store it in B
  • 5. Write-back step (WrBack)
  • Write the memory content read into register file
  • Number of cycles for an instruction
  • R-type
  • lw
  • sw
  • Branch or Jump

An exercise for the user
43
Some Simple Questions
  • How many cycles will it take to execute this
    code? lw t2, 0(t3) lw t3, 4(t3) beq
    t2, t3, Label assume branch not taken add
    t5, t2, t3 sw t5, 8(t3)Label ...
  • What is going on during the 8th cycle of
    execution?
  • In what cycle does the actual addition of t2 and
    t3 takes place?

1 5 10
15 20
44
Transition slide5 steps in detail
45
Step 1 Instruction Fetch
  • Use PC to get instruction, put it in IR.
  • Increment PC by 4, put the result back in PC.
  • Can you write this using the RTL notation?
  • IR lt- MemoryPC , PC lt- PC 4What is the
    advantage of updating the PC now?

46
Step 2 I-Decode and Register Fetch
  • Read registers rs and rt in case we need them
  • Compute branch address in case instruction is
    branch
  • RTL A lt- RegIR25-21
  • B lt- RegIR20-16
  • ALUOut lt- PC (sign-extend(IR15-0) ltlt2)
  • Did we set any control lines based on the
    instruction type? (we are busy "decoding" it in
    our control logic)

Means in parallel
47
Step 3 (Instruction dependent)
  • ALU is performing 1 of 3 functions, based on
    instruction type
  • Memory Reference ALUOut lt- A
    sign-extend(IR15-0)
  • R-type ALUOut lt- A op B
  • Branch if (AB) then (PC lt- ALUOut)

48
Step 4 (R-type or memory-access)
  • Loads and stores access memory MDR lt-
    MemoryALUOut or MemoryALUOut lt- B
  • R-type instructions finish RegIR15-11 lt-
    ALUOutWhen does the write actually take
    place?
  • -at the end of the cycle on the edge.

49
Step 5 Write-Back
  • RegIR20-16lt- MDR
  • What about all the other instructions?

50
Single cycle
51
Multi-cycle
(Now, critical path dependent on longest
delay for string of components used in 1 of 5
steps)
  • Where do we need to insert muxs?
  • Other functional units?

52
Execution Sequence Summary
IR ? MemoryPC
PC ? PC 4
A ? RegIR(2521)
B ? RegIR(2016)
ALUOut ? PC SignEx(IR(150) ltlt 2)
53
Multiple Cycle Design
  • Break up instructions into steps, each step takes
    1 cycle
  • balance work to be done
  • restrict each cycle to use only 1 major
    functional unit
  • At the end of a cycle
  • store values for use in later cycles (easiest
    thing to do)
  • introduce additional internal registers

54
Control Signals
New
Old
  • PC PCWrite, PCWriteCond, PCSource
  • Memory IorD, MemRead, MemWrite
  • IR IRWrite
  • Reg. File RegWrite, MemtoReg, RegDst
  • ALU ALUSrcA, ALUSrcB, ALUOp, ALUCnt.

RegDst, MemToReg, RegWrite, MemRead, MemWrite,
Branch, ALUSrc, ALUOp, ALUCnt.
55
Implementing the Control
  • Value of control signals is dependent upon
  • what instruction is being executed
  • which step is being performed
  • Use accumulated information to specify a finite
    state machine
  • use a state diagram, or
  • use microprogramming
  • Implementation can be derived from specification

56
Graphical Specification of FSM
t
Instruction Fetch
MemRead ALUSrcA 0 IorD 0 IRWrite ALUSrcB
01 ALUOp 00 PCWrite PCSource 00
Instruction decode/ Register fetch
1
0
ALUSrcA 0 ALUSrcB 11 ALUOp 00
start
8
9
Branch Completion
Memory address computation
Jump Completion
2
6
Execution
ALUSrcA 1 ALUSrcB 00 ALUOp
01 PCWriteCond PCSource 01
ALUSrcA 1 ALUSrcB 10 ALUOp 00
ALUSrcA 1 ALUSrcB 00 ALUOp 10
PCWrite PCSource 10
Memory access
5
Memory access
RegDst 1 RegWrite MemToReg 0
MemRead IorD 1
MemRead IorD 1
3
Tells us what values are needed and during what
step
R-type completion
7
RegDst 0 RegWrite MemToReg 1
4
Memory read completion
57
Finite State Machine for Control
Control logic is inside this box (could be
implemented in many different ways)
The outputs that we want now also dependent
on the current state.
could be ROM, logic, etc.
Inputs (which now also include the previous state)
(Still might need ALU control logic and hence
function code developed earlier)
58
Microprogramming
  • For our example, state diagrams, combinational
    logic more than adequate
  • But were dealing with small subset of MIPS
    processor
  • Full MIPS instruction set has over 100
    instructions
  • In 1 implementation instructions take from 1 to
    20 clock cycles
  • Control would be much more complex for this case
  • Another alternative microcoding
  • Think of control signals that must be asserted in
    a state as an instruction to be executed by
    datapath
  • Call these micro instructions

59
Micro-instructions
  • microinstruction
  • Set of datapath control signals that must be
    asserted in given state
  • Executing has affect of asserting control signals
    specified by the instruction
  • How do we sequence?
  • In some cases, fetch next instruction
  • Next instruction just depends on state
  • In others, consider inputs
  • i.e. next instruction depends on state input
  • Like assembly language, must branch explicitly
  • microprogramming
  • Designing control as a program that implements
    machine instructions in simpler terms

60
Microprogramming guidelines
  • Make each field of microinstruction responsible
    for specifying a non-overlapping set of control
    signals
  • Signals never asserted simultaneously may share
    same field
  • Have signals that a.) control datapath elements
    b.) field that handles sequencing
  • (i.e. selecting the next instruction)
  • Microinstructions usually in a ROM or PLA
  • Therefore can assign addresses
  • Like choosing s for FSM elements

61
Example fields
62
Choosing the next instruction
  • How to we choose whats next?
  • Increment the address of current microinstruction
    to obtain the next
  • Put Seq in the sequencing field
  • (Most common case, usually default)
  • Branch to next microinstruction
  • Place Fetch in the sequencing field
  • Choose next microinstruction based on control
    unit inputs
  • This is called a dispatch
  • Usually implemented by creating a table
    containing addresses of target microinstructions
  • (May be implemented in a ROM)

63
Dispatch tables
  • Often, (and realistically), there is more than 1
  • Example state diagram constructed earlier
  • We would need 2 dispatch tables here
  • 1 to dispatch from state 1
  • 1 to dispatch from state 2
  • Indicate next microinstruction should be chosen
    by a dispatch operation by placing dispatch i
    in the sequencing field
  • (i is table )

64
Recall
t
Instruction Fetch
MemRead ALUSrcA 0 IorD 0 IRWrite ALUSrcB
01 ALUOp 00 PCWrite PCSource 00
Instruction decode/ Register fetch
1
0
ALUSrcA 0 ALUSrcB 11 ALUOp 00
start
8
9
Branch Completion
Memory address computation
Jump Completion
2
6
Execution
ALUSrcA 1 ALUSrcB 00 ALUOp
01 PCWriteCond PCSource 01
ALUSrcA 1 ALUSrcB 10 ALUOp 00
ALUSrcA 1 ALUSrcB 00 ALUOp 10
PCWrite PCSource 10
Memory access
5
Memory access
RegDst 1 RegWrite MemToReg 0
MemRead IorD 1
MemRead IorD 1
3
Tells us what values are needed and during what
step
R-type completion
7
RegDst 0 RegWrite MemToReg 1
4
Memory read completion
65
Possible Values
66
Creating the microprogram
  • In microprogram, 2 situations where we could
    leave a field of microinstruction blank
  • When field that controls a functional unit or
    that causes state to be written (i.e. Memory
    field, ALU dest field) is blank, no control
    signals should be asserted
  • When a field only specifies control of a
    multiplexor that determines input to a functional
    unit, (i.e. SRC1), leaving it blank means that we
    do not care about input to functional unit (or
    output of multiplexor)

67
Example
  • 1st component of every instruction execution is
    to fetch instructions, decode them, and compute
    the sequential and branch target PC
  • Correspond directly to 1st 2 steps of execution
    described (see p.385-388)
  • 2 microinstructions needed for 1st two steps are
    below

68
Example
  • To understand each microinstruction, look at the
    effect of a group of fields
  • In 1st microinstructions, fields asserted and
    their effects are

Label field containing label Fetch, will be used
in Sequencing field when microprogram wants to
start execution of next instruction.
69
The entire microprogram
70
Control Example
  • Can you generate the control signal table?
  • How about micro-programmed implementation?

i
l
71
Sample Microinstruction
  • Ifetch IR lt- MemPC PC lt- PC4

Microinstruction 1d011ddd000100d11
72
A few words on MIPS exceptions
73
What is an exception?
  • Exception
  • An event other than a branch or a jump that
    changes the normal flow of an instruction
    execution
  • Often called an interrupt as well
  • Examples

74
Processing exceptions
  • For OS to process exception, it must know why it
    was caused, which instruction cause it
  • (i.e. arithmetic exception, invalid instruction)
  • One method
  • (used in MIPS)
  • Have a status register called Cause Register
  • Holds a field that indicates reason for exception
  • Another method
  • Vectored interrupts
  • Address to which control is transferred
    determined by cause of exception
  • OS knows reason for the exception by address at
    which its initiated

75
Need more HW
  • To process exceptions we need more HW
  • EPC
  • A 32-bit register that holds address of affected
    instruction
  • (Needed even with vectored interrupts)
  • Cause
  • Register used to record cause of exception
  • In MIPS, 32 bits
  • Well also need 2 more control signals
  • EPCWrite and CauseWrite

76
Finally, augmenting our FSM
t
Instruction Fetch
MemRead ALUSrcA 0 IorD 0 IRWrite ALUSrcB
01 ALUOp 00 PCWrite PCSource 00
Instruction decode/ Register fetch
1
0
ALUSrcA 0 ALUSrcB 11 ALUOp 00
start
8
9
Branch Completion
Jump Completion
Memory address computation
2
6
Execution
ALUSrcA 1 ALUSrcB 00 ALUOp
01 PCWriteCond PCSource 01
PCWrite PCSource 10
ALUSrcA 1 ALUSrcB 10 ALUOp 00
ALUSrcA 1 ALUSrcB 00 ALUOp 10
10
Op other
Memory access
5
IntCause 1 CauseWrite ALUSrcA 0 ALUSrcB
01 ALUOp 01 EPCWrite PCWrite PCSource 11
IntCause 0 CauseWrite ALUSrcA 0 ALUSrcB
01 ALUOp 01 EPCWrite PCWrite PCSource 11
Memory access
11
RegDst 1 RegWrite MemToReg 0
Overflow
MemRead IorD 1
MemRead IorD 1
3
R-type completion
7
RegDst 0 RegWrite MemToReg 1
4
Memory read completion
77
CS 2200 Lecture 7Interrupts, Memory-Mapped I/O
  • (Lectures based on the work of Jay Brockman,
    Sharon Hu, Randy Katz, Peter Kogge, Bill Leahy,
    Ken MacKenzie, Richard Murphy, and Michael
    Niemier)

78
Interrupts
  • Whats an interrupt?
  • 1 Idea an unsolicited procedure call.
  • Actual procedure called an exception/trap/interrup
    t handler
  • Why do we need them?
  • Or put another way, what would we have to do if
    we didnt have them?

(Example constantly or periodically check
I/0, peripheral devices, etc.)
79
Interrupts
  • How can interrupts be generated?

?
80
Interrupts
  • Different Types (2200 Definitions)
  • Exception - Associated with certain instruction
  • Overflow
  • Illegal Instruction
  • Traps System calls
  • Interrupt - Asynchronous event not associated
    with a certain instruction (e.g. I/O device).

81
Interrupts/Exceptions/Traps
82
Interrupts
  • Hardware
  • System bus contains 1 or more interrupt lines.
  • Need to know who
  • might put device type code on data lines
  • might put address of table entry
  • might put address of handling routine
  • May have priority scheme
  • What would priority be based on?
  • How would it work?
  • What has to happen?

i.e. what do we do, consider if interrupt is
caused by HW?
83
Interrupts
  • Hardware (Continued)
  • Save current PC on stack
  • Why the stack?
  • Other possibilities?
  • Go somewhere to handle interrupt
  • Check each device
  • Must be quick
  • Interrupt vector table
  • Located in low memory
  • Table of pointers

(interrupt might tell CPU to go to this table
specific location is pointer to routine to
handle analogous to assembly code)
84
Interrupts
  • Hardware (Continued)
  • What if we get interrupted in while handling
    interrupt?
  • What do we do when handling interrupt is
    complete?
  • Special Instruction RETI
  • Can a user disable interrupts?
  • followed by
  • while(1)

85
Interrupts
  • Software
  • System call (Monitor call)
  • Why do we need such a construct?
  • Concept of Mode
  • Mode bit
  • User mode
  • Can execute limited instruction set
  • Supervisor or Kernel or Monitor Mode
  • Used by OS
  • Can execute all instructions
  • Switch to user mode before returning to user.

86
Interrupts
  • Interrupt handler code
  • Like a function
  • Pointed to by vector table or address supplied by
    device
  • Must save state of interrupted process

(very much like a procedure call)
87
Today Interrupts
  • A. Running example an I/O device
  • e.g., network interface
  • B. Interrupt mechanics Hardware
  • C. Interrupt mechanics Software (handlers)
  • D. Aside CPU load of interrupts
  • E. Generalizing interrupts/exceptions/traps
  • and connect back to protection

88
A. Running Example
  • I/O Device a network interface

89
Network Interface?(NI)
?
90
Crude Network Interfaceinput-only
  • 1. Network sends us messages need some state to
    store those messages
  • 2. Need to know that messages have arrived
  • 3. Need some scheme to be sure we read a message
    before the network overwrites it.

91
Crude Network Interface
1. data area
DAV bit (Data AVailable bit) 2. set by
network 3. reset by software
92
How to connect it?make it look like another
memory unit
could use combinational logic in control to
help check/process
93
Memory-Mapped I/O
  • NI is a 17-word block mapped to 0xF0000000
  • Existing 1024-word memory at 0x00000000
  • How do you wire up two memory units?
  • hardware question
  • How do you read messages from the NI?
  • software question

LC-2200 address space
0xFFFFFFFF 0xF0000000 0x000003FF 0x000
00000
94
Memory-Mapped Devices
  • Network, disk, display, sound, keyboard, mouse
  • Add data/control registers of each to addr. space
  • And continuously check for input??

95
B. Interrupt MechanicsHardware
96
Interrupts
97
Interrupts
Address Bus
Processor
Data Bus
Int
Device 1
Device 2
Add an interrupt request line. A device wishing
to interrupt asserts this line
98
Interrupts
Address Bus
Processor
Data Bus
Int
Device 1
Device 2
The interrupt line is connected to the processor
control (state machine)
99
Interrupts
Address Bus
Processor
Data Bus
Int
Device 1
Device 2
At the beginning of every instruction execution
sequence a check is made on the status of the
"int" line
100
Interrupts
Address Bus
Processor
Data Bus
Int
Device 1
Device 2
If "int" is asserted special states can be used
to handle the interrupt
101
Interrupts
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
If the processor decides to handle the interrupt
it asserts the inta (interrupt acknowledege) line
102
Interrupts
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
If Device 1 was one of the devices asserting
"int" it receives the acknowledgement and doesn't
pass it on
103
Interrupts
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
If Device 1 wasn't one of the devices asserting
"int" it receives the acknowledgement and passes
it on
104
Interrupts
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
Assume it's Device 2 that wants to interrupt.
105
Interrupts
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
Now knowing that the processor is listening,
Device 2 can put the address of it's entry in the
interrupt vector table onto the data bus
106
Interrupts
Memory
0x12345678 0x3579BDFA 0x12345678 0x3579BDFE
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
The interrupt vector table is located in very low
memory and consists of a table of pointers to
interrupt handling routines
107
Interrupts
Memory
0x12345678 0x3579BDFA 0x12345678 0x3579BDFE
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
This allows the processor to jump to the code to
handle the interrupt
108
Interrupts
Memory
0x12345678 0x3579BDFA 0x12345678 0x3579BDFE
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
Once complete the handler executes a "return from
interrupt" instruction
109
Hardware Mechanics Summary
  • 1. Interrupt signal (INT)
  • devices-to-CPU?
  • 2. Interrupt Acknowledge (IACK)
  • CPU-to-devices
  • 3. Forced procedure call to interrupt handler

110
Hardware Mechanics SummarySubtleties
  • 1. Interrupt signal (INT)
  • devices-to-CPU?
  • 2. Interrupt Acknowledge (IACK)
  • CPU-to-devices
  • With multiple interrupts, which device goes
    first??
  • 3. Forced procedure call to interrupt handler
  • How do you get the address of the interrupt
    handler??
  • Where do you keep the return address?
  • n. potential recursion
  • What if you get an interrupt while servicing an
    interrupt??

111
IACK Problemone soln daisy-chain the IACK line
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
Limitations? Alternatives?
If Device 1 was one of the devices asserting
"int" it receives the acknowledgement and doesn't
pass it on
112
Which-Handler Problem
(i.e. how do we handle the interruption in the
CPU?)
  • Options?
  • 1. One handler leave dispatch to software!
  • 2. Interrupt vector table
  • device provides a number at IACK time
  • CPU (microcode) uses number to index into a table
  • CPU jumps to address in that table
  • Illustrated in preceeding slides
  • 3. Raw vector
  • device provides an address at IACK time and CPU
    jumps
  • used in Project 2

113
Crude Network Interfacea la project 2
Add 18th word NIVEC pointer to interrupt
handler
114
Return-Address Problem
  • Standard procedure call uses JALR and saves the
    return address in register RA
  • Interrupt procedure call cant use RA
  • its unpredictable and would smash whatever is
    there!
  • Options?
  • many...
  • Last time PRJ2 dedicates a processor register,
    K0

115
Recursive Interrupt Problem
Memory
0x12345678 0x3579BDFA 0x12345678 0x3579BDFE
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
What if Device 2 interrupts while the handler for
Device 1 is running? Or vice versa? Or double
interrupt from the same device?
116
Recursive Interrupt Problem
Memory
0x12345678 0x3579BDFA 0x12345678 0x3579BDFE
Address Bus
Processor
Data Bus
Int
0
intr enable
Inta
Device 1
Device 2
Add an interrupt enable bit to the
processor 1. cleared at interrupt time 2. set
at RETI time 3. EI/DI instrs.
117
C. Interrupt MechanicsSoftware
  • Interrupt Handlers

118
Example Device Interrupt(Say, arrival of
network message)
Save registers ? lw r1,20(r0) lw r2,0(r1) addi
r3,r0,5 sw 0(r1),r3 ? Restore registers Clear
current Int RETI
? add r1,r2,r3 subi r4,r1,4 slli
r4,r4,2 Hiccup(!) lw r2,0(r4) lw r3,4(r4) add r2
,r2,r3 sw 8(r4),r2 ?
(callee save)
External Interrupt
Interrupt Handler
code to handle int.
(callee restore)
(reset bit)
(return from interrupt)
119
Interrupt Mechanisms
  • Basic mechanism forced subroutine call (transfer
    of control w/saved return address)
  • Must have a means to disable interrupts to
    prevent nested, recursive interrupts.
  • one bit
  • Additions for performance
  • selective disable of multiple interrupt sources
    (priority level or a bit-per-source)
  • hardware to encode the source of the interrupt.

(if another interrupt comes along, we wait or
keep trying to send)
120
Nested Interrupts
(if higher priority interrupt comes along, we
could process it first)
Raise priority Reenable All Ints Save
registers ? lw r1,20(r0) lw r2,0(r1) addi
r3,r0,5 sw 0(r1),r3 ? Restore registers Clear
current Int Disable All Ints Restore priority RTE
? add r1,r2,r3 subi r4,r1,4 slli
r4,r4,2 Hiccup(!) lw r2,0(r4) lw r3,4(r4) add r2
,r2,r3 sw 8(r4),r2 ?
Could be interrupted by disk
Network Interrupt
Note that priority must be raised to avoid
recursive interrupts!
121
Example Handler
  • Init code
  • Write to NIVEC register
  • Handler code
  • save all registers used by handler to stack
  • do handler action
  • restore all registers used by handler from stack
  • JALR K0, ZERO

122
D. CPU Load of Interrupts
  • Interrupts cost some CPU time

123
Suppose we have lots of devices
Address Bus
Processor
Data Bus
Device 37
Device 1
Device 1
Device 1
Device 1
Device 1
Device 2
All generating interrupts...
124
How do you know theres enough CPU time?
Device Rate Handler time ------
---- ------------ Network 100/S
1mS Display 50/S 10mS
What fraction of the CPU is consumed by
interrupts? Could we add a sound card if it took
5mS, 100/S?
125
How do you know theres enough CPU time?
Device Rate Handler time ------
---- ------------ Network 100/S
1mS --gt 10 Display 50/S
10mS --gt 50
100 int/s 1 ms/int 1s/1000ms 0.1 50 int/s
10 ms/int 1s/1000ms 0.5 100 int/s 5
ms/int 1s/1000ms 0.5
What fraction of the CPU is consumed by
interrupts? ? 60 Could we add a sound card if
it took 5mS, 100/S? ? that would be 50 ...
no!, 6050 gt 100
126
E. Generalization
  • Interrupts for internal events
  • Interrupts as part of protection

127
Interrupt/Exception/Trap Classifications
  • Interrupts caused by asynchronous, outside
    events
  • I/O devices requiring service (disk, network)
  • Clock interrupts (real time scheduling)
  • Exceptions relevant to the current instruction
  • Faults, arithmetic traps, other synchronous traps
  • Traps deliberately caused by the current
    instruction
  • Invoke software on behalf of the currently
    executing process
  • Other, e.g. hardware failure
  • Non recoverable ECC, power outage, FPU is on
    fire...
  • asynchronous
  • not necessarily recoverable

128
Interrupt/Exception/Trap Classifications
  • Interrupts caused by asynchronous, outside
    events
  • Exceptions synchronous but unintentional
  • Traps synchronous, intentional
  • HP Exceptions of which some are interrupts
  • SGG Interrupts of which some are
    exceptions/traps
  • occasionally seen
  • fault (as in page fault ... an exception in
    our terminology)
  • machine check (unrecoverably fatal condition)

WARNING Inconsistent Terminology Zone
first of several, unfortunately
129
Interrupts and Protection
  • Interrupts and protection are orthogonal
  • However, conventionally, interrupts switch into
    supervisor (kernel) state.
  • some interrupt handlers must be protected
  • deliberately-invoked-traps (software traps) make
    a nice interface for system calls
  • therefore, it has been convenient to have all
    interrupts go to the kernel

130
Summary(note wrap-up visualization follows)
  • A. I/O devices memory-map their state
  • B. Interrupt mechanics Hardware
  • C. Interrupt mechanics Software (handlers)
  • D. CPU load of interrupts compute of time
  • E. General Mechanism Interrupts/Exceptions/Traps

131
Visualization of Program Execution
PC (mem. addr.)
time
132
Visualization of Program Execution
a procedure call
a loop
PC (mem. addr.)
an interrupt
time
133
Program Execution w/Protection
1. interrupts go to kernel mode 2. system calls
switch to kernel mode to interact w/IO
a loop
user space
PC (mem. addr.)
a system call
kernel space
an interrupt
time
134
Program Execution w/Protection ( w/IO)
I/O (kernel) space
a loop
user space
PC (mem. addr.)
a system call
kernel space
an interrupt
time
135
Bonus Slides
  • Speed of Interrupts

136
Example Device Interrupt(Say, arrival of
network message)
Raise priority Reenable All Ints Save
registers ? lw r1,20(r0) lw r2,0(r1) addi
r3,r0,5 sw 0(r1),r3 ? Restore registers Clear
current Int Disable All Ints Restore priority RTE
? add r1,r2,r3 subi r4,r1,4 slli
r4,r4,2 Hiccup(!) lw r2,0(r4) lw r3,4(r4) add r2
,r2,r3 sw 8(r4),r2 ?
External Interrupt
Interrupt Handler
137
Alternative Polling(again, for arrival of
network message)
Disable Network Intr ? subi r4,r1,4 slli
r4,r4,2 lw r2,0(r4) lw r3,4(r4) add r2,r2,r3 sw
8(r4),r2 lw r1,12(r0) beq r1,no_mess lw r1,20(r0)
lw r2,0(r1) addi r3,r0,5 sw 0(r1),r3 Clear
Network Intr ?
Polling Point (check device register)
Handler
no_mess
138
Delays of Interrupts/Polling
  • Interrupts
  • disrupts pipeline (usually must wait for a
    pipeline flush)
  • save/restore registers
  • other housekeeping (priority adjustments, kernel
    stuff)
  • Polling
  • must perform check whether theres an event
    waiting to be processed or not.
  • if check is periodic, event delivery is delayed
    by half a period if events arrive at random.

139
Is Polling faster or slower than Interrupts?
  • Polling is faster!
  • Compiler knows which registers in use at polling
    point. Hence, do not need to save and restore
    registers (or not as many).
  • Other interrupt overhead avoided (pipeline flush,
    trap priorities, etc).
  • Interrupts are faster!
  • Overhead of polling instructions is incurred
    regardless of whether or not handler is run.
    This could add to inner-loop delay.
  • Device may have to wait for service for a long
    time.
  • When to use one or the other?
  • Multi-axis tradeoff
  • Frequent, regular events are good for polling, as
    long as the device can be controlled at user
    level.
  • Interrupts are good for infrequent/irregular
    events
  • Interrupts are good for ensuring predictable
    service of events.
Write a Comment
User Comments (0)
About PowerShow.com