Title: Lectures 18
1Lectures 18
- Designing a Central Processor Unit
- The Controller State Sequencing and Output Logic
2Memory Reference Instructions
3Indirect Memory Reference Instructions
4Two Register Instructions
5One Register Instructions
6No Register Instructions
The NOP instruction doesnt even need one cycle
7The Fetch Cycle
- On the final version of the processor the MDR is
the only register connected to the memory data
out bus. - Thus three cycles are required to fetch and
instruction - F1 MAR?PC PC?PC1
- F2 MDR?Memory
- F3 IR?MDR
8The controller Finite State Machine
9Determining the state sequences
- The state sequencing depends on the instruction
we are executing. - For example if we are executing a STORE
instruction we will branch from E2 to F1. - If we are executing a LOADINDIRECT instruction we
will go all the way to E4 before returning to F1 - This suggests some complex sequencing logic is
required
10Choosing the Opcodes
- To date we have not defined the Opcodes, which
are the top 8 bits of the program instructions. - We can simplify things for ourselves if we choose
them with care. - For example, we can use the top bits to
distinguish the number of cycles required by the
instruction.
11Choosing the top bits
- Unfortunately we have to cover five cases, not
four.
12Generating the sequencing signal
- We now have a combinatorial logic design problem,
but with six inputs - Q2, Q1, Q0, IR31, IR30, IR29
- We can try to solve this using Boolean
simplification rules, or look for some trick to
reduce the problem.
13State Assignments
- We have not allocated the states, so perhaps we
can choose them in a way that solves the problem.
Consider - F3 0 0 1
- E1 0 1 0
- E2 1 0 0
- E3 1 1 0
- We define a variable C in such a way that it is
0 on the last execute cycle, 1 otherwise. - C IR31?Q2 IR30?Q1 IR29?Q0
14Completing the State assignment table
15We continue using the standard methodology
16Giving us the following Karnaugh maps
- D2 C Q2 Q1 C Q1 Q0
- D1 C Q1 C Q2 Q0 Q2 Q1 Q0
- D0 Q2 Q1 Q0 Q2 Q1 Q0 C Q2 Q1 Q0
17Further simplification
- Again we can use the EOR simplification rule
- D0 Q2 Q1 Q0 Q2 Q1 Q0 C Q2 Q1 Q0
- D0 Q2 (Q1?Q0) C Q2 Q1 Q0
- But, since we will need to decode the states for
the output logic, we will not bother with this
18The final circuit
19Start Up
- We did not check whether the circuit will be safe
at start up, but it is. - We will need to add extra hardware to make the
processor do something particular at start up,
(and maybe also on a signal from a reset button),
so the design will be safe in any case.
20The output Logic
- We have now successfully designed the state
sequencing logic, and all that remains is to
design the output logic. - Recall that the Moore machine had no connection
between the inputs and the output logic. This is
a safer design methodology - However, for the processor we use the Mealy
machine
21The output logic of the controller
- Our problem looks like this
22Decoding the instructions
- We can use a standard de-multiplexer (binary to
unary decoder) to decode the instructions ie
23Clock Gates
- The clock gate signals c0 to c8 determine which
register is loaded at each cycle. - The MAR will use this typical gating circuit
24Gating The MAR
- To determine when the MAR should be loaded we
need to look through all the register transfers - CMAR F1 E1(LOAD STORE)
- E2(LOADINDIRECT STOREINDIRECT)
25Using dont care states
- However, the only time we need the MAR to be
correct is before we we load the MDR. At other
times we can load it without disturbing the
execution. - Hence
- CMAR F1 E1 E2
26The MDR Clock
- From the register transfers
- CMDR F2 E2LOAD E3LOADINDIRECT
- Can be further simplified using dont cares
- Note the MDR must be preserved from cycle 1 to
cycle 3 during the call instruction
27The Register Clocks
- The register to be clocked is recorded in the IR
bits 20-22.The condition for Rdest to receive a
pulse is - CRdest E4 E3(LOADADDINCDECCOMP)
- E2(ASL MOVE
- CALLCALLINDIRECT)
- E1CLEAR
- It cannot be simplified further
28The Register clocks
- A decoder is required to determine the register
clocks.
29The Shifter Function
- The shifter function is defined as follows.
- We will use 00 (no change as the default)
30The Shifter Function
- Determining the two bits is straightforward as we
only use the shifter during shift instructions. - f4 ASRLSR
- f3 ASLLSR
31The ALU Function
- from the table
- f2 E3(COMPORAND) E2(COMPDEC)
- f1 E3(SUBTRACTCOMPAREDECINCADDAND)
- E2(COMPDEC)
- f0 E3(DEC INC ADD OR) E2(COMPDEC)
32The carry in bit
- The default will be 0
- The place that a 1 carry is required is INCE3
- Thus
- f5 INCE3
33The multiplexer selection bits
- The multiplexer functions are defined as follows
34The internal bus selector s6 s5 s4
- We will choose the shifter to be the default
(0,0,0). The conditions for selecting the other
inputs come from the register transfers - SPC E2(CALLCALLINDIRECT)
- SALU E1CLEAR (E2E3)(INCDECCOMP)
- TWOE3
- SMask E1(LOADJUMP STORE) E3CALL
- SMDR LOADE3 E4
35The internal bus selector s6 s5 s4
- Using the dont cares
- S4 SALU SMAR
- S5 SPC
- S6 SMask SMDR
36How did we do?
- We can now make a wiring list, buy the components
and test it. - The components will cost 200-300 (over twice
the price of a Pentium IV - The clock could be set at about 10KHz
- So it looks as if we had better consider the Mark
2 version straight away.
37Improvements
- All instructions are 32 bit, but mostly the
bottom 16 bits are empty. - We could have normal instructions as 16 bit, with
a multiplexer arrangement to put the top 16 or
bottom 16 bits of the IR to the controller. - 32 bit (memory reference instructions) could be
made to start on a 4 byte boundary by inserting a
NOP instruction is necessary
38Instruction Packing Schemes
- More efficient packing schemes could be devised,
since some instructions are one byte ones - eg SKIP, NOP
- By packing we could probably reduce by more than
a half the time taken on fetching.
39More Arithmetic hardware
- We have three unused inputs on the multiplexer
that selects the internal bus. - Additional arithmetic hardware could include
- A sixteen bit multiplier (multiply the bottom 16
bits of A and B to obtain a 32 bit result) - An incrementer
- A decrementer
40Other functionality
- A circuit to test if the result (or internal bus)
was zero would enable us to provide a SKIP_EQUAL
instruction. - This would require a 32 bit OR gate and a single
bit register.
41More Multiplexers
- Additional multiplexers could help us to reduce
the instruction cycles of many instructions. - For instance a multiplexer to select the input to
B independently of A would reduce many three
cycle instructions to two cycles.
42More Data Paths
- A data path from the registers to the internal
bus would reduce some instructions by one cycle. - This would require an additional input on the bus
selector multiplexer, and so might be considered
an alternative to the additional arithmetic
functions already discussed.
43Optimised Combinational logic
- This is the hard part.
- We want to have the minimum time delays in all
our combinational logic. - This is partly a question of path length, but
does require looking at low level transistor
models to calculate the time accurately
44And that is the end of the course
- I hope you enjoyed it,
- Have a good Christmas
- Our Christmas present to you
45Christmas present
- No hardware questions in the Christmas test!