Title: Chapter 5' Control Design
1Chapter 5. Control Design
2(No Transcript)
3 Two approaches for control unit design
A hard-wired control unit
a sequential logic circuit to generate
specific fixed sequences of control
signals ? change in behavior only by redesign.
4- A microprogrammed control unit
- by organizing control signals into
microinstructions. The signals are - implemented by a kind of software(or
firmware) rather than hardware. - ? design change change the contents
of control memory. - ? emulation a microprogrammed CPU
can execute programs written in - the machine
language of other computers. - Disadvantage
- ? Slower due to fetch.
- ? more costly due to the
presence of the control memory and its - access circuits.
55.1.2. Hardwired Control
- design method 1 The classical method of
sequential circuit design. For a P-state - circuit, ?
log2P? flip-flops are required. - design method 2 One-hot method, one
flip-flop per state. Expensive in terms of - F/F but
simplify CU design and debugging. - GCD processor
6(No Transcript)
7(No Transcript)
8(No Transcript)
9Classical method
S0 00, S1 01, S2 10 and S3 11
10(No Transcript)
11(5.9)
(5.10)
(5.11)
12(No Transcript)
13(No Transcript)
14One-hot method
S0 0001, S1 0010, S2 0100 and S3 1000
The one-hot method is limited to a
small number of states The
next-state and output equations have a simple and
systematic form The one-hot design method 1.
Construct a P-row state table that defines the
desired input-output behavior. 2. Associate a
separate D-type flip-flop Di with each state Si,
and assign the P-bit one-hot binary code
D1, D2 , , Di-1, Di , Di1 , , Dp
0,0,,0,1,0,,0 to Si. 3. Design a
combinational circuit C that generates the
primary and secondary output signals
Di and zk , respectively. Di is defined by
the logic equation where
denote all input combinations that
cause a transition from Sj to Si. If
zk 1 ( active ) only in rows k,h for h
1,2,,mk, then zk is defined by
15Design of 2C multiplier hardwired control
16(No Transcript)
17(No Transcript)
18(No Transcript)
19(No Transcript)
20(No Transcript)
215.2 Microprogrammed Control Instruction
implemented by a sequence of one or more
sets of concurrent micro-operations.
Microprogramming control-signal
selection and sequencing information is stored in
a ROM or RAM called a control
memory(CM), and microinstruction is fetched from
CM. A microprogrammed computer C1 can
be used to execute program written in the
machine language L2 of some other computer C2 by
placing an emulation for L2 in the CM of C1.
22 Wilkers Design microinstruction (?I)
23How to decide ?I word length 1. The degree of
parallelism required at the micro-operation level
2. How the control
information is represented or encoded 3. How to
specify the next ?I address
- Parallelism in ?I
- If all useful combination of parallel
micro-operation are specified by a single opcode
it would be enormous, and decoder will be
complicated. - ? divide the micro-operation specification part
into k disjoint control field, any one of
which can be performed simultaneously with other. - ? In IBM 360/50 ?I 90 bits (21 partitioned
control field). ? Wilker design
1-bit control field for each control signal.
Un-encoded form (4-bit)
c0
c1
c2
c3
Micro-operation
1 0 0 0 R?
X0 0 1 0 0
R? X1 0 0 1 0
R? X2 0 0 0 1
R? X3 0 0 0 0
No op
24Encoded form (3-bit)
n independent control signal ? ?log2(n1)? bits
decoder is needed
?I horizontal VS vertical horizontal form ?
long format ?
able to express a high degree of parallelism
? little encoding for the
control information. vertical form
? short format
? limited ability to express parallelism
?
considerable encoding of the control information.
25(No Transcript)
26- ?I addressing
- use ?PC (as the primary source)
- conditional branching
- Condition select subfield
- branch address store a complete
address field or -
lower-order bits of address. -
restricting the range of branch instruction to
a small -
region of CM - Timing
- monophase a simple clock pulse
synchronize all the control signals. - control signals are
active for the duration of instructions
execution cycle - polyphase divide a clock cycle into
phases and control signal is active - during one of the
phase. Increase the complexity of the
?I -
format ( to
specify the phase of which -
control signal)
27Ex) Timing of 4-phase ?I. ( R ? R1 op R2 )
28(No Transcript)
29A microprogram sequencer generates a ?I
addresses for CM and comprises ?PC and all
the logics needed for next address generation
30Minimizing the width of CM
?Is I1, I2, , In Each activates a
subset of control signals C1, C2, , Cm ? want
an encoding method
cant be activated at the same time.
An encoded control field can activate only one
control signal at a time. Two control signals can
be included in the same control field if and only
if they are never simultaneously activated by a
?I.
31Algorithm
1. Find the set of Maximal compatibility class
(MCC), defined as the compatibility classes
to which no control signal can be added without
introducing a pair of incompatible control
signals. An encoded control field can activate
only one control signal at a time. Two
control signals can be included in the same
control field iff they are never
simultaneously activated by a ?I. (i.e. they are
compatible). Two control signals Ci1 and Ci2
are compatible if Ci1?Ij implies Ci2?Ij, and
vice versa. The compatibility class is a set
of control signals that are pairwise compatible.
2. Determine all minimal MCC covers. A minimal
MCC cover is the minimal set of MCC that
includes each control signal. ( Note that a
minimal MCC cover does not always yield a
minimum value of the cost function W ).
3. For each
minimal MCC covers, include each control signal
in exactly one subset of some Ci and
execute the cost W of the resulting solutions and
select one with the minimal cost.
32Deriving MCC
- Denote Si as the set of compatibility
classes Ci such that Ci - contains i Cij control signals.
- S1simply the n original control
signals - Si forms all possible(i)- member
compatibility classes. - Using Si, construct Si1 as follow
- For each Ci?Si, add a control signal
Cik to Ci to form C. - If C is a compatibility class, then
add C to Si1 and delete Ci and - all subset of C from Si .
- Stop when Sk? for some k?n1.
- The MCCs are from .
-
- Example Find the minimum of bits in the
control fields. -
-
33Minimal MCC covers (similar to the prime
implicant covering problem)
Cover Table row for each MCC Ci
column
for each control signal Cij C1 a, C2 cd, C3
bde, C4 bdh, C5 deg, C6 dgh, C7 efg, C8
fgh
34- Find the Minimal MCC covers
? Row and column deletion from a cover table.
1. Delete all essential MCC and
all column with ? in essential rows.
2. Delete all but one of identical columns.
3. Delete all domination columns.
4. Delete all domination rows. - After finding two essential MCC C1 and C2,
we can get the reduced cover table.
35If C1,C2,C4,C7a, cd, bh, efg ? width W
7 bits If C1,C2,C4,C7a, c, bdh,
efg ? width W 6 bits
36Encoding by function
- A drawback of the minimum-width control field
functionally unrelated control -
signals are
combined.
37Multiple ?-Instruction formats
- Branch instructions which specify no
control signals. - action instructions with no branching
capability. - This approach is used at the instruction
level.
38?-program sequencer
- to place all the circuitry required to
generate ?I addresses in a single IC - with the advance of VLSI.
- a general purpose building block for
?-programmed CU. - simplify CPU design.
39 Nanoprogrammed Computer
?-programmed
Computer.
Instruction
?PC
Control signals
?CM
?IR
nanoprogrammed Computer
Instruction
Control signals
nIR
Criteria ? Size of CM ?
Speed reduction(?programming needs fetch one
time/nanoprogramming twice) due to extra
memory access and complex controller. ? The
advantage of nanoprogramming is the greater
design flexibility
40(Compare the size of CM) Size of control memory
in nanoprogramming
?CM
?Hm?Wm
Hm
Total size Hm?WmHn?Wn S2
Size of comparable single-level ?CM
?Hm?Wm S1
Hm
Usually, Hm large Wm small Hn
small Wn large (Many micro-instructions can use
the same nano- programmed control)
41Big adv. of nanoprogramming Design
flexibility
1-level ?CM
Nanoprogramming
S2 Hm ? (?log2Hm? ?log2Hn?) Hn ? N Let, r
Hn/Hm ratio of unique nano-control states to
total of ?-control states for all instructions.
Hn rHm S2 Hm ? (?log2Hm?
?log2rHm?) rHm ? N Hm ( 2
?log2Hm? ?log2r? rN )
42Example) For 68,000 ?Processor(N 70, Hm 650,
r 0.4), which approach is better?
1-level CM design S1 650 ? (?log2650? 70)
52,000
Nanoprogramming S1 650 ? (?log2650?
?log2260? ) 260 ? 70 30,550
In this case, nanoprogramming is better than
microprogramming
435.3 Pipeline Control
- Performance measure by throughput in MIPS
-
where f is the pipelines clock frequency.
44- Efficiency(utilization)
- Speedup
-
- T(m) the execution time on an
m-stage pipeline - T(1) the execution time on a
non-pipelined processor - S(m) m E(m)
45- Performance/cost ratio
-
- where f pipelines clock
frequency - K hardware cost
- Suppose the pipeline has m stages for SI.
- a the delay of a non-pipelined processor
for SI - each stage of P delay a/m and extra delay
b due to the buffer resister - hardware cost K cm d
- c buffer-register cost per stage
- d cost of the pipelines data
processing logic -
46- To maximize PCR with respect to m,
475.3.3 Superscalar Processing
- Superscalar operation performs more than one
instruction per cycle by - fetching, decoding, and executing several
instructions concurrently. - A superscalar computer has a single CPU
that attempts to exploit the parallelism that is
implicit in computer programs, with multiple
execution units.
48- In Fig. 5.66, the superscalar design has a
potential speedup of 10. - With K independent m-stage pipeline E-units
speedup factors of a - superscalar CPU
- heavy demand on the
instruction-fetch logic - a large, fast instruction and
data cache -
- Important factors for PCU of a superscalar
computer - Instruction types A floating-point add
instruction has to be issued to a - floating add instruction has to be
issued to a floating-point E-unit, not to - an integer E-unit.
- E-unit availability.
- Data dependencies To avoid conflicting use
of register, data-dependency - constraints among the operands must be
satisfied. - Control dependencies Reduce the impact of
branch instructions on pipeline - efficiency.
- Program order Instructions must eventually
produce results in the order, - even if the results may be computed
out-of-order internally. - read dynamic instruction scheduling
and branch prediction.