Title: Combinational Circuits
1Combinational Circuits
2Circuit Families
- Static with NMOS, PMOS pull-down, pull-up
- Most widely used and available in most standard
libraries - Pseudo n-MOS
- Pass transistor
3Bubble Pushing
- Must think or NAND or NOR in static CMOS
- Y AB CD
- Start with network of AND / OR gates
- Convert to NAND / NOR inverters
- Push bubbles around to simplify logic
- Remember DeMorgans Law
DeMorgans Law
A . B A B
A B A . B
4Example
- Sketch a design using one compound gate and one
NOT gate. - Y AB CD
5Compound Gates
- Logical Effort of compound gates
6Example
- The multiplexer has a maximum input capacitance
of 16 units on each input It must drive a load of
160 units. Estimate the delay of the NAND and
compound gate designs.
H 160 / 16 10 (path electrical effort) B 1
(branching effort) N 2 (number of stages)
7NAND Solution
Path parasitic delay
Path logical effort
Path effort
Best stage effort delay
Path delay
Work backwards to find sizes f gh gt 4.2
(4/3) (160/y) gt y 50 input cap of 2nd
NAND2 Can do the same for the first set of
NAND2s or we already know y 16 for them
8Compound Solution
Path parasitic delay
Path logical effort
Path effort
Best stage effort delay
Path delay
Work backwards to find sizes f gh gt 4.5 (1)
(160/y) gt y 36 input cap of INV Can do the
same for the first set compound stage or we
already Know y 16
9Example
- Annotate your designs with transistor sizes that
achieve this delay (rounded to integer values)
10Input Order
- Logical effort and parasitic delay is different
for different inputs - Some gates like AOI21 are inherently asymmetric
- Other gates have slightly different logical
efforts and parasitic delays for different inputs - Our parasitic delay model was too simple
- Calculate parasitic delay for Y falling
- One input 1 the other 0 -gt 1
- If B arrives latest -gt x Vdd
- D (R/2)(2C) R (6C) 7RC 2.33t
- If A arrives latest -gt x 0
- D R (6C) 6 RC 2t
11Inner Outer Inputs
- Outer input is closest to supply rail (B)
- Inner input is closest to output (A)
- If input arrival time is known
- Connect latest (i.e., timing critical) input to
inner terminal (A) for smallest delay
12Asymmetric Gates
- Asymmetric gates favor one input over another
- Ex suppose input A of a NAND gate is most
critical - Use smaller transistor on A (less capacitance)
- Boost size of noncritical input
- So total resistance is same
- gA 10/9 lt 4/3 for symmetric NAND
- gB 5/3 gt 4/3 for symmetric NAND
- Improvement on logical effort on input A comes at
the cost of higher effort on the reset input
13Symmetric Gates
- Inputs can be made perfectly symmetric (NAND2)
- If A comes earlier, then x is charged to 1 if B
comes earlier, then x is charged to 1. In either
case, the falling delay is the same
14Skewed Gates
- Skewed gates favor one edge over another
- Ex suppose rising output of inverter is most
critical - Downsize noncritical nMOS transistor
- Calculate logical effort by comparing to unskewed
inverter with same effective resistance on that
edge. - gu (rising transition) 2.5 / 3 5/6 lt1
- gd (falling transition) 2.5 / 1.5 5/3 gt1
15HI- and LO-Skew
- Def Logical effort of a skewed gate for a
particular transition is the ratio of the input
capacitance of that gate to the input capacitance
of an unskewed inverter delivering the same
output current for the same transition. - Skewed gates reduce size of noncritical
transistors - HI-skew gates favor rising output (small nMOS)
- LO-skew gates favor falling output (small pMOS)
- Logical effort is smaller for favored direction
- But larger for the other direction
16Catalog of Skewed Gates
17Asymmetric Skew
- Combine asymmetric and skewed gates
- Downsize noncritical transistor on unimportant
input - Lo-skewed on resetn input
- Reduces parasitic delay for critical input
18Best P/N Ratio
- We have selected P/N ratio for unit rise and fall
resistance (P/N ratio 2 assuming mr mn/mp
2). - Alternative choose ratio for least average delay
(we already found the optimized P/N using SPICE
simulations) - Ex inverter
- Delay driving identical inverter
- tpdf 2 (P1) (RC)
- tpdr 2 (P1)(mr/P) (RC)
- tpd (P 1 mr mr/P) (RC)
- Differentiate tpd w.r.t. P
- Least delay for P
r
19P/N Ratios
- In general, best P/N ratio giving the lowest
average delay is sqrt of that giving equal rise
and fall delays. - Only improves average delay slightly for
inverters - But significantly decreases area and power
20Observations
- Best P/N ratio should be chosen on the basis on
area, power, and reliability (not only average
delay) - Smaller P/N ratio reduces area and power
consumption however, unequal rise/fall times
cause cycle duty distortion, longer path delays
(if the worst edge is triggered) , and reduces
noise margin by lowering the switching point
21Other Circuit Families
- What makes a circuit fast?
- I C dV/dt -gt tpd ? (C/I) DV
- low capacitance
- high current
- small swing
- Logical effort is proportional to C/I
- pMOS are the enemy!
- High capacitance for a given current
- Can we take the pMOS capacitance off the input?
- Various circuit families try to do this
22Pseudo-nMOS
- Uses a pMOS that is always ON
- Benefits
- Input Cap is smaller than a 2/1 ratioed inverter
- Drawbacks
- Has slow rising transitions
- Dissipates power when output is low
- Lower noise margin since output low is non-zero
- Rarely used
23Pass Transistor Circuits
- Pass transistors are essential to the efficient
design of specific circuits such as the
6-transistor static RAM (will discuss later) - Inputs drive diffusion terminals as well as gates
- Other gates such as XORs can also be implemented
efficiently using pass transistors (6 transistors
using pass transistors v.s. 8 using static CMOS) - However, because of diffusion inputs the delay
depends on input driver - Therefore, in most general purpose logic, static
CMOS is superior in speed, power, and area.
24Different 2-input Mux Implementations
- 2-input multiplexer (Y SA SB)
- CMOS Transmission Gates
- Compound gates
- Using tri-states
_
_
25Sequential Circuits
26Sequencing
- Combinational logic
- output depends on current inputs
- Sequential logic
- output depends on current and previous inputs
- Requires separating previous, current, future
- Called state or tokens
- Ex FSM, pipeline
27Sequencing Overhead
- Use flip-flops to delay fast tokens so they move
through exactly one stage each cycle. - Inevitably adds some delay to the slow tokens
- Makes circuit slower than just the logic delay
- Called sequencing overhead
- Some people call this clocking overhead
- But it applies to asynchronous circuits too
- Inevitable side effect of maintaining sequence
28Sequencing Elements
- Latch Level sensitive
- a.k.a. transparent latch, D latch
- Flip-flop edge triggered
- A.k.a. master-slave flip-flop, D flip-flop, D
register - Timing Diagrams
- Transparent
- Opaque
- Edge-trigger
29Latch Design
- Pass Transistor Latch
- Pros
- Tiny
- Low clock load
- Cons
- Vt drop (output swing is not rail-to-rail)
- nonrestoring
- output noise sensitivity
- Dynamic (output floats and can be disturbed by
leakage) - diffusion input (noise on input D can turn on the
gate. Also noise can impact output Q)
Used in 1970s
30Latch Design
- Transmission gate
- No Vt drop
- Requires inverted clock
- Nonrestoring
31Latch Design
- Inverting buffer
- Restoring
- Fixes either
- Output noise sensitivity
- Or diffusion input noise sensitivity
- Inverted output
32Latch Design
- Tristate feedback
- Static
- Avoids dynamic node discharge due to leakage
- Input is still sensitive to noise
- Backward noise sensitivity (noise at Q can affect
X) - When f 1, latch is transparent, so Q D
- When f 0, the transmission gate is off, so X Q
_
_
33Latch Design
- Buffered input
- Fixes diffusion input
- Noninverting
- - Backward noise sensitivity (noise at Q can
affect X)
_
34Latch Design
- Buffered output
- No backdriving
- Addresses all deficiencies
- Widely used in standard cells (e.g., Artisan)
- Very robust (most important)
- Recommended for all, but most performance or
area critical designs - Rather large
- Rather slow (1.5 2 FO4 delays)
- High clock loading
35Latch Design
- Datapath latch
- Smaller, faster
- unbuffered input because input can be better
controlled from noise - Used by Intel as a standard datapath latch
36Flip-Flop Design
- Flip-flop is built as pair of back-to-back
latches - Be careful about the phase of the clock
- Dynamic FF
- Static FF
- Use nonoverlapping clocks in the class project
37Enable
- Enable ignore clock when en 0
- Mux increase latch D-Q delay (preferrable)
- Clock Gating increase en setup time, skew
38Reset
- Force output low when reset asserted
- Synchronous vs. asynchronous
39Set / Reset
- Set forces output high when enabled
- Flip-flop with asynchronous set and reset
40Sequencing Methods
- Flip-flops
- 2-Phase Latches
- Pulsed Latches
41Timing Diagrams
Contamination and Propagation Delays
42Max-Delay Flip-Flops
43Max Delay 2-Phase Latches
44Max Delay Pulsed Latches
45Min-Delay Flip-Flops
46Min-Delay 2-Phase Latches
Hold time reduced by nonoverlap
47Min-Delay Pulsed Latches
Hold time increased by pulse width
48Time Borrowing
- In a flop-based system
- Data launches on one rising edge
- Must setup before next rising edge
- If it arrives late, system fails
- If it arrives early, time is wasted
- Flops have hard edges
- In a latch-based system
- Data can pass through latch while transparent
- Long cycle of logic can borrow time into next
- As long as each loop completes in one cycle
49Time Borrowing Example
50How Much Borrowing?
2-Phase Latches
- Use intentional time borrowing wisely
51Clock Skew
- We have assumed zero clock skew
- Clocks really have uncertainty in arrival time
- Decreases maximum propagation delay available for
logic to meet set-up time requirements - Increases minimum contamination delay required by
logic to meet hold time requirements - Decreases time borrowing
52Skew Flip-Flops
53Skew Latches
2-Phase Latches
Pulsed Latches
54Solutions for set-up and Hold-time Violations
- If setup times are violated, reduce clock speed
- Often clock speed is fixed gt redesign logic in
the critical path. Also, use clock borrowing (but
not indiscriminately) - If hold times are violated, chip fails at any
speed - hold time is independent of frequency
- Can add buffers in the data path to slow data
- An easy way to guarantee hold times is to use
2-phase latches with big nonoverlap times, but
2-phase clocking - increases area
- May not be available in some cases
55Safe Flip-Flop
- In class, use flip-flop with nonoverlapping
clocks - Very slow nonoverlap adds to setup time
- But no hold times
- In industry, use a better timing analyzer
- Add buffers to slow signals if hold time is at
risk
56Summary
- Flip-Flops
- Very easy to use, supported by all tools
- 2-Phase Transparent Latches
- Lots of skew tolerance and time borrowing
- Pulsed Latches
- Fast, some skew toler. borrow, hold time risk
57Metastability
- When the set-up or hold time is violated, the
latch or flip-flop can become metastable - Delay increases and output becomes indeterminate
for some time before settling to a known value - Synchronizer
- Use back-to-back flip flops to transfer single
bit signals between asynchronous clock domains - For buses
- Use hand-shaking
- Gray coding
- FIFOs