Title: SPICE Simulation Cont'
1SPICE Simulation(Cont.)
2Optimization
- HSPICE can automatically adjust parameters
- Seek value that optimizes some measurement
- Example Best P/N ratio
- Weve assumed 21 gives equal rise/fall delays
- But we see rise is actually slower than fall
- What P/N ratio gives equal delays?
- Strategies
- (1) run a bunch of sims with different P size
- (2) let HSPICE optimizer do it for us
3P/N Optimization
fo4opt.sp Parameters and models -----------
--------------------------------------------------
--------- .param SUPPLY1.8 .option
scale90n .include '../models/tsmc180/models.sp' .
temp 70 .option post Subcircuits ------------
--------------------------------------------------
-------- .global vdd gnd .include
'../lib/inv.sp' Simulation netlist -----------
--------------------------------------------------
--------- Vdd vdd gnd 'SUPPLY' Vin a gnd PULSE 0
'SUPPLY' 0ps 100ps 100ps 500ps 1000ps X1 a b inv P
'P1' Override PMOS width with P
P1 X2 b c inv P'P1' M4 X3 c d inv P'P1' M16
device under test
4P/N Optimization
X4 d e inv P'P1' M64 load X5 e f inv P'P1' M
256 load on load Optimization
setup -------------------------------------------
--------------------------- .param
P1optrange(8,4,16) search from 4 to 16, guess
8 .model optmod opt itropt30 maximum of 30
iterations .measure bestratio param'P1/4'
compute best P/N ratio Stimulus -------------
--------------------------------------------------
------- .tran 1ps 1000ps SWEEP OPTIMIZEoptrange
RESULTSdiff MODELoptmod .measure tpdr
rising propagation delay TRIG
v(c) VAL'SUPPLY/2' FALL1 TARG v(d)
VAL'SUPPLY/2' RISE1 .measure tpdf falling
propagation delay TRIG v(c)
VAL'SUPPLY/2' RISE1 TARG v(d)
VAL'SUPPLY/2' FALL1 .measure tpd
param'(tpdrtpdf)/2' goal0 average prop
delay .measure diff param'tpdr-tpdf' goal 0
minimize diff between delays .end
5P/N Results
- P/N ratio for equal delay is 3.61
- tpd tpdr tpdf 84 ps (slower than 21 ratio
with tpd 75 ps) - Big pMOS transistors waste power too
- Seldom design for exactly equal delays
- Compromise between using small PMOS to save area
and large PMOS to eqaulize delays - What ratio gives lowest average delay?
- .tran 1ps 1000ps SWEEP OPTIMIZEoptrange
RESULTStpd MODELoptmod - P/N ratio of 1.41
- tpdr 87 ps, tpdf 59 ps, tpd 73 ps
6Power Measurement
- HSPICE can measure power
- Instantaneous P(t)
- Or average P over some interval
- .print P(vdd) measure inst. Power delivered
by Vdd - .measure pwr AVG P(vdd) FROM0ns TO10ns
- Power in single gate
- Connect to separate VDD supply
7Logical Effort
- Logical effort can be measured from simulation
- As with FO4 inverter, shape input, load output
- Use the Sweep command to vary H
- .tran 1ps 1000ps SWEEP H 1 8 1
8Logical Effort Plots
- Plot tpd vs. h
- Normalize by t
- y-intercept is parasitic delay
- Slope is logical effort
- Delay fits straight line
- very well in any process
- as long as input slope is
- consistent
- d 1.93 1.1h from plot
- d 2 1.3h from RC delay analysis
t 15 ps
9SPICE Device Models
- Level 1
- Closely related to the Shockley model discussed
- Enhanced with channel length modulation and drain
induced barrier lowering - Enhanced with the body effect
- Level 2 and 3
- Enhanced with velocity saturation, mobility
degradation, and subthreshold conduction. - Level 3 allows faster simulations and better
convergence - BSIM models
- Are derived from device physics, are very
complex, and use large number of parameters (
100) to fit the behavior of modern transistors - Are needed for accurate simulation of sub-micron
technologies
10Design Corner SPICE Deck
corner.sp Spec response of unloaded
inverter across process corners -----------------
------------------------------- .option scale90n
Specifies l 90 nm .param SUPPLY1.8 Must
be set before calling .lib .lib
'../models/tsmc180/opconditions.lib TT invoke
the library card read in TT .option post
Simulation netlist ------------------------------
------------------ Vdd vdd gnd 'SUPPLY' Vin a gnd
PULSE 0 'SUPPLY' 50ps 0ps 0ps 100ps
200ps M1 y a gnd gnd NMOS W4 L2 AS20 PS18
AD20 PD18 M2 y a vdd vdd PMOS W8 L2 AS40
PS26 AD40 PD26 Stimulus ------------------
------------------------------ .tran 1ps
200ps .alter repeat simulations for a
different corner
11Design Corner SPICE Deck cont.
.lib '../models/tsmc180/opconditions.lib FF
invoke the library card read in FF .alter .lib
'../models/tsmc180/opconditions.lib SS invoke
the library card read in FF .end
12OPCONDITIONS Library
opconditions .lib for TSMC 180 nm process TT
typical NMOS, PMOS, Voltage, Temperature .lib
TT .temp 70 .param SUPPLY 'SUPPLY .include
modelsTT.sp .end TT SS Slow NMOS, PMOS, Low
Voltage, High Temperature .lib SS .temp
125 .param SUPPLY 0.9 SUPPLY .include
modelsSS.sp .end SS FF Fast NMOS, PMOS,
High Voltage, Low Temperature .lib FF .temp
0 .param SUPPLY 1.1 SUPPLY .include
modelsFF.sp .end FF
13Combinational Circuits
14Circuit Families
- Static with NMOS, PMOS pull-down, pull-up
- Most widely used and available in most standard
libraries - Pass transistor
- Dynamic circuits
15Bubble Pushing
- Y AB CD
- Start with network of AND / OR gates
- Convert to NAND / NOR inverters
- Push bubbles around to simplify logic
- Remember DeMorgans Law
16Example
- Sketch a design using one compound gate and one
NOT gate. - Y AB CD
17Compound Gates
- Logical Effort of compound gates
18Example
- The multiplexer has a maximum input capacitance
of 16 units on each input (twice the unit size).
It must drive a load of 160 units. Estimate the
delay of the NAND and compound gate designs.
H 160 / 16 10 B 1 (branching) N 2 (number
of stages)
19NAND Solution
Work backwards to find sizes f gh gt 4.2
(4/3) (160/y) gt y 50 input cap of 2nd
NAND2 Can do the same for the first set of
NAND2s or we already know Y 16 for them
20Compound Solution
Work backwards to find sizes f gh gt 4.5 (1)
(160/y) gt y 36 input cap of 2nd NAND2 Can do
the same for the first set compound stage or we
already Know Y 16
21Example
- Annotate your designs with transistor sizes that
achieve this delay.
22Input Order
- Our parasitic delay model was too simple
- Calculate parasitic delay for Y falling
- If A arrives latest?
- If B arrives latest?
23Input Order
- Our parasitic delay model was too simple
- Calculate parasitic delay for Y falling
- If A arrives latest? 2t
- If B arrives latest? 2.33t
- Logical effort and parasitic delay is different
for different inputs - Some gates like AOI21 are inherently asymmetric
- Other gates have slightly different logical
efforts and parasitic delays for different inputs
24Inner Outer Inputs
- Outer input is closest to rail (B)
- Inner input is closest to output (A)
- If input arrival time is known
- Connect latest input to inner terminal for
smallest delay
25Asymmetric Gates
- Asymmetric gates favor one input over another
- Ex suppose input A of a NAND gate is most
critical - Use smaller transistor on A (less capacitance)
- Boost size of noncritical input
- So total resistance is same
- gA 10/9 lt 4/3 for symmetric NAND
- gB 5/3 gt 4/3 for symmetric NAND
- Improvement on logical effort on input A comes at
the cost of higher effort on the reset input
26Symmetric Gates
- Inputs can be made perfectly symmetric
- If A comes earlier, then x is charged to 1 if B
comes earlier, then x is charged to 1. In either
case, the falling delay is the same
27Skewed Gates
- Skewed gates favor one edge over another
- Ex suppose rising output of inverter is most
critical - Downsize noncritical nMOS transistor
- Calculate logical effort by comparing to unskewed
inverter with same effective resistance on that
edge. - gu 2.5 / 3 5/6
- gd 2.5 / 1.5 5/3
28HI- and LO-Skew
- Def Logical effort of a skewed gate for a
particular transition is the ratio of the input
capacitance of that gate to the input capacitance
of an unskewed inverter delivering the same
output current for the same transition. - Skewed gates reduce size of noncritical
transistors - HI-skew gates favor rising output (small nMOS)
- LO-skew gates favor falling output (small pMOS)
- Logical effort is smaller for favored direction
- But larger for the other direction
29Catalog of Skewed Gates
30Asymmetric Skew
- Combine asymmetric and skewed gates
- Downsize noncritical transistor on unimportant
input - Reduces parasitic delay for critical input
31Best P/N Ratio
- We have selected P/N ratio for unit rise and fall
resistance (P/N ratio 2 assuming mr mn/mp
2). - Alternative choose ratio for least average delay
- Ex inverter
- Delay driving identical inverter
- tpdf 2 (P1) (RC)
- tpdr 2 (P1)(mr/P) (RC)
- tpd (P 1 mr mr/P) (RC)
- Differentiate tpd w.r.t. P
- Least delay for P
r
32P/N Ratios
- In general, best P/N ratio giving the lowest
average delay is sqrt of that giving equal rise
and fall delays. - Only improves average delay slightly for
inverters - But significantly decreases area and power
33Observations
- Best P/N ratio should be chosen on the basis on
area, power, and reliability (not only average
delay) - Smaller P/N ratio reduces area and power
consumption however, unequal rise/fall times
cause cycle duty distortion, longer path delays
(if the worst edge is triggered , and reduces
noise margin by lowering the switching point
34Other Circuit Families
- What makes a circuit fast?
- I C dV/dt -gt tpd ? (C/I) DV
- low capacitance
- high current
- small swing
- Logical effort is proportional to C/I
- pMOS are the enemy!
- High capacitance for a given current
- Can we take the pMOS capacitance off the input?
- Various circuit families try to do this
35Pseudo-nMOS
- Uses a pMOS that is always ON
- Benefits
- Input Cap is smaller than a 2/1 ratioed inverter
- Drawbacks
- Has slow rising transitions
- Dissipates power when output is low
- Lower noise margin since output low is non-zero
- Rarely used
36Pass Transistor Circuits
- Pass transistors are essential to the efficient
design of specific circuits such as the
6-transistor static RAM (will discuss later) - Inputs drive diffusion terminals as well as gates
- Other gates such as XORs can also be implemented
efficiently using pass transistors (6 transistors
using pass transistors v.s. 8 using static CMOS) - However, because of diffusion inputs the delay
depends on input driver - Therefore, in most general purpose logic, static
CMOS is superior in speed, power, and area.
37Different 2-input Mux Implementations
- 2-input multiplexer (Y SA SB)
- CMOS Transmission Gates
- Compound gates
- Using tri-states
_
_