Title: Leakage Analysis and Minimization using MTCMOS and DualVt
1 Leakage Analysis and Minimization using MTCMOS
and Dual-Vt David Z. Pan Thanks to David
Blaauw, ICCAD03 Tutorial
2Outline
- Leakage estimation
- Transistor stacks
- Min / Max bounds and average circuit leakage
- Gate oxide leakage estimation
- MTCMOS leakage reduction
- Dual Vt leakage reduction
- Gate oxide leakage reduction
3Standby Leakage Estimation for Transistor Stacks
- Leakage current of a gate depends on input state
- Consider a 4-input NAND
- For lt1111gt, the leakagecurrent is determined
bythe pull up network - For other combinations,the leakage current
isdetermined by the - pull down network
- Stack effect must
- be modeled
(VDD 1.5V, VT 0.25V)
Chen, et al., ISLPED98
4Stack Leakage Model
- Subthreshold current model
- Based on BSIM2 MOS transistor MODEL
- Transistors which are on are treated as short
circuits - Drain-Induced Barrier Lowering (DIBL) is taken
into consideration - Body effect linear for small VS
- Equating the Isub of the transistors in the stack
1.5V
1
0V
2
0V
3
0V
4
Wei, et al., DAC98
5Stack Leakage Model
- In general, VDSi can be expressed in terms of
VDSi-1
1.5V
1
- Closed form solution possible
0V
2
0V
3
0V
4
Wei, et al., DAC98
6Accounting of Node Settling Times
- Transient analysis required to account for
settling times of nodes - Settling times range from msec hundreds of msec
- Leakage current is elevated during settling time
- Closed form solution to leakage during settling
using simplified model
0V
Johnson, et al., TCAD 6/99
7Table Based DC Solution Acceleration
- Leakage model use SPICE to pre-characterize
Ids f(Vd, Vs, W, Vg) where Vg 0, Vdd - Estimation engine set up KCL equations and solve
for node voltages with Newton-Raphson - Typical gates require lt 3 iterations of
Newton-Raphson - NAND3 results
Sirichotiyakul, et al., DAC99
8Outline
- Leakage estimation
- Transistor stacks
- Min / Max bounds and average circuit leakage
- Gate oxide leakage estimation
- MTCMOS leakage reduction
- Dual Vt leakage reduction
- Gate oxide leakage reduction
9State Dependence of Leakage Current
- Circuit state is partially known or unknown in
sleep state - Leakage variation is less for entire circuit than
for individual gates
10Leakage Current Profile
- Distribution of leakage states
- Distribution strongly dependent on circuit
topology
Decoder
Random logic
11Leakage Bound Computation
- Compute input state with maximum / minimum
leakage - Exhaustive search impractical due to exponential
search space - (2n, where n is number of prime inputs)
- Various heuristic approaches
- Random search J.Halter and F.Najm, CICC97
- Genetic algorithms Z.Chen, et al., ISLPED98
- Branch and bound M.Johnson, et al., TCAD 6/99
- Satisfiability formulation A.Fadi, et al.,
PATMOS02 - Use leakage controllability measures to
prioritize input nodes H.Kriplani, et al., TCAD
8/95
12Average Leakage Measure
- Battery life is more directly related to average
leakage than maximum leakage - Device enters standby mode many times over
battery life time - Approaches
- Apply random vectors at input
- Accurate results for circuit level leakage with
limited number of random vectors - For gate/transistor optimization, accurate
leakage current measurement on each gate is
needed - Leakage current varies dramatically on individual
gates - Random vectors not effective in computing average
leakage of individual gates in circuit
13Average Leakage Calculation Approach
- Probability based approach
- 1. Break circuit into gates
- 2. For each gate, calculate the leakage of all
states - 3. Propagate node probabilities and pair-wise
correlation factors - 4. Calculate each gates state probability
- 5. Average leakage Sum of leakage of gate
states, weighted
by gate state probability
P1
P3
Pi0
P2
Pi1
P4
Sirichotiyakul, et al., DAC99
14Gate Level Leakage States
- Number of gate states grows exponentially with
number of gate inputs - Only a few of the gate states have significant
leakage. - Dominant leakage states
- Dominant leakage state has only one transistor
OFF in any path from Vdd to Gnd.
15Dominant Leakage State Computation
- A dominant leakage state corresponds to a special
cutset in the transistor graph such that - Removing the cutset edges places Vdd and Gnd in
different partitions -
- Can directly enumerate cutsets and corresponding
dominant leakage states - Extension for transistor input correlation
Dominant Leakage State Dominant Leakage Set
011
N1 101
N2 110
N3 111
P1, P2, P3
Sirichotiyakul, et al., DAC99
16Average Leakage Measurement Result
- number of circuit states is too large to run
SPICE in reasonable time
17Outline
- Leakage estimation
- Transistor stacks
- Min / Max bounds
- Average circuit leakage
- Gate oxide leakage estimation
- MTCMOS leakage reduction
- Dual Vt leakage reduction
- Gate oxide leakage reduction
18Gate Oxide Leakage in an Inverter
- When input Vdd
- NMOS maximum Igate
- PMOS maximum Isub, reduced Igate
- When input 0V
- NMOS Vgdnegative
- Þ Igd restricted to reverse gate tunneling
- maximum Isub, reduced Igate
- PMOS small Igate
- Igate Isub
- can be independently calculated and
- added for total leakage
19Isub and Igate Dependence in Transistor Stacks
- If all inputs have a high state
- Analysis is similar to that of the inverter
- Total leakage Isub Igate
- At least one input is low
- Combination of Isub through OFF transistor
Igate of ON transistor - Igate Isub add at intermediate nodes interact
in non-trivial manner
- Approach distinguish three leakage scenarios
Lee, et al., DAC03
20Multi-Input Gate Scenario 1
- Internal nodes na and nb 0V
- Have conducting path to ground node
- Igate of tn
- Does not affect the voltage at nodes na and nb
- Itotal Igate Isub
- Reverse Igate through tt is independent of Isub
21Multi-Input Gate Scenario 2
- Internal nodes na and nb Vdd - Vt
- Connected to the output of the gate through
conducting NMOS - Igate of tn
- Vgs Vgd small (one Vt)
- Over one order of magnitude smaller than in
scenario 1 - Considered negligible
- Reverse Igate through tb is small and independent
of Isub
22Multi-input Gate Scenario 3
- Internal nodes na and nb in the range of
100200mV - Vgs,n Vgd,n close to Vdd Þ Significant Igate
- Reverse Igate from na and nb is small and can be
ignored - Reverse Igate from Vdd in tt is independent of
Isub - Forward Igate results in a rise in the voltage at
na and nb - Reduces Igate,n Vgs,n Vgd,n reduce
- Reduces Isub,t Vgs,t becomes further negative
- Exponential dependence of Isub,t on Vgs,t
- much stronger than dependence of Igate,n on Vgs,n
and Vgd,n - Igate effectively displaces the Isub
23Total Leakage Current Analysis for Isub and Igate
- Isub,k Isub,1 x Sk x st
- Isub,1 leakage current for a single
off-transistor of unit size - Sk the stack factor for a stack with k
off-transistor in series - st the size of the transistor
- Igate
- Measured for a single transistor of unit-size in
each of 3 scenarios - Igate,l l indicates the number of
off-transistors below transistor tn - Total leakage current
- Max ( Isub, computed individually , Igate,
computed individually ) - within 56 of combined Isub Igate
Source Lee, et al., DAC03
24Result Isub and Igate Estimation
- Total leakage current estimation result compared
with SPICE - 100 random vectors
25Outline
- Leakage estimation
- MTCMOS leakage reduction
- Header and footer devices
- State retaining latch configuration
- Extensions / variations of MTCMOS
- Gate leakage in MTCMOS
- Dual Vt leakage reduction
- Gate oxide leakage reduction
26Leakage Reduction Overview
MTCMOS
0
1
1
0
1
0
Source Johnson, et al., DAC99
27MTCMOS Overview
- MTCMOS (Multi Threshold CMOS)
- Active mode
- Low Vt circuit operation
- Standby mode
- Disconnect power supplies through high Vt devices
- For fine grain sleep control
- Sequential circuits must retain state
- Dual sleep devices are needed for sneak paths in
state retaining latches
Mutoh, et al., JSSC 8/95
28State Retaining MTCMOS Latch
- High Vt inverters are always powered on
- Low Vt inverters are power gated
High Vth Inverters for State Retention
Mutoh, et al., JSSC 8/95
29Outline
- Leakage estimation
- MTCMOS leakage reduction
- Header and footer devices
- State retaining latch configuration
- Extensions / variations of MTCMOS
- Gate leakage in MTCMOS
- Dual Vt leakage reduction
- Gate oxide leakage reduction
30Retaining State through Scan
- Scan out state before entering standby mode
- No state retaining flip-flop necessary
- Single footer is sufficient
- Non-power gated memory needed
- Use existing scan circuitry
- Slower transition to/from standby mode
Low Vt Logic
High Vt
Scan out
Scan in
Local Memory
31Super Cut-Off CMOS (SCCMOS)
- For sleep transistor, use negative biased instead
of high Vt - Compatible with single Vt process
- Requires on chip bias generator
- Oxide reliability issues
Kawaguchi, et al., ISSCC98
32SOI Sleep Transistor
- SIMOX-MTCMOS variable-Vt PMOS for sleep mode
control - Use body contact
- Active mode low-Vt to reduce supply voltage drop
- Sleep mode high-Vt to reduce leakage
- Increased effectiveness with low supply voltage
4 2 0
SIMOX-MTCMOS BULK-MTCMOS
VDD 0.5 V
Virtual VDD
SL
tpd (ns)
0.4 0.8 1.2
1.6 2.0
VDD (V)
Douseki, et al., ISSCC96
33Addressing Igate in MTCMOS
- Use header instead of footer sleep transistor
- Relies on lower Igate in PMOS transistor
High Vt Gating
Low Vt Logic
sleep
Vgnd
Vsup
Low Vt Logic
sleep
High Vt Gating
Hamzaoglu, et al., ISLPED02
34Sizing of Sleep Transistor
- Sleep transistor introduces additional supply
voltage drop - Degradation in performance
- Signal integrity issues
- Careful sizing of sleep transistor is needed
- Sharing virtual supply between gates reduces
voltage fluctuation
Kao, et al., DAC97
35Outline
- Leakage estimation
- MTCMOS leakage reduction
- Dual Vt leakage reduction
- Vt assignment approaches
- Simultaneous Vt and sizing approaches
- Extensions of dual Vt approach
- Gate oxide leakage reduction
36Dual Vt Circuit Optimization
- Transistor is assigned either a high or low Vt.
- Low-Vt transistor has reduced delay and increased
leakage - Use low-Vt transistors for speed critical
portions, high-Vt for rest - Trade-off degrades for lower supply voltage
- Objective
- Find an implementation between the two extremes
of all low Vt, all high Vt, trading off leakage
power for delay - Delay constraint must be met
37Dual Vt Example
- Dual Vt assignment approach
- Transistor on critical path low Vt
- Non-critical transistor high Vt
38Vt Assignment Approach
- Greedy approach backward traversal of circuit
- Select high Vt gate in critical path
- Set gate to low Vt
- Recompute critical paths
Wei, et al., DAC98
39Impact of High Vt Selection
Supply voltage1V
Nodes in critical path (low Vt) Nodes with low
Vt Nodes with high Vt
(a) Initial all low Vt0.2V
(b) high Vt0.25V
(c) high Vt0.396V
(d) high Vt0.46V
Source Wei, et al., DAC98
40Dual Vt Results
- Results for ISCAS benchmark circuits
Source Wei, et al., DAC98
41Vt Assignment Granularity
- Vt assignment can be at different level of
granularity - Gate based assignment
- Pull up network / Pull down network based
assignment - Single Vt in P pull up or N pull down trees
- Stack based assignment
- Single Vt in series connected transistors
- Individually assignment within transistor stacks
- Possible area penalty
- Number of library cells increases with finer
control - Better leakage / delay trade-off
Design rule constraint for different Vt
assignment
42Example of Different Vt Assignment Granularity
Gate based 26.7
Stack based 68.1
PU/PD based 63.5
Source Wei, et al., DAC99
43MaxFlow Formulation of Vt Assignment
- Formulated as a MaxFlow problem
- Weight assigned to each edge
- Find a Maximal Weighted Subset
- Use cuts based on topological level
- Find Maximum Weighted Level Cut, over all levels
of the graph
Wang, et al., ICCAD98
44Vt Assignment Refinement
- Swapping Local Improvement Step
- End of Cut Procedure, we have to sets SH, SL
- No more feasible edges
- Moving an edge from SH back to SL (Swap Out),
some previously infeasible edges may become
feasible. These are potential candidates for
moving from SL to SH (Swap In)
Wang, et al., ICCAD98
45Outline
- Leakage estimation
- MTCMOS leakage reduction
- Dual Vt leakage reduction
- Vt assignment approaches
- Simultaneous Vt and sizing approaches
- Extensions of dual Vt approach
- Gate oxide leakage reduction
46Vt Assignment Issues
- Improvement in Vt assignment degrades for
balanced circuits - Resizing necessary after Vt assignment to improve
obtained trade-off - Transistors changed to low Vt become oversized
Path delay profile
Blaauw, et al., ISLPED98
47Simultaneous Vt and Sizing Approach
- Fix area and move from all-high to all-low Vt
- Assign transistors to low Vt
- Periodic redistribution of area after Vt
selections - Reduce widths of transistors affected by Vt
change - Apply width reduction on transistors in the cone
of influence from the Vt changed device - The size reduction factor is linear with the depth
Sirichotiyakul, et al., DAC99
48Sensitivity Based Vt Selection
- In each iteration, pick a transistor with the
best trade-off between leakage and delay,
weighted by its path slack. - Delay change on timing arc a when transistor T is
changed to low Vt (Dda (T) )
Sirichotiyakul, et al., DAC99
49Simultaneous Vt and Sizing Results
- Substantial improvement from considering both
sizing and Vt-assignment
50Vt and Size Assignment through Lagrangian
Relaxation
- Incorporate delay constraint as a weighted term
in optimization objective - Weights are lagrangian multipliers
- Adjust multipliers based on degreed constraint
violation in each iteration
Pick initial widths and multipliers
inner loop
outer loop
Karnik, et al., DAC02
51Vt and Size Assignment through L.R. - Results
- 10 power reduction by dual Vtsizing versus
dual Vt only - 25 smaller power than single Vt designs with
sizing
Source Karnik, et al., DAC02
52Sizing and Vt Assignment
- Roughly two regions in the area-delay curve
- Prior to the knee area and leakage increase
linearly - After the knee area and leakage increase
exponentially - Strategy
- First perform sizing until the knee
(sensitivity-based method) - Then optimize Vts (enumeration with aggressive
pruning) - Final post-processing step (1-2 improvements
seen)
Ketkar, et al., ICCAD02
53Vt Assignment Approach
- Vt assignment alone is also an integer program
- Heuristic approach is necessary
- Basic engine inspired by technology mapping
approaches - Circuit is represented as a graph G(V,E), where,
V corresponds to set of gates and E to set of
interconnections - Graph is decomposed in disjoint fanout-free
regions. - Specifics
- From required times at primary outputs, required
times at root of every tree which is a
multi-fanout gate are obtained by using critical
path method (PERT traversal). - Enumerative techniques for assigning Vt to gates
in a tree
Ketkar, et al., ICCAD02
54Simultaneous Vt, Size and Vdd Assignment
- Leakage reduced through either increasing Vt or
lowering Vdd - Lowering Vdd also reduces dynamic power
- Topological constraints on Vdd assignment
- Requires use of voltage level converters
- Assign Vdd first then perform sizing/Vt
assignment
Begin
Topology Based Slack Distribution Using LP
Delay Minimize All Paths
Sensitivity Based Slack Distribution Using LP
Change VDD of Gates with Sufficient Slack
Change Gates With Sufficient Slack
?P ? ?
End
Nguyen, et al., ISLPED03
55Simultaneous Vt, Size and Vdd Assignment - Result
- Adding Vdd to W/Vt resulted in average
- 60 decrease over W only
- 25 decrease over W/Vt.
c17 c432 c499 c880 c1355 c1908
c2670 c3540 c5315 c6288 c7552
Nguyen, et al., ISLPED03
56Finding Optimal Second Vdd/Vt Values
- Optimal point is not overly sharp, and hence
points close to optimal in terms of Vdd2 and Vt2
provide near-optimal power savings. - (Vdd2,Vt2) f (Vdd1,Vt1, K)
- Strong dependence of both Vdd2 and Vt2 on Vt1
- Optimal Vdd2 is 0.5Vdd1 (Vdd,high) and possibly
lower depending on the initial ratio of dynamic
to static power
Srivastava, et al., ASP-DAC03
57Outline
- Leakage estimation
- MTCMOS leakage reduction
- Dual Vt leakage reduction
- Vt assignment approaches
- Simultaneous Vt and sizing approaches
- Extensions of dual Vt approach
- Gate oxide leakage reduction
58Increasing Device Length
- Increase in length decreases leakage, due to
short channel effects - Delay penalty due to loss of device current and
increased input loading
Delay normalize w.r.t Hi-Vt transistor
Leakage normalized w.r.t Low-Vt transistor
Length increase()
Blaauw, et al., ISLPED98
59Stack Forcing Approach
Equal Loading
Stack Forcing Vs Longer L
- Circuit technique provides additional Vts
Narendra, et al., ISLPED01
60Combining Vt and Input State Assignment
- Given a known input state in standby mode, only
OFF transistors set to high Vt - All other transistors are kept at low Vt
Lee, et al., DAC03
61Combining Vt and Input State Assignment
- Optimal input state with Vt assignment
- Increased reduction of leakage current
Lee, et al., DAC03
62Vt and State Assignment Formulation
- Objective the minimum sum of all gate leakage
currents under delay constraints - Combined Vt and state assignment
- Due to the interaction between circuit state and
Vt, - Consider their assignment simultaneously
- To turn OFF specific transistors with favorable
leakage-performance trade-offs in close
interaction with the Vt-assignment algorithm. - Very large search space all input state/Vt
assignments - Proposed solution
- Integer optimization problem
- Branch and bound method
- Exact solution is possible only for very small
circuits due to the exponential nature of the
problem - Heuristic solutions for large circuits
Lee, et al., DAC03
63Required Library Cell Versions
- NAND2 all possible Vt-assignments Þ 2 of
transistors 24 16 - Not all assignments are useful
- Based on input states, only few assignments are
meaningful
- Vt-group
- Cut-set of the graph connecting the Vdd and Gnd
nodes - Only single group needs to be considered for
high-Vt assignment - Þ Significantly reduce the complexity of the
Vt-optimization
Lee, et al., DAC03
64Simultaneous Vt and State Assignment - Result
- 10, 25 and 50 from all low Vt-group assignment
delay - 10 most stringently constrained optimization
Delay with all low Vt 0
Delay with all high Vt 100
25
10
50
65Simultaneous Vt and State Assignment - Result
- More improvement can be achieved at strict delay
constraints
66Outline
- Leakage estimation
- MTCMOS leakage reduction
- Dual Vt leakage reduction
- Gate oxide leakage reduction
67P-type Domino Structures
- Relies on lower Igate of PMOS transistor
- Use thick oxide NMOS precharge transistor
- Leakage reduction with delay penalty
Hamzaoglu, et al., ISLPED02
68Stack Order Dependence of Igate
- Key difference between the state dependence of
Isub and Igate - Isub primarily depends on the number of OFF
transistors in stack - Igate depends strongly on the position of ON/OFF
transistors in stack
5x
Source Lee, et al., DAC03
69Pin re-ordering for Igate Reduction
- Perform pin re-ordering to reduce Igate in
standby mode - Avg. 18 using state assignment alone
- Avg. 27 by using pin reordering along with state
assignment - Igate reduced by 45 up to 82
Source Lee, et al., DAC03
70References
- Blaauw, et al. Emerging power management tools
for processor design, ISLPED 1998, pp.143-148. - Chen, et al. Estimation of standby leakage
power in CMOS circuit considering accurate
modeling of transistor stacks. ISLPED 1998,
pp.239-244. - Douseki, et al. A 0.5 V SIMOX-MTCMOS circuit
with 200 ps logic gate, ISSCC 1996, pp.84-85,
423. - Enomoto, et al. A Self-Controllable-Voltage-Le
vel (SVL) circuit for low-power, high-speed CMOS
circuits, ESSCIRC 2002, pp.411-414. - Fadi, et al. Robust SAT-based search algorithm
for leakage power reduction, PATMOS 2002 - Halter and Najm A gate-level leakage power
reduction method for ultra-low-power CMOS
circuits, CICC 1997, pp.475-478. - Hamzaoglu, et al. Circuit-level techniques to
control gate leakage for sub-100 nm CMOS, ISLPED
2002, pp.60-63. - Inukai, et al. Boosted gate MOS (BGMOS)
device/circuit cooperation scheme to achieve
leakage-free giga-scale integration, CICC 2000,
pp.409-412. - Johnson, et al. Leakage control with efficient
use of transistor stacks in single threshold
CMOS, DAC 1999, pp.442-445. - Johnson, et al. Models and algorithms for
bounds on leakage in CMOS circuits, TCAD, June
1999, pp.714-725. - Kao, et al. Transistor Sizing Issues And Tool
For Multi-threshold CMOS Technology, DAC 1997,
pp.409-414. - Kao, et al. "MTCMOS Sequential Circuits",
ESSCIRC 2001, pp.332-335.
71References
- Karnik, et al. Total power optimization by
simultaneous dual-Vt allocation and device sizing
in high performance microprocessors, DAC 2002,
pp. 486-491. - Kawaguchi, et al. A CMOS scheme for 0.5 V
supply voltage with pico-ampere standby current,
ISSCC 1998, pp.192-193, 436. - Ketkar and Sapatnekar Standby power
optimization via transistor sizing and dual
threshold voltage assignment, ICCAD 2002,
pp.375-378. - Ko, et al. Hybrid dual-threshold design
techniques for high-performance processors with
low-power features, ISLPED 1997, pp.307-311. - Kriplani, et al. Pattern independent maximum
current estimation in power and ground buses of
CMOS VLSI circuits algorithms, signal
correlations, and their resolution, TCAD Aug.
1995, pp.998-1012. - Lee, et al. Analysis and minimization
techniques for total leakage considering gate
oxide leakage, DAC 2003, pp.175-180. - Lee, et al. Static leakage reduction through
simultaneous threshold voltage and state
assignment, DAC 2003, pp.191-194. - Mutoh, et al. 1-V power supply high-speed
digital circuit technology with
multithreshold-voltage CMOS, JSSC Aug. 1995,
pp.847-854. - Narendra, et al. Scaling of stack effect and
its application for leakage reduction, ISLPED
2001, pp.195-200. - Nguyen, et al. Minimization of dynamic and
static power through joint assignment of
threshold voltages and sizing optimization,
ISLPED 2003, pp. 158-163. - Pant, et al. Dual-threshold voltage assignment
with transistor sizing for low power CMOS
circuits, TVLSI April 2001, pp.390-394.
72References
- Shigematsu, et al. A 1-V high-speed MTCMOS
circuit scheme for power-down application
circuits, JSSC June 1997, pp.861-869. - Sirichotiyakul, et al. Stand-by power
minimization through simultaneous threshold
voltage selection and circuit sizing, DAC 1999,
pp.436-441. - Srivastava and Sylvester Minimizing total
power by simultaneous Vdd/Vth assignment,
ASP-DAC 2003, pp.400-403. - Sundararajan, et al. Low power synthesis of
dual threshold voltage CMOS VLSI circuits,
ISLPED 1999, pp.139-144. - Wang, et al. Static power optimization of deep
submicron CMOS circuits for dual Vt technology,
ICCAD 1998, pp.490-496. - Wei, et al. Design and optimization of low
voltage high performance dual threshold CMOS
circuits, DAC 1998, pp.489-494. - Wei, et al. Mixed-Vth (MVT) CMOS circuit
design methodology for low power applications,
DAC 1999, pp.430-435. - Ye, et al. A new technique for standby leakage
reduction in high-performance circuits, Symp.
VLSI Circuits 1998, pp.40-41. - Yeo, et al. Direct tunneling gate leakage
current in transistors with ultra thin silicon
nitride gate dielectric, Electron Device
Letters, Nov. 2000, pp.540-542.