Title: Delay and Power Variability in CMOS
1Delay and Power Variability in CMOS
Ruth Wang, Paul Friedberg, Liang-Teck Pang,
Huifang Qin, Louis Alarcón, Thuan
Trinh Professors Jan Rabaey, Mircea Stan, Yu
(Kevin) Cao January 13, 2005
2Outline
- Current YODA research
- Delay and Power Variability Study Experimental
Setup - Monte Carlo simulation framework
- Circuits under study
- Results
- Overall delay and power variability
- Single parameter contributions to delay and power
variability - Conclusion
3YODA Group Research
- The ability to accurately predict power and
timing margins for a given technology are crucial
for assessing - Processor performance in next generation
technologies - Power/performance sensitivity for new circuit
topologies - Circuit performance (e.g. delay, power, SRAM data
retention voltage) is sensitive to statistical
variations in process and environmental
parameters Vth, Lpoly, W, tox, Vdd, T - Ongoing YODA research efforts
- Probabilistic computing as a high-level approach
to robust design - Silicon measurements of the spatial correlation
of parameter variations and the effects of layout
on variation - Ultra low voltage SRAM design
- Techniques of robust design using PLA structures
- Power and delay variability in CMOS designs
4Toward Probabilistic Computation
- Process variability should be analyzed and
optimized at all design layers
Professor Yu (Kevin) Cao (ASU)
5Spatial Correlation and Layout Effects
- Exhaustive critical dimension measurements of a
200mm wafer processed in 130nm bulk technology - Developed a model for the spatial correlation of
critical linewidths as a function of distance
between logic stages
P. Friedberg, L.T. Pang, Profs. Spanos, Nikolic
6Ultra Low Voltage SRAM Design
- Goal Investigate techniques for robust, ultra
low voltage (ULV) SRAM design. - Developed an analytical model for SRAM Data
Retention Voltage (DRV), considering variations
in process parameters. - Explored the optimization of DRV using
- ULV sizing techniques
- Error-tolerant SRAM design
- Results 80 leakage power saving in standby
operation at DRV100mV by tuning the sizing, body
bias voltage and bitline voltage. - Ongoing work Error-tolerant design to further
reduce SRAM DRV and leakage power.
- 90nm SRAM test chip (tapeout October 2004)
- Silicon verification of ULV optimization
techniques.
Huifang Qin, Thuan Trinh
7Robust Design Using PLA Structures
- Programmable Logic Arrays (PLAs) benefit from
inherent advantages due to increased stack
heights and their regular structure.
- Regularly structured design reduces system and
silicon complexity - Stacked structures enable aggressive voltage
scaling
- More robust to correlated (tune or adapt) and
random variations (self-cancel) - Decreased short channel effect
- Ion/Ioff increases with increasing stack height
(leakage suppression)
L. Alarcón, Profs. M. Stan, R. Brayton
8Delay and Power Variability in CMOS
- Goal Investigate the effects of variations in
Vth, Lpoly, W, tox and Vdd on the performance of
a family of representative circuits. - Quantify the statistical variability of circuit
delay and power (active). - Identify single parameter contributions to
overall variability levels. - Circuits under study
- NAND chain (six stages)
- Adders (16-bits, various architectures)
- Logic styles Static, Dynamic Domino, Passgate
- All transistor sizes optimized for minimum delay
under an area constraint - Experimental Setup
- 90nm, pd-SOI technology
- Industrial research site
- All parameter distributions set by predictive
BSIMSOI models, ITRS (2003)
9Monte Carlo Simulation I
- Goal I Vary all parameters simultaneously
study the statistical variability of power and
delay. - Variable parameters
- Vth, Lpoly, W, tox, Vdd 1V (mean value)
- Temperature held at 85C
- Interdependencies between parameters reconciled
within the simulation - N 200 for adders, N 1000 for NANDs
- The spatial correlation coefficient defines
parameter matching between adjacent transistors - Each parameter is assigned identically to all
transistors within each circuit instance - ? is set to 1, indicates perfect correlation
(worst-case)
10Interdependencies Between Parameters
- The operating value of Vth is composed of its
long channel Vth0 value modified by ?Vth factors
(BSIMSOI Model) - Interdependencies between parameters are
reconciled within each simulation by separating
Vth, OPERATING into independent and dependent
components.
11Monte Carlo Simulation II
- Goal II Isolate individual parameter
contributions to overall power/delay variability - Parameter distributions same as in previous setup
- Again, perfect spatial correlation of parameters
is assumed (? 1)
. . . .
12NAND Chains (6-stages)
Static CMOS
Static Passgate (LEAP)
- Static capacitive load, CL 10fF
- Active, FO3 load (value varies with parameter
fluctuations)
Pulsed Static
Dynamic Domino
13Adders
- Carry select, logarithmic configuration
- Ripple carry with Manchester carry chain
(passgate-based)
Static
Dynamic
- Static, Dynamic Domino,
- Passgate
14Adders CLA Trees
Kogge Stone, Radix 2
- Kogge Stone, Radix 4
- Large stack height (static) 8
- Brent-Kung
- Large intermediate load capacitance along
critical path (Sum07 node)
15Delay, Power Variability NAND chains
- The static CMOS implementation is the most robust
to process parameter variations - The passgate style (LEAP) displays the highest
levels of delay and power variability (30 higher
than static)
16Delay Variability Adders
- Static carry select is the most robust
- The three most variable are passgate-based,
between 31 - 67 more spread than static carry
select
17Power Variability Adders
- Most robust static ripple with Manchester carry
chain - The least robust designs with large/irregular
intermediate load capacitance along critical
paths (radix 4 Kogge Stone, Brent Kung)
18Single Parameter Breakdown NAND Chains
Static capacitive load
F03 load
- Results vary depending on final loading stage
(static vs. FO3) - Vth is most significant contributor in all cases
- For active, F03 loads
- Passgate design is most sensitive to Vth
variations - Increased significance of L variations
19Single Parameter Breakdown Adders (Delay)
- Vth is most significant contributor (33 average)
- Passgate designs are the most sensitive to Vth
variations - L is nearly as significant (28 average)
20Single Parameter Breakdown Adders (Power)
- Vdd contributions dominate (41 average)
- Vth variations are also significant (30 average)
21Conclusion
- Static CMOS implementations are generally the
most robust to parameter variations, for both
delay and power - Passgate designs display the least amount of
robustness - Suffer spreads in delay and power variability
between 30 70 higher than static designs - Tend to display highest sensitivity to Vth
variations - These are worst-case results, due to the
assumption of perfect parameter correlation - Vth variations account for 35 - 40 of delay
variability - Power variability trends suggest a dependence
upon large or irregular intermediate load
capacitances - Vth, L and Vdd are consistently the highest
contributors to both delay (85) and power (80)
variation. - Future Work Ongoing efforts to more accurately
model the spatial correlation of parameters.
22Additional Material
23Adder Building Blocks
- Dot operator implementation (P, G generation)
static, radix 2
dynamic, radix 2
passgate, radix 2
static, radix 4
24Adder Power Delay Product (PDP)
- In general, CLA styles exhibit the smallest PDP
values due to their superior speeds
25Adder PDP Variability
- Most designs fall within 20 of one another
- The PDP of static ripple (Manchester carry) is
most variable - Raw (proprietary) values are needed to identify
the true strongest design
26Future Work
- Ongoing efforts Investigation into effects of
systematic within-die variations (correlations
between characteristics of adjacent devices) - The circuit metrics under study may be broadened
to include other important figures of merit (e.g.
noise immunity and leakage power) - Transistor sizing optimizations and Monte Carlo
variation simulations may be combined into a
unified, variation-aware sizing methodology. - More computationally efficient
- May yield more strategic timing margins