Title: VariationTolerant Circuit Design Techniques
1Variation-Tolerant Circuit Design Techniques
- Tanay Karnik
- Circuits Research Lab, Intel Corporation
- Acknowledgements
- Vivek De, Jim Tschanz, Jianping Xu, Ram
Krishnamurthy, Steven Hsu, Keith Bowman, Ali
Keshavarzi, Muhammad Khellah, Nam Sung Kim, Siva
Narendra, Peter Hazucha, Gerhard Schrom, Fabrice
Paillet, Noel Menezes, Shekhar Borkar, Intel
Corporation - Prof. David Blaauw, University of Michigan
- Prof. Kaushik Roy, Purdue University
- Dr. Michael Nicholaidis, TIMA, France
2Outline
- Motivation
- Static techniques
- Dynamic techniques
- adaptive circuits
- error recovery
- regulator integration
- Summary
3Transistor Research
4Intel Mobile Products
Single Core Processors
Dothan (90nm)
Banias (130nm)
Dual Core Processors
Yonah (90nm)
Penryn (45nm)
Merom (65nm)
Source www.Chip-Architect.com
5Technology Trends
- Continue Moores Law
- /transistor scaling
- Energy/operation scaling
- More transistors per chip
- Deliver higher performance systems
- Power is the limiter
- Larger delay and power variability
6Technology Outlook
7Sources of Variability
8P, V, T Variations
Voltage
Process
- Die-to-die variation
- Within-die variation
- Static for each die
- Chip activity change
- Current deliveryRLC
- Dynamic ns to 10-100us
- Within-die variation
Very slow
Device Ion
Temperature
Years
- Activity ambient change
- Dynamic 100-1000us
- Within-die variation
Time dependent Degradation Aging NBTI
9Cost of Variations
- Underestimating Variations
- Functional yield loss
- Performance reduction
- Increases silicon debug time
- Increases manufacturing effort
- Overestimating Variations
- Increases design time
- Larger die size
- Rejection of otherwise good design options
- Missed market windows
- Increases design effort
10Static Techniques
11Layout Restrictions - Orientation
12Layout Restrictions - Quantization
Vdd
Vdd
Op
Ip
Op
Vss
Vss
13Layout Restrictions - Routing
14Static variation compensation
- Measure
- Processor frequency, power
- Variation sensors, ring oscillators
- Adapt
- Clock distribution delays
- On-die body bias
- Supply voltage
15Adaptive Body Bias Process Variations
180nm ABB testchip
- WID-ABB
- 20X lower Fmax variation
- 97 high-bin parts
- ABB
- 6X lower Fmax variation
- 30 high-bin parts
16Effectiveness of Adaptive Biasing
17Input offset compensation circuit
- Active input offset compensation circuit
- - simple structure
- - programmable output voltage with 8-bit high
resolutio - Voltage offset range -200 mV to 200 mV
- Voltage offset resolution / 1 bit 1.56 mV
18Input Offset Cancellation
offset control
Vi
Vo
-
Vi-
Vo-
Vout
Vout
Vclk
Vin-
Vin
Vin
Vin-
Vbias
Vbias
- delay matching required
- Fswitching gt 2 Finput
- clocked compensation
- enables high speed operation
- simple structure
- programmable 8-bit high resolution
19Time Borrowing Flip-Flops
TB Flip-Flop
- Insertion of clock path inverters provides a
transparency window (TW) - Connections to master-latch pass gate and
tri-state inverter change for an odd number of TW
inverters - Clocking energy overhead for larger TW
K. Bowman et. al., ISLPED 2006
20TB Flip-Flops - N-Cycle Interconnect
TB N-Cycle Interconnect
- Transparency windows enable
- Amortization of clock skew jitter
- Averaging of WID data delay variations
- Min-delay constraints for interconnects are less
stringent than logic - TW can be relatively large fraction of cycle time
K. Bowman et. al., ISLPED 2006
21TB Flip-Flops Active Energy
- Active Energy FMAX trade-off quantified by
sweeping tW in model - FMAX gains saturate as data delay averaging
approaches ideality - Mean FMAX gain of 3.5 at equal active energy
K. Bowman et. al., ISLPED 2006
22TB Flip-Flops Average Energy
- TB interconnects enable 4-6 mean FMAX gain and
10 average energy savings
K. Bowman et. al., ISLPED 2006
23TB Flip-Flops FMAX Gain
65nm Variations Increase WID Delay Variance by 2X
65nm Variations
- Maximum mean FMAX benefit ranges from 4-7.5 for
65nm process - For 2X larger WID delay variance (1.41X larger
standard deviation), maximum mean FMAX gain rises
to 5-10
K. Bowman et. al., ISLPED 2006
24Dynamic Variation Compensation
25Dynamic adaptive design
- Dynamic variations voltage, temperature, aging
- Guardbanding performance or power penalty
- Optimally sense and respond to environment
Adapt Body bias (NMOS PMOS) FBB Increase F
and ISB RBB Reduce F and ISB Supply voltage
Off-chip voltage regulator Frequency DLL for
fine-grain F changeSwitch between multiple PLLs
Sense Temperature on-die thermal
sensors Voltage droop (1st, 2nd, 3rd) on-die
droop detectors Processor workload off-chip
current sensoractivity monitorssoftware hooks
26Itanium Power Controller
Montecito90-nm Itanium processor
S. Naffziger et. al., The implementation of a
2-core multi-threaded Itanium-Family Processor,
ISSCC 2005
27Itanium Power Controller
Reduced power variation!
S. Naffziger et. al., ISSCC 2005
28Itanium Power Controller
Voltage
Transition to Low Power Draw
Power
Transition to High Power Draw
Current
Transition is fast enough that thermal budget is
maintained
R. McGowen, Power and temperature control on a
90nm Itanium family processor, ISSCC, 2005
29Dynamic Adaptive TCP/IP Processor
90nm CMOS 964K transistors 1.3W _at_ 3GHz
J. Tschanz et. al, Adaptive frequency and
biasing techniques for tolerance to dynamic
temperature-voltage variations and aging, ISSCC
2007.
30Frequency Change Algorithms
Frequency
J. Tschanz et. al., ISSCC 2007
31Dynamic Droop Response
32 frequency increase
23 average frequency improvement
J. Tschanz et. al, ISSCC 2007.
32Dynamic Response
Adaptive
2943 MHz
2598 MHz
Fixed frequency
12 frequency gain at same power
Die Temperature
1.4 avg frequency gain
Frequency
NMOS body bias
J. Tschanz et. al, ISSCC 2007.
33Aging Compensation with Body Bias
3 Fmax improvement at 0.9V
PMOS body bias for compensation
J. Tschanz et. al, ISSCC 2007.
34On-Die Leakage Sensor For Measuring Process
Variation
- High leakage sensing gain 90nm dual-Vt,
Vdd1.2V, 7 level resolution, 0.66 mW _at_80Cº
C. Kim et al. , VLSI Circuits Symp. 04. Source
Kaushik Roy, Purdue University
35Path-level Delay Fault Detection
Correct operation
Data sampling errordue to voltage droop
- Challenges
- Detection and propagation of fault
- Metastability
- Error correction scheme
36Shadow Latches for Error Detection
- TIMA, France
- IROC Technologies
- http//www.iroctech.com
- Developed for soft error tolerance
- Tools to estimate FIT rates
- Tools to harden netlists
Source Michael Nicholaidis, TIMA, France,
VTS1999, DATE2000.
37Razor I Error Detection Recovery
- Goal reduce voltage margins with in-situ
error detection and correction for delay failures - Augment flip-flops on critical path with a
shadow latch - samples off the - negative clock edge or a delayed clock
clk
Q
D
0
Main Flip-Flop
Local Meta Detector
1
Error
Comparator
RAZOR FF
Restore
- Upon failure Overwrite main flip-flop with
correct data from the shadow latch - Ensure that the shadow latch is always correct by
conventional design - Razor I developed in collaboration with Austin
and Mudge
Source David Blaauw, University of Michigan.
38Razor Flip-Flop Circuit Schematic
Master
Slave
Clk
nClk
Restore
Q
D_in
G1
nClk
Clk
nClk
Clk
Error
Source David Blaauw, University of Michigan.
39Razor Self-tuning Microprocessor
140MHz
120MHz
Percentage Error Rate
Normalized Energy
ARM processor with Razor Technology
0.18um Transistors 1.6M Power overhead 2.9
Error Rate
Energy
Error detection
Voltage (in Volts)
Point of First Failure
20 energy reduction at same frequency Error
rate must be small (ltlt1)
S. Das et. al, A self-tuning DVT processor using
delay-error detection and correction, VLSI 2005.
40Distribution of Net Energy Savings
140MHz
120MHz
Number of Chips
Number of Chips
Percentage Savings
Percentage Savings
Distribution of Net Energy Savings Over Worst
Case for 33 measured chips
Source David Blaauw, University of Michigan.
41Self-Regulating Voltage using Error Detection
- Razor voltage regulation
- Tune processor voltage based on error rate
- Eliminate safety margins, purposely run below
critical voltage - Data-dependent latency margins
- Trade-off voltage power savings vs. overhead of
correction
Pipeline IPC
Total Energy
Processor Energy
Processor Energy w/ overhead
Supply Voltage
Optimal Voltage
- Analogous to wireless power modulation
Source David Blaauw, University of Michigan.
42Razor Voltage Controller
Vdd
Voltage Regulator
FPGA
reset
12 bit DAC
DC-DC
Voltage Control Function
Esample
Ediff
CPU
Error Count
Eref
Ediff Eref - Esample
Razor Voltage Control Loop
Controller Output Voltage(V)
Percentage Error Rate
Time (Seconds)
b) Run-Time Response of the Voltage Controller at
750KHz Error Sampling Frequency
Source David Blaauw, University of Michigan.
43Low-Energy Error Detection Sequentials
Source Keith Bowman, et al., ISSCC 2008.
44Low-Energy Error Detection Sequentials
Source Keith Bowman, et al., ISSCC 2008.
45Low-Energy Error Detection Sequentials
- Timing-error detection and recovery
demonstration. - Data from input buffer arrives late at 3rd
pipeline stage - (a) Detect error
- (b) Invalidate output data
- (c) Halve FCLK
- (d) Replay instruction
- (e) Validate output data
- (f) Resume target FCLK
- Avg path FCLK gains from eliminating V T
guardbands - Path FCLK gains are measured by comparing EDS at
nominal conditions to the conventional design at
worst-case conditions
Source Keith Bowman, et al., ISSCC 2008
46Low-Energy Error Detection Sequentials
- Measured throughput (TP) and error rate versus
clock frequency (FCLK) - Measured maximum throughput (TP) for resilient
and conventional designs versus supply voltage
Source Keith Bowman, et al., ISSCC 2008.
47Power Supply Industry Trends
- USD 28 Billion industry
- 1000 manufacturers of ACDC and DCDC
- a reactive industry, non-marketable for OEMs
- Easy to enter, difficult to grow
- growth ACDC 4.6, DCDC 3
- 15 RD ? 50 profit to sustain
- 5 (1 Billion) RD spending
- Efficiency
- utility bills for server farms
- Power management and digital power
Source Mohan Mankikar
48Microprocessor Platforms
Desktop
Server
platform powerdelivery
Handheld
Mobile
- area intensive decaps - platform area saving
potential
49Integrated-Converter Benefits
- 41 conversion reduces current to 0.29x, off-chip
decoupling to 0.15x, and I2R loss to 0.09x - Separate memory supply can avoid Vccmin problems
Gerhard Schrom, et al., ISLPED 2004.
5025A/mm2 Linear Regulator for 2-VCC
- 90nm communication process
- lt10PP droop with 100 load change
- 0.54ns response time
Peter Hazucha, et al., VLSI Circuits Symposium
2004.
51DC-DC Converter Test Chip
Peter Hazucha, et al., VLSI Circuits Symposium
2004.
52DCDC Switching Converter
VIN1.4V, VOUT1.1V
VIN1.2V, VOUT0.9V
- Roll-off at high current due to series R
- Efficiency improves at 1.4V due to lower MOSFET
resistance, peaks at 87.7
Peter Hazucha, et al., VLSI Circuits Symposium
2004.
53Supply Resonance Suppression
Impedance peaks at 140 MHz
Jianping Xu, et al., ISSCC2007
54Band-Limited Active Damping
55Band-Limited Active Damping
- 77 resonance suppression 12.76dB
- Obtained suppression current density of 8.5 A/mm2
- RSC cell occupies 0.001 mm2
- less than 1 area overhead and 3 power overhead
Jianping Xu, et al., ISSCC2007
56Memory Circuits
- Register Files
- Split wordline
- PVT compensated keeper
- mask, trim, tune
- NAND instead of stacked PMOS
- SRAM / cache
- Selective sleep for retention or collapse
- Voltage modulation
- bit-line pulsing
- word-line pulsing
- word-line overdrive/ underdrive
- cell voltage underdrive / collapse
- Data sensing
- sense-amplifier sharing
- optimized timing
- Variation tolerance
- self-repair with leakage monitors and tunable VBB
- active sleep, active voltage modulation
- tunable timing and trimming
Tanay Karnik, et al., ICCAD2007
57Summary
- Must do
- Continue Moores Law
- /transistor scaling
- Energy/operation scaling
- More transistors per chip
- Deliver higher performance systems
- Solutions
- Static and dynamic variation tolerant circuits
- Memory/RF scaling circuits
58References
- A 256-Kb Dual-V_rm CC SRAM Building Block
in 65-nm CMOS Process With Actively Clamped Sleep
Transistor, Khellah, M., et al. Solid-State
Circuits, IEEE Journal of Volume 42, Issue 1,
Jan. 2007 Page(s)233 - 242 - A 4.2GHz 0.3mm2 256kb Dual-V/sub cc/ SRAM
Building Block in 65nm CMOS, Khellah, M., et al.
Solid-State Circuits, 2006 IEEE International
Conference Digest of Technical Papers, Feb. 6-9,
2006 Page(s)2572 - 2581 - Dynamic sleep transistor and body bias for active
leakage power control of microprocessors,
Tschanz, J.W. Narendra, S.G. Ye, Y. Bloechel,
B.A. Borkar, S. De, V.Solid-State Circuits,
IEEE Journal of, Volume 38, Issue 11, Nov. 2003
Page(s)1838 - 1845 - Effectiveness of adaptive supply voltage and body
bias for reducing impact of parameter variations
in low power and high performance
microprocessors, Tschanz, J.W. Narendra, S.
Nair, R. De, V. Solid-State Circuits, IEEE
Journal of, Volume 38, Issue 5, May 2003
Page(s)826 - 829 - Dynamic-sleep transistor and body bias for active
leakage power control of microprocessors,
Tschanz, J. Narendra, S. Yibin Ye Bloechel,
B. Borkar, S. De, V. Solid-State Circuits
Conference, 2003. Digest of Technical Papers.
ISSCC. 2003 IEEE International, 2003 Page(s)102
- 481 vol.1 - Adaptive body bias for reducing impacts of
die-to-die and within-die parameter variations on
microprocessor frequency and leakage, Tschanz,
J.W., et al. Solid-State Circuits, IEEE Journal
of, Volume 37, Issue 11, Nov. 2002 Page(s)1396
- 1402 - Adaptive body bias for reducing impacts of
die-to-die and within-die parameter variations on
microprocessor frequency and leakage, Tschanz,
J., et al. Solid-State Circuits Conference,
2002.. ISSCC. 2002 IEEE International Volume 2,
Feb 3-7, 2002 Page(s)344 539 - Adaptive body bias for reducing impacts of
die-to-die and within-die parameter variations on
microprocessor frequency and leakage, Tschanz,
J., et al. Solid-State Circuits Conference,
2002. ISSCC. 2002 IEEE International Volume 1,
3-7 Feb. 2002 Page(s)422 - 478 vol.1 - Impact of die-to-die and within-die parameter
fluctuations on the maximum clock frequency
distribution for gigascale integration, Bowman,
K.A. Duvall, S.G. Meindl, J.D. Solid-State
Circuits, IEEE Journal of Volume 37, Issue 2,
Feb. 2002 Page(s)183 190 - A self-tuning DVS processor using delay-error
detection and correction, Das, S. Roberts, D.
Seokwoo Lee Pant, S. Blaauw, D. Austin, T.
Flautner, K. Mudge, T. Solid-State Circuits,
IEEE Journal of Volume 41, Issue 4, April 2006
Page(s)792 804 - Keith Bowman et al., ISSCC 2008
- A process variation compensating technique with
an on-die leakage current sensor for nanometer
scale dynamic circuits, Kim, C.H., etal. TVLSI,
Volume 14, Issue 6, June 2006 Page(s)646 - 649
- Self-Repairing SRAM for Reducing Parametric
Failures in Nanoscaled Memory, Mukhopadhyay, S.,
et al. VLSI Circuits, 2006. Symposium on June
15-17, 2006 Page(s)132 - 133
59References - continued
- High Voltage Tolerant Linear Regulator With Fast
Digital Control for Biasing of Integrated DC-DC
Converters, Hazucha, P, et al. Solid-State
Circuits, IEEE Journal of Volume 42, Issue 1,
Jan. 2007 Page(s)66 - 73 - A Linear Regulator with Fast Digital Control for
Biasing Integrated DC-DC Converters, Hazucha, P,
et al.. Solid-State Circuits, 2006 IEEE
International Conference Digest of Technical
Papers, Feb. 6-9, 2006 Page(s)2180 - 2189 - Optimal Design of Monolithic Integrated DC-DC
Converters, Schrom, G, et al. Integrated Circuit
Design and Technology, 2006. ICICDT '06. 2006
IEEE International Conference on 24-26 May 2006
Page(s)1 - 3 - A 233-MHz 80-87 efficient four-phase DC-DC
converter utilizing air-core inductors on
package, Hazucha, P. Schrom, G. Jaehong Hahn
Bloechel, B.A. Hack, P. Dermer, G.E. Narendra,
S. Gardner, D. Karnik, T. De, V. Borkar, S.
Solid-State Circuits, IEEE Journal of Volume 40,
Issue 4, April 2005 Page(s)838 - 845 - Area-efficient linear regulator with ultra-fast
load regulation, Hazucha, P. Karnik, T.
Bloechel, B.A. Parsons, C. Finan, D. Borkar,
S. Solid-State Circuits, IEEE Journal of Volume
40, Issue 4, April 2005 Page(s)933 - 940 - Feasibility of monolithic and 3D-stacked DC-DC
converters for microprocessors in 90nm technology
generation, Schrom, G. Hazucha, P. Jae-Hong
Hahn Kursun, V. Gardner, D. Narendra, S.
Karnik, T. De, V. Low Power Electronics and
Design, 2004. ISLPED '04. Proceedings of the 2004
International Symposium on 9-11 Aug. 2004
Page(s)263 - 268 - A 480-MHz, multi-phase interleaved buck DC-DC
converter with hysteretic control, Schrom, G.
Hazucha, P. Hahn, J. Gardner, D.S. Bloechel,
B.A. Dermer, G. Narendra, S.G. Karnik, T. De,
V. Power Electronics Specialists Conference,
2004. PESC 04. 2004 IEEE 35th Annual Volume 6,
20-25 June 2004 Page(s)4702 - 4707 Vol.6 - Power and temperature control on a 90-nm Itanium
family processor, McGowen, R. Poirier, C.A.
Bostak, C. Ignowski, J. Millican, M. Parks,
W.H. Naffziger, S. Solid-State Circuits, IEEE
Journal of Volume 41, Issue 1, Jan. 2006
Page(s)229 - 237 - A 90-nm variable frequency clock system for a
power-managed itanium architecture processor,
Fischer, T. Desai, J. Doyle, B. Naffziger, S.
Patella, B. Solid-State Circuits, IEEE Journal
of Volume 41, Issue 1, Jan. 2006 Page(s)218 -
228 - Adaptive Frequency and Biasing Techniques for
Tolerance to Dynamic Temperature-Voltage
Variations and Aging, Tschanz, J, et al..
Solid-State Circuits, 2007 IEEE International
Conference Digest of Technical Papers - On-Die Supply Resonance Suppression Using
Band-Limited Active Damping, Xu, J, et al.
Solid-State Circuits, 2007 IEEE International
Conference Digest of Technical Papers
60(No Transcript)