Title: Scaling below 90nm: Designing with unreliable components
1Scaling below 90nmDesigning with unreliable
components
Author TAD team Presented by Bart
Dierickx Director of Technology Aware Design
program IMEC, Leuven, Belgium FSA-IET 14 may 2007
2We have to face it
In a 100Million Transistor SoC At least one
transistor is weak, is deficient, or will fail
soon We do not know which one We do not know
when
?
- Technology aware design solves the challenges
imposed by scaling (variability, reliability,
static power). - It is complementary to DFM.
- Brings back that happy scaling feel
3Scaling below 90nm designing with unreliable
componentsOutline
- Introduction Scaling hits fundamental limits
- Understand and Analyze the issues routinely
- Countermeasures
- Conclusions
4Scaling hits fundamental limits
picture Asenov IMEC/TAD workshop 2005
- Atomic uncertainty in dimension and dopants
- Before that other brick walls will have stopped
us - Litho molecular randomness, wavelenght
- Electric field, strain/stress, leakage
- Economical
5The insurance salesmans talk (1)
- In the happy scaling era
- cost and power budget were the only design
constraints - design and technology did not interact we had
SPICE models and Standard Libraries - A design was in, or was not in spec
- Today
- Cost, power, yield, operation and use, stress
- Model and library complexity exploded
- A device is in or is not in spec
- Scaling induces new awkward constraints
- Variability
- Leakage, static power dissipation
- Reliability, finite and unpredictable lifetime of
components
6The insurance salesmans talk (2)
Solutions?
persistent issues, requiring design-technology
interaction 1. system-level yield, strongly
affected by variability and reliability 2.
timing-/energy-closure in the presence of
variability
?
- Some issues have found stand-alone SotA solutions
- Technology itself high K, backend, novel device
structures... - DFM OPC, CMP,
- Design solutions footer switches, guard banding,
multi VTH, low VDD, system load balancing,
7Industry hits problems that can not be solved
standalone
Can I reach my power budget?
How does low-k affect my yield?
- what are the next node design issues.
- Predicting and modeling the imperfection.
Corrections at run-time?
Please do not change design paradigms!
Field returns?
Must I fear variability?
Can I further reduce VDD?
Redundancy at all levels?
- how to handle unreliability
- Design (circuit, architecture, system) solutions
that will allow to live with variability and
reliability issues. - Systems mustwill allow that a fraction of the
devices is weak or fails, at first or during life
time.
New thoughts on testability?
Must I redesign my libraries?
Is ultra low VDD an option?
FinFETs?
Parallelism? Logic depth?
Must I pay for that process option?
Maybe high-k is a mistake?
is variability the limit or reliability?
Yield impact of process option?
8Scaling below 90nm designing with unreliable
componentsOutline
- Introduction Scaling hits fundamental limits
- Understand and Analyze the issues routinely
- Countermeasures
- Conclusions
9Tradeoff speed versus leakage technology options
If your specs are in this area choose the High
speed High leakage option
If your specs are in this area any option is OK
High speed High leakage
If your specs are in this area choose the low
speed low leakage option
Low speed Low leakage
two design points for two MOSFET technology
options
Q is this traditional view correct?
10Intermezzo specs, variability cloud and equal
yield curves
11Tradeoff speed versus leakage technology options
revisited
High speed High leakage
Low speed Low leakage
two design points for two MOSFET technology
options
12Next node is faster but has higher variability
Similar issue high-K versus low-K
13Understanding the issues routinelystrategic
what-if questions
- Strategic questions
- Example (1) trade off speed vs leakage
technology options versus yield - Example (2) next node is faster, but has higher
variability yield impact - routinely
- On a smaller scale the same questions are
operational design trade-offs - Cell, library optimization
- Simulations must be variability aware
- Sign-off in view of yield
14Variability and Reliability awareness in the
design flow?
- All abstraction levels
- Technology geometrical and material variability
- Component or compact model level electrical
variability - Gate level or standard cell level delay and
energy - Digital block level energy / execution time
variability - System level yield
- Point tools emerge
- SSTA
- Variability SPICE models
- Full Yield prediction?
- All levels covered
- IMECs variability aware modeling
15Scaling below 90nm designing with unreliable
componentsOutline
- Introduction Scaling hits fundamental limits
- Understand and Analyze the issues routinely
- Countermeasures
- Conclusions
16Solutions are emerging
- Technology solutions
- Ever better process control
- Novel devices, materials, process steps
- DFM
- Mask generation anticipates systematics OPC,
CMP, - Restricted design rules and restricted electrical
rules
- Design time solutions / countermeasures
- Variability Reliability tolerant circuit design
- Redundancy (e.g. memory, imagers, data
transmission)
- Runtime countermeasures
- Activate countermeasures as variability or spec
drift happens
- SKM Standardized Knobs Monitors
- Large class of runtime countermeasures for
variability and reliability - Living / designing with unpredictable components
- Propagating industry acceptance by standardization
Our approach
17Knobs Monitors how does it work? (1)
1.6
Energy per cylcle AU
1.4
X
1.2
1
Spec on power
X
0.8
A particular circuit instance happens to have
this operation point
Knob
0.6
Knob Selects high-speed/high-energy
configuration to regain the delay spec
0.4
0.2
Circuit Delay A.U.
0
0
0.5
1
1.5
2
2.5
Spec on Circuit delay
18step 2equip circuit parts with Knobs
Monitors
Application Environment parameters Technology
knowledge
- What does it do?
- Monitor flags spec (near-) failure
- Knob changes circuits operating point so as to
regain spec - Control algorithm
- Knobs and Monitors interact via the circuit
parts I/O
System
Any circuit part
monitor
Control algorithm
knob
- Why standardizing
- Acceptance by HW design community
- Interchangeability of each part
- Delegation of design (even across companies)
- Control algorithms become abstract paradigms
Example of a Monitor Completion detection
circuit Example of a Knob Line drivers with
programmable current
19(3) Fine-grained Knobs and Monitors in SoC
System software
Add monitors on performance critical circuit
parts (all !?)
Add operating point Knobs on performance tunable
circuit parts
Put the control intelligence in embedded
software
20 example knobs for runtime trade-offs in SRAM
- Buffers/drivers are present in different parts of
memory architecture - Limited impact in area
- large impact for energy vs delay
Ref IMEC 2006
21What does it bring us?
- Better than worst case design (BTWC)
- Circuit parts are nominally designed to be just
fast enough, with just enough energy - Have an alternate mode which has guaranteed in
spec, at the expense of power dissipation - At run time, most circuit parts run at just
enough energy few knobs must be turned high. - Fine grained is key
- Correction is at run-time. It can thus also
compensate for - Temperature drift
- Ageing many degradation effects.
- 32nm variability relaxes to 65nm feel
- Q why not design every circuit part with
guaranteed timing spec? - Yes that will work (the corner or guard band
approach), but - With severely over-designed power dissipation
- For deep submicron nodes power over-design is gt
200
22Temperature drift and ageing
Cloud shifts due to temperature
1.6
Energy per cylcle AU
1.4
1.2
X
1
Spec on power
X
0.8
Knob
0.6
0.4
Cloud shifts due to degradation
0.2
Nominal variability cloud
Circuit Delay A.U.
0
0
0.5
1
1.5
2
2.5
Spec on Circuit delay
23Examples of Knobs and Monitors
- Monitors
- Delay monitors
- CRC or parity error flagging
- Memory BIST result
- Razor triggers
- Detection of Noise margins and signal levels
(near or real failures) - Nearly failed completion detector (circuit
specific) - Power or supply current metering
- Local temperature sensors
- Feedback (on actual circuit), feed forward (on
replicated circuit) -
- Knobs
- Back biasing
- VDD
- Speed vs power in drivers, also I/O drivers
- Speed / power / noise margin in line drivers
- Power / noise margin in transmission lines
- Current biasing power/speed in analog circuits
- Quality of service, quality of experience
-
24Variability at System level is called yield
Yield estimator
System architecture running application
Specifications (power and timing constraints)
- Estimation of parametric yield in the
energy/timing domain, allows what-if questions
as - How much yield for given spec?
- How much worst case and average energy?
- What are the yield critical blocks?
- What-if the architecture changes?
Module-level activity information
IP component energy/delay statistics
25Variability Aware Modeling yields Yield
Top-down design flow
Bottom-up synthesis and simulation flow
26Scaling below 90nm designing with unreliable
componentsOutline
- Introduction Scaling hits fundamental limits
- Understand and Analyze the issues routinely
- Countermeasures
- Conclusions
27Conclusions
- Sub 90nm scaling issues force us to design with
unreliable components - Understand and analyze the issues routinely
- Routinely, in the design flow
- Make the full design flow variability and
reliability aware - Abandon guardband / corner / worst case design
- Predict yield as function of strategic and
operational choices (as IMECs TAD) - Sign off in view of yield versus power and
performance - Offer solutions or mitigation paths
- Design time solutions, DFM,
- Runtime countermeasures Knobs Monitors
28?