Title: Statistical VLSI Design Analysis
1Statistical VLSI Design Analysis
Bao Liu
UC San Diego
http//vlsicad.ucsd.edu/bliu
2Outline
- VLSI Process and System Runtime Variations
- Statistical Timing Analysis
- Circuit Operation Mode and Input Statistics
- DSM Effects
- Statistical Gate Level Simulation
- Process Variation Extraction
3VLSI Variabilities
- Technology scaling ? shrinking layout feature
sizes ? larger percentage of variations - Manufacturing process
- Chemical Mechanical Polishing (CMP) ? tox, wire
thickness - Lithography ? Leff, wire width
- Dopant variation ? Vth
- System runtime
- Power/ground supply voltage drop/bounce
- Temperature
4Damascene and Dual-Damascene Process
- Damascene process named after the ancient Middle
Eastern technique for inlaying metal in ceramic
or wood for decoration
ILD Deposition
Oxide Trench / Via Etch
Oxide Trench Etch
Metal Fill
Metal Fill
Metal CMP
Metal CMP
5Optical Proximity Correction (OPC)
- Layout modifications improve process control
- improve yield (process latitude)
- improve device performance
- Complicates mask manufacturing and increases cost
- Post-design verification is needed
6Process Variations
- Sources of Variations
- 1. Environmental factors
- 2. Physical factors
7Delay impact of variations
Courtesy Kerim Kalafala
8Process Variations (..contd)
Figure Statistical distribution of 16-bit adder
Critical Path delay 0.18?m technology. 3? -worst
case of monte-carlo simulation CWC classical
worst case model Process Parameters Oxide
thickness, Length, Width, Threshold voltage
Impact of Unrealistic Worst Case Modeling on the
Performance of VLSI circuits in Deep Submicron
Region CMOS Technologies- A.Nardi, A.Neviani,
E.Zanoni,M.Quarantelli, IEEE 99
9Outline
- VLSI Process and System Runtime Variations
- Statistical Timing Analysis
- Circuit Operation Mode and Input Statistics
- DSM Effects
- Statistical Gate Level Simulation
- Process Variation Extraction
10Timing Constraints
11Timing Constraint Setup Time
- For FFs to correctly capture data, data must be
stable for - Setup time (Tsetup) before clock arrives
Tclk_to_QTd_max Tsetup lt Tclk_skew Tcycle
12Timing Constraint Hold Time
- For FFs to correctly latch data, data must be
stable during - Hold time (Thold) after clock arrives
Tclk_to_QTd_min gt Tclk_skew Thold
13Timing Verification
- Simulation is time-consuming
- Path enabling ? SAT ? NP-hard
- Static analysis vector-less worst case analysis
- Worst case delay for each component
- Find longest/shortest path
- No Boolean logic, no SAT
- Pessimistic bounds of actual path delays
14Timing Graph
- Data paths with timing constraints
- Starting from primary inputs/FF outputs
- Ending at primary outputs/FF inputs
- Represented by a labeled directed graph G ltV,Egt
- Timing node V pin/primary input/output
- Timing edge E gate/wire delay
- (Timing arc gate delay)
15Compute Longest Path
(Kirkpatrick 1966, IBM JRD)
Origin
- Compute longest path in a DAG G
ltV,E,delay,Origingt - // delay is set of labels, Origin is the
super-source of the DAG - Forward-prop(W)
- for each vertex v in W
- for each edge ltv,wgt from v
- Final-delay(w) max(Final-delay(w), delay(v)
delay(w) delay(ltv,wgt)) - if all incoming edges of w have been
traversed, add w to W -
- Longest path(G)
- Forward_prop(Origin)
- Dynamic programming
- How to exclude a set of paths?
16Static Timing Analysis
- Actual arrival time (AAT) forward propagation
- Required arrival time (RAT) backward propagation
- Slack RAT - AAT
- A measure of how much timing margin exists at
each node - Slack lt 0 ? timing violation
- Can optimize a particular branch
- Can trade slack for power, area, robustness
- Critical path
17Process Variation
- Classification
- inter-chip - from die to die / wafer to wafer /
lot to lot - intra-chip - within a single die
- spatially correlated
- systematic
- CMP and OPC related, lens aberration
- random
- fluctuations in doping concentration
- modeling limitations
18Counting Process Variation
- Min/Max-based
- Inter-die variation
- Pessimistic
- Corner-based
- Intra-die variation
- Computational expensive
- Statistical
- pdf for delays
- Reports timing yield
19Statistical Timing Analysis
-
- Deterministic STA requires two algebraic
operations - Summation of delay
- Taking maximum
- Probabilistic max is difficult
- Identity of longest path is random
- Exact evaluation for arbitrary pdfs is
computationally prohibitive
20Gaussian Arrival Time Propagation
- Problem formulation Assuming gate delays to be
correlated Normal Random variable, compute the
Mean and Variance of critical path delay MAX(D) - Basic Idea
- Transforming series-edges,
21Gaussian Arrival Time Propagation
- Transforming parallel-edges,N MAX(N1,N2) need
not be normal !!
n1
n2
C. Clark, The greatest of a finite set of random
variables. Operations Research, 1961
For two stochastic variables
and
with correlation coefficient
, the mean
and variance
of
are obtained by the
following equations, unless
22Gaussian Arrival Time Propagation
23Gaussian Arrival Time Propagation
- Algorithm Starts at source node, propagates the
mean, variance and covariance structure of the
graph until the sink is reached - At sink node, we have the Mean and Variance of
the critical path delay
Results
24Statistical Timing Analysis
- Block based
- Each timing node has an arrival time distribution
- Static worst case analysis
- Efficient for circuit optimization
- Path based
- Each timing node in each path has an arrival time
distribution - Corner-based or Monte Carlo analysis
- Accurate for signoff analysis
25Statistical timing tools
26Statistical Timing Analysis
- What are the issues here?
27Statistical Timing Analysis Issues
- Gaussian or non-Gaussian
- Correlations
- Physical effects
- Circuit operation mode and input statistics
- because SSTA evolves from STA
28Outline
- VLSI Process and System Runtime Variations
- Statistical Timing Analysis
- Circuit Operation Mode and Input Statistics
- DSM Effects
- Statistical Gate Level Simulation
- Process Variation Extraction
29CMOS Gate Power Consumption
- Dynamic
- Still dominant component in current technology
- Charging and discharging the capacitor
- Short-circuit
- During a transition, current flows through both P
and N transistors simultaneously for a SHORT
period of time - Slow transitions worsen short-circuit power
- Leakage
- Even when a device is nominally OFF (VGS0), a
small amount of current is still flowing - With many devices, can add up to hundreds of mW
30Estimating Dynamic Power Consumption
Slide courtesy of Mary Jane Irwin, PSU
31Toggle Rate Estimation
- Simulation
- requires representative simulation vectors
- derived by designer
- automatic (Monte Carlo)
- Probabilistic Propagation
- no input vectors needed
- much faster than simulation
- less accurate than simulation
- glitches?
Slide courtesy, Prof. J. Rabaey, UCB
32Signal Probability and Activity
- Signal probability and activity
- Signal probability - probability of a signal
being logic ONE -
- Signal activity (transition density) -
probability of signal switching - ni(T) the number of switching for i(T) in
-T/2,T/2
Slide courtesy, Z. Chen, K. Roy
33Probability Propagation
- Let y f(x1, , xn) be a Boolean function with
independent variables xi, the signal probability
of f can be obtained in linear time as follows. - where
- are the cofactors of f with respect to x1.
- Improve runtime by using a BDD
34Activity Propagation
- Let y f(x1, , xn) be a Boolean function with
independent variables xi, the signal activity of
f can be obtained in linear time as follows. -
- where Boolean difference
- where is the exclusive-or operation.
35Probability Propagation
Propagate
AND gate sp(1) sp1 sp2 tp(0?1) sp (1 -
sp) Example sp 0.5 0.5 0.25 tp 0.25 (1
- 0.25) 0.1875
1/2
1/4
1/2
7/16
1/2
1/4
1/2
Ignores Temporal and Spatial Correlations
Slide courtesy, Prof. J. Rabaey, UCB
36SPSTA Signal Probability based STA
- SPSTA
- Input statistics
- Statistical
- Timing yield
- Signal transition occurrence probability
- Accurate
- SSTA
- No input statistics
- Static
- No timing yield
- No signal transition occurrence probability
- Pessimistic
- No bound
- Pessimism ? larger deviation ? less correlation ?
underestimation of DSM effects ? optimism ? no
bound
37Outline
- VLSI Process and System Runtime Variations
- Statistical Timing Analysis
- Circuit Operation Mode and Input Statistics
- DSM Effects
- Statistical Gate Level Simulation
- Process Variation Extraction
38Multiple-Input Switching
- Simultaneous signal switching at multiple inputs
of a gate leads to up to 20(26) gate delay mean
(standard deviation) mismatch Agarwal-Dartu-Blaau
w-DAC04
39Crosstalk Aggressor Alignment
- We consider an equally significant source of
uncertainty in SSTA, which is crosstalk aggressor
alignment induced gate delay variation
MIS
CAA
40Problem Formulation
- Given
- Coupled interconnect system
- Gate input signal arrival time distributions
- Find
- Gate output signal arrival time distributions
- We present signal arrival times in polynomial
functions of normal distribution random variables - E.g., for first order approximation of two input
signal arrival times, their skew (crosstalk
alignment) is given in normal distribution random
variables with correlation taken into account
xi fi(r1, r2, ) ri N(mi, 3si)
x1 N(m1, 3s1) x2 N(m2, 3s2) xx2-x1
N( mm2-m1, 3s3(s12s22corr)1/2)
41Statistical Delay Calculation w. Coupled
Interconnect
- Input Coupled interconnects
- gate input signal arrival time distributions
- process variations
- Output Gate output signal arrival time
distributions - Driver Gate delay calculation for sampled
crosstalk alignments - Approximate driver gate delay in a piece-wise
quadratic function of crosstalk alignment - Compute output signal arrival time distribution
by closed-form formulas - Combine with other process variations
42Driver Gate Delay vs. Crosstalk Alignment
16X inverters driving 1000um global
interconnects in 70nm technology
43Driver Gate Delay vs. Crosstalk Alignment
- More complex than the timing window concept
- Can be computed by simulation or delay
calculation - Approximated in a piece-wise quadratic function
44Closed Form Driver Gate Delay Distribution
- For a normal distribution crosstalk alignment x
- Transform probabilities via inverse functions
45Driver Gate Output Signal Arrival Time
Distribution
- For a normal distribution crosstalk alignment x
- Consider correlation via conditional probabilities
46Runtime Analysis
- Driver gate delay calculation for N sampled
crosstalk alignment takes O(N) time, where N
min(t3-t0, 6 s of crosstalk alignment) /
time_step - Fitting takes O(N) time
- Computing output signal arrival time distribution
takes constant time, e.g., updating in an
iterative SSTA
47DSM Effects
- SSTA must consider DSM effects!
- We take crosstalk aggressor alignment into
account in statistical gate delay calculation - We approximate driver gate delay in a piecewise
quadratic function of crosstalk aggressor
alignment - We derive closed-form formulas for driver gate
delay and output signal arrival time distribution
for given input signal arrival times in
polynomial functions of normal distributions - Our experiments show that neglecting crosstalk
alignment effect could lead to up to 159.4
(147.4) mismatch of driver gate delay means
(standard deviations), while our method gives
output signal arrival time means (standard
deviations) within 2.57 (3. 86) of SPICE
results
48Outline
- VLSI Process and System Runtime Variations
- Statistical Timing Analysis
- Circuit Operation Mode and Input Statistics
- DSM Effects
- Statistical Gate Level Simulation
- Process Variation Extraction
49Traditional Gate Models
- K-factor lookup tables
- Dg f(Cload, Tr)
- Trout g(Cload, Tr)
- Effective capacitance Ceff for distributed load
capacitance - To achieve identical gate delay (and output
signal transition time at the same time!) - E.g., by going through an iteration to achieve
the same average gate output current
- May not converge
- No equivalent gate delay and Trout at the same
time - Waveforms are not ramp functions!
50Current-Based Transistor Models
- MOSFET is a voltage-controlled current source,
e.g., as in the alpha-power-law model - For a simple inverter, gate output current is
given by one of the transistors - An equivalent inverter macro-model for an
inverting complex gate - ? current-based gate modeling
51Current-Based Gate Modeling
- Consists of a lookup table I(Vi, Vo) and C(Vi,
Vo) - Transient analysis for output signal waveform
R
Vo
Vi
Vi
I(Vi, Vo)
C
Voltage-Based
Current-Based
52Current Gate Model Based Transient Analysis
- Input Vi(t), I(Vi, Vo), Cg, load interconnect
- Output Vo(t)
- Reduce load interconnect, e.g., to a Pi model
- For each time step t
- Find Vi(t) and Vo(t)
- Find I(Vi, Vo) by take lookup
- Compute Vo(t1) with load interconnect
53Statistical Gate Level Simulation
- Given
- Variational input Vi(t)
- Current source gate model of I(Vi, Vo), Cg
- Find variational output signal waveform Vo(t)
Vi
I(Vi, Vo)
C
54Outline
- VLSI Process and System Runtime Variations
- Statistical Timing Analysis
- Circuit Operation Mode and Input Statistics
- DSM Effects
- Statistical Gate Level Simulation
- Process Variation Extraction
55Problem of Correlations
- Two types of correlation
- reconvergence
- spatial correlation
56Spatial Correlation
- Process variation
- inter-chip - from die to die / wafer to wafer /
lot to lot - intra-chip - within a single die
- spatially correlated
- systematic
- CMP and OPC related, lens aberration
- random
- fluctuations in doping concentration
- Spatial correlations!
- Spatial correlation is critical to VLSI
performance variation - We propose to extract spatial correlation based
on production chip performance variations
57Summary
- VLSI Process and System Runtime Variations
- Statistical Timing Analysis
- Circuit Operation Mode and Input Statistics
- DSM Effects
- Statistical Gate Level Simulation
- Process Variation Extraction
58Thank you!