Title: PartitioningBased Approach to Fast OnChip Decap Budgeting and Minimization
1Partitioning-Based Approach to Fast On-Chip Decap
Budgeting and Minimization
- Hang Li, Zhenyu Qi
- Sheldon X.-D. Tan
- Mixed-Signal Nanometer VLSI Research Lab
- University of California, Riverside
- Yici Cai, Xianlong Hong
- Department of Computer Science and Technology
- Tsinghua University, Beijing, China
- This work is sponsored by
- Lifeng Wu
- Cadence Design Systems Inc., San Jose, CA
2Outline
- Introductions
- Review of existing decap budgeting algorithm
- Improved conjugate gradient algorithm
- Partitioning-based decap budgeting strategy
- Experimental results
- Conclusions and future works
3What is Decap?
- Decap decoupling capacitor
4Why Adding Decap is Important?
- Power source fluctuations increase significantly
Illustration of voltage drop variation of modern
VLSI chip
From Cadences Voltagestorm product brochure
- Static IR Drop ?V I R (P /Vdd ) R
- Dynamic IR Drop ?V L di/dt noise
5Introduction Voltage Drop Impacts on Timing
10 voltage drop can cause more than 10 delay
6Introduction - Effect of Adding Decaps
Adding decaps is the most effective way to reduce
voltage noises in P/G grids
7Introduction - The Costs of Adding Decaps
- Decaps are mainly made of MOSgate capacitors
- Consuming premium white spaces
- White space can otherwise be used for adding
buffers, other logic gates for physical
optimization. - MOS gates are leaky or become more leaky with
scaling - More leakage powers
- Excessive decaps will lead to low yield and low
circuit resonant frequency, etc. - Economic use of decaps are important!!
8Review of Previous Decap Budgeting Algorithms
- Charges based methods (Chen LingDAC97, Zhao
RoyTCAD02) - Charges for removing the unwanted voltage drops
are estimated based on simplified P/G grids. - The decaps added are much more than necessary and
far from minimum solutions. - Sensitivity based methods (Bai HajjICCAD00,
Su SapatnekarTCAD03, WangDAC03, Fu
TanASPDAC04) - Decaps are added based on the time-domain
sensitivity of voltage drop w.r.t. decap area
(value). - More precise and accurate on the estimated
decaps. - Not very scalable to deal with large circuits.
9Main Contributions of the Proposed New Approach
to Decap Budgeting
- Improved conjugate gradient algorithm
(efficiency issue) - Efficient search step during line search phase.
- Partitioned-based merged adjoint network for
sensitivity computation. - Localized decap budgeting based on partitioning
(scalability issue) - Decap has local effects on the voltage
fluctuation. - Partition large circuits into smaller ones and
optimize each of them individually.
10Problem Definition - Introduction
- Given P/G grids modeled as RC/RLC networks with
time-varying current sources - Voltage is less than user specified values. The
violation becomes the area over time
- Objective use minimum decaps to remove the
shaded violation area - Subject to
- Voltage drop limited to user specified values
- Decap area constraints from physical layout
- Other reliability constraints
11Decap Budgeting Problem Formulation
- Optimize decap area subject to IR noise
constraints presented in power gird network
Objective function
minimize
, where
Constraint
1.
or error bound
where
2.
-- A Nonlinear Optimization Problem!
12Previous Conjugate Gradient Based Method (Fu
TanASPDAC04)
- Transform the above constrained problem into an
unconstrained one
Penalty function to be minimized
- Solve the unconstrained problem by Conjugate
Gradient (CG) optimization
13Conjugate Gradient Optimization Gradient
Computation
- Gradient of the penalty function with respect to
each decap node - Gradient of decap at a single node
Efficient sensitivity computation is required!
14Conjugate Gradient Optimization Sensitivity
Computation
- Adjoint Network Method
- Two simulations required to calculate sensitivity
of aSINGLE violation node - One for original network
- One for adjoint network
15Conjugate Gradient Optimization Sensitivity
Computation (Cont.)
- Merged adjoint network method in time domain
- Compute sensitivity for all the decaps together
Merge all the inputs for AT
- Two simulations to calculate a whole gradient
vector, i. e. sensitivity for ALL decaps in the
circuit - Time Complexity
16Problems for Previous CG Method
- Objective is function of ?
- Difficult to select ?
- Less efficient optimization effects (constraint
may not be removed completely) - Line search can be very costly and wasteful for
CPU time for inappropriate ?
f(x)
?
17Simplified Problem Definitions
- Simplified objective function, which consists of
the violation area only
minimize
subject to
- Avoid the inherent ambiguous optimization
objective in previous conjugate gradient method
- Eliminate misleading a in previous objective
function
18Merged Adjoint Method (FuASPDAC04)
- Merged adjoint network method for efficient
sensitivity computation
i decap node
j violation node
19Problem with Merged Adjoint Method
- Overestimation of decap budget due to the
sensitivity loss from merged adjoint network
20Partitioning-Based Merged Adjoint Method
- We observe the positive effect of
partitioning-based merged adjoint network method - Create more objective functions for subcircuits
to reduce quality lose due to the merged
sensitivity. - In the extreme case, where every subcircuit has
one node, we go back to the individual
sensitivity case (un-converged optimization
problem). - Experimental results show that increasing
partitioning number can improve the decap
optimization quality.
21Improved Conjugate Gradient Method
- Direct conjugate gradient method requires a
number of line searches to compute the best
search step at each optimization step - Key Idea
- Step size is determined by computing the maximum
decap value allowed on one or some nodes under
current search directions (sensitivity) - Binary search is used to minimize the possibly
overestimated decap values - Timing Analysis
22Optimization Algorithm Flow Chart
23Partitioning-Based Decap Budgeting
- For very large circuits, both CG/iCG based
approaches will be slow - As circuit simulation time, which is the inter
loop of the optimization, will be significant.
- Localized adding decap effect also favors the
partitioning-based strategy - Basic idea partition the large circuits into
small ones and optimize each of them
individually. - Rational for such localized optimization adding
decap has local effects on voltage drops in a P/G
circuits. - Simulate the rest of the circuit by keeping the
boundary node voltage waveform recorded from full
circuit simulation
24Partitioning Algorithm
- General partition algorithm using graph-based
multilevel minimum cut algorithm (METIS) - Extremely fast speed ensures the partition phase
wont bottleneck the entire optimization flow
- Boundary condition for each partition must be
preserved to avoid overestimation of decap budget - Boundary node waveform in PWL voltage form serves
as the boundary condition
G.Karypis, R. Aggarwal, and V.K.S. Shekhar,
Multilevel hypergraph partitioning application
in VLSI domain, IEEE Trans. On Very Large Scale
Integration (VLSI) Systems, vol. 7, no. 1, pp.
69-70, March 1999.
25Noise-Aware Partitioning (NAP)
- Violation nodes can not be at the boundary
- Violation can not be eliminated due to the
unchanged PWL waveform - Noise-aware partitioning must be introduced
26Partitioning-Based Decap Budget Flow
Finalize decap value
Decrease previous partition number
N
Violation Criterion Met ?
Y
Solve new circuit with updated decap values
Combine updated decap values Generate new
netlist file
Y
Call iCG solver to do the individual partition
optimization
27Comparison Between CG and iCG
- CG algorithm was modified by explicitly trying to
bracket the minimum before each line search for
fair and reasonable comparison to iCG - Bracketing the minimum avoids useless line
searches and improve the efficiency of the CG
algorithm
28Comparison Between Flat and Partitioning-Based
Decap Budgeting
- At least 10X speed-up is achieved with comparable
decap budget value
29Observation on Partitioned Decap Budget
- Even better decap budget could be achieved from
partitioning - Partitioning makes some nodes harder to optimize
- Partitioned merged adjoint method can improve the
quality of the decap optimization
30Conclusions Future Works
- Applied the time domain merged adjoint network
method. - Using improved conjugate gradient techniques for
decap budgeting. - The combination of proposed partitioning scheme
with the merged adjoint network method leads to
both fast optimization process and better results
that that given by flat merged adjoint network
method. - At least 10X speed-up can be achieved for large
circuit sizes under the new algorithm. - Parallel computing will be explored in the future
to further improve the algorithms efficiency.
31