Title: Weiping Shi
1HiCap A Fast Hierarchical Algorithm for 3D
Capacitance Extraction
- Weiping Shi
- Department of Computer Science
- University of North Texas
2Outline
- Introduction
- Previous Research
- Integral Equation N-Body Problem
- New Algorithm
- Experimental Results
- Conclusion
- Future Work
3Introduction
- Capacitance Extraction Given a set of conductors
in 3-D space, compute the capacitance between all
pairs of conductors.
1V
-
-
CQ
-
-
-
-
-
4- Signal delay gate delay interconnect delay
- Interconnect delay is caused by RC (resistance
and capacitance) parasitic.
R
C
C
5- Interconnect delay dominates gate delay in deep
sub-micron VLSI.
Delay
(ps)
Generation (micron)
6Importance in VLSI
- Fast and accurate capacitance extraction is
crucial in the design and verification of VLSI
circuits and packaging. - Current 3D tools are too slow.
- FastCap, Raphael, QuickCap, etc.
- 2D/2.5D/Quasi-3D tools use 3D engines to generate
library. Accuracy depends on 3D engines. - Dracula, HyperExtract, Arcordia, FireIce,
Star-RC, Columbus, etc. - For critical nets and clock trees, 3D accuracy is
necessary.
7Importance in MEMS
- Accurate capacitance extraction of complex 3-D
structures is also important in design of MEMS
(MicroElectroMechanical Systems). - Design of most motion sensors needs accurate
estimate of capacitance. - Design of most drivers needs to solve a similar
potential problem. - A recent ARPA report estimates the market of
above applications at 1 to 3 billion dollars by
2004.
8Enlarged comb driver
9Previous Research
- Differential Maxwell Equation (Finite Difference
Method or Finite Element Method) - Raphael Field Solver
- Integral Laplace Equation (Boundary Element
Method) - Multipole algorithm FastCap by Nabors White.
O(N) time. Kernel dependent. - Pre-corrected FFT algorithm by Phillips White.
O(N log N) time. Kernel independent. - SVD algorithm IES3 by Kapur Long. O(N log N)
time. Kernel independent.
10Integral Equation Approach
where ? (x) is the known surface potential,
? (x) is the charge density,
da is an incremental conductor surface area,
x is on da,
is the
kernel.
11Partition conductor surfaces into N panels and
assume uniform charge density on each panel. Then
we have a linear system
Pq v
where P is an NxN matrix of potential
coefficients, q is an N-vector of
panel charges, v is an N-vector of
known panel potentials.
12Each entry pij of potential coefficient matrix P
represents the potential at panel Ai due to unit
charge on panel Aj
Solution q of the linear system Pq v gives the
capacitance.
13Challenge
- Partition the conductor surfaces into N panels,
- Calculate and store the dense NxN matrix P, and
- Solve the linear system Pq v
In O(N) time?
14N-body Problem
- N-body Problem Given N particles in 3D space,
compute all forces between the particles. - Hierarchical Algorithm (Appel 85)
- O(N) time (Esselink)
- Radiosity (Hanrahan, Salzman Aupperle)
- Multipole Algorithm (Greengard Rohklin 87)
- O(N) time
- FastCap
15Appels Key Ideas
- For practical purposes, forces acting on a
particle need only be calculated to within the
given precision. - The force due to a cluster of particles at some
distance can be approximated with a single term.
16Outline of New Algorithm
- Adaptively partition conductor surfaces into
small panels according to a user supplied error
bound Pe. - Approximate potential coefficient matrix P and
store it in a hierarchical data structure of size
O(N). - The data structure permits O(N) time
matrix-vector product Px for any N-vector x. - Solve linear system Pq v using iterative
methods.
17Adaptive Panel Partition
- If the potential coefficient estimate between two
panels are greater than Pe, then partition the
panels. Otherwise, record the coefficient.
18Coefficient Matrix Representation
- Entries of P are are stored in a hierarchical
data structure as links.
A
H
C
B
I
J
D
E
K
L
N
G
M
F
19A
H
Matrix with
B
I
J
C
block entries
E
K
D
L
D
B
E
A
C
K
I
L
H
J
20It can be shown the matrix contains O(N) block
entries, where N is the number of panels. If
expanded explicitly, the matrix would contain NxN
entries. If panel sizes were uniform, the matrix
would be much larger than NxN.
21Matrix-Vector Product Px
- Compute charge for all panels in O(N) time.
A
H
B
C
I
J
D
E
K
L
N
G
M
F
22- Compute potential for all panels in O(N) time.
A
H
B
C
I
J
D
E
K
L
N
G
M
F
23- Distribute potential to leaf panels in O(N) time.
A
H
B
C
I
J
D
E
K
L
N
G
M
F
24Solving Linear Systems
- Use iterative methods such as GMRES or MINRES.
- Each iteration requires a matrix-vector product
Px and can be completed in O(N) time. - Number of iterations needed is very small,
normally 10-20 regardless of N.
25Error and Complexity
- Error of approximation can be controlled by the
user supplied error bound Pe. - Time complexity is O(N) because each of the above
steps is O(N).
26Experimental Results
- Test examples Bus crossing 2x2, 3x3, , 6x6. In
commercial tools, thousands of these crossings
will be computed to build the library.
2x2 Bus crossing
27Previous 3D Algorithms
- FastCap expansion order 2 (assume accurate).
- FastCap expansion order 0.
- Pre-corrected FFT. 40 faster than FastCap(2)
and uses 1/4 of memory of FastCap(2). - IES3. 60 faster than FastCap(2) and uses 1/5 of
memory of FastCap(2).
2840 - 100 times faster than FastCap(2), 14 - 40
times faster than FastCap(0).
291/60 - 1/100 of memory of FastCap(2), 1/80 -
1/280 of memory of FastCap(0).
30- Error with respect to FastCap(2)
Less than 2.7 error with respect to FastCap(2),
3 times more accurate than FastCap(0).
31Conclusion
- A new algorithm significantly faster than
previous best algorithms. It provides the
possibility for 3D extraction of clock trees and
critical nets. It can also be used to generate
libraries for commercial 2D/2.5D tools. - Kernel independent. Can be applied to
multi-layered dielectrics. - Adaptive refinement scheme produces good
partition of conductor surfaces. - Hierarchical data structure is much more
efficient than previous data structures.
32Future Research
- Capacitance Extraction
- High order basis function
- Bottom-up construction of hierarchy
- Full chip and critical net extraction
- Inductance Extraction
- FastHenry is too slow
- No commercial tool for mutual inductance.
- Variational Parasitic Extraction
- MEMS application