Title: Multilevel Optimization for LargeScale Circuit Placement
1Multilevel Optimization for Large-Scale Circuit
Placement
- Tony F. Chan, Jason Cong, Tianming Kong,
Joseph R. Shinnerl
Mathematics Department Computer Science
Department, University of California, Los
Angeles, CA 90095 Research Supported by Intel,
SRC, NSF.
2The Circuit Placement Problem
- Given
- N circuits, a.k.a blocks, modules, or cells
- A rectangle (the chip) in which the circuits
must be placed without overlapping - Connectivity specs (netlist)
- Constraints, e.g., timing, heat dissipation,
routability - Problem Find an arrangement of the circuits on
the chip that minimizes total wirelength subject
to all constraints above. - Difficulty Modeling all O(N2) constraints when
104 ? N ? 107.
3Overview
- Challenges for Circuit Placement
- Large Design Sizes (over one million moveable
objects) - Complex Design Constraints (delay, noise,
manufacturability, etc.) - Existing methodologies are inadequate.
- Simulated Annealing-based methods can handle
complex design constraints, but their runtime and
quality scale poorly. - Quadratic Programming-based methods are very
efficient, but they cannot handle complex
constraints well. - We have implemented a fast placement engine
capable of handling complex constraints and
producing good placements.
4Key Components of Our Method
- Nonlinear Programming Formulation
- Multilevel Decomposition
- recursive clustering and declustering
- continuous and discrete refinement
- interior-point solution at coarsest level
- truncated preconditioned conjugate gradients on
the Newton Equations - Fast Multipole Method for nonoverlap constraints
- Legalization and Postprocessing (Domino)
5Hypergraph modelA NET is a subset of
(interconnected) CELLS.A hyperedge is a subset
of vertices (its like a clique).
Cell 1
Cell 4
Cell 3
Cell 5
Cell 2
Problem arrange the cells to minimize total
wirelength.
6Clique modelReplace hyperedges by cliques
collapse edgeswith common endpoints.
Cell 1
1/3
Cell 4
Cell 3
1
1/3
1/3
Cell 5
Cell 2
Problem arrange the cells to minimize total
wirelength.
7Clique Model
- Replace each hyperedge i by a clique. Weight each
edge in the clique by bi 2/(ni(ni-1)), where
ni is the clique size. - Collapse edges with the same endpoints into
single edges, summing weights bi . - The length of each edge is then weighted by this
sum.
8(No Transcript)
9Nonoverlap Constraints
10Nonlinear Programming Formulation
- min f(x)
- subject to c(x) ? 0 (NP)
- where x?Rn.
- f Rn?R objective function
(n?2N or 3N) - c Rn?Rm constraint functions (m
?N(N-1)/2 N) - F?x?Rn c(x) ?0 feasible region
- x local solution to
NP (KKT conditions) - Assumption f and c are smooth
- Difficulty active set A?ici(x)0 is
unknown
11Multilevel Framework
Final Fine-Grain Problem. Slot Assignment,
Discrete Refinement, DOMINO
Initial Fine-Grain Problem
Initial Fine-Grain Problem
MESC Clustering
Decluster
Intermediate Level Discrete Refinement Only
Intermediate Level
etc.
Decluster
MESC Clustering
etc.
Decluster
MESC Clustering
Intermediate Level Continuous and
Discrete Refinement
Intermediate Level
Decluster
MESC Clustering
Coarse-Grain Problem Solve Nonlinear Programming
Problem with Interior-Point Method.
12MESC--Multilevel Edge Separability Clustering
- Edge separability q of edge (s,t) approximates
the s-t min-cut of G. - Using CAPFOREST (SIDMA 1992), we can approximate
q for all edges in N log N time. - MESC ranks edge e (s,t) as r(e)
q(e)/min(area(s),area(t))in order to balance
cluster sizes. - Cell locations are not used.
13Interpolation (P) and Restriction (PT)
- Suppose cells x1h and x2h are clustered together
to form cluster x1H. Declusteringis trivial
x1h x2h x1H . - Interpolation is simply xh PxH, wherePij 1
iff xih belongs to cluster xjH and 0
otherwise. - f(xh) (1/2)xhTQxh bTx
(1/2)xHTPTQPxH bTPxH fH(xH)
14Declustering and Slot Assignment
- Linear interpolation places cluster components
concentrically at the cluster center. - Linear assignment is used to distribute these
concentric components to nearby locations.
15Interior-Point Methods for Nonlinear
ProgrammingConstruct xk such that c(xk)gt0 for
all k and xk?x.
16The (Dynamically Weighted) Slack Variable
Formulation
-
- Initially, cells may overlap arbitrarily. As
? is reduced, however, the amount of overlap
allowed is also reduced. Once ? decreases below
zero, all constraints are satisfied.
17Example Test Run on Circuit sioo
- Initial Configuration (all cells along upper
boundary) - Next page subsequent major iteration
configurations, from upper left to lower right.
Initially, cells congregate in clusters. Then
they rush'' to the boundary of the chip.
Gradually, they move away from each other and
from the boundary.
18Test Run on Circuit sioo (contd)
19Linesearch Algorithms
- Given initial iterate x x0
- repeat
- Calculate search direction p
- Calculate steplength ?
- Update x ? x ? p
- until done
- Notation
- g ? ?f Rn?Rn Gradient of f
- Hf Rn?Rn ? n Hessian of
f - J ? Dc Rn ?Rm ? n Jacobian of c
20Solving the Barrier subproblem
- Optimality conditions
- Newton-Barrier equations
- where
-
- Problem
- Solution
-
21The Fast Multipole Method (FMM)
- Designed to accelerate evaluation of decaying
potential fields ?(r) ??i/ri used in
computational particle simulations in astronomy,
plasma physics, fluid dynamics, chemistry, etc. - Recently applied successfully to large-scale
computational problems in linear algebra and VLSI - Can be used to evaluate the n2/2 non-overlap
constraints in O(n) time
22Keys to FMM
- The contribution to the barrier function
associated with the ith constraint is a sum - To evaluate m sums of this form in O(pm)O(pn)
time,FMM uses - an efficient strategy for clustering particles
according to adaptive, hierarchical partitions
(the more distant the cluster, the larger it can
be), - exact formulas for merging clusters and
recentering multipole expansions.
23Fast Evaluation of ?B?(x) and H?(x)v
- FMM is also applicable to potentials obtained
from derivatives of the logarithm. - Each component of ?B?(x) has the form
- Products H?(x)v are approximated by
24Discrete Refinement
- Combination with continuous optimization
dramatically improves placement quality. - Localized i.e., restricted to small subsets of
cells. - Current version is based on Goto 1981
?-neighborhoods and (randomized) ?-exchange - The ?-neighborhood of location (i,j) is all
locations (m,n) such that i-m j-n lt ? . - ? denotes the total number of ?-neighborhoods
we consider at each iteration.
25Discrete RefinementExample ? 1, ?3
- Find As optimal location assuming all cells
except A are held fixed (relaxation). Suppose
As optimal location is currently occupied by B. - Pick an ?-neighbor of B at random, say D.
- Find Ds optimal location by relaxation,
occupied by G, say. - Pick an ?-neighbor of G at random, say K.
- Select the optimal permutation of A, D, and K.
26Test Results for mPL
- Sun UltraSparc-2/168MHz, 512MB memory
- 4 largest circuits from 1993 MCNC suite, all 18
circuits from ISPD98 suite. - Even though we minimize quadratic wirelength, we
still compute the half-perimeter result for
comparison with Gordian-LDomino.
27Gordian-L vs. mPL (ratios)
28Impact of Nonlinear Programmingon Circuit Biomed
29Increasing the Impact of Nonlinear Programming
- Adopt a floorplanning formulation at the coarsest
levels correctly account for size differences of
cell clusters at coarser levels - allow rectangular cells of variable aspect ratio
- approximate rectangular cells by ellipsoids for
near-field nonoverlap constraints - hierarchical bin structure for far-field
nonoverlap - Smaller nonlinear programming problems can be
solved to greater accuracy using off-the-shelf
interior-point software.
30Comparison on Sun-Ultra60 360 MHz / 1024 MBwith
ITOOLS 4.0 (Timberwolf)
31ITOOLS Comparison RATIOS(Sun Ultra60
360MHz/1024MB, continued)
32Appendix 1 Basic Conjugate Gradients (CG)
33Appendix 2 GORDIAN (ca.1990)
- 1) Minimize quadratic wirelength subject to a
single center-of-mass constraint. - 2) Partition the result into two blocks.Add 2
new center-of mass constraints, one for each
block. - 3) Minimize the same quadratic wirelength function
subject to the new set of constraints. - 4) Repeat step 2 on each block and continue,
repeatedly adding new constraints at each step.