Title: A Semi-Persistent Clustering Technique for VLSI Circuit Placement
1A Semi-Persistent Clustering Technique for VLSI
Circuit Placement
Charles J. Alpert1, Andrew Kahng2, Gi-Joon Nam1,
Sherief Reda2 and Paul G. Villarrubia1 1IBM
Corp. 2Department of CSE, UCSD
2bigblue4 design from ISPD2005 Suite
3Implications in Placement
- Scalability
- Tractability
- Runtime vs. quality trade-off
- SoC (System-on-Chip) designs
- Mixed-size objects
- White space
4Problem Statement
- What is the most effective and efficient
clustering strategy for analytic placement? - Quality of solution
- CPU time
5Clustering Concept
6Clustering Literature
- Tremendous amounts of research here
- Edge-Coarsening (EC)
- First-Choice (FC)
- Edge-Separability (ESC)
- Peak-Clustering
- Etc
- General drawbacks
- Clique transformation
- Edge weight discrepancy
- Pass-based iteration
- Lack of global clustering view
7Best-Choice Clustering
- Avoid clique transformation
- Avoid pass-based iterations
- More global view of clustering sequence
- Priority-queue management
- Lazy-update speed-up technique
- Area-controlled balanced clustering
8Best-Choice Clustering
- Initialize the priority-queue PQ
- - For each cell u calculate its clustering
score c with its closest neighbor v. - - Insert the pair (u, v) into PQ based on
their cost c. - Until the target cell number is reached
- - Pick the top of the heap (m, n)
- - Cluster (m, n) into a new object mn update
the netlist - - Calculate mn closest neighbor k insert
(mn, k) into PQ - - Recalculate the clustering cost of all the
neighbors to m and n
9Best-Choice Example
- Assume
- N-pin net weight 1 / (n-1)
- Each object size 1
- Timing criticality is 1 for all nets
10Best-Choice Example
11Best-Choice Example
12Best-Choice Example
ABCDEF
EF1/3
ABCD1/3
? clustering_score 2.875
13Best-Choice Clustering Summary
- Globally optimal clustering sequence via
priority-queue data structure - Produce better quality of results
- Clustering framework
- Arbitrary clustering score function can be
plugged in
14Best-Choice Clustering
- Clustering score distribution
- First-choice (FC) ? clustering_score 5612.83
- Best-choice (BC) ? clustering_score 6671.53
15Lazy Update Speed-up Technique
Priority Queue PQ
Top of the PQ
Node A
- Observations
- Node A might be updated a number of times before
making it to the top of the PQ (if ever), but the
last update is what determines its final position
in PQ - Statistics indicate than in 96 of our updating
steps, updating node A score pushes A down in PQ
16Lazy Update Speed-up Technique
Main Idea Wait until A gets to the top of the
priority-queue and then update its score if
necessary
Until the target cell number is reached -
Pick the top of the heap (m, n) - If (m, n)
is invalid then - recalculate m closest
neighbor n and insert (m, n) in the heap
else - Cluster (m, n) into a new object
mn update the netlist - Calculate mn
closest neighbor k insert (mn, k) in the heap
- Mark all neighbors of m and n invalid
17Lazy Update Runtime Charateristic
Note Practically no impact to solution quality
18Experiments
- IBM CPLACE
- Analytic placement algorithm
- Semi-persistent clustering paradigm
- Up-front clustering
- Selective unclustering during main global
placement - Full unclustering before detailed placement
- Order-of-magnitude reduction by clustering
- Industrial ASIC designs
- Size ranges from 56K to 880K placeable objects
19Placement Results w/ Clustering
- Average 4.3 WL improvement over EC
- BC is x8.76 slower than EC
20No Clustering vs. BCLazy Clustering
WL() CPU CL-CPU
AL(270K) 2.09 0.40 1.17
BL(276K) -4.28 0.52 1.35
CL(351K) 3.27 0.51 1.14
DL(426K) 0.87 0.45 1.35
EL(456K) 1.59 0.33 1.10
FL(880K) 1.41 0.46 1.68
AD(389K) 8.23 0.50 0.98
BD(285K) -0.34 0.47 0.94
CD(56K) -0.36 0.69 0.51
Avg. 1.39 0.48 1.14
21Conclusions
- Globally optimal clustering sequence framework
- Independent of clustering scoring function
- Better clustering sequence
- Allow significant placement speed-up
- Almost no loss of quality of solution
- Size control via clustering scoring function
- Effective for dense design
22Future Work
- Handling fixed blocks during clustering
- Ignoring nets connected to fixed objects
- Ignoring pins connected to fixed objects
- Including fixed blocks during clustering
- Etc.
- No visible improvement at the moment
23Cluster Size Control Results
Standard k 1 Automatic k ?size(u)
size(v) / ?? where ?
expected avg. size
Standard Standard Standard Automatic Automatic Automatic
Max Avg WL Max Avg WL
AD 14823 171.4 0.00 1140 160.4 -0.88
BD 28600 150.0 0.00 1140 114.6 3.71
CD 9060 113.5 0.00 610 109.8 30.05