Title: Task 1091'001: Highly Scalable Placement by Multilevel Optimization
1Task 1091.001 Highly Scalable Placement by
Multilevel Optimization
- Task Leaders Jason Cong (UCLA CS) and Tony Chan
(UCLA Math) - Students with Graduation Dates
- Michalis Romesis (UCLA CS, March 2005
---graduated) - Kenton Sze (UCLA Math, July 2006 --- graduated)
- Min Xie (UCLA CS, September 2006 --- graduated)
- Guojie Luo (UCLA CS, September 2010)
- Research Staff Joe Shinnerl, UCLA CS
2Industrial Liaisons
- Patrick McGuinness, Freescale Semiconductor, Inc.
- Natesan Venkateswaran, IBM Corporation
- Amit Chowdhary, Intel Corporation
3Task Description and Anticipated Result
- Highly scalable multilevel, multiheuristic
placement algorithms that address the critical
placement needs of nanometer designs - scalability
- multi-constraint optimization --- timing,
routability, power, manufacturability, etc. - support of mixed-sized placement and incremental
design. - Quantitative study of the optimality and
scalability of placement algorithms - Construction of synthetic benchmarks with known
optima to identify the deficiencies of existing
methods - Our goal is to achieve one-process-generation
benefit through innovation of physical-design
technologies, especially placement.
4Task Deliverables
- Report on new placement benchmarks with known
optimal or near optimal solutions for all major
objectives and constraints. Scalability and
optimization studies on existing placement
techniques (Completed 3-Nov-2003) - Experiments and reports on the applicability of
integrated AMG-based weighted aggregation and
weighted interpolation. Improvement measured on
both PEKO examples and industrial examples from
SRC member companies (Completed 1-Jun-2004) - Experiments and reports on multiheuristic,
multilevel relaxation and the scalable
incorporation of complex constraints into the
enhanced multilevel framework. Improvement
measured on both PEKO and industrial examples
(Completed 1-Jun-2005) - A highly scalable placement tool that (i)
supports multi-constraint optimization,
mixed-sized placement, and incremental design and
(ii) produces best-of-class results for both PEKO
and industrial examples from SRC member companies
(Completed 1-Jun-2006) - Final report summarizing research accomplishments
and future direction (Planned-Oct-31, 2006)
5Accomplishments in the Past Year
- Improvements in mPL for routing density control
Best quality, ISPD 2006 contest - Thermal-Driven Placement
- Heterogeneous Placement
6Relative Wirelength
A Brief History of mPL
- mPL 1.0 ICCAD00
- ESC Clustering
- Goto relaxation
UNIFORM CELL SIZE
- mPL 1.1
- FC clustering
- Partitioning added
- to legalization
- mPL 2.0
- RDFL relaxation
- Primal-dual netlist pruning
- mPL 3.0 ICCAD03
- QRS relaxation
- AMG interpolation
- Multiple V cycles
- mPL 5.0
- Multilevel force directed
- Mixed-size capability
- mPL 4.0
- Improved DP
- Backtracking
- V cycle
NON-UNIFORM CELL SIZE
- mPL 6.0
- Enhanced
- Routability handling
year
2002
2003
2000
2001
2004
2005
2006
7mPL Generalized Force-Directed Placement
- Use of accurate objective functions Bertsekas,
82, Naylor et al, 01 - Optimization-based bin-density constraint
formulation - Iterative Uzawa solver
- Multilevel for better runtime and wirelength
8Accomplishments in the Past Year
- Improvements in mPL for routing density control
Best quality, ISPD 2006 contest - Thermal-Driven Placement
- Heterogeneous Placement
9Core Engine for Density Control
- Overall scheme
- One V cycle with comparable quality
- Minimum perturbation in the last stages of GFD
- Significant speed up without losing solution
quality - Routing density handling
- Residual density in each bin
- Even distribution of dummy density into bins
- Cell area inflation for better convergence
GFD with Density Control
Minimun perturbation
10Macro Spreading
- Need area density below target value Nam,
ISPD06 - Target distance between neighboring macros
- ? target density
- Spreading represented as objective
H
A1
w2
w
w1
A2
W
fij
- dxi and dyi perturbation
- fxij and fyij piece-wise linear function
x
Hij
11Experiment Results on ISPD06
mPL6 produces the best solution quality using
ISPD06 routability-driven metric
12Demonstration of mPL6
- http//cadlab.cs.ucla.edu/cpmo/videos/mPL6-density
.wmv
13Accomplishments in the Past Year
- Improvements in mPL core engine for mixed-size
global placement - Thermal-Driven Placement
- Heterogeneous Placement
14Motivation
- High power density due to technology scaling
- Problems caused by high temperature
- Hot spots become more harmful
- Higher temperature ? Higher leakage power ? More
heat - Previously negligible effects become first-order
effects - Difficult estimation for power, timing, etc
15Thermal Model
- One layer mesh to model the substrate
- Sj (Ti - Tj) Cxy (Ti Tsink) Cz Pi
- Cxy, Cz are the thermal conductance for the
substrate and the heat sink - Solved by Fast DCT
- Solve T from CT P, given C and P
- Diagonalize C GT?G
- G is the discrete cosine matrix
- ? is a diagonal matrix
- T G-1?-1G P
16Formulation Solution
-
- Implement ?i(x) and ti(x) with filler cells and
filler power without area - Tdes is a given by user
- Solved by Uzawa Algorithm
- As additional thermal-aware GFD following a
WL-driven V-Cycle
17Experiment Results on IBM-FastPlace
- Quality improvement
- Teven is the ideal temperature with the same
total power - Max. on-chip temperature
- Tinit after Step 1
- Tfinal Tdes after Step
- More than 90 quality improvement within 5 WL
increase
18Accomplishments in the Past Year
- Improvements in mPL for routing density control
1st quality, ISPD 2006 contest - Thermal-Driven Placement
- Heterogeneous Placement
19Motivation
- Need for placement on array type chips with
pre-fabricated resources - FPGA
- Structured ASIC
- Need for heterogeneous capability
- Memory, DSP, etc
- Block on sites of the same type
20Related Work
- Academia
- VPR Betz Rose 97, PATH Kong 02, SPCD Chen
Cong 04,05, PPFF Maidee et al, 03, CAPRI
Gopalakrishnan et al, 06 - Most comparisons to out-dated tools
- No heterogeneous capability
- Industry
- Quartus II Altera Corp., ISE Xilinx Inc.
- Proprietary chips only
- Techniques not publicly documented
21Heterogeneous Placement by mPL-H
- First analytical placer for heterogeneous
placement - Framework based on mPL6 Chan et al, 05
- Multiple layered placement
- One logical layer for each resource
- Forbidden regions blocked by obstacles
- Uniform wirelength computation
- Filler cells on each layer
DSP
M-RAM
LAB
22Demonstration of mPL-H
- http//cadlab.cs.ucla.edu/cpmo/videos/mPL-H.wmv
23Experiment Setting
Verilog netlist
Quartus_map
Clustered .vqm netlist
Stratix Description
.xml
Quartus_fitter
mPL-H
Chip type
.qsf placement
Quartus_router
24Wirelength Comparison
- WL still important for architecture evaluation
- mPL-H is 3 better in HPWL, and 2 better in
routed WL than Quartus II v5.0
25Runtime Comparison
- mPL-H can be 2X faster than Quartus II v5.0 when
the circuit becomes sufficiently large
26Overall Accomplishments Over the Funding Period
- 34 reduction in WL over 3 years
- One technology generation advancement
27Technology Transfer in 2006
- Discussions at conferences and workshops
- ASPDAC 2006, Yokohama, Japan
- ISPD 2006, San Jose, USA
- DAC 2006, San Francisco, USA
- Benchmark Releases (PEKO-MS) http//cadlab.cs.ucla
.edu/pubbench - mPL release http//cadlab.cs.ucla.edu/src_686_mp
l/
28Software Download Record
- PEKO/PEKU 2002 now
- More than 360 downloads
- SRC member companies
- Cadence, IBM, Intel, Mentor Graphics,etc.
- NON-SRC member companies
- Synopsys, Magma, Monterey Design, etc.
- Universities
- CMU, Michigan, MIT, UC Berkeley, UCSD, etc.,
- mPL 2001 now
- More than 480 downloads
- SRC member companies
- Cadence, Intel, Mentor Graphics,etc.
- NON-SRC member companies
- Synopsys, Magma, Intrinsity, Oasys, etc.
- Universities
- CMU, Michigan, Stanford, UCSD, Natl Taiwan U.,
etc.,
29Publications in 2006
- Conference papers
- ASPDAC 2006 J. Cong, M. Xie, A Robust Detailed
Placement for Mixed-size IC Designs. - ISPD 2006 T. F. Chan, J. Cong, J. Shinnerl, K.
Sze and M. Xie, mPL6 Enhanced Multilevel
Mixed-size Placement. - Thesis
- Kenton Sze, Multilevel Optimization for VLSI
Circuit Placement. - Min Xie, Constraint-Driven Large Scale Circuit
Placement Algorithms.
30Room for Further Improvement?
mPL4
mPL5
- Swirls are difficult to correct with localized
refinement