Large-Scale Optimization in VLSI CAD - PowerPoint PPT Presentation

1 / 33

About This Presentation

Title:

Large-Scale Optimization in VLSI CAD

Description:

now can auto-partition 1M-gate circuits. better than manually, with free software ... Leading-edge implementations (free for all uses) ... – PowerPoint PPT presentation

Number of Views:81

Avg rating:3.0/5.0

Slides: 34

Provided by: igo63

Learn more at: https://ptolemy.berkeley.edu

Category:

more less

Transcript and Presenter's Notes

Title: Large-Scale Optimization in VLSI CAD

1
Large-Scale Optimizationin VLSI CAD

Igor Markov
http//www.eecs.umich.edu/imarkov

2
Goals/Outline of the Talk

Give a general idea about the field
success stories and applications
potential for cross-pollination
What drives the field
Reusable Intellectual Property in CAD
Consequences of large-scale
Sample wide-open problems

3
General (VLSI CAD)

Very Large System Integration
numerous components interconnect
emergent properties
not apparent in isolated components
Computer-Aided Design
better than human design (super-human!)
and then some

FOR MORE INFO...
http//www.eecs.umich.edu/imarkov/EECS527
4
Integrated Circuits

Excellent examples of large systems
manufacturing is enormously expensive
research can prevent blunders and pays off
two Moores laws keep everyone busy
circuits are growing
circuit design is getting harder
decreased market windows
must design quickly (or else)
digital circuits amenable to auto- manipulation
have a lot of regularity (easier to represent)

5
Just How Large?

As large as we can handle
a priori (physical) limitsare at least 20 years
away
pushing the boundaries is our goal
Current limits
need to solve many NP-hard problems
poor understanding, mathematical models
lack of efficient algorithms
(typical problem sizes will follow)

6
Design via Optimization

Think of all possible design solutions
solution space
need to choose one solution (or several)
What parameters should be optimized?
objective functions f1(x), f2(x),
Need to observe design constraints
The EDA revolution of the 1980s
searching, combinatorial and mathematical
optimization may outperform engineeringintuition
when implemented in software

7
A Meta-Approach to Optimization

Global Optimization
often cannot optimize accurate objectives
they can be hopeless to evaluate
e.g., min routed wirelength as f(placement)
find simpler objectives that correlate well
ditto for constraints
Detailed Optimization
improve global solutions by local search
can now worry about weird constraints
can optimize a better measure of signal delay, etc

8
Consequences of Large-Scale

Runtimes must scale near-linearly
strict limitation on used primitives(e.g., no
Gaussian elimination)
wide-spread use of multi-level methods
Same goes for memory consumption
cannot represent graphs as dense matrices
use random sampling/walks instead of enumeration
Trading solution quality for runtime
especially for randomized algorithms

9
Historic Opportunism

In early days of VLSI CAD
the Electronic Design Automation revolution
enabling, but short-lived results (can easily do
better)
e.g., this new algorithm addresses objective
f(x)
many proposed approaches never picked up
As ICs became larger, most CAD toolscould not
handle leading-edge circuits
algorithms for Deep SubMicron circuits
soon turned out that many algos were weak
partitioning, placement, SAT, etc.

10
Competitiveness

Outdated algorithms cause costly software
rewrites and lost opportunity
commercial tools may sell for 400,000
Learning circuit physics, optics, semiconductor
technologies, applied math, CS theory, AI,
databases, proper software design, etc is well
worth the effort
competitive edge
As a result of competitiveness, VLSI CAD offers
some of the best algorithms, very strong
implementations
frequent contributions to other fields

11
Success Stories

Min-cut hyper- graph partitioning
(very good solutions)
200K 0/1 variables, 1-2 mins of CPU time
Minimal Steiner trees (optimal)
hundreds of points in 1 second
Provably good routing (approximation)
500K nets in several hours (!!!)

12
Min-cut Partitioning

Given
hyper- graph
k bins
each accommodates up to N vertices
Seek
to assign each vertex to a bin
Minimize
of hyper- edges between bins

13
Min-cut Partitioning (contd)

Numerous apps in VLSI CAD beyond
supercomputing, data mining, Internet,
Progress in partitioning algorithms
started in 1972 and still going
many approaches invented / discarded
now can auto-partition 1M-gate circuits
better than manually, with free software
couldnt, even commercially, just 3 years ago
(this has nothing to do with Deep SubMicron)

14
Min-cut Partitioning (contd)

UCLA MLPart (ASPDAC 2000)
faster than hMetis per start
returns better solutions on average
never worse than 5 off from hMetis
sometimes (ibm06,2aa) 30 better
available in source code (C) and binaries
at the bookshelf, free for any use w/o
notification
Used at Cadence, Intel, start-ups
Vital to UCLA Capo placer

15
Steiner Minimal Trees

Given
k points in the plane
Seek
a Steiner tree connecting the points
add extra points
connect all points by straight-line segments
Minimize
total edge-length of the tree

16
Steiner Minimal Trees (contd)

Applications
routing signal nets
connecting cities by highways
1989, Scientific American
cannot find an SMT for 100 US cities
1999, SODA (Warme/Zachariasen)
with GeoSteiner can do that in lt1 sec
implementation available in source code

17
Routing of Multiple Nets

Given
n-tuples of locations to be connected
with Steiner trees (think of signal nets)
Constraints (not trivial to satisfy!)
routes cannot occupy same space
Minimize
total length of routes, congestion

18
Routing Of Multiple Nets

One of the first circuit design automations (late
1960s)
Has enormous solution space
A classic AI problem
Current commercial tools (e.g., Cadence)
up to a day for 500K nets, no guarantees
ISPD 2000, Albrecht (using multi-commodity flows)
500K nets in several hours, within 20 of opt.
(IBM Power 3 chip)

19
What Makes a Break-through?(or at least a splash)

Study sample splashes
Is it enough to minimize a function? (function -
relevant, minimization - efficient)
Yes
Yes, but
No
Absolutely not

20
Background VLSI Placement

bad placement good placement

21
Global WL-driven Placement

Objective
total Half-Perimeter WireLength
approximates Steiner Minimal Tree
UCLA Capo placer (DAC 2000)
beats Cadence QPlace on many benchmarks
lt50k gates unpublished 30 better on a 280K
gate bm.
compared by routed WL after Cadence WarpRoute
in congestion-driven mode 1 routing violation
failure
used for research at IBM, Intel, Phillips CMU,
available in source code (C), free for any use
(timing-driven mode not yet released)

22
Background Detailed Placement

Detailed circuit placement
given locations of circuit elements
(cells),improve them by local changes (e.g.,
swaps)
minimize total length of signal nets
Local, but large-scale problem
entails a very large number of small sub-problems
Practically important
local improvements directly translate to large
scale
very similar to floorplanning (a high-level
problem)

23
Background Detailed Placement

Naïve detailed optimization
consider 7-8 cells at a time
enumerate all permutations
compute HPWL for each
pick the best permutation
repeat for another group of 7-8
Greater groups ? better solutions
practical limit 0.01sec per group
Use Branch-and-bound for each group (ISPD 99)
Overall linear runtime
Easy parallelization (optimize many groups in )

24
Optimal Interleaving

ICCAD 2000, Hur and Lillis (TR available)

A
B
C
D
E
1
2
3
4
5
Optimally in O(n2) time by Dynamic Programming
A
1
2
B
C
3
4
D
5
E

Can handle 30 elements at a time
easier to implement than BB
the order constraint turns out very mild
Very good result
but, seemingly, nothing more than min f(x) !

25
Popularity Comparison w GeoSteiner

The Hur/Lillis algorithm
appeared several months ago (on paper)
already implemented by several groups
with great results
but Warmes GeoSteiner
is barely used
source code published 2 years ago
instead, used are simple heuristics that are
slower
Difference ease of reuse!
of result itself and/or of its representation

26
Intellectual Property in CAD

Reuse?
today hundreds of VLSI CAD engineersare
implementing the same, known, but difficult
algorithms
Breakthroughs typically producevalidated and
reusable intellectual property
yet another algorithm to min f(x) does not
automatically qualify for validated, reusable CAD
IP
applicability, generality, quality of
description, etc.
CAD IP is not just algorithms and code
CAD IP benchmarks, evaluation techniques,
empirical studies/results, algorithm analyses,etc
Studies of CAD IP suggest
to effectively reuse, need infrastructure

27
Intellectual Property in CAD

GRSC Bookshelf for Fundamental Algorithms in
CAD
a repository for reusable CAD IP, a publication
medium
a way to communicate with industry
problem formulations are also considered CAD IP
http//vlsicad.cs.ucla.edu/GSRC/bookshelf
Existing bookshelf slots include
SAT, Graph Coloring, Hypergraph Partitioning,
Mathematical Optimization, Circuit Placement,
Clock Tree Routing, Global Routing, Interconnect
Optimization, etc
Leading-edge implementations (free for all uses)
UCLA Physical Design Tools (graph partitioners,
placers,etc)
many more (SAT solvers from U. Michigan,
GeoSteiner, etc)

28
Reuse and Education

Both are necessary to sustain Moores laws
not enough designers to implement new chips
not enough CAD engineers to automate design
Need to teach/study reusable design
hardware, software/CAD IP (similar? different?)
note typical promising research demos not
reusable
Design of reusable software
theory has been available for years (processes,
code metrics, interface languages, modeling,
robust public-domain tools, etc)
need more infrastructure, practice, experience
of reuse
first reuse software
then design reusable software

29
Research Directions (1)

Citius, Altius, Fortius
faster, leaner implementations
higher-quality solutions
stronger impact on applications
aid available latest advances in CS theory,
Mathematics, AI, software engineering, etc
Large-scale computing aspects of VLSI CAD
memory locality (big deal for irregular circuits)
memory-less algorithms (and trade-offs)

30
Research Directions (2)

Quantified suboptimality of heuristics
(for NP-hard problems)
how close can we get to optima in practice?
estimate suboptimality of specific solutions
study dependence on input distributions
related to CS theory / approximation algos
example detection of symmetries in Logic
Synthesis
Kravets/Sakallah, ICCAD 2000 and TR
Lower bounds and impossibility arguments for
fundamental algorithms

31
Research Directions (3)

Using better, but still computable, models of
reality
simulation as a driver for optimization
modeling semiconductor effects
Alpert et al, ISPD 2000 --- a new interconnect
delay model, better than Elmore delay all
optimizations assuming Elmore are open to
porting
inductance, noise, etc
effects of statistical variations
CAD for new types of semi technologies and styles
subwavelength lithography (optical proximity
correction, etc)
System-On-Chip (high-level partitioning, etc)
CAD for analog circuits (including RF, MW)