PPT – ECE260B - CSE241A VLSI Digital Circuits PowerPoint presentation

About This Presentation

Title:

ECE260B - CSE241A VLSI Digital Circuits

Description:

Post Processing to Reduce ... Edge extraction in the netlist Layout Coarsening Reduce Solution Space ... 3.0 Adobe Photoshop Image Microsoft Graph 97 Chart ... – PowerPoint PPT presentation

Number of Views:169

Avg rating:3.0/5.0

Slides: 91

Provided by: AndrewB201

Learn more at: https://vlsicad.ucsd.edu

Category:

more less

Transcript and Presenter's Notes

Title: ECE260B - CSE241A VLSI Digital Circuits

1
ECE260B CSE241A Winter 2005Placement
Website http//vlsicad.ucsd.edu/courses/ece260b
-w05
Slides courtesy of Prof. Andrew B. Kahng
2
VLSI Design Flow and Physical Design Stage
3
Placement Problem

Input
A set of cells and their complete information (a
cell library).
Connectivity information between cells (netlist
information).
Output
A set of locations on the chip one location for
each cell.
Goal
The cells are placed to produce a routable chip
that meets timing
and other constraints (e.g., low-power, noise,
etc.)
Challenge
The number of cells in a design is very large (gt
1 million).
The timing constraints are very tight.

4
Optimal Relative Order
A
B
C
5
To spread ...
A
B
C
6
.. or not to spread
A
B
C
7
Place to the left
8
or to the right
9
Optimal Relative Order
A
B
C
Without free space, the placement problem is
dominated by order
10
Placement Problem
11
Global and Detailed Placement
In global placement, we decide the approximate
locations for cells by placing cells in global
bins. In detailed placement, we make some
local adjustment to obtain the final
non-overlapping placement.
12

Placement Footprints

Standard Cell
Data Path
IP - Floorplanning
13
Placement Footprints
Reserved areas
Mixed Data Path sea of gates
14
Placement Footprints
Perimeter IO
Area IO
15
Placement objectives are subject to user
constraints / design style

Hierarchical Design Constraints
pin location
power rail
reserved layers
Flat Design with Floorplan Constraints
Fixed Circuits
I/O Connections

16
Standard Cells
17
Standard Cells

Power connected by abutment, placed in
sea-of-rows
Rarely rotated
DRC clean in any combination
Circuit clean (I.e. no naked T-gates, no huge
input capacitances)
8,9,10 tracks in height
Metal 1 only used (hopefully)
Multi-height stdcells possible
Buffers sizes, intrinsic delay steps, optimal
repeater selection
Special clock buffers gates (balanced PN)
Special metastability hardened flops
Cap cells (metal1 used?)
Gap fillers (metal1 used?)
Tie-high, tie-low

18
Unconstrained Placement
19
Floor planned Placement
20
Placement Cube (4D)

Cost Function(s) to be used
Cut, wirelength, congestion, crossing, ...
Algorithm(s) to be used
FM, Quadratic, annealing, .
Granularity of the netlist
Coarseness of the layout domain
2x2, 4x4, .
An effective methodology picks the right mix from
the above and knows when to switch from one to
next.
Most methods today are ad-hoc

21
Advantages of Hierarchy

Design is carved into smaller pieces that can be
worked on in parallel (improved throughput)
A known floor plan provides the logic design team
with a large degree of placement control.
A known floor plan provided early knowledge of
long wires
Timing closure problems can be addressed by
tools, logic design, and hierarchy manipulation
Late design changes can be done with minimal
turmoil to the entire design

22
Disadvantages of Hierarchy

Results depend on the quality of the hierarchy.
The logic hierarchy must be designed with
Physical Design taken into account.
Additional methodology requirements must be met
to enable hierarchy. Ex. Pin assignment, Macro
abstract management, area budgeting, floor
planning, timing budgets, etc
Late design changes may affect multiple
components.
Hierarchy allows divergent methodologies
Hierarchy hinders Design Automation algorithms.
They can no longer perform global optimizations.

23
Traditional Placement Algorithms

Quadratic Placement
Simulated Annealing
Bi-Partitioning / Quadrisection
Force Directed Placement
Hybrid

24
Quadratic Placement
Min (x1-x3)2 (x1-x2)2 (x2-x4)2 F

Analytical Technique

x3
x1
dF/dx1 0 dF/dx2 0
Ax B
x2
x4
2 -1 -1 2
A
x3 x4
x1 x2
25
Analytical Placement

Get a solution with lots of overlap
What do we do with the overlap?

26
Pros and Cons of QP

Pros
Very Fast Analytical Solution
Can Handle Large Design Sizes
Can be Used as an Initial Seed Placement Engine
Cons
Can Generate Overlapped Solutions Postprocessing
Needed
Not Suitable for Timing Driven Placement
Not Suitable for Simultaneous Optimization of
Other Aspects of Physical Design (clocks,
crosstalk)
Gives Trivial Solutions without Pads (and close
to trivial with pads)

27
Simulated Annealing Placement

Initial Placement Improved through
Swaps and Moves
Accept a Swap/Move if it improves cost
Accept a Swap/Move that degrades cost
under some probability conditions

Cost
Time
28
Pros and Cons of SA

Pros
Can Reach Globally Optimal Solution (given
enough time)
Open Cost Function.
Can Optimize Simultaneously all Aspects of
Physical Design
Can be Used for End Case Placement
Cons
Extremely Slow Process of Reaching a Good Solution

29
Bi-Partitioning/Quadrisection
30
Pros and Cons of Partitioning Based Placement

Pros
More Suitable to Timing Driven Placement since it
is Move Based
New Innovation (hMetis) in Partitioning
Algorithms have made this Extremely Fast
Open Cost Function
Move Based means Simultaneous Optimization of all
Design Aspects Possible
Cons
Not Well Understood
Lots of indifferent moves
May not work well with some cost functions.

31
Hypergraphs in VLSI CAD

Circuit netlist represented by hypergraph

32
Hypergraph Partitioning in VLSI

Variants
directed/undirected hypergraphs
weighted/unweighted vertices, edges
constraints, objectives,
Human-designed instances
Benchmarks
up to 4,000,000 vertices
sparse (vertex degree 4, hyperedge size 4)
small number of very large hyperedges
Efficiency, flexibility KL-FM style preferred

33
Context Top-Down VLSI Placement
etc
34
Context Top-Down Placement

Speed
6,000 cells/minute to final detailed placement
partitioning used only in top-down global
placement
implied partitioning runtime 1 second for
25,000 cells, lt 30 seconds for 750,000 cells
Structure
tight balance constraint on total cell areas in
partitions
widely varying cell areas
fixed terminals (pads, terminal propagation, etc.)

35
Fiduccia-Mattheyses (FM) Approach

Pass
start with all vertices free to move (unlocked)
label each possible move with immediate change in
cost that it causes (gain)
iteratively select and execute a move with
highest gain, lock the moving vertex (i.e.,
cannot move again during the pass), and update
affected gains
best solution seen during the pass is adopted as
starting solution for next pass
FM
start with some initial solution
perform passes until a pass fails to improve
solution quality

36
Cut During One Pass (Bipartitioning)
Cut
Moves
37
Multilevel Partitioning
Refinement
Clustering
38
Force Directed Placement

Cells are dragged by forces.
Forces are generated by nets connecting cells.
Longer nets generate bigger forces.
Placement is obtained by either a constructive or
an iterative method.

Fij
i
i
j
39
Pros and Cons of Force Directed Placement

Pros
Very Fast Analytical Solution
Can Handle Large Design Sizes
Can be Used as an Initial Seed Placement Engine
The Force
Cons
Not sensitive to the non-overlapping constraints
Gives Trivial Solutions without Pads
Not Suitable for Timing Driven Placement

40
Hybrid Placement

Mix-matching different placement algorithms
Effective algorithms are always hybrid

41
GORDIAN (quadratic partitioning)
InitialPlacement
Partitionand Replace
42
Congestion Minimization

Traditional placement problem is to minimize
interconnection length (wirelength)
A valid placement has to be routable
Congestion is important because it represents
routability (lower congestion implies better
routability)
There is not yet enough research work on the
congestion minimization problem

43
Definition of Congestion
Routing demand 3 Assume routing supply is
1, overflow 3 - 1 2 on this edge.
Overflow on each edge
Routing Demand - Routing Supply (if Routing
Demand gt Routing Supply) 0 (otherwise)
Overflow overflow
S
all edges
44
Correlation between Wirelength and Congestion
45
Wirelength ? Congestion
A wirelength minimized placement
A congestion minimized placement
46
Congestion Map of a Wirelength Minimized Placement
47
Congestion MAP
48
Congestion Reduction Postprocessing
Reduce congestion globally by minimizing the
traditional wirelength
Post process the wirelength optimized placement
using the congestion objective
49
Congestion Reduction Postprocessing

Among a variety of cost functions and methods for
congestion minimization, wirelength alone
followed by a post processing congestion
minimization works the best and is one of the
fastest.
Cost functions such as a hybrid length plus
congestion do not work very well.

50
Cost Functions for Placement

The final goal of placement is to achieve
routability and meet timing constraints
Constraints are very hard to use in optimization,
thus we use cost functions (e.g., Wirelength) to
predict our goals.
We will show what happens when you try
constraints directly
The main challenge is a technical understanding
of various cost functions and their interaction.

51
Prediction

What is prediction ?
every system has some critical cost functions
Area, wirelength, congestion, timing etc.
Prediction aims at estimating values of these
cost functions without having to go through the
time-consuming process of full construction.
Allows quick space exploration, localizes the
search
For example
statistical wire-load models
Wirelength in placement

52
Paradigms of Prediction

Two fundamental paradigms
statistical prediction
of two-terminal nets in all designs
of two-terminal nets with length greater than 10
in all designs
constructive prediction
of two-terminal nets with length greater than 10
in this design
and everything in between, e.g.,
of critical two-terminal nets in a design based
on statistical data and a quick inspection of the
design in hand.
Absolute truth or I need it to make progress
SLIP (System Level Interconnect Prediction)
community.

53
Cost Functions for Placement

Net-cut
Linear wirelength
Quadratic wirelength
Congestion
Timing
Coupling
Other performance related cost functions
Undiscovered crossing

54
Net-cut Cost for Global Placement

The net-cut cost is defined as the number of
external nets between different global bins
Minimizing net-cut in global placement tends to
put highly connected cells close to each other.

55
Linear Wirelength Cost
The linear length of a net between cell 1 and
cell 2 is l12 x1-x2 y1-y2 The linear
wirelength cost is the summation of the linear
length of all nets.
56
Quadratic Wirelength Cost
The quadratic length of a net between cell 1 and
cell 2 is l12 (x1-x2)2 (y1-y2)2 The
quadratic wirelength cost is the summation of the
quadratic length of all nets.
57
Congestion Cost
Routing demand 3 Assume routing supply is
1, overflow 3 - 1 2 on this edge.
Overflow on each edge
58
Cost Functions for Placement

Various cost functions (and a mix of them) have
been used in practice to model/estimate
routability and timing
We have a good feel for what each cost function
is capable of doing
We need to understand the interaction among cost
functions

59
Congestion Minimization and Congestion vs
Wirelength

Congestion is important because it closely
represents routability (especially at
lower-levels of granularity)
Congestion is not well understood
Ad-hoc techniques have been kind-of working since
congestion has never been severe
It has been observed that length minimization
tends to reduce congestion.
Goal Reduce congestion in placement (willing to
sacrifice wirelength a little bit).

60
Correlation between Wirelength and Congestion
Total Wirelength Total Routing Demand
61
Wirelength ? Congestion
A wirelength minimized placement
A congestion minimized placement
62
Congestion Map of a Wirelength Minimized Placement
63
Different Routing Models for modeling congestion

Bounding box router fast but inaccurate.
Real router accurate but slow.
A bounding box router can be used in placement if
it produces correlated routing results with the
real router.
Note For different cost functions, answer might
be different (e.g., for coupling, only a detailed
router can answer).

64
Different Routing Models
A MSTshortest_path routing model
A bounding box routing model
65
Objective Functions Used in Congestion
Minimization

WL Standard total wirelength objective.
Ovrflw Total overflow in a placement (a direct
congestion cost).
Hybrid (1- a)WL a Ovrflw
QL A quadratic plus linear objective.
LQ A linear plus quadratic objective.
LkAhd A modified overflow cost.
(1- aT)WL aT Ovrflw A time changing hybrid
objective which let the cost function gradually
change from wirelength to overflow as
optimization proceeds.

66
Post Processing to Reduce Congestion
Reduce congestion globally by minimizing the
traditional wirelength
Post process the wirelength optimized placement
using the congestion objective
67
Post Processing Heuristics

Greedy cell-centric algorithm Greedily move
cells around and greedily accept moves.
Flow-based cell-centric algorithm Use a
flow-based approach to move cells.
Net-centric algorithm Move nets with bigger
contributions to the congestion first.

68
Greedy Cell-centric Heuristic
69
Flow-based Cell-centric Heuristic
Bin Nodes
Cell Nodes
70
Net-centric Heuristic
2
2
2
1
1
1
2
71
From Global Placement to Detailed Placement
Global Placement Assuming all the cells are
placed at the centers of global bins.
Detailed Placement Cells are placed without
overlapping.
72
Correlation Between Global and Detailed Placement
Conclusion Congestion at detailed placement
level is correlated with congestion at global
placement level. Thus reducing congestion in
global placement helps reduce congestion in final
detailed placement.

WLg Wirelength optimized global placement.
CONg Wirelength optimized detailed placement.
WLd Congestion optimized global placement.
CONd Congestion optimized detailed placement.

73
Congestion

Wirelength minimization can minimize congestion
globally. A post processing congestion
minimization following wirelength minimization
works the best to reduce congestion in placement.
A number of congestion-related cost functions
were tested, including a hybrid length plus
congestion (commonly believed to be very
effective). Experiments prove that they do not
work very well.
Net-centric post processing techniques are very
effective to minimize congestion.
Congestion at the global placement level,
correlates well with congestion of detailed
placement.

74
Shapes of Cost Functions
net-cut cost
wirelength
congestion
Solution Space
75
Relationships Between the Three Cost Functions

The net-cut objective function is more smooth
than the wirelength objective function
The wirelength objective function is more smooth
than the congestion objective function
Local minimas of these three objectives are in
the same neighborhood.

76
Crossing A routability estimator?

Replace each crossing with a gate
A planar netlist
Easy to place

77
Timing Cost
Critical Path

Delay of the circuit is defined as the longest
delay among all possible paths from primary
inputs to primary outputs.
Interconnection delay becomes more and more
important in deep sub-micron regime.

78
Timing Analysis
How do we get the delay numbers on the
gate/interconnect?
79
Approaches

Budgeting
In accurate information
Fast
Path Analysis
Most accurate information
Very slow
Path analysis with infrequent path substitution
Somewhere in between

80
Timing Metrics

How do we assess the change in a delay due to a
potential move during physical design?
Whether it is channel routing or area routing,
the problem is the same
translate geometrical change into delay change

81
Others costs Coupling Cost

Hard to model during placement
Can run a global router in the middle of
placement
Even at the global routing level it is hard to
model it

Avoid it
82
Coupling Solutions

Once we have some metrics for coupling, we can
calculate sensitivities, and optimize the
physical design...

83
Other Performance Costs

Power usage of the chip.
Weighted nets
Dual voltages (severe constraint on placement)
Very little known about these cost functions and
their interaction with other cost functions
Fundamental research is needed to shed some light
on the structure of them

84
Netlist Granularity Problem Size and Solution
Space Size

The most challenging part of the placement
problem is to solve a huge system within given
amount of time
We need to effectively reduce the size of the
solution space and/or reduce the problem size
Netlist clustering Edge extraction in the
netlist

85
Layout Coarsening

Reduce Solution Space
Edge extraction in the solution space
Only simple things have been tried
GP, DP (Twolf)
2x1, 2x2, .
Coarsen only easy parts

86
Incremental Placement

Given an optimal placement for a given netlist,
how to construct optimal placements for netlists
modified from the given netlist.
Very little research in this area.
Different type of incremental changes (in one
region, or all over)
Methods to use
How global should the method be
An extremely important problem.

87
Incremental Placement

A placement move changes the interconnect
capacitance and resistance of the associated net
A net topology approximation is required to
estimate these changes

88
Placynthesis Algorithms
buffering
resizing
restructuring
89
Many other Design MetricsPower Supply and Total
Power
Source The Incredible Shrinking Transistor,
Yuan Taur, T. J. Watson Research Center, IBM,
IEEE Spectrum, July 1999
90
Dual Voltages A harder problem

Layout synthesis with dual voltages major
geometric constraints

VL
VH
VH
GND
feedthrough
VL
H
L
OUT
IN
H
L
? ? ?
GND
H -- High Voltage Block L -- Low Voltage Block
Cell Library with Dual Power Rails
Layout Structure
91
Placement References

C. J. Alpert, T. Chan, D. J.-H. Huang, I. Markov,
and K. Yan, Quadratic Placement Revisited,Proc.
34th IEEE/ACM Design Automation Conference, 1997,
pp. 752-757
C. J. Alpert, J.-H Huang, and A. B. Kahng,
Multilevel Circuit Partitioning, Proc. 34th
IEEE/ACM Design Automation Conference, 1997, pp.
530-533
U. Brenner, and A. Rohe, An Effective Congestion
Driven Placement Framework, International
Symposium on Physical Design 2002, pp. 6-11
A. E. Caldwell, A. B. Kahng, and I.L. Markov,
Can Recursive Bisection Alone Produce Routable
Placements,Proc. 37th IEEE/ACM Design Automation
Conference, 2000, pp 477-482
M.A. Breuer, Min-Cut Placement, J. Design
Automation and Fault Tolerant Computing, I(4),
1997, pp 343-362
J. Vygen, Algorithms for Large-Scale Flat
Placement, Proc. 34th IEEE/ACM Design Automation
Conference, 1988,pp 746-751
H. Eisenmann and F. M. Johannes, Generic Global
Placement and Floorplanning, Proc. 35th IEEE/ACM
Design Automation Conference, 1998, pp. 269-274
S.-L. Ou and M. Pedram, Timing Driven Placement
Based on Partitioning with Dynamic Cut-Net
Control, Proc. 37th IEEE/ACM Design Automation
Conference, 2000, pp. 472-476
C.M. Fiduccia and R.M. Mattheyses, A linear time
heuristic for improving network partitions, Proc.
ACM/IEEE Design Automation Conference. (1982) pp.
175 - 181.