Geen diatitel

About This Presentation

Title:

Geen diatitel

Description:

A Priori System-Level Interconnect Prediction The Road to Future Computer Systems Dirk Stroobandt Ghent University Electronics and Information Systems Department – PowerPoint PPT presentation

Number of Views:63

Avg rating:3.0/5.0

Slides: 94

Provided by: Dirk88

Category:

more less

Transcript and Presenter's Notes

Title: Geen diatitel

1
A Priori System-LevelInterconnect PredictionThe
Road to Future Computer Systems
Dirk Stroobandt Ghent University Electronics and
Information Systems Department
Presentation at Intel June 16th, 2000
2
Outline

Why do we need a priori interconnect prediction?
Basic models
Rents rule with extensions and applications
A priori wirelength prediction
New evolutions
Applications
CAD (extrapolation, achievable routing, layer
assignment)
Evaluation of new computer architectures
Circuit characterization
Conclusions

3
Outline

Why do we need a priori interconnect prediction?
Basic models
Rents rule with extensions and applications
A priori wirelength prediction
New evolutions
Applications
CAD (extrapolation, achievable routing, layer
assignment)
Evaluation of new computer architectures
Circuit characterization
Conclusions

4
Why do we needa priori interconnect prediction?

Importance of wires increases (they do not scale
as components).
For future designs, very little is known.
Roadmapping uses a priori estimation techniques.
To improve CAD tools for design layout
generation.
CAD tools have to take into account timing
constraints, area constraints, performance, power
dissipation
All these constraints wires should be as short
as possible.
Estimation at early stage aids the CAD tools in
finding a better solution through fewer design
cycle iterations.

5
Why do we needa priori interconnect prediction?

To evaluate new computer architectures
To adhere to the increasing performance demands,
new computer architectures are needed.
Each of them must be evaluated thoroughly.
A priori estimates immediately provide a ground
for drawing preliminary conclusions.
Different architectures can be compared to each
other.
Applications for evaluating three-dimensional
(opto-electronic) architectures, FPGAs, MCMs,...

6
Components of thephysical design step
circuit
architecture
Layout generation
layout
7
Circuit model
Logic block
Net
External net
Terminal / pin
Multi-terminal nets have a net degree gt 2
8
Model for partitioning
8 nets cut
4 nets cut
Optimal partitioning minimal number of nets cut
9
Model for partitioning
10
The three basic models
11
The three basic models
Optimal placement placement with minimal total
wire length over all possible placements.

Optimal routing routing through shortest path
requires channels with sufficiently high density
(or enough routing layers)
for multi-terminal nets Steiner trees
This defines the net length for known endpoints

Placement and routing model
12
Outline

Why do we need a priori interconnect prediction?
Basic models
Rents rule with extensions and applications
A priori wirelength prediction
New evolutions
Applications
CAD (extrapolation, achievable routing, layer
assignment)
Evaluation of new computer architectures
Circuit characterization
Conclusions

13
Rents rule
Rents rule was first described by Landman and
Russo, 1971 For average number of terminals and
blocks per module
100
T
p Rent exponent
t average term./block
10
Measure for the complexity of the interconnection
topology
(simple) 0 ? p ? 1 (complex)
average
Rents rule
Normal values 0.5 ? p ? 0.75
1
1
1000
10
100
B
14
Rents rule
If ?B cells are added, what is the increase
?T? In the absence of any other information we
guess
?T
?B
B
Overestimate many of ?T terminals connect to T
terminals and so do not contribute to the
total. We introduce a factor p (p lt1) which
indicates how self connected the netlist is
T
Statistically homogenous system
Or, if ?B ?T are small compared to B and T
15
Rents rule
p
T t B
Rents rule is experimentally validated for a lot
of real circuits and for different
partitioning methodologies.

Distinguish between
p intrinsic Rent exponent
p Rent exponent for
a given placement
p Rent exponent for
a given partitioning

average
Rents rule
Deviation for high B and T Rents region II
(cfr. later).
16
Rents rule
Rents rule is a result of the self-similarity
within circuits
Assumption interconnection complexity is equal
at all levels.
17
Extension the local Rent exponent

Variations in Rents rule
global variations (e.g., lower complexity after
Technology mapping of the circuit, duplication)
local variations.
Two kinds of local variations in Rents rule
hierarchical locality some hierarchical levels
are more complex than others
spatial locality some circuit parts are more
complex than others.
Both are deviations from Rents rule that can be
modelled well.

18
Hierarchical locality Rents region II

Causes of region II
- pin limitation problem
- parallel to serial (complexity is moved from
space to time, number of pins is lowered)
- coding (input and output stream compact).

average
Rents rule
19
Hierarchical locality region III

For some circuits also deviation at low end
Stroobandt, GLSVLSI 99.
Mismatch between the available (library) and the
desired (design) complexity of interconnect
topology.
Only for circuits with logic blocks that have
many inputs.

T
20
Hierarchical locality modelling

Use incremental Rent exponent (proportional to
the slope of Rents curve in a single point) Van
Marck et al., ISYCS 1995.

21
Spatial locality in Rents rule

Inhomogeneous circuits different parts have
different interconnection complexity.
For separate parts

Only one Rent exponent (heterogeneous) might not
be realistic. Clustering simple parts will be
absorbed by complex parts.
22
Local Rent exponent

Higher partitioning levels Rent exponents will
merge.
Spreading of the values with steep slope
(decreasing) for complex part and gentle slope
(increasing) for simple part.
Local Rent exponent
tangent slope of the line that combines all
partitions containing the local block(s).

1
1
2
1
T
1
2
1
1
2
2
2
2
B
23
Heterogeneous Rents rule

Suggested by Zarkesh-Ha, Davis, Loh, and
Meindl,98
Weighted arithmetic average of the logarithm of
T
Heterogeneous Rents rule (for 2 parts)

24
Use of Rents rule in CAD

Rents rule is very powerful as a measure of the
complexity of the interconnection topology
Can aid in the partitioning process
Benchmark generators are based on Rents rule
Is basis for a priori estimates in CAD

25
Rents rule in partitioning

Actual goal minimize the number of pins per
module.
We should use a pin count criterion.
External multi-terminal
nets lead to only one
new pin instead of two
when cut.
Preferring external nets
to be cut will better keep
clusters together.

26
Rents rule in partitioning

Solution use a new ratio value (in ratiocut
partitioning) based on terminal count
Stroobandt, ISCAS99
Better partitions are obtained because the total
number of pins for each module is taken into
account by the cost function.

27
Rents rule in partitioning

Better (ratio cut) heuristic by using terminal
count prediction Stroobandt, ISCAS99.
Clustering property of the ratio cut use Rents
rule instead of uniformly distributed random
graph.
New ratio
Instead of old ratio

28
Rents rule in partitioning

Important (especially in pin-limited designs)
terminal balancing Stroobandt, Swiss
CAD/CAM99.
Minimizing the terminal count alone is not enough.

Additional cost function for terminal balancing
Terminal
29
Rents rule in benchmark generation

Generating benchmarks in a hierarchical way
Rents rule is used for estimating the number of
connections
Other parameters have to be controlled as well
Classical parameters
total number of gates
total number of nets
total number of pins
Gate terminal distribution
Net degree distribution
Other issues gate functionality, redundancy,
timing constraints, ...

30
Outline

Why do we need a priori interconnect prediction?
Basic models
Rents rule with extensions and applications
A priori wirelength prediction
New evolutions
Applications
CAD (extrapolation, achievable routing, layer
assignment)
Evaluation of new computer architectures
Circuit characterization
Conclusions

31
Donaths hierarchical placement model
1. Partition the circuit into 4 modules of equal
size such that Rents rule applies (minimal
number of pins).
2. Partition the Manhattan grid in 4 subgrids of
equal size in a symmetrical way.
32
Donaths hierarchical placement model
3. Each subcircuit (module) is mapped to a
subgrid.
4. Repeat recursively until all logic blocks are
assigned to exactly one grid cell in the
Manhattan grid.
33
Donaths length estimation model

At each level Rents rule gives number of
connections
number of terminals per module directly from
Rents rule (partitioning based Rent exponent
p)
every net not cut before (internal net) 2 new
terminals
every net previously cut (external net) 1 new
terminal
assumption ratio f (internal nets)/(nets
cut) is constant over all levels k Stroobandt
and Kurdahi, GLSVLSI98
number of nets cut at level k (Nk) equals
where ?1/(1f) ? depends on the total number
of nets in the circuit and is bounded by 0.5 and
1.

34
Donaths length estimation model
Length of the connections at level k ?
Adjacent (A-) combination
Diagonal (D-) combination
?
Donath assumes all connection source and
destination cells are uniformly distributed over
the grid.
35
Average interconnection length

Number of connections at level k
Average length A-combination
Average length D-combination
Average length level k
Total average length with
and 2K G total number of gates

36
Results Donath
Scaling of the average length L as a function of
the number of logic blocks G
Similar to measurements on placed designs.
37
Results Donath
Theoretical average wire length too high by a
factor 2
38
Including optimal placement model

Keep wire length scaling by hierarchical
placement.
Improve on uniform probability for all
connections at one level (not a good model for an
optimal placement).

Enumeration site density function (only
architecture dependent). Occupying probability
favours short interconnections (for an optimal
placement) (darker)
39
Including optimal placement model

Wirelength distributions contain two parts
site density function and probability distribution

all possibilities requires enumeration (use
generating polynomials, Stroobandt Van Marck,
SLIP2000)
probability of occurrence shorter wires more
probable Stroobandt, VLSI Design vol. 10, 1999
40
Wire length distribution
Local distributions at each level have similar
shapes (self-similarity) ? peak values
scale. Integral of local distributions equals
number of connections. Global distribution
follows peaks.

From this we can deduct that
For short lengths

41
Occupying probability results
Use probability on each hierarchical level (local
distributions).
8
Occupying prob.
7
Donath
6
experiment
5
L
4
3
2
1
0
10000
10
100
1000
G
42
Occupying probability results
Effect of the occupying probability boosting the
local wire length distributions (per level) for
short wire lengths
percent of wires
Occupying prob.
Donath
100
global trend
global trend
10
per level
per level
1
total
total
0,1
0,01
10-3
10-4
10000
1
10
100
1000
1
10
100
1000
10000
Wire length
Wire length
43
Occupying probability results
Effect of the occupying probability on the total
distribution more short wires less long
wires ? average wire length is shorter
percent wires
100
Donath
10
Occupying prob.
1
10-1
10-2
10-3
10-4
10-5
1
10
100
1000
10000
Wire length
44
Occupying probability results
Percent wires
60
Donath
50
-8
Occupying prob.
-23
global trend
40
30
10
20
6
10
1
10
3
4
5
6
7
8
2
9
Wire length
45
Occupying probability results
Number of wires
1000
Donath
Occupying prob.
100
measurement
10
1
0,1
1
100
10
Wire length
46
Davis probability function

Introduced by Davis, De, and Meindl, IEEE T El.
Dev., 98.
Number of interconnections at distance l is
calculated for every gate separately, using
Rents rule.
Three regions gate under investigation (A),
target gates (C), and gates in between (B).
Number of connections between A and C is
calculated.

This approach alleviates the discrete effects at
the boundaries of the hierarchical levels while
maintaining the scaling behaviour.
47
Davis probability function

-
-
C
B
C
C
TA?C

-
-
TAB
TBC
TB
TABC

B
B
B
C
C
B
B
A
B
B
C
C
Assumption net cannot connect A,B, and C
B
B
B
C
C
B
C
C
C
48
Davis probability function
For cells placed in infinite 2D plane
49
Planar wirelength model A
28
Finite system, BtotL2, no edges, approximate
form for q?(l)
50
Planar wirelength model B (Davis)
29
L
Finite system, BtotL2, includes edge
effects, use q(l)
51
Planar wirelength model comparison
30
Btot 1024 p 0.66
Model A
Model A Lav 4.53 Model B Lav 2.27
Model B
52
Relationship between models from Davis (planar
model B) and Stroobandt (hierarchical model C)
Db(l)
Dc(l,h)
same q(l) essentially identical!
53
Hierarchical wirelength model comparison
Ctot 1024 p 0.66
Model D
Model C (Stroobandt) Lav 2.05 Model D
(Donath) Lav 5.14
Model C q(l) and hierarchy Model D only
hierarchy (q(l)1)
Model C
54
Planar and hierarchical model comparison
Model A
Model D
Model B
Model C
Models B (planar) and C (hierarchical ) are
equivalent if the Rent exponent used for the
probability function (depends on placement) and
the one used for the number of nets per
hierarchical level (based on partitioning) are
the same
55
Outline

Why do we need a priori interconnect prediction?
Basic models
Rents rule with extensions and applications
A priori wirelength prediction
New evolutions
Applications
CAD (extrapolation, achievable routing, layer
assignment)
Evaluation of new computer architectures
Circuit Characterization
Conclusions

56
Extension to three-dimensional grids
57
Anisotropic systems
58
Anisotropic systems

Basic method Donaths method in 3D
Not all dimensions are equal (e.g., optical links
in 3rd D)
possibly larger latency of the optical link
(compared to intra-chip connection)
influence of the spacing of the optical links
across the area (detours may have to be made)
limitation of number of
optical layers
Introducing an optical cost

59
Anisotropic systems

If limited number of layers use third dimension
for topmost hierarchical levels (fewest
interconnections).
For lower levels 2D method.
2D and 1D partitioning are sometimes used to get
closer to the (optimal in isotropic grids) cubic
form.
Depending on the optical cost, it is advantageous
either to strive for getting to the electrical
plains as soon as possible (high optical cost,
use at high levels only) or to partition the
electrical planes first (low optical cost).

60
External nets

Importance of good wire length estimates for
external nets during the placement process
For highly pin-limited designs placement will be
in a ring-shaped fashion (along the border of the
chip).

61
Wire lengths at system level

At system level many long wires (peak in
distribution).

How to model these? Davis and Meindl
98 estimation based on Rents rule with
the floorplanning blocks as logic
blocks. IMPORTANT!
62
Outline

Why do we need a priori interconnect prediction?
Basic models
Rents rule with extensions and applications
A priori wirelength prediction
New evolutions
Applications
CAD (extrapolation, achievable routing, layer
assignment)
Evaluation of new computer architectures
Circuit characterization
Conclusions

63
Improving CAD tools for design layout
Digital design is complex
Computer-aided design (CAD)

More efficient layout generation requires good
wire length estimates.
Layer assignment in routing
effects of vias, blockages
congestion, ...

A priori estimates are rough but can already
provide us with a lot of information.
64
Evaluating new computer architectures

Estimation for evaluating and comparing different
architectures

Circuit characterization
We need parameters to classify circuits in
classes and to optimize them. Benchmark
generation based on Rents rule.
65
Outline

Why do we need a priori interconnect prediction?
Basic models
Rents rule with extensions and applications
A priori wirelength prediction
New evolutions
Applications
CAD (extrapolation, achievable routing, layer
assignment)
Evaluation of new computer architectures
Circuit Characterization
Conclusions

66
Technology extrapolation
What is the most power-efficient noise management
strategy?

Evaluates impact of
design technology
process technology
Evaluates impact on
achievable design
associated design problems
Questions to be addressed
Sets new requirements for CAD tools and
methodologies
Roadmaps familiar and influential example

How and when do L, SOI, SER, etc. matter?
Will layout tools need to perform process
simulation to efficiently model cross-die and
cross-wafer manufacturing variation?
67
Current extrapolation systems

Previous and ongoing efforts
ITRS Roadmaps
Tools SUSPENS, GENESYS, RIPE, BACPAC,
Numerous tools in industry
Use models for
delay
power
architecture
wirelength estimation
...

68
GTX GSRC technology extrapolation system

GTX is set up as a framework for technology
extrapolation
Caldwell et al., DAC 2000

69
GTX

Check it out at http//vlsicad.cs.ucla.edu/GSRC/GT
X/

70
Models of achievable routing

wirelength estimation models (Donath, )
actual placement information

Required versus available resources

Required versus available resources
71
Models of achievable routing
Required versus available resources
limited by routing efficiency factor hr
72
Models of achievable routing
Required versus available resources
limited by power/ground (signal net fraction si)

73
Models of achievable routing
Required versus available resources
limited by via impact factor vi (ripple
effect) utilization factor Ui (available /
supplied area)
74
Use of achievable routing models

Optimizing interconnect process parameters for
future designs (number of layers, wire width and
pitch per layer, ...)
With given layer characteristics predict the
number of layers needed
If number of layers fixed oracle (not)
routable!
(SUSPENS, GENESYS, RIPE, BACPAC, GTX)
Supplying objectives that guide layout tools to
promising solutions (wire planning)
Kahng, Mantik and Stroobandt, ISPD 2000

75
Layer assignment in routing

DSM design routing tools have to account for
delay constraints
yield
power
Conventional technique
router assigns wires to layers
wire sizing, repeater insertion/sizing applied
More interesting approach
wire sizing etc. used by router to assign wires
Kahng and Stroobandt, SLIP 2000

76
Our Layer Assignment Concept

Search for optimal layer for a wire with
optimal wire size, number and size of repeaters
for each wire
meeting consistent stage delay constraints
taking total repeater area constraint into
account
accounting for impact of vias
A priori estimation techniques make it useful for
application both before / after placement
Potential applications
improving CAD layout tools
studying effects of technological parameters
optimizing fabrication process

77
Problem and Models
Find the optimal assignment of wires to
wiring layers subject to delay constraints and
total repeater area constraints

Optimization objective of layers needed
Degrees of freedom (for each wire)
choice of layer parameters
wire width
number of repeaters
size of the repeaters

78
Layer Assignment method

2 phases optimize, then round to integers
Phase 1

Calculate minimal delay Tmin.
79
A Typical Example
Tier type 2
Tier type 1
Tier type 0
Wire width (mm)
Delay (ps)
0
2
4
0
0
2
2
4
Number of repeaters
Wirelength (mm)
80
Target Delay Influence
Tier type 2
Tier type 1
Tier type 0
Delay (ps)
Wire width (mm)
Wirelength (mm)
Wirelength (mm)
81
Uniform Versus Non-uniform Stacks
82
Optimal Layer Stack Monotonic?
83
Outline

Why do we need a priori interconnect prediction?
Basic models
Rents rule with extensions and applications
A priori wirelength prediction
New evolutions
Applications
CAD (extrapolation, achievable routing, layer
assignment)
Evaluation of new computer architectures
Circuit characterization
Conclusions

84
Opto-electronic FPGAs
Research question Is the use of massive optical
interconnects at the logic level in general
purpose electrical systems meaningful?

Answer depends on whether the properties of
optical interconnections are comparable (or
better) than those of the electrical ones they
replace or complement
? Such a situation presents itself in electrical
FPGAs

85
Area I/O in FPGAs

Future multi-FPGA emulators will face the
following problems
ASICs will keep growing the need for multi-FPGA
emulators will stay
FPGAs will have increasing numbers of CLBs
Complex designs have high Rent Exponents
? Pin and inter-FPGA interconnect limitations
will keep increasing
Area I/O provides significant benefits in FPGAs

J. Depreitere, H. Van Marck, J. Van Campenhout,
Microprocessors and Microsystems 21, 1998, pp. 89
-- 97
86
Why Area I/O in FPGAs ?
(a) gain because wires do not have to be routed
all the way to the perimeter
(a) gain because wires do not have to be routed
all the way to the perimeter (b) gain when pin
limitation problems are also considered

Area I/O provides significant benefits in FPGAs

J. Depreitere, H. Van Marck, J. Van Campenhout,
Microprocessors and Microsystems 21, 1998, pp. 89
-- 97
87
Why 3D FPGAs?

Different asymptotic average wire length

Two-dimensional
Three-dimensional
J. Van Campenhout, H. Van Marck, J. Depreitere,
J. Dambre, IEEE J. Sel. Topics in Quant. Electr.
on Smart Phot. Comp., Interconnects, and Proc.
(5)2, 1999, pp. 306 -- 315
88
Why 3D FPGAs ?

Wire length distribution differs significantly

J. Van Campenhout, H. Van Marck, J. Depreitere,
J. Dambre, IEEE J. Sel. Topics in Quant. Electr.
on Smart Phot. Comp., Interconnects, and Proc.
(5)2, 1999, pp. 306 -- 315
89
Effect of Anisotropy

Benefits are lower if anisotropy is higher

Average wire length
Average wire length
7.5
5
Cost 24
7
6.5
4
6
Cost 16
5.5
3
5
4.5
Cost 8
2
4
3.5
1
Cost 1
3
0
2.5
1
1
4
5
10
15
20
25
2
8
6
16
10
12
14
Relative anisotropic cost
Number of layers (4096 gates)
J. Van Campenhout, H. Van Marck, J. Depreitere,
J. Dambre, IEEE J. Sel. Topics in Quant. Electr.
on Smart Phot. Comp., Interconnects, and Proc.
(5)2, 1999, pp. 306 -- 315
90
What Could It Look Like ?

A 3-D extension of the electrical on-chip
interconnect fabric
offers a highly compact and densely
interconnected multi-FPGA system
should provide an essentially 3-D routing
environment, leading to shorter average wire
lengths, hence faster systems
should provide an increased routability of
complex designs
Other (hierarchical) interconnect schemes could
be envisioned

91
Overview of the built system
Each FPGA has 128 optical receivers and 128
transmitters System is designed for 80MHz (85 MHz
measured) FPGA chip contains about 165,000
transistors About 1/3 of the FPGA chip is used
for the optics
92
An optical prototype

0.6 mm chip
standard 145
pin PGA
socket
4 x 8 InP
detecor
arrays
4 x 8 LEDs
(VCSELs)

93
Conclusion

Wire length estimates are becoming more and more
important.
A priori estimates can provide a lot of
information at virtually no cost.
Methods are based on Rents rule.
Important for future research how can we build a
priori estimates into CAD layout tools?
More information at http//www.elis.rug.ac.be/dst
r/
Check out the International Workshop on
System-level Interconnect Prediction (SLIP) at
http//vlsicad.cs.ucla.edu/SLIP2000/