Title: Simulated Evolution Algorithm for Multi-Objective VLSI Netlist Bi-Partitioning
1Simulated Evolution Algorithm for Multi-Objective
VLSI Netlist Bi-Partitioning
-
- Sadiq M. Sait,, Aiman El-Maleh, Raslan Al Abaji
- King Fahd University of Petroleum Minerals
- Dhahran, Saudi Arabia
- 27 May 2003, IEEE ISCAS, Bangkok, Thailand
2Outline
- Introduction
- Problem Formulation
- Cost Functions
- Proposed Approaches
- Experimental results
- Conclusion
3VLSI Technology Trends
7.5M333MHz0.25um
Design Characteristics
3.3M200MHz0.6um
1.2M50MHz0.8um
0.13M12MHz1.5um
0.06M2MHz6um
Cycle-basedsimulation,FormalVerification
Top-DownDesign,Emulation
HDLs, Synthesis
CAESystems, Siliconcompilation
Key CAD Capabilities
SPICE Simulation
The challenges to sustain such a fast growth to
achieve giga-scale integration have shifted in a
large degree, from the process of manufacturing
technologies to the design technology.
4The VLSI Chip in 2006
Technology 0.1 um Transistors 200 M Logic
gates 40 M Size 520 mm2 Clock 2 - 3.5 GHz Chip
I/Os 4,000 Wiring levels 7 - 8 Voltage 0.9 -
1.2 Power 160 Watts Supply current 160 Amps
Performance Power consumption Noise
immunity Area Cost Time-to-market Tradeoffs!!!
5VLSI Design Cycle
VLSI design process is carried out at a number of
levels.
- System Specification
- Functional Design
- Logic Design
- Circuit Design
- Physical Design
- Design Verification
- Fabrication
- Packaging Testing and Debugging
6Physical Design
The physical design cycle consists of
- Partitioning
- Floorplanning and Placement
- Routing
- Compaction
Physical design converts a circuit description
(behavioral/structural), into a geometric
description. This description is used to
manufacture a chip.
7Why we need Partitioning ?
- Decomposition of a complex system into smaller
subsystems. - Each subsystem can be designed independently
speeding up the design process (divide-and
conquer-approach). - Dividing a complex IC into a number of functional
blocks, each of them designed by one or a team of
engineers. - The partitioning scheme has to minimize the
interconnections between subsystems.
8Levels of Partitioning
System
System Level Partitioning
PCBs
Board Level Partitioning
Chips
Chip Level Partitioning
Subcircuits / Blocks
9Classification of Partitioning Algorithms
Partitioning Algorithms
Iterative Heuristics
Performance Driven
Others
Group Migration
- Kernighan-Lin
- Fiduccia-Mattheyeses (FM)
- Multilevel K-way Partitioning
- Simulated annealing
- Simulated evolution
- Tabu Search
- Genetic
- Spectral
- Multilevel Spectral
- Lawler et al.
- Vaishnav
- Choi et al.
- Junichiro et al.
10Related previous Works
1999 Two low power oriented techniques based on simulated annealing (SA) algorithm by choi et al.
1969 A bottom-up approach for delay optimization (clustering) was proposed by Lawler et al.
1998 A circuit partitioning algorithm under path delay constraint is proposed by junichiro et al. The proposed algorithm consists of the clustering and iterative improvement phases.
1999 Enumerative partitioning algorithm targeting low power is proposed in Vaishnav et al. Enumerates alternate partitionings and selects a partitioning that has the same delay but less power dissipation. (not feasible for huge circuits.)
11Motivation
- Need for Power optimization
- Portable devices
- Power consumption is a hindrance in further
integration - Increasing clock frequency
- Need for delay optimization
- In current sub micron design wire delay tend to
dominate gate delay. - Larger die size imply long on-chip global routes,
which affect performance - Optimizing delay due to off-chip capacitance
12Objective
- Design a class of iterative algorithms for VLSI
multi-objective partitioning. - Explore partitioning from a wider angle and
consider circuit delay, power dissipation and
interconnect in the same time, under a given
balance constraint
13Problem formulation
- Objectives
- Power cost is optimized
- Delay cost is optimized
- Cutset cost is optimized
- Constraint
- Balanced partitions to a certain tolerance
degree (10)
14Problem formulation
- the circuit is modeled as a hypergraph H(V,E),
where V v1,v2,v3, vn is a set of modules
(cells). - And E e1, e2, e3, ek is a set of hyperedges.
Being the set of signal nets, each net is a
subset of V containing the modules that the net
connects. - A two-way partitioning of a set of nodes V is to
determine two subsets VA and VB such that VA U VB
V and VA ?VB ?
15Cutset
- Based on hypergraph model H (V, E)
- Cost 1 c(e) 1 if e spans more than 1 block
- Cutset sum of hyperedge costs
- Efficient gain computation and update
16Delay Model
- path ? SE1 ? C1?C4?C5?SE2.
- Delay? CDSE1 CDC1 CDC4 CDC5 CDSE2
- CDC1 BDC1 LFC1 ( Coffchip CINPC2 CINPC3
CINPC4)
17Delay
Delay(Pi)
Pi is any path Between 2 cells or nodes P set
of all paths of the circuit.
18Power
The average dynamic power consumed by CMOS logic
gate in a synchronous circuit is given by
Ni is the number of output gate transition per
cycle (Switching Probability)
is the load capacitance
19Power
Load Capacitances driven by a cell before
Partitioning
Additional load due to off chip capacitance.
(cut net)
Total Power dissipation of a Circuit
20Power
Can be assumed identical for all nets
Set of Visible gates Driving a load outside the
partition.
21Unifying Objectives, How ?
- Problems in choosing weights.
- Need to tune for every circuit.
22Fuzzy logic for cost function
- Imprecise values of the objectives
- best represented by linguistic terms that are
basis of fuzzy algebra - Conflicting objectives
- Operators for aggregating function
23Fuzzy logic for Multi-objective function
- The cost to membership mapping
- Linguistic fuzzy rule for combining the
membership values in an aggregating function - Translation of the linguistic rule in form of
appropriate fuzzy operators - And-like operators Min operator ? min (?1,
?2) - And-like OWA ?? min (?1,?2) ½ (1-?)
(?1 ?2) - Or-like operatorsMax operator ? max
(?1, ?2) - Or-like OWA ?? max (?1,?2) ½
(1-?) (?1 ?2) - Where ? is a constant in range 0,1
24Membership functions
Where Oi and Ci are lower bound and actual cost
of objective i ? i(x) is the membership of
solution x in set good i gi is the relative
acceptance limit for each objective.
25Fuzzy linguistic rule
- A good partitioning can be described by the
following fuzzy rule - IF solution has small cutset AND low power AND
- short delay AND
- good Balance.
- THEN it is a good solution
26Fuzzy cost function
The above rule is translated to AND-like OWA
Represent the total Fuzzy fitness of the
solution, our aim is to Maximize this fitness
Respectively (Cutset, Power, Delay, Balance)
Fitness
27Simulated Evolution
- Algorithm Simulated evolution
- Begin
- Start with an initial feasible Partition S
- Repeat
- Evaluation Evaluate the Gi (goodness) of
all modules - Selection
- For each Vi (cell) DO
- begin
- if Random Rm gt Gi then select
the cell - End For
- Allocation For each selected Vi (cell) DO
- begin
- Move the cell to
destination Block. - End For
- Until Stopping criteria is satisfied.
- Return best solution.
- End
28Cut goodness
di set of all nets, Connected and not cut. wi
set of all nets, Connected and cut.
29Power Goodness
Vi is the set of all nets connected and Ui is
the set of all nets connected and cut.
30Delay Goodness
Ki is the set of cells in all paths passing
by cell i. Li is the set of cells in all paths
passing by cell i and are not in same block as
i.
31Final selection Fuzzy rule
IF Cell I is near its optimal Cut-set goodness as
compared to other cells AND AND THEN
it has a high goodness.
near its optimal power goodness compared to
other cells
near its optimal net delay goodness as compared
to other cells OR T(max)(i) is much
smaller than Tmax
32Fuzzy Goodness
Tmax delay of most critical path in current
iteration. T(max)(i) delay of longest path
traversing cell i. Xpath Tmax / T(max)(i)
Fuzzy Goodness
Respectively (Cutset, Power, Delay ) goodness.
33Experimental Results
ISCAS 85-89 Benchmark Circuits
34SimE versus Tabu Search GA against time
Circuit S13207
35 SimE results were better than TS and GA, with
faster execution time.
Experimental Results SimE versus Ts versus GA
36ConclusionRe-write this
- The present work successfully addressed the
important issue of reducing power and delay
consumption in VLSI circuits. - The present work successfully formulate and
provide solutions to the problem of
multi-objective VLSI partitioning - TS partitioning algorithm outperformed GA in
terms of quality of solution and execution time
37 Thank you