Title: Simulated%20Evolution%20Algorithm%20for%20Multiobjective%20VLSI
1MS Thesis Presentation
Simulated Evolution Algorithm for Multiobjective
VLSI Netlist Bi-Partitioning
- By
- Dr Sadiq M. Sait
- Dr Aiman El-Maleh
- Raslan Al Abaji
- King Fahd University
- Computer Engineering Department
2Outline .
- Introduction
- Problem Formulation
- Cost Functions
- Proposed Approaches
- Experimental results
- Conclusion
3VLSI Technology Trend
7.5M333MHz0.25um
Design Characteristics
3.3M200MHz0.6um
1.2M50MHz0.8um
0.13M12MHz1.5um
0.06M2MHz6um
Cycle-basedsimulation,FormalVerification
Top-DownDesign,Emulation
HDLs, Synthesis
CAESystems, Siliconcompilation
Key CAD Capabilities
SPICE Simulation
The Challenges to sustain such an exponential
growth to achieve gigascale integration have
shifted in a large degree, from the process of
manufacturing technologies to the design
technology.
4The VLSI Chip in 2006
Technology 0.1 um Transistors 200 M Logic
gates 40 M Size 520 mm2 Clock 2 - 3.5 GHz Chip
I/Os 4,000 Wiring levels 7 - 8 Voltage 0.9 -
1.2 Power 160 Watts Supply current 160 Amps
Performance Power consumption Noise
immunity Area Cost Time-to-market Tradeoffs!!!
5VLSI Design Cycle
- VLSI design process is carried out at a number of
levels.
- System Specification
- Functional Design
- Logic Design
- Circuit Design
- Physical Design
- Design Verification
- Fabrication
- Packaging Testing and Debugging
6Physical Design
The physical design cycle consists of
- Partitioning
- Floorplanning and Placement
- Routing
- Compaction
Physical Design converts a circuit description
into a geometric description. This description is
used to manufacture a chip.
7Why we need Partitioning ?
- Decomposition of a complex system into smaller
subsystems. - Each subsystem can be designed independently
speeding up the design process (divide-and
conquer-approach). - Decompose a complex IC into a number of
functional blocks, each of them designed by one
or a team of engineers. - Decomposition scheme has to minimize the
interconnections between subsystems.
8Levels of Partitioning
System
System Level Partitioning
PCBs
Board Level Partitioning
Chips
Chip Level Partitioning
Subcircuits / Blocks
9Classification of Partitioning Algorithms
Partitioning Algorithms
Simulation Based Iterative
Performance Driven
Others
Group Migration
- Kernighan-Lin
- Fiduccia-Mattheyeses (FM)
- Multilevel K-way Partitioning
- Simulated annealing
- Simulated evolution
- Tabu Search
- Genetic
- Spectral
- Multilevel Spectral
- Lawler et al.
- Vaishnav
- choi et al.
- junichiro et al.
10Related previous Works
1999 Two low power oriented techniques based on simulated annealing (SA) algorithm by choi et al.
1969 A bottom-up approach for delay optimization (clustering) was proposed by Lawler et al.
1998 A circuit partitioning algorithm under path delay constraint is proposed by junichiro et al. The proposed algorithm consists of the clustering and iterative improvement phases.
1999 Enumerative partitioning algorithm targeting low power is proposed in Vaishnav et al. Enumerates alternate partitionings and selects a partitioning that has the same delay but less power dissipation. (not feasible for huge circuits.)
11Motivation
- Need for Power optimization
- Portable devices.
- Power consumption is a hindrance in further
integration. - Increasing clock frequency.
- Need for delay optimization
- In current sub micron design wire delay tend to
dominate gate delay. Larger die size imply long
on-chip global routes, which affect performance. - Optimizing delay due to off-chip capacitance.
12Objective
- Design a class of iterative algorithms for VLSI
multi objective partitioning. - Explore partitioning from a wider angle and
consider circuit delay , power dissipation and
interconnect in the same time, under balance
constraint.
13Problem formulation
- Objectives
- Power cost is optimized AND
- Delay cost is optimized AND
- Cutset cost is optimized
- Constraint
- Balanced partitions to a certain tolerance
degree. (10)
14Problem formulation
- the circuit is modeled as a hypergraph H(V,E)
- Where V v1,v2,v3, vn is a set of modules
(cells). - And E e1, e2, e3, ek is a set of hyperedges.
Being the set of signal nets, each net is a
subset of V containing the modules that the net
connects. - A two-way partitioning of a set of nodes V is to
determine two subsets VA and VB such that VA U VB
V and VA ?VB ?
15Cutset
- Based on hypergraph model H (V, E)
- Cost 1 c(e) 1 if e spans more than 1 block
- Cutset sum of hyperedge costs
- Efficient gain computation and update
16Delay Model
- path ? SE1 ? C1?C4?C5?SE2.
- Delay? CDSE1 CDC1 CDC4 CDC5 CDSE2
- CDC1 BDC1 LFC1 ( Coffchip CINPC2 CINPC3
CINPC4)
17Delay
Delay(Pi)
Delay(Pi)
Pi is any path Between 2 cells or nodes P
set of all paths of the circuit.
18Power
The average dynamic power consumed by CMOS logic
gate in a synchronous circuit is given by
Ni is the number of output gate transition per
cycle( switching Probability)
Is the Load Capacitance
19Power
Load Capacitances driven by a cell before
Partitioning
additional Load due to off chip capacitance.(
cut net)
Total Power dissipation of a Circuit
20Power
Can be assumed identical for all nets
Set of Visible gates Driving a load outside the
partition.
21Balance
The Balance as constraint is expressed as
follows
However balance as a constraint is not appealing
because it may prohibits lots of good moves.
Objective Cells(block1) Cells( block2)
22Unifying Objectives, How ?
- Problems in choosing Weights.
- Need to tune for every circuit.
23Fuzzy logic for cost function
- Imprecise values of the objectives
- best represented by linguistic terms that are
basis of fuzzy algebra - Conflicting objectives
- Operators for aggregating function
24Use of fuzzy logic for Multi-objective cost
function
- The cost to membership mapping.
- Linguistic fuzzy rule for combining the
membership values in an aggregating function. - Translation of the linguistic rule in form of
appropriate fuzzy operators.
25Some fuzzy operators
- And-like operators
- Min operator ? min (?1, ?2)
- And-like OWA
- ? ? min (?1, ?2) ½ (1- ?) (?1 ?2)
- Or-like operators
- Max operator ? max (?1, ?2)
- Or-like OWA
- ? ? max (?1, ?2) ½ (1- ?) (?1 ?2)
- Where ? is a constant in range 0,1
26Membership functions
Where Oi and Ci are lower bound and actual cost
of objective i ? i(x) is the membership of
solution x in set good i gi is the relative
acceptance limit for each objective.
27Fuzzy linguistic rule
- A good partitioning can be described by the
following fuzzy rule - IF solution has
- small cutset AND
- low power AND
- short delay AND
- good Balance.
- THEN it is a good solution
28Fuzzy cost function
The above rule is translated to AND-like OWA
Represent the total Fuzzy fitness of the
solution, our aim is to Maximize this fitness.
Respectively (Cutset, Power, Delay , Balance )
Fitness.
29Simulated Evolution
- Algorithm Simulated evolution
- Begin
- Start with an initial feasible Partition S
- Repeat
- Evaluation Evaluate the Gi (goodness) of
all modules - Selection
- For each Vi (cell) DO
- begin
- if Random Rm gt Gi then select
the cell - End For
- Allocation For each selected Vi (cell) DO
- begin
- Move the cell to
destination Block. - End For
- Until Stopping criteria is satisfied.
- Return best solution.
- End
30Simulated evolution Implementation.
- Cut goodness
- Power goodness
- Delay goodness
- The overall is a Fuzzy goodness.
31Cut goodness
di set of all nets, Connected and not cut. wi
set of all nets, Connected and cut.
32Power Goodness
Vi is the set of all nets connected and Ui is
the set of all nets connected and cut.
33Delay Goodness
Ki is the set of cells in all paths passing
by cell i. Li is the set of cells in all paths
passing by cell i and are not in same block as
i.
34Final selection Fuzzy rule.
IF Cell I is near its optimal Cut-set goodness as
compared to other cells AND AND THEN
it has a high goodness.
near its optimal power goodness compared to
other cells
near its optimal net delay goodness as compared
to other cells OR T(max)(i) is much
smaller than Tmax
35Fuzzy Goodness
Tmax delay of most critical path in current
iteration. T(max)(i) delay of longest path
traversing cell i. Xpath Tmax / T(max)(i)
Fuzzy Goodness
Respectively (Cutset, Power, Delay ) goodness.
36Selection implementation
- Biasless selection scheme
- The goodness distribution among the cells is
Guassian, with mean Gm and Standard deviation
G? . - A random Guassian Rm number is generated with
R? . - Eliminate having cells with zero selection
probability.
37Selection implementation
- Rm Gm - G?
- R? G?
- selection rule
-
- if Rm gt Goodness (I) then select the cell.
38Experimental Results
ISCAS 85-89 Benchmark Circuits
39SimE Vs Ts Vs GA against time Circuit S13207
40 SimE results were better than TS and GA, with
faster execution time.
Experimental Results SimE Vs Ts Vs GA
41 Thank you. Questions?