VLSI Design Abstraction Levels - PowerPoint PPT Presentation

About This Presentation

Title:

VLSI Design Abstraction Levels

Description:

Introduction (VLSI Placement Problem Definition) Informal definition ... Penalization of frequent moves. Kelly et. al.94 proposed the following strategy for QAP ... – PowerPoint PPT presentation

Number of Views:415

Avg rating:3.0/5.0

Slides: 45

Provided by: facultyK

Category:

more less

Transcript and Presenter's Notes

Title: VLSI Design Abstraction Levels

1
Introduction

VLSI Design Abstraction Levels

Idea
Architectural Design
Logical Design
Physical Design
Fabrication
New Chip
2
Introduction (VLSI Placement Problem Definition)

Informal definition and objective
Mm1, m2, , mn, Ss1, s2, , sk
Each mi is associated with a set Smi ? S
Each si is associated with a set Msimjsi?Smj
LL1, L2, , Lp, p ? n
Complexity is n! ? NP-Hard problem

3
Introduction (VLSI Placement Styles)

Full Custom Layout
Gate Array Methodology
Macro Cell
Standard Cell Methodology

Wasted Space
Feedthrough Cell
A
Pad
B
Routing Channel
4
Introduction (Heuristics Applied to VLSI
Placement)

Deterministic vs. Stochastic Heuristics
Constructive vs. Iterative Heuristics
Constructive Deterministic
Linear Placement Alg., Min-Cut Placement Alg.
Force Directed Alg.
Iterative Stochastic
Simulated Annealing, Genetic Alg., Simulated
Evolution Tabu Search

5
Multi-Objective Placement Problem

Multiple Objectives and Constraints
Conflicting Objectives
Interconnection Length
Area
Critical Path Delay
Overall Solution Quality Evaluation

6
Multi-Objective Placement (Interconnection Length)

Significance
Net Definition and Estimation Techniques
Steiner Tree Approximation

1
2
3
4
7
Multi-Objective Placement (Area)

Standard Cell Placement uses fixed height cells
Fixed Height Channels are assumed ?
Only width varies from a solution to another
Width is given by the longest row

8
Multi-Objective Placement (Critical Path Delay)

A VLSI ckt. is a collection of paths
Path and Critical Path Definitions
Given a path ?, let v1, v2, , vk be the nets
belonging to ?. The delay of the path ? is given
by

9
Multi-Objective Placement (Overall Solution
Evaluation)

Multi-objective ? Vector quantity
Weighted Sum vs. Fuzzy Logic
Fuzzy Goal Based Cost Measure
C(x) (C1(x), C2(x), , Cp(x))
O (O1, O2, , Op), Oi ? Ci(x) ?i, ?x?K
G (g1, g2, , gp)
x is acceptable if Ci(x) ? gi ? Oi

10
Multi-Objective Placement (Acceptable Solutions)
Cdelay(x)
gdelay?Odelay
Cwidth(x)
gwidth?Owidth
Odelay
Owidth
Owl
gwl?Owl
Cwl(x)
Impossible Solutions
Acceptable Solutions
11
Multi-Objective Placement (Membership Rule)

The rule for determining membership in the fuzzy
set is
If the solution is within acceptable wire length
AND within acceptable circuit delay AND within
acceptable width, THEN it is acceptable.

12
Multi-Objective Placement (Membership Function)
?ic
1.0
?ic(x)
gi
Ci/Oi
1.0
Ci(x)/Oi
13
Multi-Objective Placement (Lower Bounds)

Optimum values for wire length, delay and width
are computed according to

14
Tabu Search
Start with an Initial Solution s
Initialize TL (Tabu List), AL (Aspiration
Level) and Counter

Definition, Key feature and Memory
TS vs. Local Search
Basic TS Algorithm

Investigate subset V of the neighborhood of s
Find the best solution s in V
Yes
Is the move (s,s) Tabu?
No
Yes
C(s) lt AL
s s
No
Update TL and AL
Counter Counter 1
Yes
No
CounterN_Iterations?
Report the best solution
15
Tabu Search Parameters (CL)

Candidate List
why a candidate list?
Construction Strategies
Aspiration Plus
Elite Candidate List
Successive Filter Strategy
Sequential Fan Candidate List
Bounded Change Candidate List

16
Tabu Search Parameters (Moves)

Moves and Move Attributes
what is a move?
why move attributes?
Complementing a binary variable in the solution
Having a change of C(s) - C(s) in the cost
Similar change in another problem based function
Any combination of the above

17
Tabu Search Paramters (Cost TL)

Evaluation Function
Implementation vs. operation cost
single or multiple objectives
Tabu List
Local search, intensification and diversification
Tabu Tenure
problem size
Search objective

18
Tabu Search Parameters (AC)

Aspiration Criteria
What and Why?
Global Aspiration by Objective
Regional Aspiration by Objective
Aspiration by Search Direction
Aspiration by Influence

19
Tabu Search Classes

Short Term Memory
Intermediate Term Memory
Long Term Memory

20
Literature Review (Heuristics Applied to VLSI
Placement)

Linear Placement Algorithm (CD)
Min-Cut Placement (CD)
Force Directed Placement (ID)
Simulated Annealing (IS)
Genetic Algorithm (IS)
Simulated Evolution (IS)
Tabu Search (IS)

21
Literature Review (Tabu Search in VLSI Placement)

Lim, Chee and Wu92
Macro-Cell Placement with global routing
Quad Partitioning Using TS to minimize delay
Interconnect and cell delay (Weighted Sum)
Lin and Du90
Capacitor Placement in radial distribution sys.
Minimize energy loss. Short term TS random CL
Improvement in quality and time over SA

22
Literature Review (Tabu Search in VLSI Placement)

Handa and Kuga95
Analog LSI chip designs placement
Wire length, area ease routing
TS performed better when imposed on GA
Mackey and Carothers96
Quad-partitioning VLSI Macro-cell placement
Wire length minimization using Fuzzy Cost
Significant Improvement over Lim, Chee Wu

23
Literature Review (Tabu Search Parallelization)

Purposes
How to paralelize?
One search for time t or p searches for time t/p

24
Literature Review (Tabu Search Parallelization)

Taillard90
Neighborhood examination for FSSP
A master broadcasts an initial solution
Slaves send their best found neighbors
The master picks the overall best move
It continues for fixed iterations or until no
improvement is observed

25
Literature Review (Tabu Search Parallelization)

Garcia, Potvin and Rousseau94
Parallel TS for vehicle routing
A master and slaves investigate neighborhood
Each process sends its best move
The master broadcasts a set of best moves
De Falco et. al. 94
Evolution principles were included in PTS
Neighbor machines exchange best solutions
If coming best is better, it replaces the local
one

26
Literature Review (Tabu Search Parallelization)

Nair and Freville97
PTS for 0-1 multi-knapsack problem
A master generates initial solutions and
strategies
Mori and Hayashim98
PTS for voltage and reactive power control
Two Schemes
Neighborhood Investigation
Search replication with different Tabu Tenures

27
Proposed Algorithm (Basic Proposed TS Algorithm)

Reads randomly generated solution initializes
The Alg. runs STMTS for fixed of iterations
CL is constructed using random moves
A move is swapping two cells
A move attribute is the swapped cells numbers
A compound move can be made with depth d
examining Nv neighbors at each step.
Tabu Tenure used depends on the circuit size

28
Proposed Algorithm (Basic Proposed TS Algorithm)

Aspiration by Objective
Cost Function used is the same as the one
proposed by Ali in his MS thesis
Wire length, delay and width are computed
It uses Fuzzy Goal Based Cost Measure

29
Parallelization of the Proposed Algorithm

Parallelized on a NOW using PVM
Why?
Two levels of parallelization
Candidate List Construction
Tabu search replication

TS Master
TS Worker
TS Worker
TS Worker
CLW
CLW
CLW
CLW
CLW
CLW
30
Parallelization of the Proposed Algorithm
(Scenario)
Tab Search Master
Initialize Data Structures and Read Initial
solution
Repeats for No.of Global Iterations
Tab Search Worker
Spawn TSWs and pass them the arguments
Receive arguments from TSM
Send Current Solution to TSWs
Receive Initial Solution from TSM
Candidate List Worker
Perform a diversification step
Receive arguments from TSW
Spawn CLWs and pass them args
Repeats for No.of Local Iterations
Send current solution to CLWs
Receive Initial Solution from TSW
Investigate the neighborhood and find the best
move
Send the best cost and best solution if the TSW
asks for it
Get best cost from all CLWs and best solution as
the overall best
If move isnt tabu or satisfies AC accept it.
Otherwise, reject it
Send best cost and best solution if the TSM asks
for it
Get best cost from all TSWs and best solution as
the overall best
31
Parallelization of the Proposed Algorithm (cell
selection)

When a CLW chooses two cells for swap, one of
them has to be from its range
If not, prob. that 2 CLWS pick the same two cells
is (2/n)2
Prob. that k CLWs pick the same two cells is
(2/n)k
In our case, prob. that 2 CLWS pick the same 2
cells is (1/n)2
Prob. that more than 2 CLWS pick the same cells
is 0 ?
Prob. that 2 CLWs pick the same cells is reduced
by 4
Prob. that k gt 2 CLWs pick the same cells is
eliminated
CLWs make compound moves ? Sequential Fan CL
Strat.

32
Parallelization of the Proposed Algorithm ( of
choices)

If ClWs have no restriction in choosing cells ?
C2n choices
Prob. that 2 cells are taken from the same range
is k(k-1)/n2
If each of k CLWs have to choose both cells from
its range ? k ? C2n/k choices
In this case, if we have 100 cells and 4 CLWs ?
we normally have 4950 choices and only 1200
choices with the restriction ? 75.8 of the
neighborhood is ignored
In our case, prob. that the 2 cells are taken
from the same range is 1? (k-1)/n (k-1)/n ?
prob. of choosing cells from the same range is
multiplied by n/k ? prob. of choosing 1 cell from
outside the range is reduced by n/k

33
Applying the Algorithm in a Heterogeneous
Environment

A NOW is normally heterogeneous in
Machine architecture, data format, computational
speed, network type, machine load and network
load
PVM takes care of the first 2
In our implementation we care for others
The master gets best sol. from any TSW that
finished LI
Once finished TSWs are half the total, the master
asks others
TSWs check for such a message every 10 iterations
Once they receive it, they kill the currently
running CLWs and report their best solutions to
the master

34
Applying the Algorithm in a Heterogeneous
Environment

Same principle applies between TSW CLW
CLWs check frequently for a message that either
kills them or asks them for their best
By that, we account for machine load, machine
speed and network load heterogeneity
Experiments were run on PX/SPARC, Sparc-Station
10, LX/SPARC and UltraSparc 1
All have the same OS (Solaris 2.5)

35
Diversification of the Search Process

What and why?
Penalization of frequent moves
Kelly et. al.94 proposed the following strategy
for QAP
Let most recent minimum be ?min?min(1),
?min(2), , ?min(n) and the current sol. be
?cur?cur(1), ?cur(2), , ?cur(n), then all
swaps ?cur(x) ? ?cur(y) such that ?cur(x)
?min(x) or
?cur(y) ?min(y) are considered. The swap
with highest improvement or least degradation is
performed. Such moves are made until no more
moves are available

36
Diversification of the Search Process

In our work, the given scheme is modified to make
every TSW investigate a different space
Every time a TSW gets a solution from the TSM, it
diversifies within its assigned range. It
performs swaps to a predetermined depth. At every
swap, it makes Nv trials and accepts the best
The 1st cell has to be from the range ?
probability that 2 CLWs make the same move is
reduced by 4 and probability that k gt 2 CLWs make
the same move is eliminated
A condition for the swap is that the new
locations have to be different from original ones

37
Experiments and Results

Effect of Low-level parallelization degree
Effect of High-level parallelization degree
Effect of Accounting for Heterogeneity
Effect of Diversification
Weighted Sum vs. Fuzzy Evaluation
Comparison with Previous Results

38
Experiments and Results (Benchmark
Characteristics)
39
Experiments and Results (Experiments Parameters)
40
Experiments and Results (Effect of Number of CLWs)

CLWs from 1 to 4 and TSWs fixed to 4
12 Machines are included in the PVM

41
Experiments and Results (Effect of CLWs on Qual.)
42
Experiments and Results (Runtime of CLWs)
43
Experiments and Results (Speedup of CLWs)
44
Experiments and Results (Efficiency of CLWs)

Write a Comment

User Comments (0)