VLSI Design Abstraction Levels - PowerPoint PPT Presentation

About This Presentation
Title:

VLSI Design Abstraction Levels

Description:

Introduction (VLSI Placement Problem Definition) Informal definition ... Penalization of frequent moves. Kelly et. al.94 proposed the following strategy for QAP ... – PowerPoint PPT presentation

Number of Views:412
Avg rating:3.0/5.0
Slides: 45
Provided by: facultyK
Category:

less

Transcript and Presenter's Notes

Title: VLSI Design Abstraction Levels


1
Introduction
  • VLSI Design Abstraction Levels

Idea
Architectural Design
Logical Design
Physical Design
Fabrication
New Chip
2
Introduction (VLSI Placement Problem Definition)
  • Informal definition and objective
  • Mm1, m2, , mn, Ss1, s2, , sk
  • Each mi is associated with a set Smi ? S
  • Each si is associated with a set Msimjsi?Smj
  • LL1, L2, , Lp, p ? n
  • Complexity is n! ? NP-Hard problem

3
Introduction (VLSI Placement Styles)
  • Full Custom Layout
  • Gate Array Methodology
  • Macro Cell
  • Standard Cell Methodology

Wasted Space
Feedthrough Cell
A
Pad
B
Routing Channel
4
Introduction (Heuristics Applied to VLSI
Placement)
  • Deterministic vs. Stochastic Heuristics
  • Constructive vs. Iterative Heuristics
  • Constructive Deterministic
  • Linear Placement Alg., Min-Cut Placement Alg.
    Force Directed Alg.
  • Iterative Stochastic
  • Simulated Annealing, Genetic Alg., Simulated
    Evolution Tabu Search

5
Multi-Objective Placement Problem
  • Multiple Objectives and Constraints
  • Conflicting Objectives
  • Interconnection Length
  • Area
  • Critical Path Delay
  • Overall Solution Quality Evaluation

6
Multi-Objective Placement (Interconnection Length)
  • Significance
  • Net Definition and Estimation Techniques
  • Steiner Tree Approximation

1
2
3
4
7
Multi-Objective Placement (Area)
  • Standard Cell Placement uses fixed height cells
  • Fixed Height Channels are assumed ?
  • Only width varies from a solution to another
  • Width is given by the longest row

8
Multi-Objective Placement (Critical Path Delay)
  • A VLSI ckt. is a collection of paths
  • Path and Critical Path Definitions
  • Given a path ?, let v1, v2, , vk be the nets
    belonging to ?. The delay of the path ? is given
    by

9
Multi-Objective Placement (Overall Solution
Evaluation)
  • Multi-objective ? Vector quantity
  • Weighted Sum vs. Fuzzy Logic
  • Fuzzy Goal Based Cost Measure
  • C(x) (C1(x), C2(x), , Cp(x))
  • O (O1, O2, , Op), Oi ? Ci(x) ?i, ?x?K
  • G (g1, g2, , gp)
  • x is acceptable if Ci(x) ? gi ? Oi

10
Multi-Objective Placement (Acceptable Solutions)
Cdelay(x)
gdelay?Odelay
Cwidth(x)
gwidth?Owidth
Odelay
Owidth
Owl
gwl?Owl
Cwl(x)
Impossible Solutions
Acceptable Solutions
11
Multi-Objective Placement (Membership Rule)
  • The rule for determining membership in the fuzzy
    set is
  • If the solution is within acceptable wire length
    AND within acceptable circuit delay AND within
    acceptable width, THEN it is acceptable.

12
Multi-Objective Placement (Membership Function)
?ic
1.0
?ic(x)
gi
Ci/Oi
1.0
Ci(x)/Oi
13
Multi-Objective Placement (Lower Bounds)
  • Optimum values for wire length, delay and width
    are computed according to

14
Tabu Search
Start with an Initial Solution s
Initialize TL (Tabu List), AL (Aspiration
Level) and Counter
  • Definition, Key feature and Memory
  • TS vs. Local Search
  • Basic TS Algorithm

Investigate subset V of the neighborhood of s
Find the best solution s in V
Yes
Is the move (s,s) Tabu?
No
Yes
C(s) lt AL
s s
No
Update TL and AL
Counter Counter 1
Yes
No
CounterN_Iterations?
Report the best solution
15
Tabu Search Parameters (CL)
  • Candidate List
  • why a candidate list?
  • Construction Strategies
  • Aspiration Plus
  • Elite Candidate List
  • Successive Filter Strategy
  • Sequential Fan Candidate List
  • Bounded Change Candidate List

16
Tabu Search Parameters (Moves)
  • Moves and Move Attributes
  • what is a move?
  • why move attributes?
  • Complementing a binary variable in the solution
  • Having a change of C(s) - C(s) in the cost
  • Similar change in another problem based function
  • Any combination of the above

17
Tabu Search Paramters (Cost TL)
  • Evaluation Function
  • Implementation vs. operation cost
  • single or multiple objectives
  • Tabu List
  • Local search, intensification and diversification
  • Tabu Tenure
  • problem size
  • Search objective

18
Tabu Search Parameters (AC)
  • Aspiration Criteria
  • What and Why?
  • Global Aspiration by Objective
  • Regional Aspiration by Objective
  • Aspiration by Search Direction
  • Aspiration by Influence

19
Tabu Search Classes
  • Short Term Memory
  • Intermediate Term Memory
  • Long Term Memory

20
Literature Review (Heuristics Applied to VLSI
Placement)
  • Linear Placement Algorithm (CD)
  • Min-Cut Placement (CD)
  • Force Directed Placement (ID)
  • Simulated Annealing (IS)
  • Genetic Algorithm (IS)
  • Simulated Evolution (IS)
  • Tabu Search (IS)

21
Literature Review (Tabu Search in VLSI Placement)
  • Lim, Chee and Wu92
  • Macro-Cell Placement with global routing
  • Quad Partitioning Using TS to minimize delay
  • Interconnect and cell delay (Weighted Sum)
  • Lin and Du90
  • Capacitor Placement in radial distribution sys.
  • Minimize energy loss. Short term TS random CL
  • Improvement in quality and time over SA

22
Literature Review (Tabu Search in VLSI Placement)
  • Handa and Kuga95
  • Analog LSI chip designs placement
  • Wire length, area ease routing
  • TS performed better when imposed on GA
  • Mackey and Carothers96
  • Quad-partitioning VLSI Macro-cell placement
  • Wire length minimization using Fuzzy Cost
  • Significant Improvement over Lim, Chee Wu

23
Literature Review (Tabu Search Parallelization)
  • Purposes
  • How to paralelize?
  • One search for time t or p searches for time t/p

24
Literature Review (Tabu Search Parallelization)
  • Taillard90
  • Neighborhood examination for FSSP
  • A master broadcasts an initial solution
  • Slaves send their best found neighbors
  • The master picks the overall best move
  • It continues for fixed iterations or until no
    improvement is observed

25
Literature Review (Tabu Search Parallelization)
  • Garcia, Potvin and Rousseau94
  • Parallel TS for vehicle routing
  • A master and slaves investigate neighborhood
  • Each process sends its best move
  • The master broadcasts a set of best moves
  • De Falco et. al. 94
  • Evolution principles were included in PTS
  • Neighbor machines exchange best solutions
  • If coming best is better, it replaces the local
    one

26
Literature Review (Tabu Search Parallelization)
  • Nair and Freville97
  • PTS for 0-1 multi-knapsack problem
  • A master generates initial solutions and
    strategies
  • Mori and Hayashim98
  • PTS for voltage and reactive power control
  • Two Schemes
  • Neighborhood Investigation
  • Search replication with different Tabu Tenures

27
Proposed Algorithm (Basic Proposed TS Algorithm)
  • Reads randomly generated solution initializes
  • The Alg. runs STMTS for fixed of iterations
  • CL is constructed using random moves
  • A move is swapping two cells
  • A move attribute is the swapped cells numbers
  • A compound move can be made with depth d
    examining Nv neighbors at each step.
  • Tabu Tenure used depends on the circuit size

28
Proposed Algorithm (Basic Proposed TS Algorithm)
  • Aspiration by Objective
  • Cost Function used is the same as the one
    proposed by Ali in his MS thesis
  • Wire length, delay and width are computed
  • It uses Fuzzy Goal Based Cost Measure

29
Parallelization of the Proposed Algorithm
  • Parallelized on a NOW using PVM
  • Why?
  • Two levels of parallelization
  • Candidate List Construction
  • Tabu search replication

TS Master
TS Worker
TS Worker
TS Worker
CLW
CLW
CLW
CLW
CLW
CLW
30
Parallelization of the Proposed Algorithm
(Scenario)
Tab Search Master
Initialize Data Structures and Read Initial
solution
Repeats for No.of Global Iterations
Tab Search Worker
Spawn TSWs and pass them the arguments
Receive arguments from TSM
Send Current Solution to TSWs
Receive Initial Solution from TSM
Candidate List Worker
Perform a diversification step
Receive arguments from TSW
Spawn CLWs and pass them args
Repeats for No.of Local Iterations
Send current solution to CLWs
Receive Initial Solution from TSW
Investigate the neighborhood and find the best
move
Send the best cost and best solution if the TSW
asks for it
Get best cost from all CLWs and best solution as
the overall best
If move isnt tabu or satisfies AC accept it.
Otherwise, reject it
Send best cost and best solution if the TSM asks
for it
Get best cost from all TSWs and best solution as
the overall best
31
Parallelization of the Proposed Algorithm (cell
selection)
  • When a CLW chooses two cells for swap, one of
    them has to be from its range
  • If not, prob. that 2 CLWS pick the same two cells
    is (2/n)2
  • Prob. that k CLWs pick the same two cells is
    (2/n)k
  • In our case, prob. that 2 CLWS pick the same 2
    cells is (1/n)2
  • Prob. that more than 2 CLWS pick the same cells
    is 0 ?
  • Prob. that 2 CLWs pick the same cells is reduced
    by 4
  • Prob. that k gt 2 CLWs pick the same cells is
    eliminated
  • CLWs make compound moves ? Sequential Fan CL
    Strat.

32
Parallelization of the Proposed Algorithm ( of
choices)
  • If ClWs have no restriction in choosing cells ?
    C2n choices
  • Prob. that 2 cells are taken from the same range
    is k(k-1)/n2
  • If each of k CLWs have to choose both cells from
    its range ? k ? C2n/k choices
  • In this case, if we have 100 cells and 4 CLWs ?
    we normally have 4950 choices and only 1200
    choices with the restriction ? 75.8 of the
    neighborhood is ignored
  • In our case, prob. that the 2 cells are taken
    from the same range is 1? (k-1)/n (k-1)/n ?
    prob. of choosing cells from the same range is
    multiplied by n/k ? prob. of choosing 1 cell from
    outside the range is reduced by n/k

33
Applying the Algorithm in a Heterogeneous
Environment
  • A NOW is normally heterogeneous in
  • Machine architecture, data format, computational
    speed, network type, machine load and network
    load
  • PVM takes care of the first 2
  • In our implementation we care for others
  • The master gets best sol. from any TSW that
    finished LI
  • Once finished TSWs are half the total, the master
    asks others
  • TSWs check for such a message every 10 iterations
  • Once they receive it, they kill the currently
    running CLWs and report their best solutions to
    the master

34
Applying the Algorithm in a Heterogeneous
Environment
  • Same principle applies between TSW CLW
  • CLWs check frequently for a message that either
    kills them or asks them for their best
  • By that, we account for machine load, machine
    speed and network load heterogeneity
  • Experiments were run on PX/SPARC, Sparc-Station
    10, LX/SPARC and UltraSparc 1
  • All have the same OS (Solaris 2.5)

35
Diversification of the Search Process
  • What and why?
  • Penalization of frequent moves
  • Kelly et. al.94 proposed the following strategy
    for QAP
  • Let most recent minimum be ?min?min(1),
    ?min(2), , ?min(n) and the current sol. be
    ?cur?cur(1), ?cur(2), , ?cur(n), then all
    swaps ?cur(x) ? ?cur(y) such that ?cur(x)
    ?min(x) or
  • ?cur(y) ?min(y) are considered. The swap
    with highest improvement or least degradation is
    performed. Such moves are made until no more
    moves are available

36
Diversification of the Search Process
  • In our work, the given scheme is modified to make
    every TSW investigate a different space
  • Every time a TSW gets a solution from the TSM, it
    diversifies within its assigned range. It
    performs swaps to a predetermined depth. At every
    swap, it makes Nv trials and accepts the best
  • The 1st cell has to be from the range ?
    probability that 2 CLWs make the same move is
    reduced by 4 and probability that k gt 2 CLWs make
    the same move is eliminated
  • A condition for the swap is that the new
    locations have to be different from original ones

37
Experiments and Results
  • Effect of Low-level parallelization degree
  • Effect of High-level parallelization degree
  • Effect of Accounting for Heterogeneity
  • Effect of Diversification
  • Weighted Sum vs. Fuzzy Evaluation
  • Comparison with Previous Results

38
Experiments and Results (Benchmark
Characteristics)
39
Experiments and Results (Experiments Parameters)
40
Experiments and Results (Effect of Number of CLWs)
  • CLWs from 1 to 4 and TSWs fixed to 4
  • 12 Machines are included in the PVM

41
Experiments and Results (Effect of CLWs on Qual.)
42
Experiments and Results (Runtime of CLWs)
43
Experiments and Results (Speedup of CLWs)
44
Experiments and Results (Efficiency of CLWs)
Write a Comment
User Comments (0)
About PowerShow.com