Parallel Optimization Tools for High Performance Design of Integrated Circuits presentation

About This Presentation

Transcript and Presenter's Notes

Title: Parallel Optimization Tools for High Performance Design of Integrated Circuits

1
Parallel Optimization Tools for High Performance
Design of Integrated Circuits

Azadeh Davoodi
Assistant Professor
(joint work with my student Tai-Hsuan Wu)
Department of Electrical and Computer Engineering

WISCAD VLSI Design Automation Lab
http//wiscad.ece.wisc.edu
Thanks to Jeff Linderoth
2
Research Optimality in IC Design

Optimality
required to assess the quality of existing design
techniques
currently use heuristics to solve large-scale,
non-linear and discrete optimization problems
have no idea how far might
be from the optimal
solution

Optimality matters to shorten the design cycle
of Integrated Circuits and meet stringent
time-to-market requirements.
Source MIPS Technologies
3
Optimization for High Performance Design

j
dj
Tcons

Discrete optimization problem
Typically the relaxed continuous version is
solved as a convex program and the result is
discretized

4
Examples of Optimization Complexity
Bench of Variables Exhaustive Search Size Reduced Search Size Level in Search Tree
c5315 705 gt E230 E10 35.11
c7552 822 gt E230 E08 26.93
c6288 1256 gt E230 E11 33.98
s1488 307 E230 E11 32.19
s1494 309 E227 E09 30.23
s9234 740 gt E230 E07 18.77
s5378 930 gt E230 E09 29.39
s38584 6950 gt E230 E09 47.94
s35932 7260 gt E230 E10 59.17
b20 24484 gt E230 E12 68.34
Azadeh Davoodi--WISCAD
5
Using Master-Worker Framework of Condor for Grid
Optimization
http//www.cs.wisc.edu/condor/mw
Master

C APIs which facilitate
dynamic and opportunistic resource utilization
fault tolerant implementation via checkpointing
and job migration

Unprocessed Tasks
Finished Tasks
T1
T2
T3
T4
T5
T6
T7
T8
T9
Tasks in process
Azadeh Davoodi--WISCAD
6
Master-Worker Implementation for High Performance
IC Design

Master
imposes variable ordering in the branch-and-bound
search tree
applies pruning of sub-optimal branches
check points after every 5000 completed tasks by
workers

Worker
each worker computes upper and lower bounds for K
number of nodes in the search tree sequentially
and communicates the bounds to the Master

7
Dealing with Communication Overhead

3 types of data exchange between the Master and
each Worker
scalar upper and lower bounds
circuit information (optimization problem
description)
partial variable assignment
Send above only once when the worker is allocated
and reuse each worker for future tasks as much as
possible

8
MW Implementation in Condor

MASTER SUBMIT FILE
Universe Scheduler
Executable master_DGS_socket
Image_Size 100000
MemoryRequirements 100
Input in_master.socket
Output out_master.socket
Error out_worker.socket
Log _DGS.log
Requirements (Arch "INTEL"
OPSYS"LINUX")
getenv True
Queue

WORKER SUBMIT FILE
Universe Vanilla
Worker 1Executable exec0.(Opsys).(Arch).ex
e arguments 0 8997 8997 144.92.240.35
Log log_file
Output output_file.0
Error error_file.0
Requirements ( Arch"INTEL
OPSYS"LINUX")
should_transfer_files Yes when_to_transfer_outp
ut ON_EXIT
rank Mips
on_exit_remove false
Queue
Worker 2

Resource Information
179 CAE machines Intel/Linux
If all CAE are in use, Flocks to the queue of
Intel/Linux machines in CS

Azadeh Davoodi--WISCAD
9
Results
On-an-average each variable had 4.5 discrete
options to choose from.
Bench variables Runtime Max Workers Average Workers
c5315 705 36min 118 105.74
c6288 1256 66min 126 114.94
c7552 822 31min 113 101.97
s5378 930 39min 129 95.57
s9234 740 52min 119 95.8
s15850 617 48min 139 112.67
s35932 7260 36min 163 108.15
s38584 6950 62min 133 113.86
b18 47191 52hours 192 189.82
b20 15699 28hours 187 167.29
b22 24484 38hours 190 173.73
Azadeh Davoodi--WISCAD
10
Future Plans

Parallel Optimization Tools for High Performance Design of Integrated Circuits PowerPoint PPT Presentation