Title: DISTRIBUTED GENERATION OF PAIRWISE COMBINATIONS
1PARALLEL GRAPH PARTITIONING ON A HYPERCUBE
- DISTRIBUTED GENERATION OF PAIRWISE COMBINATIONS
F. Ercal, P. Sadayappan, and J.
Ramanujan University of Missouri-Rolla and The
Ohio State University
2PROBLEM DEFINITION
- Given a graph G(V,E), VN Ee
- Obtain a K partitions from G with the following
constraints - Balanced Each partition has equal size
- Minimum cut number of edges across partition is
minimized - arises in TasK Allocation, VLSI layout, File
Placement etc. - Intractable, no polynomial time algorithm is
Known - Heuristics needed
- Kernighan-Lin Mincut Heuristic (1970)
- Time complexity O(N2logN)
- Extension by Fiduccia and Mattheyses (1982)
- Used Buckets and moves. Linear time algorithm
O(e)
3MINCUT ALGORITHM
CUT5
IF V2 MOVES GAIN2 and TOT_GAIN2
CUT3
IF V5 MOVES GAIN1 and TOT_GAIN3
4MINCUT ALGORITHM (Contd..)
-2
0
v7
v1
v2
-1
v6
v5
-2
v3
-3
v8
v4
-1
CUT2
IF V1 MOVES GAIN0 and TOT_GAIN3
5RECURSIVE BISECTION
6TIME COMPLEXITY
Sequential Time Complexity for Recursive Bisection
N 2(N/2) 4(N/4) .2p(N/2p) gt
O(NlogK)
Parallel Time Complexity for Recursive Bisection
N N/2 N/4 . N/2p gt
O(N)
- COMMENT
- speedup is very limited
- to increase speedup, bisection algorithm must be
parallelized
7PAIRWISE MINCUT
PAIRS TO BE CONSIDERED FOR MINCUT (1,2) (1,3)
(1,4) (1,5) (1,6) (1,7) (1,8) (2,3) (2,4)
.. (2,8) . (7,8)
8TIME COMPLEXITY
Sequential Time Complexity for Pairwise Mincut
Parallel Time Complexity for Recursive Bisection
(100 processor utilization)
- CONCLUSIONS
- Sequential Recursive Bisection (RB) has much
lower time complexity than Pairwise Mincut (PM) - but superior parallelizability of PM renders its
parallel time complexity comparable to that of
parallel RB
91) RECURSIVE BISECTION
- Perform repeated bisection, each time doubling
the number of partitions, until K partitions are
obtained
Time Complexity
N 2(N/2) 4(N/4).2P(N/2P) gt O(NlogK)
2) PAIRWISE MINCUT
- Initially obtain K partitions. Try to reduce the
cut-size between each pair of partitions.
K(K-1)/2 pairs (each of size 2N/K) must be
considered
Time Complexity
3) Any combination of
RECURSIVE BISECTIONPAIRWISE MINCUT
10DISTRIBUTED GENERATION OF PAIRWISE COMBINATIONS
ON A HYPERCUBE
Problem
- Given 2P disjoint items, P(2P-1) distinct pairs
can be formed. - How would you efficiently generate these pairs
on the processors of a hypercube ? - Similar to the problem of distributed scheduling
of a round-robin tournament between 2C players
using C courts, where the paths between courts
form a hypercube topology - maximum utilization of courts (processor
utilization) -
- minimum walking between courts (min. comm.
overhead)
11C1
C2
A00 A01 A10 A11
B00 B01 B10 B11
P00 P01 P10 P11
d0 d1 d2
P00
Distributed PC Algorithm on a 2d Hypercube (4
Processors)
12A1 A2 A3 AK/2 AK/21 AK
B1 B2 B3 BK/2 BK/21 BK
1
CYCLIC-TOUR
RING-FRAGMENTATION
2
A1 A2 AK/4 AK/41 AK/2
AK/2 AK/21 A3K/4 A3K/41 AK
B1 B2 BK/4 BK/41 BK/2
BK/2 BK/21 B3K/4 B3K/41 BK
CYCLIC-TOUR
CYCLIC-TOUR
RING-FRAGMENTATION
13Ring Communication in different phases of
Distributed PC algorithm
0110
1110
0111
1111
0100
1100
0110
1110
0011
0010
1011
1010
0000
1000
0001
1001
(a) d0 1 ring of size 16
1110
1111
0110
0111
1100
0100
1110
0110
1011
1010
0011
0010
1000
0000
1001
0001
(b) d1 2 rings of size 8
14Ring Communication in different phases of
Distributed PC algorithm (Contd..)
1110
1111
0110
0111
1100
0100
1110
0110
1011
1010
0011
0010
1000
0000
1001
0001
(c) d2 4 rings of size 4
1110
1111
0110
0111
1100
0100
1110
0110
1011
1010
0011
0010
1000
0000
1001
0001
(d) d3 8 rings of size 2
15Ring Communication in different phases of
Distributed PC algorithm (Contd..)
1110
1111
0110
0111
1100
0100
1110
0110
1011
1010
0011
0010
1000
0000
1001
0001
(e) d4 16 rings of size 1