Graph Partitioning and its Application to the NodeRank Mapping Problem presentation

About This Presentation

Transcript and Presenter's Notes

Title: Graph Partitioning and its Application to the NodeRank Mapping Problem

1
Graph Partitioning and its Application to the
Node-Rank Mapping Problem

Gaurav Khanna
Dr. Rahul Garg
Dr. Nisheeth Vishnoi

2
RoadMap

Node-Rank Mapping Problem
Graph Partitioning
KRV A new algorithm for graph partitioning
Graph Partitioning Results
Approaches for the node-rank mapping
Mapping Results
Conclusions/Future Work

3
Node-Rank Mapping Problem
M
a
p
p
p
p
i
0
1
P
P
0
1
n
g
P
P
2
3
p
p
2
3
Parallel
Processors
program
4
Node-Rank Mapping Goals

Processor Utilization
Ideally all processors have equal computation and
communication costs
Minimization of Inter-processor communication
Can be posed as a graph partitioning problem
Graph G (V, E)
Each task is represented by a vertex
Vertex weights represents computational costs
Each edge denotes communication between a pair of
tasks
Edge weights represent communication costs
Goal
Partition vertices into P parts such that each
partition has equal vertex weights
Minimize the weight of edges cut
Problem is NP hard

5
Node-Rank Mapping / Graph Partitioning
P0
P1
P0
P1
P0
P1

Load Balance and Minimizing
communication
are often competing forces

6
Graph Partitioning - Metis

A widely used partitioning tool.
Goal
Minimize edge cut
Balance the sum of vertex weights in each
partition as much as possible
Uses Multilevel partitioning algorithm.
Coarsening Phase.
Initial Partitioning Phase.
Uncoarsening Phase.
KL-type refinement algorithm.

7
Graph Partitioning The Sparsest Cut Problem

Given a graph G(V,E)

Find a cut that minimizes the ratio of the
weight of edges across and the size of the
smaller side
TV \ S
S
W(S,T)
Minimize (S,T)
min S,T
Sparsity ?(S)
8
KRV algorithm for the sparsest cut problem

Khandekar-Rao-Vazirani (KRV) STOC 2006
Graph Partitioning using single commodity flows
O(log2 n) approximation to sparsity using O(log2
n) single commodity max-flow computations
Key idea
Expanders
Single Commodity flows
Runtime complexity of O(n3/2)
Comparison to previous approaches
Leighton-Rao88 based on multi-commodity flows O(
? log n )
Alon-Milman85 based on spectral methods O(v ?)
Yield better approximations but take O(n2) time.
KRV is faster
Yields poly-logarithmic complexity

9
Main Theorem

Given a graph G(V,E) on n vertices and a 1,
there exists an algorithm that
either outputs a cut of sparsity at most a,
or proves that every cut has sparsity at least
.
Procedure to Output a cut
Employ a binary search on the sparsity value a
Start with a middle of a
If cut found with sparsity a, lower a
If no cut found in O(log(sqr(n))) iterations,
increase a
Output the cut with least value of a

a
log2 n
10
KRV algorithm Pseudo code
Procedure KRV(G(V,E))

H (V,Empty)
While(amax gt amin)
a (amax amin)/2
num_iterations O(log(sqr(n))
for(i0 i lt num_Iterations i)
Vector GenRandomOrthogonalVector(V)
S FindBisection(H,Vector, V)
F CreateFlowNetwork(G,S, a)
flow MaxFlow(F)
If(flow MAXFLOW)
M GenerateMatching(F,flow)
Add Matching to H
continue
else

Procedure FindBisection (H, Vector, V)
for each matching M in H) For each
pair ij which belongs to M)
Vi Vj (Vi Vj)/2
Output the indexes of n/2 smallest values of V
11
KRV
G(V,E)
H(V,F,w)
n/2
n/2

Assign each edge corresponding to the original
graph a capacity 1/ a
Assign each dotted edge a weight of 1

12
KRV algorithm Pseudo code
Procedure KRV(G(V,E))

H empty
While(amax gt amin)
a (amax amin)/2
num_Iterations O(log(sqr(n))
for(i0 I lt num_Iterations i)
Vector GenRandomOrthogonalVector(V)
S FindBisection(H,Vector, V)
F CreateFlowNetwork(G,S, a)
flow MaxFlow(F)
If(flow MAXFLOW)
M GenerateMatching(F,flow)
Add Matching to H
continue
else

13
KRV
G(V,E)
H(V,F,w)
14
KRV algorithm Pseudo code
Procedure KRV(G(V,E))

H empty
While(amax gt amin)
a (amax amin)/2
num_Iterations O(log(sqr(n))
for(i0 I lt num_Iterations i)
Vector GenRandomOrthogonalVector(V)
S FindBisection(H,Vector, V)
F CreateFlowNetwork(G,S, a)
flow MaxFlow(F)
If(flow MAXFLOW)
M GenerateMatching(F,flow)
Add Matching to H
continue
else

15
KRV
G(V,E)
H(V,F,w)
16
KRV algorithm Pseudo code
Procedure KRV(G(V,E))

While(amax gt amin)
a (amax amin)/2
Num_Iterations O(log(sqr(n))
for(i0 I lt Num_Iterations i)
Vector Generate_Random_Orthogonal_Vector(V
)
S FindBisection(Vector, V)
F CreateFlowNetwork(G,S, a)
flow MaxFlow(F)
If(flow MAXFLOW)
M Generate_Matching(F,flow)
continue
else
amax a

17
KRV
G(V,E)
H(V,F,w)
18
KRV
G(V,E)
H(V,F,w)
19
KRV algorithm Pseudo code
Procedure KRV(G(V,E))

While(amax gt amin)
a (amax amin)/2
Num_Iterations O(log(sqr(n))
for(i0 I lt Num_Iterations i)
Vector Generate_Random_Orthogonal_Vector(V
)
S FindBisection(Vector, V)
F CreateFlowNetwork(G,S, a)
flow MaxFlow(F)
If(flow MAXFLOW)
M Generate_Matching(F,flow)
continue
else
amax a

20
KRV
G(V,E)
H(V,F,w)
G(V,E)
H(V,F,w)
21
KRV
Cut-size n/2 k l E(S,T) lt n/2
S
E(S,T) lt k l S
min S,T
k
E(S,T) ( cut-edges) / a Sparsity of cut
(cut-edges1)/ min
S,T Therefore, Sparsity of cut lt a
l
T
Assume S T
22
Our Implementation of KRV

Employs Dinics maxflow algorithm for computing
maximum flows
O(n2m) complexity
Matching Generation algorithm
Greedy Approach
Iteratively, Find Paths from source to sink with
non-zero flow and match the corresponding vertex
from both the partitions
KRV yields cuts while trying to minimize sparsity
But, we need balanced cuts
More applicability
e.g. parallel computing, VLSI layouts, sparse
linear system solving
For eventual application to node-rank mapping
problem
Run-time reduces significantly
KRV_Balanced
Yields 1/3 2/3 balanced cut
Both partitions have at least n/3 vertices
Call KRV recursively each time on the bigger
partition

23
Graph Partitioning Results

Comparison across three schemes
KRV_balanced
Metis ( default balance of 1/2-1/2)
Metis ( input with the balance obtained by
KRV_balanced)
Classes of Graphs
Benchmark graphs obtained from the graph
partitioning archive
http//staffweb.cms.gre.ac.uk/c.walshaw/partition
/
Graphs based on power-law degree distributions
R-MAT A recursive model for Graph Mining
Degree distributions of the internet
Graphs representing dense components connected
sparsely

24
Benchmark Graphs
25
Benchmark Graphs
26
Graphs based on powerlaw degree distributions
27
Graphs based on powerlaws degree distributions
28
Graphs- Dense components connected sparsely
29
Graphs- Dense components connected sparsely
30
Approaches to solve the Node-Rank Mapping Problem

Goal
Obtain a map of the graph vertices onto a torus
Minimize the cost function k,j C(k,j)
H(m(k),m(j))
Two Phase Approach
Linear 1-D arrangement of the graph vertices
Embedding the linear arrangement onto a
d-dimensional torus
Linear Arrangement of Vertices
Employ the KRV algorithm
Apply recursively on each smaller sub-graph
Eventually, Obtain a one-dimensional ordering of
vertices
Vertices closer together in the ordering have
high communication
Minimize a metric which has a similar form as the
designated cost function

31
Approaches to solve the Node-Rank Mapping Problem

Map the vertices onto the torus using a
space-filling curve
Generate a curve through a d-dimensional mesh
Any two vertices differing by a distance k along
the curve are at a distance O(dk1/d) in the mesh
Map the linear ordering of vertices onto this
curve
Each vertex is mapped onto its corresponding
point
Cost of embedding in the mesh is atmost O(d)
times the cost in the linear arrangement

32
Illustration
Original Graph G
4
3
2
1
7
5
6
Sub-graph G2
Sub-graph G1
1
4
3
2
5
6
7
Sub-graph G12
Sub-graph G11
Sub-graph G22
5
6
1
Sub-graph G21
4
3
2
7
Final Map
6
1
5
7
2
4
3
33
Illustration contd..
Final Map
1
5
7
2
4
3
6
3
7
6
2
4
1
5
34
Results

Existing Work
Optimizing task layout on the BlueGene/L
Supercomputer
Bhanot et.al. IBM J. Res. Dev. Vol 49 No. 2/3
March/May 2005
SA (Simulated Annealing) based approach to
optimize job layout
Mapping m(j) torus node location where MPI task
j is mapped.
H(m(k),m(j)) Number of Hops on torus between
mapping of task k and task j
F Free Energy k,j C(k,j) H(m(k),m(j))
Minimize F by a series of swaps between randomly
chosen torus positions
KRV space-filling based scheme does not perform
as good as SA
SA directly tries to optimize the cost function.
Therefore, performs better.
KRV space-filling does it in a two-step fashion
Moreover, the bounds proved for space-filling
have been proved for d-dim meshes
The cost function is based on a torus topology
Another variant
Use the map obtained by KRV space-filling as an
initial map
Apply annealing

35
Alternate Approach

Another approach employed in Bhanot et.al.
Apply a graph partitioner to divide the original
graph into sub-graphs
Choose a sub-graph and apply SA on it
Map the sub-graph onto the torus
Repeat the same procedure with the next sub-graph
and the remaining available torus
Uses Metis to Partition
Idea is to reduce the runtime without significant
degradation in performance

36
Alternate Approach

Graph Partitioning followed by annealing
Compare Metis Vs KRV applied to the above
approach
Annealing optimizes each sub-graph separately
Oblivious to the edges crossing the cut
Map needs to be optimized furthur
We apply a hill-climbing style local search
heuristic
In iteration i
Swap neighbors separated by distance i
If the cost improves, commit this move
Else, reject the move
Local optimization heuristic

37
Benchmarks

NAS parallel benchmarks
SP, BT, LU, CG, MG
Standard communication benchmarks
Smg2000
Parallel semi-coarsening multi-grid solver for
the linear systems
Umt2k
3D, deterministic, multi-group, photon transport
code for unstructured meshes
Collecting the communication matrices
All applications were linked to the MPI tracing
library
Run on BlueGene/l bgd machines
Generates communication matrices in dense format

38
Results
39
Conclusions/Future Work

Graph Partitioning
Empirically results show that KRV algorithm
outperforms Metis in terms of cut quality
For certain benchmark graphs, power-law graphs
Holds good promise
Explore the utility of KRV in other problem
domains
Node-Rank Mapping
For the cases when KRV cuts are better, the
resulting maps obtained are also better
Runtime of KRV is significantly higher
Choice of the Max-Flow algorithm
Avoiding running O(log(sqr(n)) iterations by
checking if the graph H is already an expander
Graph sparsification Benczur et. al
Issues in improving the mapping scheme furthur
Choice of the local search heuristic
Choice of the objective function

THANX
Questions ?

Write a Comment

User Comments (0)

About PowerShow.com

Graph Partitioning and its Application to the NodeRank Mapping Problem PowerPoint PPT Presentation