Title: A Fine-Grain Hypergraph Model for 2D Decomposition of Sparse Matrices
1A Fine-Grain Hypergraph Model for 2D
Decomposition of Sparse Matrices
Umit V. Catalyurek and Cevdet Aykanat
Department of Computer EngineeringBilkent
University
2Outline
- Graph Partitioning
- Hypergraph Partitioning
- Standard Graph Model for Sparse Matrix
Representation - Fine-Grain Hypergraph Model for Sparse Matrix
Representation - Experiment Results
- Applicability of the Fine-Grain Hypergraph Model
3Graph Partitioning
- Graph G(V, E) set of vertices V and set of
edges E - every edge eij ? E connects pair of distinct
vertices vi and vj - K-way graph partition by edge separator ?V1,
V2, , VK - Vk is nonempty subset of V, i.e., Vk ? V,
- parts are pairwise disjoint, i.e., Vk ? Vl ?,
- union of K parts is equal to V, i.e., ?k1K Vk
V. - an edge eij is said to be
- cut if vi ? Vk and vj ? Vl and k?l
- uncut if vi ? Vk and vj ? Vk
- a partition is said to be balanced if
- Wk ? Wavg (1 ?)
- Wk weight of part Vk, ? maximum
imbalance ratio - cost of a partition
- cutsize(?) ?eij ? EE w(eij)
- where EE is set of cut edges
4Hypergraph Partitioning
- Hypergraph H(V,N) a set of vertices V and a set
of nets N - nets (hyperedges) connect two or more vertices
- every net nj ? N is a subset of vertices, i.e.,
nj ? V - graph is a special instance of hypergraph
- K-way hypergraph partition ??V1, V2, , VK
- a net that has at least one pin in a part is said
to connect that part - connectivity set C(nj) of a net nj set of
parts connected by nj - connectivity c(nj) C(nj) of a net nj
number of parts connected by nj. - a net nj is said to be
- cut if c(nj) gt 1
- uncut if c(nj) 1
- two cutsize definitions widely used in VLSI
community - net-cut metric cutsize(?) ?n ? NE w(ni)
- connectivity - 1 metric cutsize(?)
?n ? NE w(ni) (c(nj) - 1)
5Hypergraph Partitioning
- cut nets NE n1, n8, n15
- connectivity sets
- C(n1) V1,V2,
- C(n8) C(n15) V1,V2,V3
- connectivity values
- c (n1 ) 2, c (n8 ) c (n15 ) 3
- cutsize values assuming unit net weights
- net-cut metric cutsize(?) NE 3
- connectivity - 1 metric cutsize(?) 1 2 2
5
6Parallel Matrix-Vector Multiplication yAx
- Parallel iterative solvers
- 1D rowwise or columwise partitioning of A
- symmetric partitioning to avoid communication
during linear vector operations - all vectors are divided conformally with row or
column partitioning - symmetric row/column permutation on A
- processor Pk computes linear vector operations on
k-th blocks of vectors. - rowwise Pk computes yk Ark x
- entries of the x-vector are communicated
- columnwise Pk computes yk Ack xk, where y ?
yk - entries of the yk vectors are communicated
7Graph Model for Representing Sparse Matrices
- standard graph model G(V, E) for matrix A
- vertex set one vertex vi for each row/column i
of A - vi ? V ? task i of computing inner product yi
lt ri, xgt - node weighting w (vi) number of nonzeros in
row ri - edge set E (vi, vj) ? E ? aij ? 0 and aji ? 0
- each edge denotes bidirectional interaction
between tasks i and j - edge (vi, vj) ? E ? yi ? yi aij xj and yj ? yj
ajixi - exchange of xi and xj values before local
matrix-vector products - rows ri and rj assigned to different processors
- ? communication of two words
- edge weighting w (vi, vj) 2
8Graph Model Minimizes the Wrong Metric
- a 4-way rowwise partition of a sample symmetric
matrix in graph model - cost(?) 2 ? 5 10 words, but actual
communication volume is 7 words - P1 sends xi to both P2 and P4
- P2 and P4 send xj, xk, xl and xm, xh ,
respectively, to P1 - graph model tries to minimize the total number of
off-block-diagonal nonzeros - treats each off-block-diagonal nonzero entry as
if it incurs a distinct communication - nonzeros in the same column of an off-diagonal
block - necessitate the communication of only a single x
value
P1
P2
P3
P4
P2
P1
h
m
l
k
j
i
Vj
P1
Vi
Vk
i
Vl
j
k
P2
l
Vm
P3
Vh
m
P4
h
P4
P3
9Fine-Grain Hypergraph Model
- M x M matrix A with Z nonzeros is represented by
H(V, N) - Z vertices one vertex vij for each aij ? 0
- 2 ?M nets one net for each row and for each
column of A - N NR? NC
- row nets NR m1, m2, , mM
- column nets NC n1, n2, , nM
- vij ? mi and vij ? nj iff aij ? 0
- column-net nj represents dependency of atomic
tasks to xj - row-net mi represents dependency of computing yi
to partial y'i results
10Fine-Grain Hypergraph Model
one vertex for each nonzero
11Fine-Grain Hypergraph Model for 2D Decomposition
- unit net weighting w(n) 1 for each net n ? N
- use connectivity-1 metric cutsize(?) ?n ? NE
(c(nj) - 1) - minimizing cutsize corresponds to minimizing
total volume of communication - consistency of the model
- exact correspondence between cutsize and
communication volume - maintain symmetric partitioning yi, xi assigned
to the same processor - consistency condition
- vii ? ni and vii ? mi for each vertex vii (holds
iff aii ? 0 ) - consider a K-way partition ??V1, V2, , VK
H(V, N) - ? induces a partition on nonzeros of matrix A
- decode vii ? Vk ? assign yi and xi to processor
Pk
12Fine-Grain Hypergraph Model for 2D Decomposition
1
2
3
4
5
6
7
8
1
1
2
1
2
2
2
2
2
2
3
3
1
3
3
4
5
3
3
6
1
1
x2
7
1
2
3
1
8
3
cutsize(?) 8
Communication Volume8
13Experimental Results
14Experimental Results
15Applicability of the Model
- Parallel reduction
- columns / x-vector inputs
- rows / y-vector output
- nonzeros input to output mapping computation
- Fine-grain hypergraph model
- models the workload partitioning
- Doesnt restrict the place of computation to the
owner of input or output - For each input output pair computation can be
done in any of the processors - Communication volume is minimized and workload
balanced! - Directly and exactly models the total
communication volume - column-nets dependency to input ? models
pre-communication - row-nets dependency of output to partial outputs
? models post-communication
16End of Talk
17(No Transcript)