Parallizing the SVD Computation for Latent Semantic Analysis

About This Presentation

Title:

Description:

Number of Views:16

Avg rating:3.0/5.0

Slides: 11

Provided by: jorgec6

Category:

more less

Transcript and Presenter's Notes

Title: Parallizing the SVD Computation for Latent Semantic Analysis

1
Parallizing the SVD Computation for Latent
Semantic Analysis

2
Introduction Information Retrieval Methods

Term-matching retrieval techniques.
Based on sintaxis.
Typical in search engines.
Latent Semantic Analysis.
Allows document- document, term-document and
term-term comparison.
Idea Model the relationship between terms and
documents using a frequency term-by-document
matrix.
Element aij is how many times word i occurs in
document j.
We apply SVD decomposition to this matrix to
remove noisy information and reveal semantic
structure.

3
SVD decomposition

Matrix A T x S x D
We use SVDPACKC library to carry out SVD
decomposition of sparse matrices stored in Row
Compact Format.
Pointr Vector
Rowind Vector
Value Vector

4
How we parallelized SVD computation ?

Parallelize basic matrix and vector operations
because they are 90 of the execution time.
This approach was taken based on simplicity,
since parallelization of the SVD computation as a
whole was not feasible.
Application of Master-Worker Paradigm to our
case
Master is running the main program.
Workers waiting in a barrier.
Master opens the barrier and send the code of the
operation to workers together with the data, at
the end of the computation synchronization using
MPI_Reduceor MPI_Gatherv.

5
How we parallelized SVD computation ? (2)

How to parallelize Matrix x Matrix x Vector and
Matrix x Vector taking compact row format into
account?
Distribution of non-zero values could be not
uniform, i.e. columns with few non-zero values
and colums with a lot of non-zero values.
Spliting the matrix just by columns it may result
in an unbalanced distribution of computational
effort .
Solution is simple Split data in chunks with
equal amount of nonzero values, since this is the
real computation. Based on value vector.

6
Advantages and disadvantages with this solution

Good news
Matrix is sent only once at the beginning.
Communication cost is reduced to the minimum.
Computation is balanced for all the nodes.
We just need to send the piece of vector each
time we calculate this operation.
Bad news
Sometimes we will need to split in the middle of
a column (in the best case).
Need of recalculation of pointr for each node and
the same for the piece of vector.

7
Vector operations

Basic operations
Constant x Vector Vector.
Dot Product.
Parallelization is simple.
Problem
These operations are called around 1 million
times in a execution, every call implies
synchronization communication cost.
Communicative cost computational cost in
remote nodes is higher than computational cost in
local node for medium matrices ( A few thousands
by a few thousands).
Just useful for big vectors, in our test decrease
the performance.

8
Performance Measurements

Size of sparse matrix(3 dense)
9
Problems

Speed-up worse than expected a priori, maybe
because we could not test with larger matrices.
Problems with MPI_Broadcast.
Parallelization is not always good. Small vector
operations.
Difficulties in matrix operations because of the
Compact Row Format
Recalculation of pointr vector for each node is
not so easy.
Similar problem with vector that we send for
matrix operations.

10
Conclusion

Speed-up about 3, not so great as we expected.
Possible reasons
We are using medium size matrices, not large
ones.
Communication cost higher because of the compact
row format rowind and pointr vectors.
Not possible to take advantage of vector
operations, since medium scale problem.
Syncronization overhead. We call an operation a
few millions times.
Questions ??