Monte Carlo Linear Algebra Techniques and Their Parallelization - PowerPoint PPT Presentation

About This Presentation
Title:

Monte Carlo Linear Algebra Techniques and Their Parallelization

Description:

Take a random walk, based on these probabilities. Define random variables Xi ... Each random walk can be used to estimate the kj th component of Cj h ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 18
Provided by: asri9
Learn more at: http://www.cs.fsu.edu
Category:

less

Transcript and Presenter's Notes

Title: Monte Carlo Linear Algebra Techniques and Their Parallelization


1
Monte Carlo Linear Algebra Techniques and Their
Parallelization
  • Ashok Srinivasan
  • Computer Science
  • Florida State University

www.cs.fsu.edu/asriniva
2
Outline
  • Background
  • Monte Carlo Matrix-vector multiplication
  • MC linear solvers
  • Non-diagonal splitting
  • Dense implementation
  • Sparse implementation
  • Parallelization
  • Conclusions and future work

3
Background
  • MC linear solvers are old!
  • von Neumann and Ulam (1950)
  • Were not competitive with deterministic
    techniques
  • Advantages of MC
  • Can give approximate solutions fast
  • Feasible in applications such as preconditioning,
    graph partitioning, information retrieval,
    pattern recognition, etc
  • Can yield selected components fast
  • Are very latency tolerant

4
Matrix-vector multiplication
  • Compute Cj h, C e R nxn
  • Choose probability and weight matrices such that
  • Cij PijWij and hi pi wi
  • Take a random walk, based on these probabilities
  • Define random variables Xi
  • X0 w0, and Xi Xi-1 Wki ki-1
  • E(Xj dikj) (Cj h)i
  • Each random walk can be used to estimate the kj
    th component of Cj h
  • Convergence rate independent of n

pk0
k0
Pk1k0
k1
Pk2k1
k2
kj
Update (Cj h)kj
5
Matrix-vector multiplication ... continued
pk0
  • Smj0 Cj h too can be similarly estimated
  • Smj0 (BC) jBh will be needed by us
  • It can be estimated using probabilities on both
    matrices, B and C
  • Length of random walk is twice that for the
    previous case

k0
Pk1k0
k1
Pk2k1
k2
Pk3k2
k3
Update (Cj h)k2j1
k2m1
6
MC linear solvers
  • Solve Ax b
  • Split A as A N M
  • Write the fixed point iteration
  • xm1 N-1 Mxm N-1 b Cxm h
  • If we choose x0 h, then we get
  • xm Smj0 Cj h
  • Estimate the above using the Markov chain
    technique mentioned earlier

7
Current techniques
  • Current techniques
  • Choose N a diagonal matrix
  • Ensures efficient computation of C
  • C is sparse when A is sparse
  • Example N Diagonal of A yields the Jacobi
    iteration, and the corresponding MC estimate

8
Properties of MC linear solvers
  • MC techniques estimate the result of a stationary
    iteration
  • Errors from the iterative process
  • Errors from MC
  • Reduce the error by
  • Variance reduction techniques
  • Residual correction
  • Choose a better iterative scheme!

9
Non-diagonal splitting
  • Observations
  • It is possible to construct an efficient MC
    technique for specific splittings, even if
    explicit construction of C were computationally
    expensive
  • It may be possible to implicitly represent C
    sparsely, even if C is not actually sparse

10
Our example
  • Choose N to be the diagonal and sub-diagonal of A

d1 s2 d2 sn dn
. . . . . .

N
N-1
  • Computing N-1 C is too expensive
  • Compute xm Smj0 (N-1 M)j N-1 b instead

11
Computing N-1
  • Using O(n) storage and precomputation time, any
    element of N-1 can be computed in constant time
  • Define T(1) 1, T(i1) T(i) si1/di1
  • N-1ij
  • 0, if i lt j
  • 1/di, if i j
  • (-1)i-j /dj T(i)/T(j), otherwise
  • The entire N-1, if needed, can be computed in
    O(n2) time

12
Dense implementation
  • Compute N-1 and store in O(n2) space
  • Choose probabilities proportional to the weight
    of the elements
  • Use the alias method to sample
  • Precomputation time proportional to the number of
    elements
  • Constant time to generate each sample
  • Estimate Smj0 (N-1 M)j N-1 b

13
Experimental results
Walk length 2
14
Sparse implementation
  • We cannot use O(n2) space or time!
  • Sparse implementation for M is simple
  • Sparse representation of N-1
  • Choose Pij
  • 0, if i lt j
  • 1/(n-j1) otherwise
  • Sampled from the uniform distribution
  • Choose Wij N-1ij Pij
  • Constant time to determine any Wij
  • Minor modifications needed when si 0

15
Parallelization
Proc 2 RNG 2
Proc 3 RNG 3
23.7
23.2
23.6
23.5
  • MC is embarrassingly parallel
  • Identical algorithms are run independently on
    each processor, with the random number sequences
    alone being different

16
MPI vs OpenMP on Origin 2000
  • Cache misses cause poor performance of the
    OpenMP parallelization

17
Conclusions and future work
  • Demonstrated that is possible to have effective
    MC implementations with non-diagonal splittings
    too
  • Need to extend this to better iterative schemes
  • Non-replicated parallelization needs to be
    considered
Write a Comment
User Comments (0)
About PowerShow.com