CS 290H 24 October Parallel computing and preconditioning - PowerPoint PPT Presentation

1 / 7
About This Presentation
Title:

CS 290H 24 October Parallel computing and preconditioning

Description:

Parallel triangular solves, graph coloring for ILU ... Reordering by graph coloring (see example) But the orderings are not great for convergence ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 8
Provided by: JohnGi84
Category:

less

Transcript and Presenter's Notes

Title: CS 290H 24 October Parallel computing and preconditioning


1
CS 290H 24 OctoberParallel computing and
preconditioning
  • Read Chen Toledo, Vaidya's
    preconditioners Implementation and experimental
    study (See references page.)
  • Homework 2 on web page by end of today, due Mon 7
    Nov.
  • Parallel CG, matrix-vector multiplication, graph
    partitioning
  • Parallel triangular solves, graph coloring for
    ILU
  • Introduction to sparse approximate inverse
    preconditioners

2
Preconditioned conjugate gradient iteration
x0 0, r0 b, d0 B-1 r0, y0
B-1 r0 for k 1, 2, 3, . . . ak
(yTk-1rk-1) / (dTk-1Adk-1) step length xk
xk-1 ak dk-1 approx
solution rk rk-1 ak Adk-1
residual yk B-1 rk
preconditioning
solve ßk (yTk rk) / (yTk-1rk-1)
improvement dk yk ßk dk-1
search direction
  • Several vector inner products per iteration (easy
    to parallelize)
  • One matrix-vector multiplication per iteration
    (medium to parallelize)
  • One solve with preconditioner per iteration (hard
    to parallelize)

3
Matrix-vector product Parallel implementation
  • Lay out matrix and vectors by rows
  • Hard part is matrix-vector product
    y Ax
  • Algorithm
  • Each processor j
  • Broadcast x(j)
  • Compute y(j) A(j,)x
  • May send more of x than needed
  • Partition / reorder matrix to reduce communication

4
Graph partitioning in theory
  • If G is a planar graph with n vertices, there
    exists a set of at most sqrt(6n) vertices whose
    removal leaves no connected component with more
    than 2n/3 vertices. (Planar graphs have
    sqrt(n)-separators.)
  • Well-shaped finite element meshes in 3
    dimensions have n2/3 - separators.
  • Also some other classes of graphs trees, graphs
    of bounded genus, chordal graphs,
    bounded-excluded-minor graphs,
  • Mostly these theorems come with efficient
    algorithms, but they arent used much.

5
Graph partitioning in practice
  • Graph partitioning heuristics have been an active
    research area for many years, often motivated by
    partitioning for parallel computation. See CS
    240A.
  • Some techniques
  • Spectral partitioning (uses eigenvectors of
    Laplacian matrix of graph)
  • Geometric partitioning (for meshes with specified
    vertex coordinates)
  • Iterative-swapping (Kernighan-Lin,
    Fiduccia-Matheysses)
  • Breadth-first search (GLN 7.3.3, fast but dated)
  • Many popular modern codes (e.g. Metis, Chaco) use
    multilevel iterative swapping
  • Matlab graph partitioning toolbox see course web
    page

6
Parallel Incomplete Cholesky and ILU Issues
  • Computing the preconditioner
  • Parallel direct methods well developed
  • But IC/ILU is sparser gt harder to speed up
  • Still, you only have to do it once
  • Applying the preconditioner
  • Triangular solves are not very parallel
  • Reordering by graph coloring (see example)
  • But the orderings are not great for convergence

7
Sparse approximate inverses
  • Compute B-1 ? A explicitly
  • Minimize A B-1 I F (in parallel, by
    columns)
  • Variants factored form of B-1, more fill, . .
  • Good very parallel, seldom breaks down
  • Bad effectiveness varies widely
Write a Comment
User Comments (0)
About PowerShow.com