The Landscape of Axb Solvers - PowerPoint PPT Presentation

About This Presentation

Title:

The Landscape of Axb Solvers

Description:

condition number usually improves (only) by constant factor (except MIC for some ... still, often good when tuned for a particular class of problems. Parallelism ... – PowerPoint PPT presentation

Number of Views:29

Avg rating:3.0/5.0

Slides: 27

Provided by: csU45

Learn more at: https://sites.cs.ucsb.edu

Category:

more less

Transcript and Presenter's Notes

Title: The Landscape of Axb Solvers

1
The Landscape of Axb Solvers
More Robust
Less Storage (if sparse)
D
2
Conjugate gradient iteration
x0 0, r0 b, p0 r0 for k 1, 2,
3, . . . ak (rTk-1rk-1) / (pTk-1Apk-1)
step length xk xk-1 ak pk-1
approx solution rk rk-1 ak
Apk-1 residual ßk
(rTk rk) / (rTk-1rk-1) improvement pk
rk ßk pk-1
search direction

One matrix-vector multiplication per iteration
Two vector dot products per iteration
Four n-vectors of working storage

3
Conjugate gradient Krylov subspaces

Eigenvalues Au ?u ?1, ?2 ,
. . ., ?n
Cayley-Hamilton theorem
(A ?1I)(A ?2I) (A ?nI) 0
Therefore S ciAi 0 for some ci
so A-1 S (ci/c0) Ai1
Krylov subspace
Therefore if Ax b, then x A-1 b and
x ? span (b, Ab, A2b, . . ., An-1b) Kn (A, b)

0 ? i ? n
1 ? i ? n
4
Conjugate gradient Orthogonal sequences

Krylov subspace Ki (A, b) span (b, Ab, A2b, .
. ., Ai-1b)
Conjugate gradient algorithm for i 1, 2, 3,
. . . find xi ? Ki (A, b) such that ri
(Axi b) ? Ki (A, b)
Notice ri ? Ki1 (A, b), so ri ? rj for all
j lt i
Similarly, the directions are
A-orthogonal (xi xi-1 )TA (xj xj-1 ) 0
The magic Short recurrences. . . A is symmetric
gt can get next residual and direction from
the previous one, without saving them all.

5
Conjugate gradient Convergence

In exact arithmetic, CG converges in n steps
(completely unrealistic!!)
Accuracy after k steps of CG is related to
consider polynomials of degree k that are equal
to 1 at 0.
how small can such a polynomial be at all the
eigenvalues of A?
Thus, eigenvalues close together are good.
Condition number ?(A) A2 A-12
?max(A) / ?min(A)
Residual is reduced by a constant factor by
O(?1/2(A)) iterations of CG.

6
(Matlab demo)

CG on grid5(15) and bcsstk08
n steps of CG on bcsstk08

7
Conjugate gradient Parallel implementation

Lay out matrix and vectors by rows
Hard part is matrix-vector product
y Ax
Algorithm
Each processor j
Broadcast x(j)
Compute y(j) A(j,)x
May send more of x than needed
Partition / reorder matrix to reduce communication

8
(Matlab demo)

2-way partition of eppstein mesh
8-way dice of eppstein mesh

9
Preconditioners

Suppose you had a matrix B such that
condition number ?(B-1A) is small
By z is easy to solve
Then you could solve (B-1A)x B-1b instead of Ax
b
B A is great for (1), not for (2)
B I is great for (2), not for (1)
Domain-specific approximations sometimes work
B diagonal of A sometimes works
Or, bring back the direct methods technology. . .

10
(Matlab demo)

bcsstk08 with diagonal precond

11
Incomplete Cholesky factorization (IC, ILU)

Compute factors of A by Gaussian elimination,
but ignore fill
Preconditioner B RTR ? A, not formed explicitly
Compute B-1z by triangular solves (in time
nnz(A))
Total storage is O(nnz(A)), static data structure
Either symmetric (IC) or nonsymmetric (ILU)

12
(Matlab demo)

bcsstk08 with ic precond

13
Incomplete Cholesky and ILU Variants

Allow one or more levels of fill
unpredictable storage requirements
Allow fill whose magnitude exceeds a drop
tolerance
may get better approximate factors than levels of
fill
unpredictable storage requirements
choice of tolerance is ad hoc
Partial pivoting (for nonsymmetric A)
Modified ILU (MIC) Add dropped fill to
diagonal of U or R
A and RTR have same row sums
good in some PDE contexts

14
Incomplete Cholesky and ILU Issues

Choice of parameters
good smooth transition from iterative to direct
methods
bad very ad hoc, problem-dependent
tradeoff time per iteration (more fill gt more
time) vs of iterations (more
fill gt fewer iters)
Effectiveness
condition number usually improves (only) by
constant factor (except MIC for some problems
from PDEs)
still, often good when tuned for a particular
class of problems
Parallelism
Triangular solves are not very parallel
Reordering for parallel triangular solve by graph
coloring

15
(Matlab demo)

2-coloring of grid5(15)

16
Sparse approximate inverses

Compute B-1 ? A explicitly
Minimize B-1A I F (in parallel, by
columns)
Variants factored form of B-1, more fill, . .
Good very parallel
Bad effectiveness varies widely

17
Support graph preconditioners example
Vaidya
G(A)
G(B)

A is symmetric positive definite with negative
off-diagonal nzs
B is a maximum-weight spanning tree for A (with
diagonal modified to preserve row sums)
factor B in O(n) space and O(n) time
applying the preconditioner costs O(n) time per
iteration

18
Support graph preconditioners example
G(A)
G(B)

support each edge of A by a path in B
dilation(A edge) length of supporting path in B
congestion(B edge) of supported A edges
p max congestion, q max dilation
condition number ?(B-1A) bounded by pq (at most
O(n2))

19
Support graph preconditioners example
G(A)
G(B)

can improve congestion and dilation by adding a
few strategically chosen edges to B
cost of factorsolve is O(n1.75), or O(n1.2) if A
is planar
in recent experiments Chen Toledo, often
better than drop-tolerance MIC for 2D problems,
but not for 3D.

20
Domain decomposition (introduction)
A

Partition the problem (e.g. the mesh) into
subdomains
Use solvers for the subdomains B-1 and C-1 to
precondition an iterative solver on the
interface
Interface system is the Schur complement
S G ET B-1E FT C-1F
Parallelizes naturally by subdomains

21
(Matlab demo)

grid and matrix structure for overlapping 2-way
partition of eppstein

22
Multigrid (introduction)

For a PDE on a fine mesh, precondition using a
solution on a coarser mesh
Use idea recursively on hierarchy of meshes
Solves the model problem (Poissons eqn) in
linear time!
Often useful when hierarchy of meshes can be
built
Hard to parallelize coarse meshes well
This is just the intuition lots of theory and
technology

23
Other Krylov subspace methods

Nonsymmetric linear systems
GMRES for i 1, 2, 3, . . . find xi ? Ki
(A, b) such that ri (Axi b) ? Ki (A,
b)But, no short recurrence gt save old vectors
gt lots more space
BiCGStab, QMR, etc.Two spaces Ki (A, b) and Ki
(AT, b) w/ mutually orthogonal basesShort
recurrences gt O(n) space, but less robust
Convergence and preconditioning more delicate
than CG
Active area of current research
Eigenvalues Lanczos (symmetric), Arnoldi
(nonsymmetric)

24
The Landscape of Sparse Axb Solvers
25
Complexity of direct methods
Time and space to solve any problem on any
well-shaped finite element mesh
26
Complexity of linear solvers
Time to solve model problem (Poissons equation)
on regular mesh

Write a Comment

User Comments (0)