Title: An Introduction to the Conjugate Gradient Method without the Agonizing Pain
1An Introduction to the Conjugate Gradient Method
without the Agonizing Pain
- Jonathan Richard Shewchuk
- Reading Group Presention By
- David Cline
2Linear System
Unknown vector (what we want to find)
Known vector
Square matrix
3Matrix Multiplication
4Positive Definite Matrix
x1 x2 xn
gt 0
Also, all eigenvalues of the matrix are positive
5Quadtratic form
- An expression of the form
6Why do we care?
- The gradient of the quadratic form is our
original system if A is symmetric
7Visual interpretation
8Example Problem
9Visual representation
f(x)
f(x)
f(x)
10Solution
- the solution to the system, x, is the global
minimum of f. if A is symmetric, - And since A is positive definite, x is the global
minimum of f
11Definitions
Whenever you read residual, Think the
direction of steepest Descent.
12Method of steepest descent
- Start with arbitrary point, x(0)
- move in direction opposite gradient of f, r(0)
- reach minimum in that direction at distance alpha
- repeat
13Steepest descent, mathematically
- OR -
14Steepest descent, graphically
15Eigen vectors
16Steepest descent does well
Steepest descent converges in one Iteration if
the error term is an Eigenvector.
Steepest descent converges in one Iteration if
the all the eigenvalues Are equal.
17Steepest descent does poorly
If the error term is a mix of large and small
eigenvectors, steepest descent will move back and
forth along toward the solution, but take many
iterations to converge. The worst case
convergence is related to the ratio of the
largest and smallest eigenvalues of A, called the
condition number
18Convergence of steepest descent
iterations
energy norm at iteration i
energy norm at iteration 0
19How can we speed up or guarantee convergence?
- Use the eigenvectors as directions.
- terminates in n iterations.
20Method of conjugate directions
- Instead of eigenvectors, which are too hard to
compute, use directions that are conjugate or
A-orthogonal
21Method of conjugate directions
22How to find conjugate directions?
- Gram-Shmidt Conjugation
- Start with n linearly independent vectors u0un-1
- For each vector, subract those parts that are not
A-orthogonal to the other processed vectors
23Problem
- Gram-Schmidt conjugation is slow and we have to
store all of the vectors we have created.
24Conjugate Gradient Method
- Apply the method of conjugate directions, but use
the residuals for the u values - ui r(i)
25How does this help us?
- It turns out that the residual ri is A-orthogonal
to all of the previous residuals, except ri-1, so
we simply make it A-orthogonal to ri-1, and we
are set.
26Simplifying further
ki-1
27Putting it all together
Start with steepest descent
Compute distance to bottom Of parabola
Slide down to bottom of parabola
Compute steepest descent At next location
Remove part of vector that Is not A-orthogonal to
di
28Starting and stopping
- Start either with a rough estimate of the
solution, or the zero vector. - Stop when the norm of the residual is small
enough.
29Benefit over steepest descent
30Preconditioning
31Diagonal preconditioning
- Just use the diagonal of A as M. A diagonal
matrix is easy to invert, but of course it isnt
the best method out there.
32CG on the normal equations
If A is not symmetric, or positive-definite, or
not square, we cant use CG directly to
solve However, we can use it to solve the
system is always symmetric, positive
definite and square. The problem that we solve
with this is the least-squares fit
but the condition number
increases. Also note that we never actually have
to form Instead we multiply by AT and then by A.
33(No Transcript)