Title: CS 584
1CS 584
2Iterative Methods
- Gaussian elimination is considered to be a direct
method to solve a system. - An indirect method produces a sequence of values
that converges to the solution of the system. - Computation is halted in an indirect algorithm
when when a specified accuracy is reached.
3Why Iterative Methods?
- Sometimes we don't need to be exact.
- Input has inaccuracy, etc.
- Only requires a few iterations
- If the system is sparse, the matrix can be stored
in a different format. - Iterative methods are usually stable.
- Errors are dampened
4Iterative Methods
- Consider the following system.
- Now write out the equations
- solve for the ith unknown in the ith equation
x1 6/7 x2 3/7 x2 8/9 x1 - 4/9
5Iterative Methods
- Come up with an initial guess for each xi
- Hopefully, the equations will produce better
values and which can be used to calculate better
values, etc. and will converge to the answer.
6Consider
- Are systems of equations and finite element
methods related?
7Iterative Methods
- Jacobi iteration
- Use all old values to compute new values
k x1 x2 0
0.00000 0.00000 10
0.14865 -0.19820 20
0.18682 -0.24908 30 0.19662
-0.26215 40 0.19913
-0.26551 50 0.19977 -0.26637
8Jacobi Iteration
- The ith equation has the form
-
1
n
å
,
i
b
j
x
j
i
A
0
j
- Which can be rewritten as
ö
æ
1
å
ç
-
j
x
j
i
A
i
b
i
x
,
ç
i
i
A
,
ø
è
¹
i
j
9Jacobi Iteration
- The vector (b - Ax) is zero when and if we get
the exact answer. - Define this vector to be the residual r
- Now rewrite the solution equation
10void Jacobi(float A, float b, float x,
float epsilon) int k 0 float
x1 float r float r0norm
// Randomly select an initial x vector r
b - Ax // This involves matrix-vector
mult etc. r0norm r2 // This is
basically the magnitude while (r2 gt
epsilon r0norm) for (j
0 j lt n j) x1j rj /
Aj,j xj r b - Ax x
x1
11Parallelization of Jacobi
- 3 main computations per iteration
- Inner product (2 norm)
- Loop calculating xjs
- Matrix-vector mult. to calculate r
- If Aj,j, bj, rj are on the same proc.
- Loop requires no communication
- Inner product and Matrix-vector mult require
communication.
12Inner Product
- Suppose data is distributed row-wise
- Inner product is simply dot product
- IP Sum(xj xj)
- This only requires a global sum collapse
- O(log n)
13Matrix-Vector Multiplication
- Again data is distributed row-wise
- Each proc. requires all of the elements in the
vector to calculate their part of the resulting
answer. - This results in all to all gather
- O(n log n)
14Jacobi Iteration
- Resulting cost for float (4 bytes)
- Tcomm iterations (TIP TMVM)
- TIP log p (ts tw 4)
- TMVM p log p (ts tw nrows/p 4)
15Iterative Methods
- Gauss-Seidel
- Use the new values as soon as available
k x1 x2 0
0.00000 0.00000 10
0.21977 -0.24909 20
0.20130 -0.26531 30 0.20008
-0.26659 40 0.20000
-0.26666
16Gauss-Seidel Iteration
- The basic Gauss-Seidel iteration is
17Gauss-Seidel Iteration
- Rule Always use the newest values of the
previously computed variables. - Problem Sequential?
- Gauss-Seidel is indeed sequential if the matrix
is dense. - Parallelism is a function of the sparsity and
ordering of the equation matrix.
18Gauss-Seidel Iteration
- We can increase possible parallelism by changing
the numbering of a system.
19Parallelizing Red-Black GSI
Block checkerboard.
2 phases per iteration. 1- compute red cells
using values from black cells 2- compute black
cells using values from red cells Communication
is required for each phase.
20Partitioning
P0
P1
P2
P3
21Communication
22Procedure Gauss-SeidelRedBlack while ( error gt
limit ) send black values to neighbors
recv black values from neighbors compute
red values send red values to neighbors
recv red values from neighbors compute
black values compute error / only do
every so often / endwhile
23Extending Red-Black Coloring
- Goal Produce a graph coloring scheme such that
no node has a neighbor of the same color. - Finite element methods produce graphs with only 4
neighbors. - Two colors suffice
- What about more complex graphs?
24More complex graphs
- Use graph coloring heuristics.
- Number the nodes one color at a time.
25Conclusion
- Iterative methods are used when an exact answer
is not computable or needed. - Gauss-Seidel converges faster than Jacobi but
parallelism is trickier. - Finite element codes are simply systems of
equations - Solve with either Jacobi or Gauss-Seidel