Title: Adaptive Mesh Refinement, Multigrid Project
1Adaptive Mesh Refinement, Multigrid Project
Danny Thorne Computer Science Department Universit
y of Kentucky
- In collaboration with
- Craig Douglas, my advisor.
- Ray Tuminaro, Sandia National Labs, California.
- Jonathan Hu, Sandia National Labs, California.
- Jaideep Ray, National Combustion Research
Facility, Sandia, California. - Karen Devine, Sandia National Labs, New Mexico.
Supported in part by Sandia National Laboratories
and the National Science Foundation.
2Adaptive Mesh Refinement, Multigrid Project
Danny Thorne University of Kentucky
Supported in part by Sandia National Laboratories
and the National Science Foundation.
3Definition of the Problem
Solve elliptic boundary value problems
(BVPs). using adaptive mesh refinement
(AMR) and multigrid (MG).
4Motivation
- Problems with variations in scale, e.g.
combustion.
CRF Example
5Outline
- Background and definitions.
- Multigrid on adaptively refined meshes, the
basic algorithm. - Cache aware smoothers.
- Cache aware AMRMG.
- Numerical Results.
- Future work.
6Background and Definitions
7Cell Centered Multigrid
Illustration in 1D with four grid levels
Smooth A4u4f4.
Smooth A4u4f4.
Smooth A3u3f3.
Smooth A3u3f3.
Smooth A2u2f2.
Smooth A2u2f2.
Solve A1u1f1 directly.
8Definition of a Grid Hierarchy
9(No Transcript)
10Transfer Operators
11Nesting
We require strict nesting of the grids
This requirement can be written in terms of the
domains as
and
12(No Transcript)
13Internal Patch Boundaries
Three important concepts for MG on AMR
- Ghost points Define ghost points near the
internal patch boundaries so that a regular
stencil can be used on the fine grid side of the
boundary. (This is illustrated in later slides.) - C1 continuity Also, these ghost points are
used to preserve C1 continuity across internal
patch boundaries. - Flux matching The process of preserving C1
continuity across internal patch boundaries (also
referred to as coarse/fine interfaces) is called
flux matching and is discussed more in later
slides.
Illustrations of ghost points and flux matching
in later slides.
Note We use cell-centered grids so that coarse
and fine grids wont share grid points at the
interfaces.
14Internal Patch Boundaries
a.k.a. Coarse/Fine Interfaces
C1 continuity guarantees consistency which is
important because consistency stability
convergence.
Note We use cell-centered grids so that coarse
and fine grids wont share grid points at the
interfaces.
15Example
16Poisson on a Unit Cube
Example
172D Stencils
For grid points away from refinement patches.
183D Stencils
Boundary values.
19Ghost Points
- Ghost points are needed at the interface between
coarse patches and fine patches. - On the finer level, they provide information
needed by the discrete composite operator to use
a regular stencil at the boundaries of patches. - On the coarser level, they provide information
needed by the composite operator for flux
matching across coarse/fine interfaces. - The following few slides will illustrate the
process of acquiring ghost points and applying
the flux matching technique.
20Ghost Points
Normal
- First, interpolate values at the blue xs using
coarse grid values. - Then interpolate the red points using fine grid
values along with the values at the blue xs. - The red points are the ghost points that will be
used by the discrete operator.
21Flux Matching
1
2
The same approximation for the derivative across
the coarse/fine interface is used for the grid
points on both sides of the interface, so C1
continuity is preserved across the interface.
Notice that these fluxes match the fluxes used in
the stencils for the boundary points on the fine
grid side. Hence, flux matching.
22Flux Matching in 3D
Image created with Povray (http//www.povray.org)
23Ghost Points
Hard to illustrate in 3D
Interpolate the green points using coarse grid
values and then interpolate the blue points using
the green points.
Interpolate the red points using fine grid values
and the blue points. The red points are the
ghost points.
A piece of a coarse grid thats adjacent to a
fine grid
24Flux Matching
Hard to illustrate in 3D
At the interface, average the fluxes across the
four fine grid cell walls using the interpolated
values denoted by the red points.
25Multigrid on AMR Hierarchies
An Illustration in 1D
26Preparation For Illustration
- Assume the hierarchy is already constructed
(e.g. from AMR codes at Sandia). - The following sequence of slides illustrates our
AMR MG algorithm on the smallest possible example
(in 1D) that captures all the aspects of the
algorithm - Three levels,
- One refinement patch per level,
- Four grid points per level.
- Problem
- Residual
- Correction
Fasten your seatbelts, grasp the safety bar,
place head firmly against head rest
27Stuff (Hidden Slide)
Checkerboard denotes values that were updated in
previous steps.
28Color Test
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
29Illustration in 1D
The dashed lines and dots represent the
completion of the composite grid on each level.
The algorithm is patch-based. The solid lines
represent the domains of the patches, and the
dark dots represent the grid points in the
patches.
30Illustration in 1D
Start with an initial guess for the solution on
the composite grid.
Represent the grid points with a 2-by-2 array of
spaces to keep track of the status of the
different variables during the algorithm. See
the key to the left.
Solution
Correction
Residual
Temporary
31Illustration in 1D
Interpolate ghost point(s) so that regular
stencils can be applied at patch boundary points.
Checkerboard pattern denotes values that were
updated in previous steps.
Solution
Correction
Residual
Temp
Temporary
32Illustration in 1D
Compute the residual on the finest grid level
(using the ghost point(s) to complete the
stencil(s) on the patch boundary points).
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
33Illustration in 1D
Smooth to get a correction.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
34Illustration in 1D
Store temporarily corrected solution, needed for
acquisition of ghost points for the residual
computation on the next level. (The permanently
corrected solution for this level of this V cycle
will be computed on the other side of the V
cycle.)
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
35Illustration in 1D
Project the residual onto the part of the coarser
grid that is covered by the finer grid.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
36Illustration in 1D
Interpolate ghost point(s) for the temporarily
corrected solution. This is needed for residual
computation on the uncovered part of the next
level.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
37Illustration in 1D
Interpolate ghost point(s) so that regular
stencils can be applied at patch boundary points.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
38Illustration in 1D
Compute the residual on the uncovered nodes. Use
fine grid data (including ghost points) to do
flux matching at the interface nodes. Use the
newly interpolated ghost points to apply a
regular stencil at the boundary points.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
39Illustration in 1D
Smooth to get a correction.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
40Illustration in 1D
Store temporarily corrected solution, needed for
acquisition of ghost points for the residual
computation on the next level. (The permanently
corrected solution for this level of this V cycle
will be computed on the other side of the V
cycle.)
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
41Illustration in 1D
Project the residual onto the part of the coarser
grid that is covered by the finer grid.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
42Illustration in 1D
Compute ghost point(s) for the temporarily
corrected solution.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
43Illustration in 1D
Compute the residual on the uncovered nodes. Use
fine grid data (including ghost points) to do
flux matching at the interface nodes.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
44Illustration in 1D
Solve for the correction on the coarsest level.
(We use geometric multigrid here.)
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
45Illustration in 1D
Correct the solution on the uncovered part of the
coarsest level.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
46Illustration in 1D
Update the correction on the next finer level
with interpolated values from the coarse level.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
47Illustration in 1D
Update the residual.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
48Illustration in 1D
Smooth to get a correction for the correction.
We avoid smoothing the current correction because
that would require interpolating ghost points at
the patch boundaries.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
49Illustration in 1D
Update the original correction with the newly
computed correction.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
50Illustration in 1D
Now update the solution on the uncovered nodes
with the latest correction.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
51Illustration in 1D
Update the correction on the next finer level
with interpolated values from the coarser level.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
52Illustration in 1D
Update the residual.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
53Illustration in 1D
Smooth to get a correction for the correction.
We avoid smoothing the current correction because
that would require interpolating ghost points at
the patch boundaries.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
54Illustration in 1D
Update the original correction with the newly
computed correction.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
55Illustration in 1D
Now update the solution on this level with the
latest correction.
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
Temporary
56Illustration in 1D
Final solution on the composite grid after one V
cycle.
Can do more V cycles by starting the procedure
again from the beginning with the new solution as
the initial guess.
57Illustration in 1D
Checkerboard denotes values that were updated in
previous steps.
Solution
Correction
Residual
Temp
58Composite Grid Operators
Level 1 operator
Level 2 operator
Level 3 operator
Level 1 operator
Level 2 operator
Level 3 operator
59Smoothing
- Typically a damped Gauss-Seidel iteration.
- Using natural, red-black, or a multi-color
ordering.
- Compute pointwise on level without
regard for any other level
60Residual Computation
- Where Average() is a standard weighted
restriction from the finer level to the coarser
level.
61Multigrid Algorithm (Old)
// Split residual computation.
// Intermediate correction.
62Multigrid Algorithm (Hidden)
Smooth a correction.
Temporarily corrected solution.
Project part of the residual.
Compute the other part of the residual.
Interpolate and correct.
Update the residual.
Smooth an intermediate correction.
Update the final correction.
Apply the final correction to the solution on
this level.
63AMR MG Algorithm
Solve on the coarsest grid.
Compute residual on finest grid.
Smooth a correction.
Store temporarily corrected solution.
Project part of the residual.
Compute the other part of the residual.
Recursively apply to coarser levels.
Interpolate and correct.
Update the residual.
Smooth an intermediate correction.
Update the final correction.
Apply the final correction to the solution on
this level.
64Cache Aware Smoothers
65Cache Aware Smoothers
Preparation for Illustration
- Want data to pass through cache only once per
application of the smoother. - Application of the smoother means multiple
(e.g. two to four) Gauss-Seidel updates per
point. - Whereas normally one sweeps the entire grid over
and over to apply successive updates, we want to
stagger the updates in such a way that we need to
sweep the grid only once. - Want the answer to be exactly the same as the
normal application of the smoother. (Bit-wise
equivalence.) - The following illustration is of a 2D grid with
12-by-4 grid points and for three updates per
point. Updates occur from left to right and
bottom to top.
66Cache Aware Smoothers
67Cache Aware Smoothers
Note that updating the third row again would
require updated data in the fourth row, so we
have to wait since the fourth row hasnt been
updated yet. Similarly for the second row which
depends on a second update of row three.
Before updating new points, apply more updates to
the current points in cache where possible.
Update points in the first partition.
68Cache Aware Smoothers
Update points in the next partition.
Now we can continue updating the rows in the
previous partition.
69Cache Aware Smoothers
70Cache Aware Smoothers
71Only Post-smoothing
- Apparently, the cache aware smoothing technique
will give better performance with greater numbers
of updates (within a limit imposed by the cache
size, anyway). - When doing both pre- and post- smoothing, it is
common to do two pre-smoothing updates and two
post-smoothing updates. - Q What if we instead do four post-smoothing
updates and no pre-smoothing? - A Better cache effects.
- In addition, doing only post-smoothing
simplifies our AMR MG algorithm, as illustrated
on the next slide, and is more conducive to
making multigrid cache aware.
72Only Post-Smoothing
- No pre-smoothing.
- No need for a temporarily corrected solution.
- Just project/compute the residual.
- Only need to compute ghost points once.
- No residual update.
- Can avoid intermediate correction by
interpolating ghost points (good for
implementation of cache aware multigrid). - Number of smoothing updates here should equal
the number of pre-smoothing updates plus the
number of post-smoothing updates used in the
original implementation.
73Cache Aware Multigrid
74Numerical Results
- The following slides contain graphs of numerical
results. - The results are for the 2D case.
- Three types of grid hierarchies were tested.
- Full domain refinements (normal, non-AMR MG).
- One refinement per patch
- Two refinements per patch
75Numerical Results
- We solve a Laplaces equation with solution
- Dirichlet boundary conditions.
- We iterate until the residual is less than
times the norm of the composite right hand
side. - We use three versions of the smoother
- Standard Implementation Uses one pair of loops
with lots of conditionals to determine which
stencil to apply (recall that different stencils
are used on the interior and boundaries in the
cell centered code). - Optimized Implementation Uses several loops to
apply the updates intelligently (with knowledge
of what order the points are updated) without
using conditionals. - Cache Aware Implementation As illustrated in
previous slides.
76Numerical Results
- We solve a Laplaces equation with solution
-
- Dirichlet boundary conditions.
- We iterate until the residual is less than
times the norm of the composite right hand
side. - We use three versions of the smoother
- Standard Implementation A standard
implementation that one would write in general
practice. - Optimized Implementation Optimized loop
structure requiring more complicated and tedious
coding. - Cache Aware Implementation As illustrated in
the previous slides. This uses the optimized
implementation as a basis.
77Pentium Results
78Itanium Results
79Speedups
80Pre-/Post- versus Post-
With the cache aware smoother
81Pre-/Post- versus Post-
With the cache aware smoother
82Future Work
83Future Work
- Cache aware methods
- Need to try a variety of techniques.
- In larger problems, especially in 3D, the row
based blocking is insufficient. - Try windshield wiper style approach.
- 3D
- I already have a basic 3D code.
- Need to extend the cache aware implementation to
3D. (How to apply cache aware techniques in 3D is
an active are of research.) - Need to generalize functionality -- more
complicated configurations of refinement patches
are possible than in 2D.
- Parallel Support
- Partitioning mechanisms.
- Load balancing algorithms (Zoltan).
84Future Work
- Variable Coefficients
- Currently the code only supports constant
coefficients. - I am already in the middle of building support
for variable coefficients.
- Support for more general boundary conditions.
- Currently only support Dirichlet boundaries.
- Clustering Algorithms
- This is related to the process of dynamically
determining useful refinement patterns based on
the behavior of the solver. - Given a set of points that are not sufficiently
accurate, cluster them into subsets that are
convenient to cover with rectangular patches. - Will probably use existing tools for this, at
least in the short term.
85Future Work
- Papers
- C. C. Douglas, J. Hu, J. Ray, D. T. Thorne, and
R. S. Tuminaro, Fast, Adaptively Refined
Computational Elements in 3D, in Proceedings of
the International Conference on Computational
Science, Amsterdam, 2002, Springer-Verlag,
Berlin, 2002, 10 pages. (Refereed) - C. C. Douglas and D. T. Thorne, A Note on Cache
Memory Methods for Multigrid in Three Dimensions,
11 pages, to appear, American Mathematics
Society, 2002-2003. - C. C. Douglas, J. Hu, J. Ray, D. T. Thorne, and
R. S. Tuminaro, Cache Aware Multigrid on Two
Dimensional Adaptively Refined Structured Meshes,
nearly ready for submission. - Paper to do 3D results
- Paper to do Parallel algorithms and results.
- Paper to do Fully cache aware multigrid
results. - Paper to do Tutorial about the code.
86Implementation
87Grids and Grid Functions
- These are the two main types of objects.
- A grid level is made from grids.
- The grid hierarchy is made from grid levels.
- A grid function level is made from grid
functions. - The composite grid function is made from grid
level functions. - An AMR MG Poisson object is made from a grid
hierarchy and a composite grid function. - The AMR MG Poisson object is what the user will
interface with. - These things will be illustrated on the next
several slides.
The next slide has most of this content in the
form of a diagram.
88AMR MG Poisson
Grid Hierarchy
Grid Function Composite
NumLevels
NumLevels
Grid Levels
Grid Function Levels
NumGrids
NumGrids
Grids
Grid Function Arrays
NumFunctionss
Grid Functions
n
A contains n instances of B.
A
B
A is derived from B.
A
B
Array
A contains a pointer to B.
A
B
89Grids
Real coordinates of the beginning and ending
corners of a rectangular region of the domain.
Integer coordinates of the beginning and ending
grid points of the rectangular region of the
domain.
Identity of the processor this grid belongs to.
Pointer to the grid level that this grid belongs
to.
Pointers to the parent and children of this grid.
Illustration on next slide.
90Grid Illustration
dx1
(ex1,ex2)
dx2
(ei1,ei2)
(si1,si2)
ni2 2
(sx1,sx2)
ni1 3
- The grid is characterized by the coordinates sx,
ex, dx, si, ei, and ni. - The grid does not store values for the grid
points. - The values associated with the grid points are
stored in grid functions (next slide).
91Grid Functions
A single value for each grid point.
Pointer to the corresponding grid.
Pointer to the base grid.
Identity of the processor that this grid function
belongs to.
92Grid Function Arrays
Array of NumGridFunctions pointers to grid
functions.
NumGridFunctions
Ghost points.
Pointer to the corresponding grid.
Pointer to the base grid.
Pointer to the containing grid function level.
Identity of the processor that this grid function
array belongs to.
Pointers to parent and children.
- This is where most of the work is done.
- Most of the multigrid related methods are here,
for example.
93Grid Levels
NumLevels, LevelIndex, NumGrids,
Array of NumGrids pointers to grids.
Pointer to the grid hierarchy that contains this
grid level.
94Grid Function Levels
NumGridFunctions, NumFunctionArrays
Array of NumFunctionArrays pointers to grid
function arrays.
Pointer to the parent grid function level.
Pointer to the base grid.
95Grid Hierarchy
NumLevels
Array of NumLevels pointers to grid levels.
Grid Function Composite
NumLevels, NumGridFunctions
Array of NumLevels grid function levels.
96AMR MG Poisson
Pointer to a grid hierarchy.
Pointer to a composite grid function.
Array of pointers to grids.
NumGrids, NumGridFunctions.
RefinementFactor, MinPatchSize.
The grids are allocated here, and the rest of the
grid hierarchy is initialized with pointers to
them.
97Additional Comments
- Not shown in the preceding diagrams
- Every object contains a pointer to an MPI
wrapper. - Every object has the usual constructors,
destructor, operators, accessors, - Some of the objects have methods related to the
mesh refinement process and/or multigrid process.
Those will be discussed in the following few
slides.
98Refinement
99Composite Operator
AMRMGPoissonClassLsuball()
GridFunctionCompositeClassLsuball()
GridFunctionLevelClassLsuball()
GridFunctionLevelClassLsuball()
- Interpolate ghost points around this grid on
level l using values from the parent grid on
level l-1. - Apply the operator to this grid on level l.
- Perform flux matching at the interfaces between
this grid on level l and all child grids on level
l1. - Note that the ghost points for the flux matching
were acquired when Lsuball() was called for level
l1 previously. - Note also that an argument can be passed to this
function indicating which level to treat as the
finest level. If that argument indicates that
this level, level l, is the finest level, then
flux matching will not be performed and Lsuball()
will correspond to Lnf,l in that case.
100Multigrid
AMRMGPoissonClassVCycle()
GridFunctionCompositeVcycle()
GridFunctionLevel( lmax )ComputeResidual()
GridFunctionLevel( l-1 )Vcycle_Project_Stage00()
GridFunctionLevel( l )Vcycle_Project_Stage01()
For l from finest to coarsest1.
GridFunctionLevel( l )Lsuball( l )
GridFunctionLevel( l-1 )Lsuball()
GridFunctionLevel( l-1 )Vcycle_Project_Stage02()
GridFunctionLevel( 0 )Vcycle_CoarseGridSolve()
GridFunctionLevel( l-1 )Vcycle_Correct_Stage01()
GridFunctionLevel( l )Vcycle_Correct_Stage02()
For l from coarsest to finest-1.
101Closing Remarks
- As mentioned before, the user will interface
with an AMRMGPoisson object. In the simplest
case, just specify the problem on a base grid and
then call the solve method. - AMRMGPoisson.InitBaseGrid( /Specify pointers to
functions and/or filenames of input
files/) - AMRMGPoisson.Solve()
- Experiment with caching techniques.
- Parallelization.
- Clustering (GrACE)
- Load balancing (Zoltan)