Paul Fischer - PowerPoint PPT Presentation

1 / 75
About This Presentation
Title:

Paul Fischer

Description:

Paul Fischer – PowerPoint PPT presentation

Number of Views:147
Avg rating:3.0/5.0
Slides: 76
Provided by: paulfi8
Category:
Tags: fischer | paul

less

Transcript and Presenter's Notes

Title: Paul Fischer


1
  • Paul Fischer
  • Mathematics and Computer Science Division
  • Argonne National Laboratory

fischer_at_mcs.anl.gov www.mcs.anl.gov/fischer
Temperature distribution in an fcc-packed array
of spheres subject to unit-flux boundary
conditions.
2
Outline
  • Objective Give an overview of motivation, and
    some detail, for the development of high-order
    methods in scientific applications
  • Introduction
  • Architecture / Algorithm Coupling
  • Accuracy
  • Stability
  • Solvers
  • Applications will be interwoven throughout

3
Motivation Spectral Element Simulation Examples
  • transitional boundary layers
  • heat transfer enhancement
    M. Greiner (UNR)
  • convection dynamics in deep atmospheres A. Deane
    (U.Md)
  • pattern formation in nonequilibrium flows
  • H.
    Greenside (Duke), M. Paul M. Cross (Caltech)
  • vascular flows F. Loth (UIC), S. Lee
    (MIT), H. Bassiouny (UofC)
  • MHD F. Cattaneo, A.
    Obabko, R. Rosner (UofC)

4
Heat Transfer in an fcc Lattice of Spheres at ReD
30,000
  • E1536 elements of order N14
  • 10 hours on 64 nodes for 1 flow-through time
  • sphere-centric view
  • void-centric view
  • Max temperature fluctuation
  • 14 x mean temperature change.

5
Transition for Pulsatile Flow in a Model
Stenosis
Calculation by Sonu Sam Varghese, Steve Frankel,
Purdue.
E1600, N10
6
Spatial Discretization Spectral Element Method

(Patera 84, Maday Patera 89)
  • Variational method, similar to FEM, using GL
    quadrature.
  • Domain partitioned into E high-order
    quadrilateral (or hexahedral) elements
    (decomposition may be nonconforming - localized
    refinement)
  • Trial and test functions represented as N
    th-order tensor-product polynomials within each
    element. (N 4 -- 15, typ.)
  • EN 3 gridpoints in 3D, EN 2 gridpoints in 2D.
  • Converges exponentially fast with N for smooth
    solutions.

3D nonconforming mesh for arterio-venous graft
simulations E 6168 elements, N 7
7
PN - PN-2 Spectral Element Method for
Navier-Stokes (MP 89)
Velocity, u in PN , continuous Pressure, p
in PN-2 , discontinuous
Gauss-Lobatto Legendre points (velocity)
Gauss Legendre points (pressure)
8
Local Spectral Element Basis in 2D
Nodal bases on the Gauss-Lobatto-Legendre points
N4
N10
9
  • Architecture / Algorithm Considerations

10
Ubiquitous Loosely-Coupled Network of
Processor/Memory Units
  • Paralellism easy SP h P S1 h O(1)
    even for P? 10,000.
  • Concurrency
  • Reasonable communication requirements
  • Difficulty is getting good single processor
    performance S1
  • Moores Law CPU speed doubles every 18 months

11
Moores Law Measured Performance 19862000
12
Ubiquitous Loosely-Coupled Network of
Processor/Memory Units
  • Paralellism easy SP h P S1 h O(1)
    even for P? 10,000.
  • Concurrency
  • Reasonable communication requirements
  • Difficulty is getting good single processor
    performance
  • Moores Law CPU speed doubles every 18 months
  • Memory speeds double every 120 months
  • ? Growing processor-memory gap.
  • On current machines, 10 of peak is considered
    reasonable.

13
Ubiquitous Loosely-Coupled Network of
Processor/Memory Units
  • Paralellism easy SP h P S1 h O(1)
    even for P? 10,000.
  • Concurrency
  • Reasonable communication requirements
  • Difficulty is getting good single processor
    performance
  • Moores Law CPU speed doubles every 18 months
  • Memory speeds double every 120 months
  • ? Growing processor-memory gap.
  • On current machines, 10 of peak is considered
    reasonable.
  • Need algorithms that re-use data, i.e. FLOPS gtgt
    memory accesses

14
Work vs. Memory Examples

  • ops mem ratio
  • Vector-Vector a a c b 2n
    2n 1
  • Matrix-Vector x Ay
    2n2 n2 2
  • Sparse Mat-Vec x Asy 2mn
    mn 2
  • Matrix-Matrix W AU 2n3
    2n2 n
  • Amortize cost of accessing A over n vectors

15
Matrix-Matrix Product Performance on
Cached-Based Architectures
  • Time for AB vs AB for N10 is
  • 2.0 x on DEC Alpha,
  • 1.5 x on IBM SP

16

  • Examples of data re-use via matrix-matrix
    products in PDE operator evaluation (kernel of
    iterative solvers)

17
Example 1 Spectral Element Derivative
Evaluation
  • Local tensor-product form (2D),
  • allows derivative to be evaluated as
    matrix-matrix product
  • In Rd (N1)d points/element, O(Nd1)
    ops/element
  • Similar complexities hold for all SE operations

18
Example 2 Derivative Evaluation for p-type or
DG Triangles

(Teng, Hesthaven, Warburton,)
  • Standard uxe Dxe ue , e 1,, E
  • O(E p4 ) memory references for O(E p2) gridpoints
  • Modified uxe rxe Dre ue sxe Dse ue
  • For triangles, metrics rxe and sxe are constants
  • Dre ue , Dse ue are matrix-matrix products on
    canonical triangle
  • O(E p2 ) memory references for O(E p2) gridpoints
  • In 3D, savings is more significant

s W r
y W e x
19
Example 3 Block-Jacobi Preconditioning for
Triangles / Tets
20
Summary Architectures
  • It is possible to construct high-order methods
    having memory-access demands identical to
    standard low-order schemes.
  • The extra work of the high-order methods can be
    (potentially) covered by the processor-memory
    performance gap.

21
  • Motivation for High-Order Accuracy

22
High-Order Methods for the Incompressible
Navier-Stokes Equations
  • Reynolds number Re 1000 - 2000
  • small amount of diffusion
  • highly nonlinear (small scale structures result)
  • High-order methods yield savings for simulations
    featuring
  • long time-integration
  • minimal physical dissipation (Regtgt1, no
    turbulence modeling)
  • Examples
  • DNS (direct numerical simulation)
  • LES (large eddy simulation)
  • Chaotic systems

23
Exponential Convergence
Exact Navier-Stokes solution due
to Kovazsnay(1948)
24
Excellent transport properties, even for
non-smooth solutions
Convection of non-smooth data on a 32x32
grid (K1 x K1 spectral elements of order N).
(cf. Gottlieb Orszag 77)
25
Long-time integration 1-D periodic convection
For 0.5 error, only 64 pts. are required for
N8, vs. 256 pts. for N2. Savings is cubed in
3D.
26
Relative Phase Error for h vs. p Refinement ut
ux 0
  • h-refinement
    p-refinement
  • Fraction of accurately resolved modes is
    increased only through increasing order ( N or p
    )

27
  • Example Applications of High-Order Methods

28
Scattered Field Solution of Maxwells Equations
  • Calculation by Jan Hesthaven Tim Warburton
    USEMe
  • Computed in the time domain
  • Discontinuous Galerkin in space 250,000 tets,
    p4
  • jan.hesthaven_at_brown.edu
  • timwar_at_math.umn.edu

29
Anisotropic Diffusion in Plasma Fusion Reactors
  • For the next generation fusion codes, the DOE
    Center for Extended Magnetohydrodynamics Modeling
    (CEMM) at PPPL has posed three challenging model
    problems to test new discretizations.
  • As part of SciDAC funded research, we have
    tackled the first of these.

30
Anisotropic Diffusion in a Tokamak (PPPL
Challenge Problem)
  • k 109 k 1

31
Anisotropic Diffusion in Plasma Fusion Reactors
  • Because of the high-degree of anisotropy, this
    problem is more like a (difficult) hyperbolic
    problem than straightforward diffusion.

Error vs. k and N
32
  • Stability of High-Order Methods

33
Stability Considerations
  • Stability for nodal bases essentially comes from
    use of Gauss-type nodal distributions avoids
    Runge phenomenon
  • This has been generalized to tets / triangles by
  • Hesthaven points based on steady-state charge
    distributions
  • Wingate Taylor Fekete points that minimize
    det Vandermonde matrix
  • Other stability issues also arise, because there
    is little numerical or physical dissipation in
    the numerical systems.
  • Problem usually stems from spurious eigenvalues
    and can be controlled by
  • Least-Squares
  • Upwinding
  • Bubble functions
  • Hyperviscosity or spectrally vanishing viscosity
  • Filtering

34
Filter-Based Stabilization
(Gottlieb et al., Don et al., Vandeven, Boyd, ...)
  • At end of each time step,
  • Interpolate u onto GLL points for PN-1
  • Interpolate back to GLL points for PN
  • Results are smoother with linear combination
    (F. Mullen 01)

F1 (u) IN-1 u
Fa (u) (1-a) u a IN-1 u (a 0.05
- 0.2)
  • Preserves interelement continuity and spectral
    accuracy
  • Equivalent to multiplying by (1-a) the N th
    coefficient in the expansion
  • u(x) S uk fk (x)

    (Boyd 98)
  • fk (x) Lk(x) - Lk-2(x)

35
Numerical Stability Test Shear Layer Roll-Up
(Bell et al.
JCP 89, Brown Minion, JCP 95, F. Mullen, CRAS
2001)
2562
2562
1282
2562
2562
1282
36
Error in Predicted Growth Rate for
Orr-Sommerfeld Problem at Re7500
Spatial and Temporal Convergence (FM,
2001)
Base velocity profile and perturbation streamlines
37
Filtering permits Red99 gt 700 for transitional
boundary layer calculations
Re 1000
Re 3500
38
  • Navier-Stokes Time Advancement
  • Iterative Linear Solvers

39
Navier-Stokes Time Advancement
  • Nonlinear term explicit
  • k th-order backward difference formula /
    extrapolation
  • characteristics (Pironneau 82, MPR 90)
  • Stokes problem pressure/viscous decoupling
  • 3 Helmholtz solves for velocity
  • (consistent) Poisson equation for pressure
    (computationally dominant)

40
PN - PN-2 Spectral Element System Matrices for
Stokes (MP 89)
41
Consistent Splitting for Unsteady Stokes(MPR 90,
Blair-Perot 93, Couzy 95)
  • E - consistent Poisson operator for pressure, SPD
  • Stiffest substep in N-S time advancement
  • Most compute-intensive phase

42
Pressure Solution Strategy Epn gn
  • Projection compute best approximation from
    previous time steps
  • Set p S gi p n-i, projection onto L previous
    time steps
  • Choose gi such that Dp E p n - p E
    is minimized
  • Cost 2 matvecs in E per timestep, L2 vectors of
    storage
  • Multilevel preconditioned CG or GMRES to solve
  • E Dp gn - E p

43
Initial guess for Axn bn via projection (
AE, SPD)
(best fit solution)
44
Initial guess for Epn gn via projection onto
previous solutions
(F 93, 98)
  • pn - pE O(Dtl) O( etol )
    (Chan Wan, 95)
  • Results with/without projection (1.6 million
    pressure nodes)
  • Similar results for pulsatile carotid artery
    simulations
    108-fold reduction in initial residual

Pressure Iteration Count
Initial Pressure Residual
45
Pressure Solution Strategy Epn gn
  • Projection compute best approximation from
    previous time steps
  • Multilevel preconditioned CG or GMRES to solve
  • E Dp gn - E p

46
Two-Level Overlapping Additive Schwarz
Preconditioner for Pressure Solve
(Dryja Widlund 87, Pahl 93, PF 97, FMT 00)
Local Overlapping Solves Poisson problems with
homogeneous Dirichlet boundary conditions.
Coarse Grid Solve Poisson problem using linear
finite elements on entire spectral element mesh
(GLOBAL). (Fast solver Tufo F. 01)
47
Overlapping Schwarz SEM Fast Local Solvers
  • Exploit local tensor-product structure --
  • A is separable in canonical domain
  • Fast diagonalization method (FDM) (Lynch et al
    64)
  • Local solve cost is 4d K N(d1)

48
Convergence Rates
  • Schwarz theory says that convergence rate depends
    only on d/H (Helement size), and not on number
    of elements or local order N.
  • In practice, however, d N-2
  • Recent work (Lottes F. 04a, 04b) indicates that
    N-dependence is diminished if AC-1 is replaced
    with a multigrid V-cycle (at nominal cost) using
    weighted additive Schwarz smoothing.
  • Convergence rate is strongly dependent on element
    geometry particularly aspect-ratio.

49
Two-Level Additive Schwarz Performance
  • Iteration count deteriorates in presence of
    high-aspect ratio elements
  • Nonconforming discretizations eliminate
    unnecessary elements in the far field and result
    in better conditioned systems.

50
  • Navier-Stokes Case Study
  • Blood Flow in Arteriovenous Grafts

51
A Study of Transition in Arterio-Venous Grafts
  • Provides a port for dialysis patients to obtain
    high flow rates and avoid repeated vessel injury
  • PTFE plastic tubing surgically attached from
    artery to vein
  • short-circuit of high to low pressure results in
    very high flow ratestransitional flow
  • Failure often results after 3 months from
    occlusion (intimal hyperplasia) forming
    downstream of attachment to vein, where flow is
    turbulent

52
In Vitro Validation Transitional Flow,
Arteriovenous Graft
Mesh constructed by Seung Lee, MIT
53
Comparison of spanwise vorticity for laminar and
turbulent cases
Re 1696
Re 2912
54
Spectral Element Simulation of Transitional Flow
in an Arteriovenous-Graft Model, Re2912
55
Experimental/Numerical Comparison at Re2912
LDA measurements CFD results
(Loth, et al. JBME 2002)
56
Canine In-Vivo Model of AV Graft
NIH RO1, PI
Bassiouny
57
Canine In-Vitro Model
  • CT Scan image (GE Medical Systems)
  • Slice thickness 1.25mm (contiguous)
  • Pixel size .19x.19mm

A
B(1-4)
C
A
B-1
B-2
B-3
C
C
B-4
CS view
58
Canine Graft Pulsatile Flow Simulation
Sang-Wook Lee, UIC
  • In vivo turbulence is observed
  • CFD does not indicate turbulence
  • Frequency peak 30 Hz, not 200-300
  • Re too low 1200
  • Re in arterial side is 2400
  • points to arterial flow as possible controllable
    source of turbulence

Venous Anastomosis
59
Where is Turbulence Coming From ?
  • Suspected that turbulence is generated at
    arterial anastomosis and sustained through the
    graft and into the PVS

60
Simulations of Full Graft Geometry
  • Arterial Anastomosis ? PTFE graft ? Venous
    Anastomosis

Preliminary CFD results for 3-6 mm transition at
arterial anastomosis 300-400 Hz fluctuation
Spectral Element Mesh for PTFE graft
61
Simulations of Full Graft Geometry
  • Arterial Anastomosis ? PTFE graft ? Venous
    Anastomosis

E4800 Spectral Element Mesh for PTFE graft
N10 ( 4.7 M points ) N14 ( 13.0 M
points )
62
Where is Turbulence Coming From ?
  • Turbulence generated at arterial anastomosis is
    not sustained through the graft and into the
    PVS.
  • Recent detective work by students David Smith and
    Sang-Wook Lee in the UIC Biofluids Lab points to
    flow-reversal in the DVS as a probable source of
    turbulence.
  • This conjecture is supported by data from
    concurrent numerical simulations (Lee) and
    in-vitro LDA measurements (Smith).

63
Where is Turbulence Coming From ?
  • Three simulations with steady inlet conditionsat
    ReG 1200
  • 100, 85, and 70 flow through PVS
  • 70 case shows strong turbulence, despite the
    fact that ReDVV 950

100
85
70
64
Coherent Structures for 70/30 Split _at_ ReG 1200
65
Conclusions
  • Highlighted several considerations in the
    development of high-order methods for transport
    applications
  • Architectures processor/memory gap
  • Accuracy minimal dispersion
  • Stability filtering
  • Solvers multilevel Schwarz methods

66
Future Directions
  • Expect to see further development of high-order
    methods as new physical problems are addressed.
  • Recent areas include
  • Vascular flows
  • Electromagnetics, Hesthaven Warburton (Brown,
    UNM)
  • MHD codes LANL, PPPL, ANL / U of C
  • Compressible aero codes D. Mavriplis UWY, D.
    Darmofals group MIT
  • Atmospheric models NCAR
  • Geophysics, etc.,

67
(No Transcript)
68
Conclusions
  • Hex-based SEM can be effectively used for
    transitional flow problems in moderately complex
    domains
  • Avoid severe mesh distortion impact on accuracy
    and conditioning
  • Gordon-Hall generally sufficient to blend local
    element distortion
  • Sub-parametric mappings and / or over-integration
    can be important for treating first-order
    derivative terms (i.e.,
  • div u , u.grad u )
  • PDE-based approach to mesh generation can be
    employed to generate high-quality meshes (dates
    back to aero-industry)

69


70
(No Transcript)
71
(No Transcript)
72
(No Transcript)
73
Vascular Flow Simulations
F. Loth (UIC), S. Lee (MIT), H. Bassiouny
(UofC)
  • Pulsatile carotid artery simulations
  • 3 million grid points (K 3000 elements of
    order N 10)
  • 20 hours / 256 processors for single cardiac
    cycle (temporal scales span 3 decades)
  • Fast turn-around time is required for clinical
    relevance and for large patient-population
    studies ? fast solvers are important

Patient-Specific Simulation Scenario
74
Challenges in Vascular Flow Simulation
  • Automated mesh creation based on medical image
    data is needed.
  • Boundary conditions must be deduced from
    ultrasound data.
  • The correct flow-split must be imposed in
    bifurcating geometries.
  • Strong turbulence can wreak havoc at outflow
    boundaries.
  • Modeling issues
  • Non-Newtonian at certain sites?
  • Role of wall compliance?

75
Performance pressure solve -- Epngn
  • Project solution onto previous solutions ( 6
    orders of magnitude residual reduction for
    carotid simulations)
  • p Sk bk pk
  • Solve remaining perturbed problem with conjugate
    gradients, preconditioned by the two-level
    additive Schwarz method of Dryja and Widlund (87).
Write a Comment
User Comments (0)
About PowerShow.com