AstroBEAR - PowerPoint PPT Presentation

About This Presentation
Title:

AstroBEAR

Description:

Method of reducing computation in finite volume calculations ... Cons: Clumsy data structures, more memory and pointer management for developer. Hybridization ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 15
Provided by: brandon50
Category:
Tags: astrobear | clumsy

less

Transcript and Presenter's Notes

Title: AstroBEAR


1
AstroBEAR
  • Finite volume hyperbolic PDE solver
  • Discretizes and solves equations of the form
  • Solves hydrodynamic and MHD equations
  • Written in Fortran, with MPI support libraries

2
Adaptive Mesh Refinement
  • Method of reducing computation in finite volume
    calculations
  • Starts with a base resolution and overlays grids
    of greater refinement where higher resolution is
    needed.
  • Grids must be properly nested
  • For parallelization purposes, only one parent per
    grid

3
AMR Algorithm(Cunningham et al., 2009)?
  • AMR(level, dt)
  • if (level 0) nsteps 1
  • if (level gt 0) nsteps refine_ratio
  • For n 1, nsteps
  • DistributeGrids(level)
  • IF (level lt MaxLevel)
  • CreateRefinedGrids(level 1)
  • SwapGhostData(level 1)
  • Integrate(level, n)
  • If (level lt MaxLevel) CALL AMR(level 1,
    dt/refine_ratio)
  • If (level gt 1) synchronize_data_to_parent(level)

4
Parallel Communications
  • Grids rely on external ghost cells to perform
    calculations.
  • Data from neighboring grids needs to be copied
    into ghost region.
  • Major source of scaling problems
  • Alternate fixed-grid code (AstroCUB) has
    different communication method

5
AstroBEAR Parallel Communication
  • TransferOverlapData()
  • TransferWorkerToWorkerOverlaps()
  • TransferMasterToWorkerOverlaps()
  • TransferWorkerToMasterOverlaps()
  • foreach overlap transfer t
  • If (Worker(t.source)) SignalSendingProcessor(t.so
    urce)
  • If (Worker(t.dest)) SignalReceivingProcessor(t.de
    st)
  • IF (Worker(t.source)) SendLocalOverlapRegion(t.so
    urce)
  • IF (Worker(t.dest)) SendLocalOverlapRegion(t.dest
    )
  • ltt.source sends data to t.destgt

6
AstroCUB Parallel Communication
  • TransferOverlapData(Grid g)
  • for dim 1, ndim
  • foreach boundary along dim
  • foreach field_type
  • MPI_ISEND(ltfield_type datagt at ltboundarygt)
  • MPI_IRECV(ltfield_type datagt at ltboundarygt)
  • MPI_WAIT()

7
AstroBEAR/AstroCub Comparison
  • AstroBEAR
  • Recalculates overlaps before each synchronization
  • Each send/receive operation is handled
    individually
  • Groups transfers based on source and destination
    processor (master or worker)?
  • 10 MPI calls per grid per timestep in 3D hydro
    runs
  • AstroCUB
  • Calculates overlaps once, prior to first
    synchronization
  • Send/receive operations handled together
  • 6 MPI calls per processor per timestep in 3D
    hydro runs

8
Requirements
  • Physics
  • Hydro/MHD
  • Cooling
  • Cylindrical Source
  • Self-Gravity
  • Sink Particles
  • Numerics
  • MUSCL-Hancock, Runge-Kutta
  • Strang Splitting
  • Constrained Transport
  • Roe, Marquina Flux

9
Language Options
  • Python
  • Pros Good stack trace, flexibility, resource
    management
  • Cons Requires SciPy, GPU or hybridization for
    numerics
  • C
  • Pros Speed, no interfaces required
  • Cons More memory, pointer management work falls
    on developer
  • Fortran
  • Pros Fast number-crunching
  • Cons Clumsy data structures, more memory and
    pointer management for developer

10
Hybridization
  • Not unheard-of in scientific codes
  • Cactus (Max Planck Institute)?
  • We've tried it already (HYPRE)?
  • Can benefit from strengths of scripting and
    compiled languages
  • May result in steeper learning curve for new
    developers

11
Parallelization Improvements
  • Transmission caching
  • Each processor stores its ghost zone transmission
    details until regrid
  • Message packing
  • Sending big blocks containing many messages

msg1
msg2
msg3
12
Parallelization Improvements, ctd.
  • Redundancy in root domains
  • Stretching root grids initially to pull in
    extra data from other grids
  • Reduces the need for refined ghost transmissions

stretched grid
core grid
13
Further Options for Improvement
  • Refined grids Can Berger-Rigoutsos be further
    simplified/parallelized?

14
Concerns for New Code
  • Solver Modularity
  • Code should run on CPU cluster or GPU cluster
  • Scalability
  • Code must run effectively on more than 64 CPUs
Write a Comment
User Comments (0)
About PowerShow.com