DFT requirements for leadership-class computers - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

DFT requirements for leadership-class computers

Description:

Department of Physics Astronomy, University of Tennessee, Knoxville, TN-37996, USA ... Slow and CPU/memory intensive for 2D-3D geometries ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 15
Provided by: Nico216
Learn more at: http://unedf.org
Category:

less

Transcript and Presenter's Notes

Title: DFT requirements for leadership-class computers


1
DFT requirements for leadership-class computers
http//unedf.org
  • N. Schunck
  • Department of Physics ? Astronomy, University of
    Tennessee, Knoxville, TN-37996, USA
  • Physics Division, Oak Ridge National Laboratory,
    Oak Ridge, TN-37831, USA

A. Baran, J. Dobaczewski, J. McDonnell, J. Moré,
W. Nazarewicz, N. Nikolov, H. H. Nam, J. Pei, J.
Sarich, J. Sheikh, A. Staszczak, M. V. Stoitsov,
S. Wild
The 3rd LACM-EFES-JUSTIPEN Workshop JIHIR, Oak
Ridge National Laboratory, February 23-25, 2009
2
Nuclear DFT Why supercomputing?
1
DFT A global theory
Principle average out individual degrees of
freedom
  • Treatment of correlations ?
  • Current lack of quantitative predictions at the
    100 keV level
  • Extrapolability ?
  • No limit theory from light nuclei to the
    physics of neutron stars
  • Rich physics
  • Fast and reliable

Ground-state of even nucleus can be computed in a
matter of minutes on a standard laptop why
bother with supercomputing?
  • Why super-computers
  • Large-scale problems (LACM) fission, shape
    coexistence, time-dependent problems
  • Systematic restoration of broken symmetries and
    correlations made easy (QRPA, GCM?, etc.)
  • Optimization of extended functionals on larger
    sets of experimental data

Supercomputers DFT at full power
3
Classes of DFT Solvers
2
Non-linear integro-differential fixed point
problem
  • Coordinate-space direct integration of the HFB
    equations
  • Accurate provide  exact  result
  • Slow and CPU/memory intensive for 2D-3D
    geometries
  • Configuration space expansion of the solutions
    on a basis (usually HO)
  • Fast and amenable to beyond mean-field extensions
  • Truncation effects source of divergences/renormal
    ization issues
  • Wrong asymptotic unless different bases are used
    (WS, PTG, Gamow, etc.)

Computational package used and developed at ORNL
and estimate of the resources needed for a
standard HFB calculation
1D 2D 3D
r-space 1 mn, 1 core (HFBRAD) 5 hours,70 cores (HFBAX) -
HO basis - 2 mn, 1 core (HFBTHO) 5 hours, 1 core (HFODD)
4
Recent physics achievements
3
Nuclear fission
Even-even, odd-even and odd-odd mass tables
Systematics of odd-proton states in odd nuclei
Cf. Talks by M. Stoitsov, S. Wild and J.
Moré Online resources http//massexplorer.org/ ht
tp//unedf.org/
5
Petascale and beyond
4
  • Hardware constraints (see R. Lusk and J. Varys
    talks)
  • Many cores (100,000) stacked into sockets -
    Currently 4 cores/socket, evolution toward 8
    cores/socket and more
  • Small-memory per core (shared memory per socket)
  • Short, crash-prone, expensive runtime
  • Consequences on the architecture of DFT solvers
  • Optimize time of one HFB calculation reduce
    number of iterations, use symmetries smartly by
    improving/interfacing codes, parallelization,
    etc.
  • Work on parallel wrapper load balancing,
    checkpoints, error control mechanisms, etc.

6
Optimization - Interface HFBTHO/HFODD
5
  • Restarting HFODD from HFB-THO means
  • Tremendous gain in time of calculation
  • Accrued numerical stability
  • Taking advantage of existing mass tables
  • Procedure
  • Coordinate phase transformation (both unitary)
  • Modify HFODD to restart from HFB matrix elements
    instead of density fields on Gauss-Hermite mesh
  • Interface fulling working for spherical HO bases
    (precision of restart at 10-4 - 10-6)
  • Memory issue for deformed bases

HFB-THO Axial Cylindrical coordinates Time-reversal symmetry j-block diagonalization
HFODD symmetry-unrestricted Cartesian coordinates Y-simplex eigenbasis No time-reversal symmetry Full diagonalization
7
Optimization HFODD Profiling
6
Broyden routine storage of NBroyden fields on 3D
Gauss-Hermite mesh
Temporary array allocation for HFB matrix
diagonalization
Safe limit memory/core on Jaguar/Franklin
neutrons
protons
Calculations by J. McDonnell
8
Optimization HFODD Parallelization
7
M
M
M
M
  • Two levels of parallelism handled by simple MPI
    group structure
  • Nuclear configuration (Z, N, interaction, Q?µ,
    etc.)
  • HFB solver
  • Standard PBLAS and ScaLAPACK libraries for
    distributed linear algebra
  • Natural splitting of the HFB matrix (OpenMP)
    perhaps not scalable enough
  • Splitting
  • HFB matrix into N blocks
  • Eigenfunctions conserve the same N-blocks
    splitting
  • Densities must be re-constructed piecewise
  • Challenges
  • Identify self-contained set of all matrices
    required for one iteration
  • Handling of conserved symmetries
    give different block structure
  • Identify and replace all BLAS calls by PBLAS
    equivalents

9
Optimization - Finite-size spin instabilities
8
Convergence of the HFB calculation of 100 blocked
states in 157-165Ba
  • Response of the nucleus to a perturbation with
    finite momentum q studied in the RPA theory
  • Channels scalar-isoscalar, scalar-isovector,
    vector-isoscalar, vector-isovector, etc.

Modern Skyrme functionals are highly-instable
with respect to finite-size spin perturbations !
Region of instability
Warning for next generation of functionals
stability must be assessed !
T. Lesinski et al, Phys. Rev. C 74, 044315
(2006) D. Davesne et al, arXiv0906.1927 (2009)
10
Work in progress - Fission
9
  • Example of challenges for next generation DFT
    microscopic description of nuclear fission
  • Degrees of freedom at the HFB level deformation,
    temperature
  • Potential energy surfaces depend critically on
    interaction/functional and pairing correlations

Static HFB pre-requisites
  • Computational tools
  • Augmented Lagrangian Method ?
  • Broyden Method ?
  • Precision tools
  • Large bases ?
  • Benchmarks ?
  • Distributed computing tools
  • MPI wrapper ?
  • Load balancing ?
  • Efficient, independent, constraint calculations ?

11
DFT Computing Infrastructure
10
Interfacing codes
Parallelize solver
Load balancing
12
Deliverables Year 2-3
11
Workplan Year 2-3
Current Status
  • Have a DFT package combining HFB-THO and HFODD
    available for large-scale calculations
  • Optimize full diagonalization of large (4,000 ?
    4,000) matrices in HFODD
  • Take advantage of N-core architecture
  • Increase speed for large bases (fission, heavy
    nuclei)
  • Overcome current memory limitations
  • Optimize Broyden method (Cf. Jorges talk) to
    improve stability/convergence
  • Papers on odd nuclei
  • Methodology and Theoretical Models
  • Systematic and comparison with experiment

Done (for spherical bases) - large-scale
calculations up to 14,112 cores (2 hours)
  • Well on target
  • Parallelization of the HFODD core (PBLAS,
    ScaLAPACK)
  • Will solve issues related to speed, memory and
    precision
  • Change of iteration cycle updating HFB matrix
    elements instead of fields

Done - Numerical instabilities of large-scale
calculations can be tracked down to physical
instabilities built-in current functionals (see
Marios talk)
  • Delayed by problem of instabilities
  • Paper 1 ready to be published
  • Paper 2 in preparation
  • Additional Paper 3 on finite-size spin
    instabilities in preparation

13
Work Plan (Year 4)
12
  • Remaining of the year
  • New version of HFODD HFBTHO interface, shell
    correction, finite-temperature, Augmented
    Lagrangian Method, matrix elements mixing,
    parallel interface, etc.
  • 2 papers on odd nuclei and 1 on spin
    instabilities in preparation
  • Physics
  • Optimization of DME-based functionals genetic
    algorithm Argonne optimizer (cf Marios talk)
  • Applications of DME functionals UNEDF-1
  • Computing
  • Implement DME functionals in HFODD (study of
    time-odd channels)
  • Complete version 1.0 of parallel HFODD core
  • Demonstrate efficiency and scalability of the
    code
  • First applications N-dimensional potential
    energy surface, fission pathways
  • Improve parallel interface to HFODD
  • Optimistic it should be a good application of
    ADLB (moderately long to long work units of 1-2
    hours, little communication).
  • Realistic remove the master and have him work
    like a slave (French revolution spirit)
  • Replace sequential I/O by parallel I/O for HFODD
    records (used as checkpoints)

14
Microscopic Description of Nuclear Fission
Our Holy Grail
Scientific and computational challenges
Summary of research direction
  • Describe dynamics with novel energy functionals
    and ab initio methods
  • adiabatic approach
  • non-adiabatic/early stochastic
  • full time-dependent dynamics
  • Develop ultra-scale techniques for the
    description of fission
  • Build a spectroscopic precision nuclear energy
    density functional
  • Perform constrained minimization on a
    multi-dimensional potential energy surface
  • Find full spectrum of dense millions-sized
    matrices

Expected Scientific and Computational Outcomes
Potential impact on Nuclear Science
  • Societal Impact
  • Nuclear Energy programs
  • Threat reduction
  • NNSA Stockpile Stewardship Program
  • Time-dependent many-body dynamics
  • Low-energy heavy-ion collisions and nucleon- and
    photon-induced reactions
  • Neutron star quakes
  • Vortex dynamics in quantum super-fluids
  • Predict half-lives, mass and kinetic energy
    distribution of fission fragments and fission
    cross-sections
  • Analyze the fission process through the
    visualization of time evolution
  • Develop scalable application software for
    time-dependent many-body dynamics
Write a Comment
User Comments (0)
About PowerShow.com