Evolution of Computational Approaches in NWChem - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Evolution of Computational Approaches in NWChem

Description:

A choice of state-of-the-art minimizers. Overview of NWChem: classical molecular dynamics ... Calculations of cytosine base in DNA, Valiev et al., JCP 125 ... – PowerPoint PPT presentation

Number of Views:348
Avg rating:3.0/5.0
Slides: 34
Provided by: Staf275
Category:

less

Transcript and Presenter's Notes

Title: Evolution of Computational Approaches in NWChem


1
Evolution of Computational Approaches in NWChem
  • Eric J. Bylaska, W. De Jong, K. Kowalski, N.
    Govind, M. Valiev, and D. Wang
  • High Performance Software Development
  • Molecular Science Computing Facility

2
Outline
  • Overview of the capabilities of NWChem
  • Existing terascale?petascale simulations in
    NWChem
  • Challenges with developing massively parallel
    algorithm (that work)
  • Using embeddable languages (Python) to develop
    multifaceted simulations
  • Summary

3
Overview of NWChem Background
  • NWChem is part of the Molecular Science Software
    Suite
  • Developed as part of the construction of EMSL
  • Environmental Molecular Sciences Laboratory, a
    DOE BER user facility, located at PNNL
  • Designed and developed to be a highly efficient
    and portable Massively Parallel computational
    chemistry package
  • Provides computational chemistry solutions that
    are scalable with respect to chemical system size
    as well as MPP hardware size

4
Overview of NWChem Background
  • More than 2,000,000 lines of code (mostly
    FORTRAN)
  • About half of it computer generated (Tensor
    Contraction Engine)
  • A new version is released on a yearly basis
  • Addition of new modules and bug fixes
  • Ported the software to new hardware
  • Increased performances (serial parallel)
  • Freely available after signing a user agreement
  • World-wide distribution (downloaded by 1900
    groups)
  • 70 is academia, rest government labs and industry

5
Overview of NWChem Available on many compute
platforms
  • Originally designed for parallel architectures
  • Scalability to 1000s of processors (part even
    10,000s)
  • NWChem performs well on small compute

    clusters
  • Portable runs on a wide range of computers
  • Supercomputer to Mac or PC with Windows
  • Including Cray XT4, IBM BlueGene
  • Various operating systems, interconnects, CPUs
  • Emphasis on modularity and portability

6
Overview of NWChem capabilities
  • NWChem brings a full suite of methodologies to
    solve large scientific problems
  • High accuracy methods (MP2/CC/CI)
  • Gaussian based density functional theory
  • Plane wave based density functional theory
  • Molecular dynamics (including QMD)
  • Will not list those things standard in most
    computational chemistry codes
  • Brochure with detailed listing available
  • http//www.emsl.pnl.gov/docs/nwchem/nwchem.html

7
Overview of NWChem high accuracy methods
  • Coupled Cluster
  • Closed shell coupled cluster CCSD and CCSD(T)
  • Tensor contraction engine (TCE)
  • Spin-orbital formalism with RHF, ROHF, UHF
    reference
  • CCD, CCSDTQ, CCSD(T)/T/t/, LCCD, LCCSD
  • CR-CCSD(T), LR-CCSD(T)
  • Multireference CC (available soon)EOM-CCSDTQ for
    excited states
  • MBPT(2-4), CISDTQ, QCISD

8-water cluster on EMSL computer 1376 bf, 32
elec, 1840 processors achieves 63 of 11 TFlops
CF3CHFO CCSD(T) with TCE 606 bf, 57 electrons,
open-shell
8
Overview of NWChem Latest capabilities of high
accuracy
Gas phase DNA environment
CR-EOMCCSD(T) 4.76 5.24 5.01 5.79
TDDFT? infty ? 5.81 eV EOMCCSD ? infty ? 6.38 eV
Surface Exciton Energy 6.35 /- 0.10 eV (expt)
Dipole polarizabilities (in Å3)
C60 molecule in ZM3POL basis set 1080 basis set
functions 24 correlated Electrons 1.5 billion
of T2 amplitudes 240 correlated electrons
Wavelength CC2 CCSD Expt.
? 92.33 82.20 76.58
1064 94.77 83.62 794
532 88.62
9
NWChem capabilities DFT
  • Density functional theory
  • Wide range of local and non-local
    exchange-correlation functionals
  • Truhlars M05, M06 (Yan Zhao)
  • Hartree-Fock exchange
  • Meta-GGA functionals
  • Self interaction correction (SIC) and OEP
  • Spin-orbit DFT
  • ECP, ZORA, DK
  • Constrained DFT
  • Implemented by Qin Wu and Van Voorhis

DFT performance Si75O148H66, 3554 bf, 2300 elec,
Coulomb fitting
10
Overview of NWChem TDDFT
  • Density functional theory
  • TDDFT for excited states

Expt 696 nm TDDFT 730 nm
Expt 539 nm TDDFT 572 nm
Modeling the ?max (absorption maximum) optical
chromophores with dicyanovinyl and
3-phenylioxazolone groups with TDDF, B3LYP, COSMO
(8.93 ? CH2Cl2), 6-31G, Andzelm (ARL), et al
(in preparation 2008)
Development validation of new Coulomb
attenuated (LC) functionals (Andzelm, Baer,
Govind, in preparation 2008)
11
Overview of NWChem plane wave DFT
Silica/Water CPMD Simulation
  • Plane wave density functional theory
  • Gamma point pseudopotential and projector
    augmented wave
  • Band structure (w/ spin orbit)
  • Extensive dynamics
    functionality with Car-Parrinello
  • CPMD/MM molecular dynamics, e.g. SPC/E, CLAYFF,
    solid state MD
  • Various exchange-correlation
    functionals
  • LDA, PBE96, PBE0,(B3LYP)
  • Exact exchange
  • SIC and OEP
  • Can handle charged systems
  • A full range of pseudopotentials and a
    pseudopotential generator
  • A choice of state-of-the-art minimizers

CCl42-/water CPMD/MM Simulation
Spin-Orbit splitting in GaAs
12
Overview of NWChem classical molecular dynamics
  • Molecular dynamics
  • Charm and Amber force fields
  • Various types of simulations
  • Energy minimization
  • Molecular dynamics simulation including ab initio
    dynamics
  • Free energy calculation
  • Multiconfiguration thermodynamic integration
  • Electron transfer through proton hopping (Q-HOP),
    i.e. semi-QM in classical MD
  • Implemented by Volkhard group, University of
    Saarland, Germany
  • Set up and analyze runs with Ecce

13
Overview of NWChem QM/MM
MM
QM
QM/MM
  • Seamless integration of molecular dynamics with
    coupled cluster and DFT
  • Optimization and transition state search
  • QM/MM Potential of Mean Force (PMF)
  • Modeling properties at finite temperature
  • Excited States with EOM-CC
  • Polarizabilities with linear response CC
  • NMR chemical shift with DFT

QM/MM CR-EOM-CCSD Excited State Calculations of
cytosine base in DNA, Valiev et al., JCP 125
(2006)
14
Overview of NWChem Miscellaneous functionality
  • Other functionality available in NWChem
  • Electron transfer
  • Vibrational SCF and DFT for anharmonicity
  • COSMO
  • ONIOM
  • Relativity through spin-orbit ECP, ZORA, and DK
  • NMR shielding and indirect spin-spin coupling
  • Interface with VENUS for chemical reaction
    dynamics
  • Interface with POLYRATE, Python

15
Existing Terascale?Petascale capabilities
  • NWChem has several modules which are scaling to
    the terascale and work is ongoing to approach the
    petascale

NWPW scalability UO22122H2O (Ne500,
Ng96x96x96)
CCSD scalability C60 1080 basis set functions
MD scalability Octanol (216,000 atoms)
16
Existing terascale?petascale Capabilities
Embarrassingly Parallel Algorithms
  • Some types of problems can be decomposed and
    executed in parallel with virtually no need for
    tasks to share data. These types of problems are
    often called embarrassingly parallel because they
    are so straight-forward. Very little inter-task
    (or no) communication is required.
  • Possible to use embeddable languages such as
    Python to implement
  • Examples
  • Computing Potential Energy Surfaces
  • Potential of Mean Force (PMF)
  • Monte-Carlo
  • QM/MM Free Energy Perturbation
  • Dynamics Nucleation Theory
  • Oniom
  • .

17
Challenges Tackling Amdahl's Law
There is a limit to the performance gain we can
expect from parallelization
Parallel Efficiency
My Program
f
Significant time investment for each 10-fold
increase in parallelism!
1-f
18
Challenges Gustafsons Law make the problem big
Gustafson's Law (also known as Gustafson-Barsis'
law) states that any sufficiently large problem
can be efficiently parallelized
My parallel Program
Assuming
b(n)
Then
a(n)
19
Challenges Extending Time Failure of
Gustafsons Law?
  • The need for up-scaling in time is especially
    critical for classical molecular dynamics
    simulations and ab-initio molecular dynamics,
    where an up-scaling of about a 1000 times will be
    needed.
  • Current ab-initio molecular dynamics simulations
    done over several months are currently done only
    over 10 to 100 picoseconds, and the processes of
    interest are on the order of nanoseconds.
  • The step length in ab initio molecular dynamics
    simulation is on the order of 0.10.2 fs/step
  • 1 ns of simulation time ? 10,000,000 steps
  • at 1 second per step ? 115 days of computing time
  • At 10 seconds per step ? 3 years
  • At 30 seconds per step ? 9 years
  • Classical molecular dynamics simulations done
    over several months, are only able simulate
    between 10 to 100 nanoseconds, but many of the
    physical processes of interest are on the order
    of at least a microsecond.
  • The step length in molecular dynamics simulation
    is on the order of 12 fs/step
  • 1 us of simulation time ? 1,000,000,000 steps
  • at 0.01 second per step ? 115 days of computing
    time
  • At 0.1 seconds per step ? 3 years
  • At 1 seconds per step ? 30 years

20
Challenges Need For Communication Tolerant
Algorithms
  • Data motion costs increasing due to processor
    memory gap
  • Coping strategies
  • Trade computation for communication SC03
  • Overlap Computation and Communication

Processor(55/year)
Memory DRAM(7/year)
21
Basic Parallel Computing tasks
  • Break program into tasks
  • A task is a coherent piece of work (such as
    Reading data from keyboard or some part of a
    computation), that is to be done sequentially it
    can not be further divided into other tasks and
    thus can not be parallellised.

22
Basic Parallel Computing Task dependency graph
a a 2 b b 3 for i 0..3 do ci1
ci b b a b b c4
23
Basic Parallel Computing Task dependency graph
(2)
Critical path
24
Basic Parallel Computing Critical path
  • The critical path determines the time required to
    execute a program. No matter how many processors
    are used, the time imposed by the critical path
    determines an upper limit to performance.
  • The first step in developing a parallel algorithm
    is to parallelize along the critical path
  • If not enough?

25
Case Study Parallel Strategies for Plane-Wave
Programs
Ng basis
Ne molecular orbitals
Critical path parallelization
Distribute Molecular Orbitals (one eigenvector
per task)
  • Minimal load balancing

Distribute Brillouin Zone
26
Case Study Speeding up plane-wave DFT
Old parallel distribution (scheme 3 critical
path)
Each box represents a cpu
27
Case Study Analyze the algorithm
  • Collect timings for important components of the
    existing algorithm
  • For bottlenecks
  • Remove all-to-all, log(P) global operations
    beyond 500 cpus
  • Overlap computation and communication (e.g.
    pipelining)
  • Duplicate the data?
  • Redistribute the data?
  • .

28
Case Study Try a new strategy
New parallel distribution
Old parallel distribution
29
Case Study analyze new algorithm
  • Can trade efficiency in one component for
    efficiency in another component if timings are of
    different orders

30
Case Study Success!
31
Python Example AIMD Simulation
title "propane aimd simulation" start
propane2-db memory 600 mb permanent_dir
./perm scratch_dir ./perm geometry C
1.24480654 0.00000000
-0.25583795 C 0.00000000
0.00000000 0.58345271 H
1.27764005 -0.87801632 -0.90315343 mass
2.0 H 2.15111436
0.00000000 0.34795707 mass 2.0 H
1.27764005 0.87801632 -0.90315343
mass 2.0 H 0.00000000
-0.87115849 1.24301935 mass 2.0 H
0.00000000 0.87115849 1.24301935
mass 2.0 C -1.24480654
0.00000000 -0.25583795 H
-2.15111436 0.00000000 0.34795707 mass
2.0 H -1.27764005
-0.87801632 -0.90315343 mass 2.0 H
-1.27764005 0.87801632 -0.90315343
mass 2.0 end basis library 3-21G end python
from nwmd import surface
nwmd_run('geometry','dft',10.0, 5000) end task
python
32
Python Example Header of nwmd.py
import nwchem specific routines from nwchem
import other libraries you might want to
use from math import from numpy import from
numpy.linalg import from numpy.fft import
import Gnuplot basic rtdb routines to
read and write coordinates
def geom_get_coords(name) This
routine returns a list with the cartesian
coordinates in atomic units for the geometry
of given name try actualname
rtdb_get(name) except NWChemError
actualname name coords rtdb_get('geometry
' actualname 'coords') return coords def
geom_set_coords(name,coords) This
routine, given a list with the cartesian
coordinates in atomic units set them in the
geometry of given name. try actualname
rtdb_get(name) except NWChemError
actualname name coords list(coords)
rtdb_put('geometry' actualname
'coords',coords)
33
Python Example basic part of Verlet loop
do verlet step-1 verlet steps for s in
range(steps-1) for i in range(nion3)
rion0i rion1i for i in range(nion3)
rion1i rion2i t time_step
set coordinates and calculate energy nwchem
specific geom_set_coords(geometry_name,
rion1) (v,fion) task_gradient(theory)
verlet step for i in
range(nion3) rion2i 2.0rion1i -
rion0i - dtiifioni calculate ke
vion1 for i in
range(nion3) vion1.append(h(rion2i -
rion0i)) ke 0.0 for i in
range(nion3) ke 0.5massiivion1i
vion1i e v ke print ' '
print '_at__at_ 5d 9.1f 19.10e 19.10e 14.5e'
(s2,t,e,v,ke)
Write a Comment
User Comments (0)
About PowerShow.com