Title: New Algorithms to make Quantum Monte Carlo more Efficient
1New Algorithms to make Quantum Monte Carlo more
Efficient
- Daniel R. Fisher
- William A. Goddard III
Materials and Process Simulation Center
(MSC) California Institute of Technology
2Diracs Thoughts on Chemistry
"The underlying physical laws necessary for the
mathematical theory of a large part of physics
and the whole of chemistry are thus completely
known, and the difficulty is only that the
application of these laws leads to equations
much too complicated to be soluble." - P.
Dirac, Proc. Roy. Soc (London) 123714 (1929)
3Computational Quantum Mechanics Algorithms
Adapted from Morokuma, et al. IBM J. Res.
Dev. Vol. 45 No. 3/4 May July 2001. p 367-395
4Hartree-Fock QM Calculations
Replace explicit electron-electron interactions
with average electron-electron interactions to
get N,1-particle equations.
This works BUT it does not deal correctly with
electron-electron correlation
5When Is Electron Correlation Important?
- Transition Metal Oxides
- High Temprature Copper Oxide Superconductors
- Heavy Fermion Metals
- Usually Rare Earths or Actinides
- Organic Charge Transfer Compounds
- One and Two Dimensional Electron Gas Systems
- Nano-scale Wires
- Positron Chemistry
- Defect Detection and Analysis
- Muon/Exotic Particle Physics
- Catalyzed Fusion
See 21 April 2000 Science for More Examples
6QMC Basic Algorithm
Get YT (HF,DFT,MCSCF)
Get Starting Jastrow
Traditional QC package
Molecule
Construct Initial Walker(s)
VMC Equilibrate
Task Conceived!!! -Accurate result -Small expense
Change Jastrow to Optimize it
VMC Sampling
Optimized? Enough
DMC Equilibrate
DMC Sampling
Very Accurate Result
7Metropolis Algorithm and QMC
- Reformulate Energy Expectation Integral.
- Monte Carlo Integration for this 3N-dimensional
integral. - Metropolis algorithm produces electron
configurations with respect to Y2. - System energy expectation value is average of the
local energies.
8VMC Wavefunction for a Molecule
What a typical YHe might look like
9Diffusion QMC
Change variables ...
Key Point No Matter What Wavefunction We Start
With, It Decays to the True Ground State as We
Take Steps in time, t!
10New Algorithms and Areas Affected
Get YT (HF,DFT,MCSCF)
Get Starting Jastrow
Traditional QC package
Molecule
Construct Initial Walker(s)
VMC Equilibrate
Task Conceived
Change Jastrow to Optimize it
VMC Sampling
Optimized? Enough
DMC Equilibrate
Decorrelation Algorithm
DMC Sampling
Very Accurate Result
11Problem With Metropolis Sampling
From intro statistics, the uncorrelated variance
in energy is
Probability of a move being accepted is related
to the transition probability. T(A?B)
This does not work in this case because the
energies calculated at sequential points are
serially correlated.
Rejected Move
Accepted Move
Energies at these points are correlated!
12Flyvbjerg-Petersen Decorrelation Algorithm
Average blocks of the original data into new data
elements.
If the data blocks are sufficiently large, the
blocked data points are uncorrelated and the
standard O(N) variance equation can be used.
13VMC Particle-in-a-Box Standard Deviation
Calculation
Uncorrelated Data
Correlated Data
14New Statistical Analysis Algorithm
- Flyvbjerg-Petersen Algorithm
- One processor must do all of the work
- Must communicate O(N) data when used on a
parallel computer - Must store O(N) data
- Must be used at the end of the calculation
- (cant check convergence on-the-fly).
- Dynamic Distributable Decorrelation Algorithm
- Feldmann M.T., D.R. Kent IV, R.P. Muller, W.A.
Goddard III. Efficient Algorithm - for On-the-fly Error Analysis of Local or
Distributed Serially-Correlated Data, - J. Chem. Phys. (submitted)
- Perfectly parallel even on inhomogeneous
computers - Must communicate O(log2N) data when used on a
parallel computer - Must store O(log2N) data
- Can provide on-the-fly results (convergence
based termination)
N 107-1012 log2N 23-40
15New Manager Worker Parallelization Algorithm
- Current Algorithm - Pure Iterative
- All processors do an equal amount of work
- If one processor finishes first it must wait on
all of the others - Intended only for homogeneous machines
- (all processors the same)
- Convergence-based termination not possible
- New Algorithm - Manager Worker
- Feldmann M.T., D.R. Kent IV, R.P. Muller, W.A.
Goddard III. Manager-Worker - Based Model for Massively Parallel and Perfectly
Load-Balanced Quantum - Monte Carlo, J. Comp. Chem. (submitted)
- All processors do as much work as they can
- All processors complete at the same time
- Intended for either homogeneous or inhomogeneous
machines - (processors can be different)
- Convergence-based termination possible with DDDA
16Parallel Algorithm Performance Heterogeneous
Computer
8 total processors. Mixture of Pentium II 200
MHz and Pentium III 866 MHz
17Parallel Algorithm Performance Heterogeneous
Computer
Performance depends on what is being measured
Not quite linear
LLNL Blue Pacific
LANL Nirvana
Linear to 2048 Processors
18Are Transferable Parameter Sets Possible?
Can parameter sets be found that are transferable
between different systems? Similar idea to
contracted Gaussian basis sets (e.g. 6-31G).
19Correlation Energy Recovered by Generic Jastrow
Function
All these hydrocarbons have nearly the same
optimal b?? parameter!
20Generic Jastrow Performance
DMC CH4
DMC C2H2
21Why Isnt QMC Really Linearly Scaling?
- Each processor needs to perform statistically
independent calculations - Each processor must begin calculating with
independent, statistically significant points in
configuration space - A separate Metropolis calculation must be
equilibrated for each independent point in
configuration space
LANL Nirvana
Efficiency
Number of Processors
22Current Work- Initialization Solution
- Make algorithm that reduces the initialization
time. - Better guess for placement of initial walkers in
the Monte Carlo random walk. - Reduce the number of steps to equilibrate the
walker. - Make algorithm for initialization which itself is
parallelizable. - There are two criteria by which we can judge the
quality of a configuration. - The sum of one electron probability densities
from the SCF calculation -
- The distance of the electrons from each other.
23Constructing Initial Configurations
- Orbitals are linear combinations of primitive
gaussians -
The square of the orbital is its probability
density. We can change to spherical coordinates
and separate the dependence in r, ?, and f
24Probability Distribution Functions
- By integrating over one variable at a time, we
can get the marginal probability distribution
function in each direction. - These are the probability distribution
functions for the 2px orbital of Ne - Now we can generate uniform random numbers
between 0 and 1 to distribute electrons with
respect to the probability density of this
orbital.
25Deciding which configurations to keep
- We can invert each orbital of the molecule and
distribute electrons in them according to their
occupancy. This will ensure that our initial
configurations are in regions of high one
electron probability density. - This algorithm, however, will not prevent
electrons from being placed near each other. - We plan to develop a heuristic score function
that will take a 3N-dimensional walker as input
and return a real number, such that the larger
the result, the fewer steps the walker will need
to take to reach an equilibrium region of
configuration space.
26The Score function
- The score function will depend on the one
electron density and the distance between pairs
of electrons - ,
where
Our goal is to determine a functional form of the
score function G that is transferable between
different molecular systems.
27Using the score function
- Once a reliable score function is developed, it
will be possible to construct high-quality
walkers - - For an N-electron molecule, we could generate
MgtN electron positions and then find which
combination gives the best walker. - - Alternatively, a large set of N-electron
configurations could be generated, and then the
most favorable could be chosen. - - The number of equilibration steps would be
determined by the result of the score - function
- ????? Could it be possible to use the score
function to dynamically determine when a walker
is equilibrated ?????
If this is possible, we could terminate the
initialization of each walker individually. This
would mean we could start gathering data from a
walker as soon as it is equilibrated, rather than
waiting for the entire ensemble.
28Conclusion
- For parallel calculations, the efficiency drops
off drastically as the initialization time and
number of processors increase. - This new class of Metropolis initialization
algorithms will greatly decrease the time
required to initialize a QMC calculation. - The efficient use of parallel processors will
allow highly accurate quantum mechanical
calculations of larger and more interesting
chemical systems.
29Acknowledgements
- General
- Mike Feldmann
- Chip Kent
- Rick Muller
- William A. Goddard III
- Goddard Group
- CACR staff
- Funding
- ASCI (Caltech-ASCI-MP)
-