Title: Automatic Construction of Ab Initio Potential Energy Surfaces
1Automatic Construction of Ab Initio Potential
Energy Surfaces
Interpolative Moving Least Squares (IMLS) Fitting
of Ab Initio Data for Constructing Global
Potential Energy Surfaces for Spectroscopy and
Dynamics
Donald L. Thompson University of Missouri
Columbia Richard Dawes, Al Wagner, Michael
Minkoff Fourth International meeting
"Mathematical Methods for Ab Initio Quantum
Chemistry" 13-14 November 2008Laboratoire J.A.
DieudonnéCNRS et Université de Nice -
Sophia-Antipolis
2Potential Energy Surfaces
? Basis for quantum and classical dynamics,
spectroscopy ? Electronic structure calculations
can provide accurate energies (even gradients
and Hessians) but at a high cost (Highly
accurate energy calculations for a single
geometry can take hours or days)
- We want to
- Generate accurate global PESs fit to a minimum
number - (100s 1000s) of ab initio points
- Make ab initio dynamics feasible for the highest
levels of - quantum chemistry methods (for which gradients
may - not be directly available)
- As blackbox as possible
3Requirements
? Minimize number of ab initio points ? Minimal
human effort and cost of fitting ? Low-cost
accurate evaluations
Our approach Interpolating Moving Least Squares
(IMLS) ? Much cheaper than high-level quantum
chemistry ? Doesnt need gradients, but can use
gradients and Hessians ? Can use high-degree
polynomials
- How to make efficient and practical
- Optimally place minimum number of points
- ? Weight functions
- ? Reuse fitting coefficients (store local
expansions) - Use zeroth-order PES and fit difference
- ? Other techniques
4Least-Squares Fitting
Usual applications are for data with
statistical errors, but trends that follow known
functional forms.
- Ab initio energies do not have random errors
- A PES does not have a precisely known functional
form ? - the energy points lie on a surface
- of unknown shape
- Thus, fit with a general basis
- set (e.g., polynomials)
- Basis functions the true
- function provides a more compact
- representation
Fitting ab initio energies
5Weighted least squares equations
W1 gives standard least squares We use
standard routines
BTW(z) B a(z) BTW(z)V
6Weighted vs. standard least squares
Standard, first degree fit to the 5 points
First Degree
IMLS, first degree
IMLS fits perfectly at each point
Second Degree
Standard, second degree
IMLS, second degree
7Optimum Point Placement
- ? We want to do the fewest number of ab initio
calculations - ? A non-uniform distribution of points is best
- ? We can use the fact that IMLS fits perfectly at
each point - to determine where to place points for the most
accurate - fit using the fewest possible points
- Use fits of different degree IMLS fits
Illustrate for 1-D Morse potential 5
seed points
8Automatic Point Placement 1-D Illustration
Start with 5 uniformly placed points Fit with 2nd
3rd degree IMLS Add new point where they differ
the most
Squared difference indicates where new points
are needed
9Automatic Point Placement
5 initial points
1 new point added
2 new points added
3 new points added
10Density adaptive weight function
Automatic point placement will generate a
nonuniform density of points. Thus, we use
a flexible, density-dependent weight function
11High Dimensional Model Representation(HDMR)
basis set
- Can represent high dimensional function through
an expansion of lower order terms - Can also use full dimensional expansion but
restrict the order of terms differently - Evaluation scales as NM2. HDMR greatly reduces
M. - This also reduces the number of points required.
12Accurate PESs from Low-Density Data
Initial testing for 3-D HCN-HNC We used the
global PES fit to ab initio points by van Mourik
et al. as a source for (cheap) points. ? Saves
time obtaining points ? Allows extensive error
analyses We fit using (12,9,7) HDMR basis
1-coordinate term truncated at 12th degree
2-coordinate term truncated at 9th degree
3-coordinate term truncated at 7th degree 180
basis functions
T. van Mourik, G. J. Harris, O. L. Polyansky,
J. Tennyson, A. G. Császár, and P. J.
Knowles, J. Chem. Phys. 115, 3706 (2001).
13Error as function of automatically selected data
points
Seed points Start with 4, 6, 8 for r, R cos?
Energy cutoff 100 kcal/mol
Data Points van Mourik et al. PES
3-D HCNHNC
Automatic surface generation Using (12,9,7)
(11,8,6) bases
Successive Order Solid True Error Open
The difference in successive orders follows
closely the true error. Thus, adding points
based on difference criteria results in converged
true error
RMS
Mean
14Convergence rate dependence on basis set HCN
Accuracy follows Farwigs formula for power-law
convergence ? Linear on log-log plot with slope
(n1)/D, where n degree of basis
Obeys power law over 3 orders of magnitude
RMS Error (kcal/mol)
8th degree HDMR (12,9,7) both have 180 fcts.,
but HDMR converges faster
Number of Points
R. Farwig, J. Comput. Appl. Math. 16, 79
(1986) Math. Comput. 46, 577 (1986).
15Cutting cost Local IMLS
- Cost of evaluation scales as NM2 for standard
IMLS - (N ab initio points, M basis functions)
- High-degree standard IMLS is too costly to use
directly, - thus we use local-IMLS Local approximants
(polynomials) of the potential near data points
are calculated using IMLS (expensive) the
interpolated value is taken to be a weighted sum
of them
- In standard IMLS they are recomputed at each
evaluation point (very accurate, but too costly) - The coefficients are generally slowly varying
- In the L-IMLS approach coefficients are computed
stored at a relatively small number of points - Evaluations are low cost weighted interpolations
between stored points
16Overcoming scaling problem for automatic point
selection
- We get high accuracy low cost with high-degree
L-IMLS But must find optimum place to add each
ab initio point - Trivial in 1-D
- as shown
- earlier
- With L-IMLS the functions whose maxima we seek
are continuously globally defined as are their
gradients - So, define negative of the squared-difference
surface - We can use efficient minimization schemes such as
conjugate gradient to find local minima - Difference between successive orders of IMLS
- Can also use variance of weighted contributions
to interpolated value with local IMLS - Grid or random search scales very poorly with
dimension
17Method schematic
18Automated PES fitting in 3-D HCN-HNC
Used 30 random starting points for minimizations
Basis set not well supported
Spectroscopic accuracy ? To less than 1
cm-1 within 792 pts with Hessians or 1000 pts
with gradients
223
828
318
For 0.1 kcal/mol
cm-1
HDMR (12,9,7)
But we can do even better Discussed below
The PES is fit up to 100 kcal/mol
19Dynamic Basis Procedure
Avoids including points in the seed data that
are not optimally located Start with very small
initial grid of points use automatic surface
generation with a small basis, successively
increasing the basis as points are added
20Automated Dynamic Basis 6-D (HOOH)
Fit to analylic H2O2 PES
Dynamic basis
Fit up to 100 kcal/mol
A min. of 591 pts. would be needed if we
started with the (10,7,5,4) basis. We started
with 108. Convergence also much faster
164
114
754
RMS error based on randomly selected test points
B. Kuhn et al. J. Chem. Phys. 111, 2565 (1999).
21Spectroscopic Accuracy 9-D (CH4)
Test Case Schwenke Partridge PES a least
squares fit to 8000 CCSD(T)/cc-pVTZ ab initio
data over the range 0-26,000 cm-1 ? We fit the
range 0-20,000 cm-1 (57.2 kcal/mol). ? Energies
gradients only (Hessians data not cost
effective as shown earlier) ? Bond distances ?
Exploited permutation symmetry ? Dynamic basis
procedure
D. W. Schwenke H. Partridge, Spectrochim Acta
Part A 57, 887 (2001)
22Automated PES fitting in 9-D (CH4)
With 1552 pts. the E only RMS error is 0.41
kcal/mol including gradients brings it down to
0.32 kcal/mol. The RMS error for the
Schwenke-Partridge PES (based on 8000 pts)
is 0.35 kcal/mol The IMLS fitting is
essentially automatic, little human effort, and
no prior knowledge of the topology
9,6,4,4
9,6,4,4
23A General 3-Atom IMLS-QC Code
- Input file
- Accuracy target
- Energy range
- Basis set
- Number of seed points and coordinate ranges
- Type of coordinates, Jacobi, valence, bond
distances - Generates input files for Gaussian, MolPro, and
Aces II - Energies only or energies gradients
24A New PES for the Methylene Radical
We have generated a spectroscopically accurate
PES for CH2 for energies up to 20,000 cm-1 (216
vibrational states). CASSCF calculations in
valence coordinates. Vibrational levels were
computed using a discrete variable
representation (DVR) method. DVR typically
requires 10s of thousands of ab initio points.
For a benchmark we performed a DVR calculation
using ab initio calculations at all 22,400 DVR
points.
25Singlet Methylene fit to energies and gradients
True and estimated errors are in near perfect
agreement
Black estimated errors
Red true errors
CASSCF calculation in valence coordinates. Energy
range of 20000 cm-1. Estimated error vs. true
error (sets of 500 random ab initio calcs). True
error (RMS and mean) are sub-wavenumber using 355
points.
26Singlet Methylene Vibrational Levels Discrete
Variable Representation (DVR) Calculation
Absolute errors for 216 vibrational levels (below
20,000 cm-1). Variational vibrational
calculations were performed using DVR and a PES
fitted with a mean estimated error of 2.0
cm-1 Exact levels were benchmarked by a DVR
calculation using ab initio calculations at all
22,400 DVR points.
27Singlet Methylene Vibrational Levels Discrete
Variable Representation (DVR) Calculation
Plot of absolute errors for 216 vibrational
levels (below 20,000 cm-1). Variational
vibrational calculations were performed using a
DVR and fitted PESs with mean estimated errors
of 0.5 cm-1 Exact levels were benchmarked by a
DVR calculation using ab initio calculations at
all 22,400 DVR points.
28Singlet Methylene Vibrational Levels Comparisons
2.0 cm-1 mean estimated error
0.5 cm-1 mean estimated error
29Singlet Methylene Vibrational Levels Discrete
Variable Representation (DVR) Calculation
Absolute errors for 216 vibrational levels (below
20,000 cm-1). Variational vibrational
calculations were performed using a DVR and PES
fitted with mean estimated errors of 0.33
cm-1 Exact levels were benchmarked by a DVR
calculation using ab initio calculations at all
22,400 DVR points. Mean and maximum errors for
levels computed with this PES are 0.10 and 0.41
cm-1.
30Singlet Methylene Vibrational Levels Comparisons
2.0 cm-1 mean estimated error
0.33 cm-1 mean estimated error
31IMLS Classical TrajectoriesPreliminary Efforts
Two difference approaches IMLS-accelerate
direct dynamics Dynamics Driven Fitting
(both under development) In both cases IMLS
intercepts ab initio PES calls the electronic
structure code is called only if necessary (based
on error estimate)
32Accelerated Direct DynamicsTest case HONO
cis-trans isomerization
- Trajectories were initiated with 8 quanta in the
HON bend to cause rapid IVR then isomerization
(Want rapid exploration of configuration space) - ? Integration stepsize 0.05 fs
- ? Trajectories were stopped once they spent 3
times the period of the torsion mode in the range
of the trans torsion angle or violated energy
conservation criterion - Used HF/cc-pVDZ want fast ab initio
calculation to test the method - IMLS intercepts direct dynamics ab initio PES
calls. Electronic structure code is called only
if necessary (based on error estimate) Data
collection trajectories are moved back in time if
the rare event of adding new ab initio data
occurs
33Accelerated direct dynamics with IMLS HONO
(10,7,5,5) basis of 651 functions Values and
gradients used The fit began after 25 ab initio
"seed" points were generated
10-2
10-5
Speedup depends on error tolerance 7.6
evaluations per ab initio call for 10-5 error
tolerance 76.3 evaluations per ab initio
call for 10-2 error tolerance
Factor of 20 speed up with 0.06 drift in total
energy
34Dynamics Driven Fitting HONO cis-trans
isomerization rate
A series of sets of trajectories, with various
energy conservation limits, are used to explore
configuration space.
35Accelerated direct dynamics HONO cis-trans
isomerization rate
Results for PESs fit with 8 different maximum
error tolerances
36Concluding Comments
- IMLS allows automated generation of PESs for
various applications - Spectroscopy
- Dynamics
- Flexible fits to energies, energies and
gradients, or higher derivatives - Interfaced to general classical trajectory code
GenDyn - Interfaced to electronic structure codes
- Gaussian, Molpro, Aces II
- Robust, efficient, practical methods that assures
fidelity to - the ab initio data