Title: Parallel Simulated Annealing for EAM potential fitting
1Parallel Simulated Annealing for EAM potential
fitting
By Tao Xu CS6230 Final Presentation 05/05/05
2Outline
- Introduction to EAM potential
- The object/cost function
- Simulated Annealing Algorithm
- Synchronous Parallel Simulated Annealing
- Asynchronous Parallel Simulated Annealing
- Conclusions and References
3The EAM potential
Where f(r) is a pair potential, r(r) an atomic
density function and a embedding function
U(n). Let a indicate the entire set of L
parameters a1, a2, , aL used to characterize the
functions. The goal is try to determine the
optimal set a by matching the forces from
first-principle calculations with those predicted
by the classical potentials.
4The object function
The key of force-matching method is to minimize
the object function Z(a) ZF(a) ZC(a) where,
M of sets of atomic configurations(e.g.
structures). Nk of atoms in configuration
k. Fki(a) is the force on the ith atom in set k
obtained with parameter set a. Fki0 is the
reference force from first principle. ZC
contains contribution from Nc additional
constraints. Ar(a) are physical quantities as
calculated from potentials.
5Simulated Annealing Algorithm
Initial configuration a
Random number generator
Create new random configuration a
Evaluate the cost function
Acceptance probability
No
Yes
Accept new config
Terminate Search?
Adjust Temperature
END
6A typical SA run
7Details at low temperature limit
8Synchronized Parallel Simulated Annealing
Send initial configuration
ROOT 0
Collect data from workers
Send best configuration
. . . .
P1
P2
P3
P4
Collect final configuration
9How to make a selection from workers results
- There are three ways here
- Minimum The root chooses the configuration with
the lowest cost function - Random The root chooses one of the configuration
at random - Metropolis-like The root chooses the minimum
sometimes but accepts others with some nonzero
probability. - In the current implementation, the minimum one
is chosen at each temperature.
10Speedup for Synchronous PSA
11Advantages and disadvantages of synchronous PSA
- The advantages of synchronous PSA
- Attains a near-linear speedup. This is due to
the fact that, with n processors, the program is
searching a factor n more possible
configurations, which increase the chances of
stumbling onto the correct configuration more
quickly. - Easy to implement. The only message-passing
occurs at the synchronous steps. - The disadvantages of synchronous PSA
- Idle time If one processor obtains the
prerequisite number of successes before another
one, it must wait for other processor to finish. - Synchronization cost A global gathering and
rebroadcasting of large configurations can be
time-consuming. However, this is not usually a
problem with smaller systems.
12Asynchronous Parallel Simulated Annealing
ROOT
Send initial configuration
Register
T
?
Collect data from workers
. . . .
P3
P4
P1
P2
Collect final configuration
13Differences between Synchronous and Asynchronous
PSA
- Every processor controls its own cooling
schedule - Each processor works independently with each
other to avoid any idle time for waiting others
to finish. When it finishes at one temperature,
it checks its value against the global register.
If its value is worse, it takes the configuration
from the register. Otherwise, it writes its
value to the register. - The best configuration is always stored in a
global register on a master processor.
- Advantages of Asynchronous PSA
- No processors ever sit idle. When a processor
finishes at one temperature, it goes on to the
next. However, there might still be some idle
time at the end of the program. - No expensive synchronization steps.
Communications are smaller but more frequent.
14Synchronous vs. Asynchronous PSA
15Conclusion
- Simulated annealing always converges, but it
takes a long time to find the minimum - Thus, parallelization of simulated annealing is
desirable. Due to a faster perusal of the search
space, a near linear speedup is obtained in the
convergence time - Asynchronous annealing converges faster than
synchronous due to the near-zero idle time and
has a better speedup.
16References and Questions
1 M.S. Daw and M. I. Baskes Embedded-atom
method Derivation and application to impurities,
surfaces, and other defects in metals. Physical
Review B, Vol. 29, No. 12, June 1984, pp.
6443-6453 2 R. A. Johnson Analytic
Nearest-neighbor Model for fcc Metals. Physical
Review B, Vol. 37, No. 8, March 1988, pp.
3924-3931 3 S. Kirkpatrick, C.D. Gelatt, M.P.
Vecchi Optimization by Simulated Annealing.
Science, Vol. 220, No. 4598, May 1983, pp. 671-780