Title: Long time dynamics of Complex Systems
1Long time dynamics of Complex Systems
Ron Elber Avijit Ghosh and Alfredo
CÃ rdenas Computer Science, Cornell University
NSF NIH
2The R to T transition in hemoglobin
Time scale problems
3The R to T transition in hemoglobin
- Oxygen binding and transport proteins.
- A conformational transition from oxygen binding
to oxygen releasing protein. - Time scale of the complete transition ten
microseconds - Something happens on every time scale (ligand
dissociation femtoseconds, iron relaxation
picoseconds, ligand diffusion nanoseconds,
quaternary relaxation microseconds) - CANNOT BE MODELLED BY ACTIVATED DYNAMICS!
4Many paths, compute rate from many trajectories
5Dynamics on rough energy landscapes with broad
range of time scales
NOT transition state theory, activated dynamics,
or transition path sampling.
6The Molecular Dynamics Approach
Verlet algorithm
To maintain numerical stability must be
small
A million steps provide only a nanosecond TIME
SCALE AND SAMPLING LIMITATION
7Molecular Dynamics Integration with respect to
time or the path length
Time
Path length
8Functional formulation for classical dynamics
Gauss action
What happens when the integral is approximated
with a large step? Consider the length
representation...
9Large step Length formulation
- Finite difference approximation
- At the limit of large (still small enough
to obtain the path direction). We obtain an
expression for the steepest descent path
(reaction coordinate)
10Steepest Descent Paths as Quenched
trajectoriesResemble The inherent structure
of Stillinger and Weber
e
Filtering high frequency modes
11Dipeptide trajectories
12Global formulation of reaction coordinates
- Pratt (diffusive path in time, first global
approach) - Elber Karplus (Not the SDP, application to
proteins) - Ulitsky Elber (SDP)
- Jonsson (SDP)
- Olender Elber (SDP, connection to diffusion
process)
13Global formulation of classical trajectories
- Czerminski Elber (approx. functional)
- Wilson (Elastic band, Verlet algorithm)
- Freeman and Doll (Fourier space)
- Olender Elber (large step in time)
- Parrinello (large step in time, energy
conservation) - Ghosh Elber (large step in length)
14How to recover high frequency modes, missing in a
filtered trajectory?
15Trajectories versus Quenched trajectories
e
Statistical properties of high frequency modes
16Correlation of Missing high frequency modes for
blocked valine peptide (13 atoms)
17High frequency motion correlation function for
the (much) larger system folding of C peptide
Or Central Limit Theorem wins again
18Correlation of High Frequencies Motions
Longer memory but still not too long (lt20 ps)
19Distributions of the errors
Valine dipeptide
C peptide folding
20Probability of a single trajectory
Since the errors have no correlations in time
Optimize a target function of the whole trajectory
If is modeled by a Gaussian, then
21Optimization of a functionalOr What We Compute
P1
P3
P5
P2
P4
S is optimized on a Power-Edge Dell Clusters at
the Cornell Theory Center (256 or 128 Pentium
III Windows 2000 O/S) http//www.tc.cornell.edu/CB
IO
22Implementation on a PC cluster
Parallel simulation of ten microsecond
trajectories of hemoglobin on Velocity (work by
Veaceslav Zaloj).
23In contrast, usual parallelization of MD Space
decomposition
P1
P2
P3
P4
Communication is a problem. Requires very large
systems.
24Functional optimization versus initial value
solvers
- Arbitrary time steps, filters
- State to state
- Easy to parallel
- Large system
- Not good for searches (fixed end points)
25Filtering high frequencies
The simplest trajectory (one moving point)
26Use an exact expansion for the path
Substitute into the integral expression to
minimize S as a function of a,b, c and
a(k) Repeat the approximation of the error
estimate force assumed a constant in the time
interval.
27Filtering high frequencies
The approximate integration (that follows the SDE
algortihm) gives
We optimize S as a function of the independent
variables
To have
High frequencies removed. Stability maintained.
28Numerical examples Filtering high frequency modes
29Comparing spatial and temporal integrations
- Total energy is fixed (not time)
- Uniform distribution of points along the path
30Fractal like refinement of trajectory time scale
How long is the coast of England?
Mandelbrot How long is an MD trajectory in
length? There is one-to-one correspondence
between time and length
31End of refinement Time converged agreement
with initial value solver (glycine dipeptide)
32Main references for the new stochastic path
algorithm for long time dynamics
R. Olender and R. Elber, Calculation of
classical trajectories with a very large time
step Formalism and numerical examples, J. Chem.
Phys. 105,9299(1996) SMALL SYSTEMS R. Elber,
J. Meller, and R. Olender, A stochastic path
approach to compute atomically detained
trajectories Application to the folding of C
peptide, J. Phys. Chem. B 103,899(1999) C
PEPTIDE V. Zaloj and R. Elber, Parallel
computations of molecular dynamics trajectories
using stochastic path approach, Comp. Phys.
Comm. 128,118(2000) HEMOGLOBIN Joost C.M.
Uitdehaag, Bart A. van der Veen, Lubbert
Dijkhuizen, Ron Elber and Bauke W. Dijkstra, The
enzymatic circularization of a malto-octaose
linear chain studied by stochastic reaction path
calculations on cyclodextrin glycosyltransferase,
Proteins, 327-335,43(2001). Glycosyltransferase
Eastman P., Gronbech-Jensen N., Doniach S.,
Simulation of protein folding by reaction path
annealing, J. Chem. Phys. 114, 3823-3841, 2001
Huo SH, Straub JE Direct computation of long time
processes in peptides and proteins Reaction path
study of the coil-to-helix transition in
polyalanine, Proteins 36249-261,1999
33Computational protocol
- We use the length parameterization
- Optimize or sample from the following target
function - Subject to constraint on overall translation and
rotation - Compute along projected gradient
34Computational protocol continue
Start from a straight line interpolation minimize
T
35Mechanisms of protein folding
- Hydrophobic collapse?
- Framework model?
- Other reaction coordinates?
- Formation of native/non-native contacts?
- Order of events.
- Comparison to experiment using atomically
detailed models.
36Folding of two proteins
- Protein A small helical protein, other
theoretical studies, a few experiments - Cytochrome C Not so small protein, many
experiments, no calculations, microsecond
collapse, millisecond folder
37Folding Protein A Path Parameterized by
Length Avi Ghosh (with Harold Scheraga)
- Small 3 helix bundle (H1-H3) , 62 residues.
- H1 at residues 6-17
- H2 at residues 21-32
- H3 at residues 38-57
- Interacts with Immunoglobin binding proteins.
- Fast folding.
38Protein A , frag. B Crystal Structure.
- Amber/OPLS United Atom Force Field
- Analytical Continuum Solvent Model (GB/SA) by
Dave Case implemented into Moil Package
39Experimental DataCurrent Literature
- Wright Prot. Sci 1997
- H1-H2 are the helices that bind to Immunoglobin.
- Fragment studies on H1, H2, H3.
- H3 to be the most stable.
- Experimental evidence shows folding occurs lt 6
ms. - No intermediates detected.
- Bottomley Prot. Eng 1994
- H3 forms early in folding process
40Theoretical Data
- Brooks PNAS 1997 Equilibrium studies
- Rg as a reaction coordinate
- Hydrogen bonds /Native contacts reaction
coordinate studied. - H3 is the least stable, folds last.
- Daggett PNAS 2000 Unfolding simulations
- show H3 is most stable, unfolding last.
- High temperature unfolding.
41Computational protocol
- 100 length slices, optimizing the action in
length - Initial paths were generated as minimum energy
coordinates - Simulated annealing for each path.
- Time of the process sub-milliseconds.
Approximately one day to optimize a trajectory on
30 CPU-s.
42Generation of Ensemble of Unfolded Structures
that are used in path calculations Protein A
(Fragment B)
- High temperature MD run at 1000k
- Structures taken every 50 ps.
- Minimized locally.
- Total of 1000 structures generated.
- Clustering
- Top 130 structures gt 8.5 A RMS apart taken.
- 130 FOLDING TRAJECTROEIS
43Relative Helix Formation
(1) Helix 3 forms first (2) Helix 1 follows
closely (3) Helix 2 forms last
44A comparison of collapse and secondary structure
formation Average over all trajectories and all
length slices
45Log(prob) map of structuresalong first 20 of
path length (Rg /Hb)
46Free energy map of structuresalong 20-40 of
path length (Rg / Hb)
47Free energy map of structuresalong 40-60 of
path length (Rg / Hb)
48Free energy map of structuresalong 60-80 of
path length (Rg / Hb)
49Free energy map of structuresalong last 20 of
path length (Rg /Hb)
50Secondary and tertiarystructure formation (Nc /
Hb)
51Overview of the folding processfor Protein A
- Collapse to Compact structure occurs smoothly
throughout the folding process. No barrier - Nuclei of secondary structure forms early in the
folding process, majority occurs ½ way through
folding process - Mechanism Partial formation of transient
secondary structure, rapid collapse and
simultaneous formation of complete secondary
structure and tertiary contacts - Helix 3 forms first in agreement with experiment
(Oas)
52Cytochrome C(Alfredo Cardenas)
53Cytochrome C
- A small single domain protein (104 aa)
- Millisecond folder
- Well defined structure, bound to heme
- Extensive experimental studies by Englander,
Roder, Rousseau, Eaton, Takahashi... - Rapid collapse before significant formation of
secondary structure - Helices N and C form first
- Potential intermediates, HIS mis-ligated to heme
54Cyt C Computational protocol
- 4 ns High temperature runs produce unfolded
structures - 900 intermediate structures between unfolded and
folded configurations (typical length step 0.6
angstrom) - Individual trajectories were produced on (about)
40 CPU for 48 hours by full action optimization - Generalized Born model was used for solvation
- Total of 26 folding trajectories is analyzed
55Collapse Helix Formation
Molten globule
Qualitative agreement with Takahashi experiments
56First length slice
57Second length slice
58Third length slice
59Fourth length slice
60Last (fifth) length slice
61Sequence of secondary structure formation
Agrees with Englander and Roder experiments
62Energy barrier Another definition of the molten
globule?
Measured by Hagen and Eaton
63History of native and none native contacts
Do we observe non-native contacts? Is the time
history of non-native contacts significantly
different from native? (suggestion To dissolve
non-native contacts an increase in the radius of
gyration is required).
64native contacts and radius of gyration
65non-native contacts
66A trajectory of a native contact
Molten globule
67A trajectory of a non-native contact
Molten globule
68Summary of cytochrome C studies
- SDE simulation of the millisecond folding process
of Cytochrome C - Rapid collapse that is followed by formation of
secondary structure in agreement with
experiment different form protein A - First the N/C helices form and the 60 helix forms
last, in agreement with experiment - Non-native contacts are fixed within the molten
globule. No significant changes in compactness
are required.