Title: Molecular dynamics and applications to amyloidogenic sequences
1Molecular dynamics and applications to
amyloidogenic sequences
- Nurit Haspel, David Zanuy, Ruth Nussinov
- (in cooperation with Ehud Gazits group)
2Contents
- Molecular dynamics goals, applications and basic
principles - Newtons equations of motion.
- Energy conservation and equations.
- The force field.
- Solvation models.
- Periodic boundary conditions.
- A molecular dynamic protocol
- Energy minimization.
3Contents (cont.)
- MD protocol (cont)
- Assignment of initial velocities.
- Equilibration.
- Methods of integration.
- Case study Human calcitonin hormone
- Basic background.
- Simulated models.
- Initial results and future work plans.
4The goal of MD
- Predicting the structure and energy of molecular
systems (in our case short peptide structures). - Simulating the behavior of the molecules in the
solution (by solving the energy equations at
every time interval). - Trying to find a model that explains the behavior
of the system.
5Applications of MD
- Sampling the conformational space over time.
Important for ligand docking, for example. - Determine equilibrium averages, structural and
motional properties of the system. - Study the time development of the system.
- Today, most (if not all) biomolecular structures
obtained by X-ray crystallography or NMR are MD
refined.
6Types of simulated systems
- Peptidic systems
- Micelle formation
- Nucleotides
- Small molecules
- Ligand docking
-
- Note Each type of system has its own unique
parameters and equations.
7The basic principle
- Solving the classical mechanics equations
(Newtons equations) over the pairs of atom
distances, angles, dihedrals, VdW interactions
and electrostatics in small time intervals (other
parameters can be added). - classical equations are usually sufficient for
large scale systems. Quantum mechanical
modifications are extremely costly and are used
only on small scale system or where more accuracy
is needed.
8Newtons mechanical equation (based on Newtons
second law)
F
V2
V1
F Ma M(dv/dt) M(d2r/dt2)
Or, with a small enough time interval ?t
?V (F/M) ?t ? V2 V1(F/M) ?t
9Newton equations (cont.)
The new position, r2 is determined by the old
position, r1 and the velocity v2 over time ?t
(which should be very small!). The above equation
describes the changes in the positions of the
atoms over time.
10The process of MD
- The simulation is the numerical integration of
the Newton equations over time. - Positions and velocities at time t
- Positions and velocities at time tdt.
- Positions velocities trajectory.
11The connection between force and energy
F-dU/dr ?U-?Fdr-1/2Mv2
U the energy (scalar). r the position vector.
12Conservation of energy
½SMiVi2 SEpot,iconst
The potential energy is taken from the force
field parameters.
13The potential energy equations bonded
interactions
14The potential energy equations (cont., non-bonded
interactions)
-
Van der Waals -
electrostatic - Etc
- The energy parameters are defined in the force
field
15The force field definition
- All the equations and the adjusted parameters
that allow to describe quantitatively the energy
of the chemical system. - Note, that mixing equations and parameters from
different systems always results in errors! - Force field examples FF2, FF3, Sybyl, charmm etc.
16Solvation models
- No solvent constant dielectric.
- Continuum referring to the solvent as a bulk.
No explicit representation of atoms (saving
time). - Explicit representing each water molecule
explicitly (accurate, but expensive). - Mixed mixing two models (for example explicit
continuum. To save time).
17Periodic boundary conditions
- Problem Only a small number of molecules can be
simulated and the molecules at the surface
experience different forces than those at the
inner side. - The simulation box is replicated infinitely in
three dimensions (to integrate the boundaries of
the box). - When the molecule moves, the images move in the
same fashion. - The assumption is that the behavior of the
infinitely replicated box is the same as a
macroscopic system.
18Periodic boundary conditions
19A sample MD protocol
- Read the force fields data and parameters.
- Read the coordinates and the solvent molecules.
- Slightly minimize the coordinates (the created
model may contain collisions), a few SD steps
followed by some ABNR steps. - Warm to the desired temperature (assign initial
velocities). - Equilibrate the system.
- Start the dynamics and save the trajectories
every 1ps (trajectorythe collection of
structures at any given time step).
20Why is minimization required?
- Most of the coordinates are obtained using X-ray
diffraction or NMR. - Those methods do not map the hydrogen atoms of
the system. - Those are added later using modeling programs
(such as insight), which are not 100 accurate. - Minimization is therefore required to resolve the
clashes that may blow up the energy function.
21Common minimization protocols
- First order algorithms
- Steepest descent
- Conjugated gradient
- Second order algorithms
- Newton-Raphson
- Adopted basis Newton Raphson (ABNR)
22Steepest descent
- This is the simplest minimization method
- The first directional derivative (gradient) of
the potential is calculated and displacement is
added to every coordinate in the opposite
direction (the direction of the force). - The step is increased if the new conformation has
a lower energy. - Advantages Simple and fast.
- Disadvantages Inaccurate, usually does not
converge.
23Conjugated gradient
- Uses first derivative information information
from previous steps the weighted average of the
current gradient and the previous step direction. - The weight factor is calculated from the ratio of
the previous and current steps. - This method converges much better than SD.
24Newton-Raphson algorithm
- Uses both first derivative (slope) and second
(curvature) information. - In the one-dimensional case
- In the multi-dimensional case much more
complicated (calculates the inverse of a hessian
curvature matrix at each step) - Advantage Accurate and converges well.
- Disadvantage Computationally expensive, for
convergence, should start near a minimum.
25Adopted basis Newton Raphson (ABNR)
- An adaptation of the NR method that is especially
suitable for large systems. - Instead of using a full matrix, it uses a basis
that represents the subspace in which the system
made the most progress in the past. - Advantage Second derivative information,
convergence, faster than the regular NR method. - Disadvantages Still quite expensive, less
accurate than NR.
26Assignment of initial velocities
- At the beginning the only information available
is the desired temperature. Initial velocities
are assigned randomly according to the
Maxwell-Bolzmann distribution - Pv - the probability of finding a molecule with
velocity between v and dv. - Note that 1. the velocity has x,y,z components.
- 2. The velocities exhibit a gaussian distribution
27Bond and angle constraints (SHAKE algorithm)
- Constrain some bond lengths and/or angles to
fixed values using a restraining force Gi. - Solve the equations once with no constraint
force. - Determine the magnitude of the force (using
lagrange multipliers) and correct the positions
accordingly. - Iteratively adjust the positions of the atoms
until the constraints are satisfied.
28Equilibrating the system
- Velocity distribution may change during
simulation, especially if the system is far from
equilibrium. - Perform a simulation, scaling the velocities
occasionally to reach the desired temperature. - The system is at equilibrium if
- The quantities fluctuate around an average value.
- The average remains constant over time.
29The verlet integration method
- Taylor expansion about r(t)
- Combining the equation results in
- Which is velocity independent.
- The error is of order dt4 (the next expression of
the series)
30The verlet method (cont.)
- The velocities can be calculated using the
derivation formula - Here the error is of order dt2
- Note the time interval is in the order of 1fs.
(10-15s)
31The verlet algorithm
- Start with r(t) and r(t-dt)
- Calculate a(t) from the Newton equation
- a(t) fi(t)/mi .
- Calculate r(tdt) according to the aforementioned
equation. - Calculate v(t).
- Replace r(t-dt) with r and r with r(tdt).
- Repeat as desired.
32Amyloid fibril formation
- Associated with a large number of degenerative
diseases such as Alzheimers, Parkinsons etc. - Associated with a structural change in the
protein structure, resulting in the formation of
stable fibrils. - The fibrils are richer in ß-sheets (although
their tertiary arrangements are usually
undetermined). - Amyloid forming proteins do not share sequence
homology, but the fibrillar structures exhibit
similar physicochemical and structural
characteristics.
33The human calcitonin (hCT)
- A 32 amino acid polypeptide hormone, produced by
the C-cells of the thyroid and involved in
calcium homeostasis. - Fibrillation of hCT was found to be associated
with carcinoma of the thyroid. - Synthetic hCT can form amyloid fibrils in vitro
with similar morphology to the deposits found in
the thyroid. - The in vitro process is affected by the pH of the
system.
34The structure of hCT
- In monomeric state, hCT has little ordered
secondary structure in room temperature. - Fibrillated hCT have both helical and sheet
components. - In DMSO/H2O a short double stranded anti-parallel
ß-sheet is formed in the region of residues
16-21. - Previous research indicated a critical role to
residues 18-19.
35The sequence of hCT
- -
- NH2-CGNLSTCMLGTYQDFNKFHTFPQTAIGVGAP-COOH
36Experimental data regarding the fibril forming
region
- The DFNKF area was found to form fibrils rich in
anti-parallel ß-sheets. - The spectrum observed with the DFNK tetrapeptide
is less typical of ß-sheets, but may be
interpreted as such. - The FNKF tetrapeptide exhibits a spectrum that is
typical of a non-ordered structure. - The DFN tripeptide seems to be a mixture of
ß-sheet and non-ordered structure.
37The effect of F?A mutation
- The DANKA mutation does not exhibit a typical
spectrum of the ß-sheet structure, although they
exhibit a certain degree of order. - This implies on the effect of the Phe aromatic
residues in the fibrillation process.
38Tested models
- Combinations of parallel/anti parallel within
sheet and between sheets. So far about 20
models. - Each model is simulated for 4ns. (each such
simulation takes about 5 days on a powerful
cluster). - The tested parameters for model stability
distance within/between sheets, aromatic
interactions, HB contact conservation etc.
39Topologically different models
40An example of a model
41Initial results (trajectory analysis)
- A model thats totally unstable
- Before After
42Average intra-sheet distance analysis
43Percentage of conserved H-bonds over time
44Future work plans
- Test mutations once we focus on the correct
model. - Make more analyses and find out what causes the
fibril formation (suspicion The aromatic ring
p-stacking, salt bridges between the oppositely
charged residues D and K) - ???