Modeling real molecules in virtual water. - PowerPoint PPT Presentation

1 / 60
About This Presentation
Title:

Modeling real molecules in virtual water.

Description:

... bound 0.4 ns MD simulation Using Molecular Dynamics based on the Generalized Born model to simulate protein-protein docking. barnase barstar Molecular ... tool to ... – PowerPoint PPT presentation

Number of Views:231
Avg rating:3.0/5.0
Slides: 61
Provided by: coursesC4
Category:

less

Transcript and Presenter's Notes

Title: Modeling real molecules in virtual water.


1
Modeling real molecules in virtual water.
Alexey Onufriev The Scripps Research Institute La
Jolla, CA
Acknowledgements David Case (Scripps)
Don Bashford
(Scripps)
.
Support NIH grant GM 45607
2
Outline
  • Key features of virtual water model.
  • Simulated folding of a 46-residue protein
    using all-atom molecular dynamics based on
    the GB.
  • structural basis of stability of halophilic
    proteins.

3
THEME I. Protein folding.
Amino-acid sequence translated genetic code.
METALAALAASPGLUGLU--.
How?
Experiment amino acid sequence uniquely
determines proteins 3D shape (ground state).
Nature does it all the time. Can we?
4
The magnitude of the protein folding challenge
A small protein is a chain of 50 mino acids
(more for most ). Assume that each amino
acid has only 10 conformations (vast
underestimation) Total number of possible
conformations 1050 Say, you make one MC
step per femtosecond. Exhaustive search for
the ground state will take 1027 years.
Why bother proteins shape determines its
biological function.
5
Complexity of protein design
Example PCNA a human DNA-binding protein.
Drawn to scale
6
Everything that living things do can be
reduced to wiggling and jiggling of
atoms R. Feynmann
Suggests the approach model what nature does,
i.e. let the molecule evolve with time according
to underlying physics laws.
7
Principles of Molecular Dynamics (MD)
Each atom moves by Newtons 2nd Law F ma

-

E
8
(No Transcript)
9
Computational advantages of representing water
implicitly, via a continuum solvent model
Implicit water as dielectric continuum
Explicit water (traditional)
Low computational cost. Fast dynamics.
  • Other advantages
  • Instant dielectric response gt no water
    equilibration necessary.
  • No viscosity gt faster conformational
    transitions.
  • Solvation in an infinite volume gt no boundary
    artifacts.
  • Solvent degrees of freedom taken into
    account implicitly gt easy to
  • estimate total energy of solvated
    system.

Large computational cost. Slow dynamics.
10
Traditional approach
To obtain electrostatic potential, solve
Poisson-Boltzmann equation
inside, e4
Use molecular structure to define the
dielectric boundary.
outside, e80
11
Computational challenges 1. Solution of a PDE
is expensive. Slow.2. Need fine grid gt run out
of memory fast.3. Problems with obtaining
derivatives of energy.
Need fast method. Cant afford to solve Poisson
equation on a grid every 2 fs of simulation
time. Also, an analytical formula for the
energy would be nice here, to get the forces via
F dE/dr.
12
Contributions to solvation free energy
DGsolv DGnonpolar DWelectr
vacuum
water
13
Alternative generalized Born approximation (GB)
Total electrostaticenergy
Solvent polarization, DW
Vacuum part
Function to be determined.
14

Solvation energy of individual ion

15
The magic formula
f rij2 RiRjexp(-rij2/4RiRj) 1/2 /Still
et al. 1990 /
16
(No Transcript)
17
All pairs of atoms
f simple, smooth function
18
Key result
(Onufriev et al. J.Comp.Chem. 2002 )
Given a good set of effective radii Ri , the
Generalized Born formula approximates the
essentially exact Poisson-Boltzmann solution
very well. that is given the correct self
energy, the interaction energy can be compouted
(for vast majority of atom pairs in the molecule).
19
How to calculate the effectiveBorn rdius, Ri?
Invert the Born formula
Calculate solvationenergy from
classicalelectrostatics.
Make someapproximations
Do the integral analytically, overatom spheres.
20
Computing the effective Born radii. Step I
Effective radius
Solvation energy of individual atom inside the
molecule
21
Computing the effective Born radii. Step II
22
Computing the effective Born radii. Step III
Integral over atom spheres
23
Computational challenges. Example.
Molecular volume gt Si 4/3pR3i
24
This is how it is done
General idea
l 4/3 compact packing of hard spheres
atom spheres
25
How effective Born radii are computed in the new
GB model.
original
degree of burial.
R-1 (r - 0.09A)-1 - I
new
26
Effects of radii re-scaling in the new GB model.
  • Compared to the original GB, the new approach
  • Makes Reff of buried atoms larger.
  • Removes numerical instability for large Reff.
  • Small molecule results as good (small Reff).

New GB
Original GB
Degree of atoms burial (normalized)
27
The first genetarion Generalized Born Model
(Hawkins,Cramer, Truhlar, (1996) )
Worked very well for small compounds, but not so
well for macromolecules.
The problem is addressed in the second generation
GB models .
Onufriev et al. J. Phys. Chem. 104, 3712 (2000).
Most of the error in the effective radii
(relative to PB) can be eliminated by
integration over correct molecular volume.
28
Computing the effective Born radii for
macromolecules.
l 4/3 compact packing of hard spheres
29
The improved GB model is validated by 1.
Calculations of pKs of titratable groups in
proteins -- comparison to experiment.2.
Comparison to Poisson-Boltzmann (exact)
theory. 3. MD simulations of native proteins
to see how close is simulated structure to
the known X-ray one.
30
GB the Generalized Born model (modified).
PB -- Poisson Boltzmann theory.
31
How well does new GB work (compared to
Poisson-Boltzmann) ?
Method System PB (0.5A grid) GB (new) AMBER-7 GB (Old) AMBER-6
Apomyoglobin (153 residues) pH2 -2088.9 -2089.9 -2161.1
Protein A (46 residues) pH7 145.1 143 131
The change in the electrostatic part of solvation
energy DW of the protein in going from its
native (N) to the unfolded (U) states,
calculated using the PB and GB models. The
change, DW (N) - DW (U) , is in Kcal/mol. The
quantity is calculated as an average over 50
structures representative of each state.
32
Potential function
Electrostatic part of the of solvation free
energy(calculated by the GB)
Total energy
Etot Evacuum DEelecGB DGnonpolar

Gas-phase potential (AMBER-7)
Non-polar contributionto solvation.
Approximatedas surface energy sA
33
Folding a 46 residue protein using Molecular
Dynamics based upon the GB model.
Native Protein A
Molecular Dynamics The protein is equilibrated
for 9 ns at 300K, native state is stable. To
model unfolding conditions, the temperature is
quickly raised to 450K, and remains at 450K for 1
ns. The resulting state appears unfolded (rmsd to
native gt 15A)), with no tertiary and little
secondary structure left. The protein is then
gradually cooled down (2 ns) and the simulation
continues at 300K for another 5 ns.
Energy function
Computations We use AMBER-7, parm94 all-atom
ff. Starting structure PDB 1BDD. DGnonpolar
5kal/A2 solvent accessible area. Salt
0.15 M NaCl.
34
Using Molecular Dynamics based on the
GeneralizedBorn model to simulate
protein-protein docking.
unbound
barstar
barnase
35
Molecular mechanics (free) energy can be used as
a scoring function, to identify the correctly
bound conformation
36
(No Transcript)
37
The bottom of the folding funnel.

4
5
6
7
38
Simulated Refolding pathway of the 46-residue
protein
Movie available at www.scripps.edu/onufriev/RESE
ARCH/in_virtuo.html
1
3
5
0 1 2
3 4 5 6
NB due to the absence of viscosity, folding
occurs on much shorter time-scale than in an
experiment.
39
Recent landmark attempt to fold a (36 residue)
protein in virtuo using Molecular Dynamics
Duan Y, Kollman, P Science, 282 740 (1998).
Simulation time 3 months on 256 processors 64
years on one processor.
Result partially folded structure.
Problem explicit water simulation are too
expensive computationally cant wait long
enough.
40
Folding a protein in virtuo using Molecular
Dynamics based on the Generalized Born (implicit
solvation) model.
Simulation time overnight on 16 processors.
Protein to fold 46 -residue protein A (one of
the guinea pigs in folding studies).
No prior structural knowledge is necessary
(homology, etc. )
Protocol details AMBER-7 package, parm-94
force-field. New GB model.
41
Protein-A re-folding steps. Formation of
residue-residue contacts
42
Initial stages of re-folding.
Contacts are formed between residues in the loops.
(mostly hydrophobic)
0 0.5 ns
0.5 1 ns
Contacts superimposed on the native backbone .
t0 ns corresponds to the unfolded structure.
Hypothesis restricted motion in the loops
(hinges) direct fast folding.
NMR evidence for restricted motions in the hinge
regions in the unfolded state of apomyoglobin
Schwarzinger S., Wright, P., Dyson, J. Biochem.
41, 12681, (2002)
43
Restricted motion in the loops in the unfolded
state may be important for fast
folding. (directing formation of the correct
topology)
44
Conclusions to part I
  • Molecular Dynamics based on the improved GB
    model can be used to fold a 46-residue protein
    (to backbone RMSD to X-ray 2.4 A, starting from
    an unfolded state at 450 K. )
  • Contacts formed in the early stages of folding
    between residues in the loop regions may direct
    fast formation of the correct topology.

45
Restricted motion in the loops in the unfolded
state may be important for fast folding.
If, instead of 450K, the protein in unfolded at
750K (the rest of the simulation protocol remains
the same), it misfolds upon cooling. The
misfolded structure has the wrong
topology (and higher energy) compared to the
native fold achieved in the previous
simulation, and represents a kinetic trap. If,
however, (f,y) dihedral angles of loop residues
(4 in each loop) are slightly restrained during
the simulation, the protein finds the correct
topology immediately upon cooling .
Experimentally, restricted motions in the loops
are observed in 8M urea unfolded apomyoglobin (S.
Schwarzinger, P. Wright, et al. )
2
1
3
End result (300K)
Correct topology is achieved despite
considerable fluctuations of dihedral angles in
the high temperature unfolded state.
Fluctuations in the unfolded state (0.1 1ns)
red diamonds values in the native state.
1
T450 K
2
T750 K
3
T750 K, loop fluctuations slightly restricted
loop
loop
46
Conclusions
The Generalized Born approximation can provide
an accurate estimate of charge-charge
interactions in solvated macromolecules. The
use of a continuum solvent model based upon a
Generalized Born approximation speeds up
Molecular Dynamics simulations considerably. A
46-residue protein can be successfully folded in
a Molecular Dynamics simulation based on the
Generalized Born model with a reasonable
computational effort. Loop regions appear to be
important for steering the folding protein
towards the correct topology.
47
The magnitude of the protein folding challenge
A small protein is a chain of 50 mino acids
(more for most ). Assume that each amino
acid has only 10 conformations (vast
underestimation) Total number of possible
conformations 1050 Say, you make one MC
step per femtosecond. Exhaustive search for
the ground state will take 1027 years.
Why bother proteins shape determines its
biological function.
48
The generalized Born formula can be more than
just a computationally effective alternative to
the Poisson-Boltzmann method.
After all, its a (simple) formula, and as such
can be used for back-of-the-envelope
calculations.
49
BIOLOGY INFORMATICS bioinformatics
CHEMISTRTY INFORMATICS cheminformatics
?
physionformatics
PHYSICS INFORMATICS
50
Application II
What makes proteins from halophilic organisms
stable at very high salt?
  • Why study extremophiles?
  • Natural curiosity (generally not fundable).
  • Industrial uses of extremophiles (PCR,
    pharmaceuticals, food industry, etc. Nature 409
    1092 (2001).
  • Life on Mars?

51
Unlike ordinary proteins, stability of
halophilic proteins increases with salt
concentrations.
Why? Can we try to find out?
100
Folded
0
1 2 3
4
NaCl M
52
A necessary condition for halophilism d(stabilit
y)/d(salt concentration) gt 0.
53
Salt dependence enters through the electrostatics
Only this term is affected by salt
Can estimate using GB
54
Which can be approximated via the generalized
Born formula
Salt dependence
e 80 (water)
Salt dependence exp(-kfGB)
55
The salt dependence of folding free energy is
therefore
Unfolded
Folded
56
Increasing charge on a protein may indeed make it
MORE stable at high salt. (salt decreases
strongly unfavorable charge-charge interactions
in the folded state)
Example of a protein apomyoglobin
Onufriev, Bashford, Case JMB. 325, 555 (2003).
57
Halophilic proteins are indeed strogly charged.
Randomly selected N125 non-homologous proteins
from SwissProt database.
GLU,ASP count as 1, ARG, LYS count as 1, HIS
counts as 0
58
homo sapiens
halophilic
Randomly selected N125 non-homologous proteins
from SwissProt database.
59
As expected,
is large and positive in halophilic proteins.
Randomly selected N125 non-homologous proteins from SwissProt database. All halophilic Homo sapiens,
(Sqi)/N -0.107 -0.0065
(Sqiqj)/(N(N-1) 0.170 0.0077
60
How can adding more charge stabilize a protein
(at high salt)?
Energy
61
Additional charges near the surface
CAN make the protein more stable
at high salt.
E12 lt 0 (stabilizing)
E12
Solution e 80 salt
Protein e4
62
Halophilic proteins are rich in negatively
charged surface residues. The positive
countercharges are next to them.
Structural data from 3 halophilic proteins
available in PDB (1D3A, 1DOI, 1HLP)
63
Structure of a halophilic protein showing
negatively charged surface residues and the
positive countercharges nearby.
ARG, LYS
GLU,ASP
Halophilic Malate Dehydrogenase from haloracula
marismortui PDB ID 1D3A
64
Conclusions to part II
  • The generalized Born model can also be used as
    an analytical tool to help understand
    structure-function relationships.
  • Additional charge near the surface of halophilic
    proteins apperas to be necessary for their
    stability at high salt concentrations.

65
Summary
The fast, analytical virtual water models can
be used to explore large-scale conformational
transitions in bio-molecules, previously
inaccessible via traditional explicit water
methods.
66
The emergence of in virtuo Science.
67
Free energy
_

_

Unstabe (at high salt)
_

0


_

_

_
Stable (at high salt)


Evolution (via single mutations)
68
My attempt to use finite difference
Poisson-Boltzmann for back-of-the-envelope
calculations
69
Possible strategy for larger proteins
Amio-acid sequence, no structure
Knowledge-based modeling. (Homology, etc)
Correct topology. 10 A uncertainty
virtual water modeling
Mostly correct structure. 2-3 A uncertainity
real water modeling
Accurate structural model comparable to X-ray or
NMR (1 A)
70
CONCLUSIONS
Molecular Dynamics methodology based on implicit
solvent model (modified Generalized Born) has
potential to predict full structure of a
protein from its amino-acid sequence. The
methodology IS NOT SPECIFIC TO PROTEIN
FOLDING, and can be applied to other problems in
molecular modeling, such as rational drug design.

71
  • Advantages of implicit solvation methodology
  • Can model larger systems on longer time-scales.
    Low computational cost (compared to explicit
    water representation)
  • The models describe instantaneous
    solvent dielectric response. Eliminates the
    need for lengthy equilibration of water necessary
    in explicit water simulations. No viscosity.
    The molecule can explore
  • the available conformational space much
    faster.
  • Corresponds to solvation in an infinite volume
    of solvent. No artifacts of replica
    interactions in periodic systems.
  • Solvent degrees of freedom are taken into
    account implicitly.
  • Estimating energies of solvated
    structures is much more straightforward
    (than with explicit water models)
  • Analytical Generalized Born is easy to
    integrate into molecular dynamics programs.

72
Plans
1. Further development of the Implicit Solvation
methods. A. Better (but not slower) way to
compute effective radii. in the
Generalized Born.B. Better functional form.
C. Proper account for internal dielectric. D.
Define electrostatic potential in GB (to tackle
large systems like ribosome where PB
chokes). E. Better account for non-polar term,
more than just EsA.
2. Applications.A. Myoglobin dynamics on long
time scales (oxygen uptake pathways).B.
Two-state model of the photo-cycleC. Folding
of small proteins. Why some proteins fold so
fast? D. Dynamics of the nucleosome
folding/unfolding. Possible molecular
mechanisms facilitating transcription/replicatio
n.
Write a Comment
User Comments (0)
About PowerShow.com