The Probabilistic Roadmap Approach to Study Molecular Motion

About This Presentation

Title:

The Probabilistic Roadmap Approach to Study Molecular Motion

Description:

The formation order that appears the most often over all paths is considered the ... If two nodes are closer apart than some e, they are merged into one roadmap ... – PowerPoint PPT presentation

Number of Views:20

Avg rating:3.0/5.0

Slides: 57

Provided by: lato

Learn more at: http://robotics.stanford.edu

Category:

more less

Transcript and Presenter's Notes

Title: The Probabilistic Roadmap Approach to Study Molecular Motion

1
The Probabilistic Roadmap Approach to Study
Molecular Motion

Jean-Claude Latombe
Kwan Im Thong Hood Cho Temple Visiting Professor,
NUS
Kumagai Professor, Computer Science, Stanford

2
Molecular motion is an essential process of life
CspA
3
Understanding molecular motion could help cure
many diseases
Mad cow disease is caused by misfolding
Drug molecules act bybinding to proteins
4
As few experimental tools are available,
computational tools are critical

Computer simulation
Monte Carlo simulation
Molecular Dynamics

5
But MD and MC simulation have two major drawbacks

Each simulation run yields a single pathway,
while molecules tend to move along many different
pathways

6
But MD and MC simulation have two major drawbacks

Each simulation run yields a single pathway,
while molecules tend to move along many different
pathways

7
But MD and MC simulation have two major drawbacks

Each simulation run yields a single pathway,
while molecules tend to move along many different
pathways? Interest in ensemble
properties

8
Example of Ensemble Property Probability of
Folding pfold
Measure kinetic distance to folded state
9
Other Examples of Ensemble Properties

Order of formation of secondary structure
elements
Average time for a ligand to escape a binding
site
Folding rate of a protein
Key intermediates along folding pathways
Etc ...

10
But MD and MC simulation have two major drawbacks

Each simulation run yields a single pathway,
while molecules tend to move along many different
pathways? Interest in ensemble properties
Each simulation run tends to waste much time in
local minima

11
Roadmap-Based Representation

Network of conformations connected by local
motion pathways
Compact representation of huge number of motion
pathways
Coarse resolution relative to MC and MD
simulation
Efficient algorithms for analyzing multiple
pathways

12
Roadmaps for Robot Motion Planning
13
Initial Work Application ofRoadmaps to Ligand
Binding A.P. Singh, J.C. Latombe, and D.L.
Brutlag. A Motion Planning Approach to Flexible
Ligand Binding. Proc. 7th Int. Conf. on
Intelligent Syst. for Molecular Biology (ISMB),
pp. 252-261, 1999

The ligand is modeled as a flexible molecule,
but the protein is assumed rigid
A conformation of the ligand is defined by the
position and orientation of a group of 3 atoms
relative to the proteinand by the torsional
angles of the ligand

14
Roadmap Construction (Node Generation)

Conformations of the ligand are sampled at random
around the protein
The energy E at each sampled conformation is
computed
E Einteraction Einternal Einteraction
electrostatic van der Waals potential Einterna
l Snon-bonded pairs of atoms electrostatic
van der Waals
A sampled conformation is retained as a node with
probability 0 if E gt Emax
Emax-E
Emax-Emin
1 if E lt Emin
? Denser distribution of nodes in low-energy
regions of conformational space

15
Roadmap Construction (Edge Generation)

Each node is connected to each of its closest
neighbors by a straight edge
Each edge is discretized at some resolution e (
1Å)
If any E(qi) gt Emax , then the edge is rejected

E
16
Roadmap Construction (Edge Generation)

Each node is connected to each of its closest
neighbors by a straight edge
Each edge is discretized at some resolution e (
1Å)
If all E(qi) ? Emax , then the edge is retained
and is assigned two weights w(q?q) and w(q?q)
where
(probability that the ligand moves from qi to
qi1 when it is constrained to move along the
edge)

17
Querying the Roadmap

For a given goal node qg (e.g., binding
conformation), the Dijkstras single-source
algorithm computes the lowest-weight paths from
qg to each node (in either direction) in O(N
logN) time, where N number of nodes
Various quantities can then be easily computed
in O(N) time, e.g., average weights of all
paths entering qg and of all paths leaving qg
( binding and dissociation rates Kon and Koff)

Protein Lactate dehydrogenase Ligand Oxamate (7
degrees of freedom)
18
Experiments on 3 Complexes

PDB ID 1ldm
Receptor Lactate Dehydrogenase (2386 atoms, 309
residues)
Ligand Oxamate (6 atoms, 7 dofs)
PDB ID 4ts1
Receptor Mutant of tyrosyl-transfer-RNA
synthetase (2423 atoms, 319 residues)
Ligand L- leucyl-hydroxylamine (13 atoms, 9
dofs)
PDB ID 1stp
Receptor Streptavidin (901 atoms, 121 residues)
Ligand Biotin (16 atoms, 11 dofs)

19
Computation of Potential Binding Conformations

Sample many (several 1000s) ligands
conformations at random around protein
Repeat several times
Select lowest-energy conformations that are
close to protein surface
Resample around them
Retain k (10) lowest-energy conformations
whose centers of mass are at least 5Å apart

lactate dehydrogenase
20
Results for 1ldm

Some potential binding sites have slightly lower
energy than the active site ? Energy is not a
discriminating factor for recognizing active site
Average path weights (energetic difficulty) to
enter and leave binding site are significantly
greater for the active site ? Indicates that the
active site is surrounded by an energy barrier
that traps the ligand

21
Application of Roadmaps to Protein Folding
N.M. Amato, K.A. Dill, and G. Song. Using Motion
Planning to Map Protein Folding Landscapes and
Analyze Folding Kinetics of Known Native
Structures. J. Comp. Biology, 10(2)239-255, 2003

Known native state
Degrees of freedom f-? angles
Energy van der Waals, hydrogen bonds,
hydrophobic effect
New idea Sampling strategy

22
Sampling Strategy(Node Generation)

High dimensionality ? non-uniform sampling
Conformations are sampled using Gaussian
distribution around native state
Conformations are sorted into bins by number of
native contacts (pairs of C? atoms that are
closeapart in native structure)
Sampling ends when all bins have minimum number
of conformations ? good coverage of
conformational space

23
Application Order of Formation of Secondary
Structure Elements

The lowest-weight path is extracted from each
denatured conformation to the folded one
The order of formation of SSEs is computed along
each path
The formation order that appears the most often
over all paths is considered the SSE formation
order of the protein

24
Order of Formation of Secondary Structures along
a Path

The contact matrix showing the time step when
each native contact appears is built

25
Protein CI2 (1a 4 b)
26
60
5
Protein CI2 (1a 4 b)
27
Order of Formation of Secondary Structures along
a Path

The contact matrix showing the time step when
each native contact appears is built
The time step at which a structure appears is
approximated as the average of the appearance
time steps of its contacts

28
a forms at time step 122 (II) b3 and b4 come
together at 187 (V) b2 and b3 come together at
210 (IV) b1 and b4 come together at 214 (III)
Protein CI2 (1a 4 b)
29
Application Order of Formation of Secondary
Structure Elements

The lowest-weight path is extracted from each
denatured conformation to the folded one
The order of formation of SSEs is computed along
each path
The formation order that appears the most often
over all paths is considered the SSE formation
order of the protein

30
Comparison with Experimental Data
31
Stochastic Roadmaps M.S. Apaydin, D.L. Brutlag,
C. Guestrin, D. Hsu, J.C. Latombe and C. Varma.
Stochastic Roadmap Simulation An Efficient
Representation and Algorithm for Analyzing
Molecular Motion. J. Comp. Biol.,
10(3-4)257-281, 2003

New Idea Capture the stochastic nature of
molecular motion by assigning probabilities to
edges

32
Edge Probabilities
Follow Metropolis criteria
Self-transition probability
vj
33
Stochastic Roadmap Simulation
V
Pij
34
Roadmap as Markov Chain
j
Pij
i

Transition probability Pij depends only on i and
j

35
Probability of Folding pfold
Unfolded state
Folded state
36
First-Step Analysis
Let fi pfold(i) After one step fi Pii fi
Pij fj Pik fk Pil fl Pim fm
37
First-Step Analysis

One linear equation per node
Solution gives pfold for all nodes
No explicit simulation run
All pathways are taken into account
Sparse linear system

l
k
j
Pik
Pil
Pij
m
Pim
i
Pii
Let fi pfold(i) After one step fi Pii fi
Pij fj Pik fk Pil fl Pim fm
38
Number of Self-Avoiding Walks on a 2D Grid
1, 2, 12, 184, 8512, 1262816, 575780564,
789360053252, 3266598486981642, (10x10)
41044208702632496804, (11x11) 1568758030464750013
214100, (12x12) 182413291514248049241470885236
gt 1028
http//mathworld.wolfram.com/Self-AvoidingWalk.htm
l
39
In contrast

Computing pfold with MC simulation requires
For every conformation q of interest
Perform many MC simulation runs from q
Count number of times F is attained first

40
Computational Tests

1ROP (repressor of primer)
2 a helices
6 DOF

1HDD (Engrailed homeodomain)
3 a helices
12 DOF

H-P energy model with steric clash exclusion Sun
et al., 95
41
pfold for ß hairpin
Immunoglobin binding protein (Protein G) Last 16
amino acids Ca based representation Go model
energy function 42 DOFs Zhou and Karplus,
99
42
Correlation with MC Approach
1ROP
43
Computation Times (ß hairpin)
Monte Carlo (30 simulations)
Over 107 energy computations
10 hours of computer time
1 conformation
Roadmap
50,000 energy computations
23 seconds of computer time
2000 conformations
6 orders of magnitude speedup!
44
Using Path Sampling to Construct Roadmaps N.
Singhal, C.D. Snow, and V.S. Pande. Using Path
Sampling to Build Better Markovian State Models
Predicting the Folding Rate and Mechanism of a
Tryptophan Zipper Beta Hairpin, J. Chemical
Physics, 121(1)415-425, 2004

New idea
Paths computed with Molecular Dynamics
simulation techniques are used to create the
nodes of the roadmap? More pertinent/better
distributed nodes
? Edges are labeled with the time needed to
traverse them

45
Sampling Nodes from Computed Paths (Path Shooting)
F
U
46
Sampling Nodes from Computed Paths (Path Shooting)
F
U
47
Node Merging

If two nodes are closer apart than some e, they
are merged into one ? roadmap
Rules are applied to update edge probabilities
and times

48
Application Computation of MFPT

Mean First Passage Time the average time when a
protein first reaches its folded state
First-Step Analysis yields
MPFT(i) Sj Pij x (tij MPFT(j))
MPFT(i) 0 if i ? F
Assuming first-order kinetics, the probability
that a protein folds at time t is
where r is the folding rate
MFPT 1/r

49
Computational Test

12-residue tryptophan zipper beta hairpin (TZ2)
Folding_at_Home used to generate trajectories (fully
atomistic simulation) ranging from 10 to 450 ns
1750 trajectories (14 reaching folded state)
? 22,400-node roadmap
MFPT 2-9 ms, which is similar to experimental
measurements (from fluorescence and IR)

50
Conformational Analysis of Protein Loops J.
Cortés, T. Siméon, M. Renaud-Siméon, and V. Tran.
Geometric Algorithms for the Conformational
Analysis of Long Protein Loops. J. Comp.
Chemistry, 25956-967, 2004

New idea
Explore the clash-free subset of the
conformational space of a loop, by building a
tree-shaped roadmap
Kinematic model f-y angles on the backbone ci
torsional angles in side-chains

Amylosucrase (AS)
- Only enzyme in its family that acts on
sucrose substrate
The 17-residue loop (named loop 7) between
Gly433 and Gly449 is
believed to play a pivotal role

52
Roadmap Construction

A tree-shaped roadmap is created from a start
conformation qstart
At each step of the roadmap construction, a
conformation qrand of the loop is picked at
random, and a new roadmap node is created by
iteratively pulling toward it the existing node
that is closest to qrand

53
Roadmap Construction
C
Cfree
Cclosed
qstart
Stops when one cant get closer to qrand or a
clash is detected
54
Computational Results

Surprisingly, loop 7 cant move much
Main bottleneck is residue Asp231

Positions of theCa atom of middleresidue
(Ser441)
55
Computational Results

If residue Asp231 is removed, then loop 7s
mobility increases dramatically. The Ca atom of
Ser441 can be displaced by more than 9Å from its
crystallographic position

56
Conclusion

Probabilistic roadmaps are a recent, but
promising tool for exploring conformational
spaces and computing ensemble properties of
molecular pathways
Current/future research
Better sampling strategies able to handle more
complex molecular models (protein-protein
binding)
More work to include time information in
roadmaps
More thorough experimental validation to compare
computed and measured quantitative properties

Write a Comment

User Comments (0)

About PowerShow.com

The Probabilistic Roadmap Approach to Study Molecular Motion - PowerPoint PPT Presentation

The Probabilistic Roadmap Approach to Study Molecular Motion

The formation order that appears the most often over all paths is considered the ... If two nodes are closer apart than some e, they are merged into one roadmap ... – PowerPoint PPT presentation