Title: Energy Maintenance for Molecular Simulation
1Energy Maintenancefor Molecular Simulation
kinematics energy ? motion structure
Main computational issue Proximity computation
2Energy
Function defined over large dimensionalconformati
on space
3Energy Function
- E ES EQ ESQ ETor EvdW Edipole
-
4Energy Function
- E ES EQ ESQ ETor EvdW Edipole
-
EvdW
5Role of vdW Terms
- vdW terms ? maze of in conformational space
- Other terms steer the molecule in this maze
6Heuristic Energy Terms(e.g., Go Models)
7Interaction with Solvent
- Explicit solvent models 100s or 1000s of
discrete solvent molecules - Implicit solvent models solvent as continuous
medium, interface is solvent-accessible surface
8Energy Function
- E S bonded terms S non-bonded terms
S solvation terms - Bonded terms - Relatively few
- Non-bonded terms - Depend on distances between
pairs of atoms - Quadratic number ? Expensive to
compute - Solvation terms- May require computing molecular
surface
9Energy Function
- E S bonded terms S non-bonded terms
S solvation terms - Bonded terms - Relatively few
- Non-bonded terms - Depend on distances between
pairs of atoms - Quadratic number ? Expensive to
compute - Solvation terms- May require computing molecular
surface
10Uses of Energy Function
- Generate energetically plausible conformations
sample (at random), minimize, cluster - Generate meaningful distributions (e.g.,
Boltzman) of conformations Monte Carlo
simulation - Generate motion pathways to study molecular
kinetics molecular dynamics, MC simulation
11Monte Carlo Simulation (MCS)
- Popular approach to study thermodynamic and
kinetic properties of proteins - Random walk through conformation space
- At each cycle
- Perturb current conformation at random
- Accept step with probability
-
- (Metropolis acceptance criterion)
- The conformations generated by an arbitrarily
long MCS are Boltzman distributed, i.e.,
conformations in V
12Uses of Energy Function
- Generate energetically plausible conformations
sample (at random), minimize, cluster - Generate meaningful distributions (e.g.,
Boltzman) of conformations Monte Carlo
simulation - Generate motion pathways to study molecular
kinetics molecular dynamics, MC simulation - ? One issue in common Energy must be evaluated
frequently E.g., MD and MC simulation runs may
consist of millions of steps, each
13Uses of Energy Function
- Generate energetically plausible conformations
sample (at random), minimize, cluster - Generate meaningful distributions (e.g.,
Boltzman) of conformations Monte Carlo
simulation - Generate motion pathways to study molecular
kinetics molecular dynamics, MC simulation - ProblemHow to efficiently compute and update
energy during minimization and simulation?
14Non-Bonded Energy Terms
- Quadratic number of pairs of atoms
- Energy terms go to 0 when distance increases ?
Cutoff distance (6 - 12Å) - vdW forces prevent atoms from bunching up ?
Only O(n) interacting pairs
HalperinOvermars 98 - Problems
- How can we find the interacting pairs without
enumerating all atom pairs? - How can we detect atomic clashes quickly?
- Main computational issue Proximity computation
15Grid Method
- Subdivide 3-space into cubic cells
- Compute cell that contains each atom center
- Represent grid as hashtable
16Grid Method
- O(n) time to build grid
- O(1) time to find interactive pairs for each atom
- T(n) to find all interactive pairs of atoms
HalperinOvermars, 98 - Asymptotically optimal in worst-case
17Energy Update
- Compare the interacting pairs at new step with
those at previous step - For every pair that has disappeared, subtract the
corresponding energy term from energy value - For every new pair, add the corresponding energy
term to energy value - Takes T(n) time, even if very few pairs have
changed
18The grid method is unable to recognize and re-use
such partial sums
19Grid Method
- O(n) time to build grid
- O(1) time to find interactive pairs for each atom
- T(n) to find all interactive pairs of atoms
HalperinOvermars, 98 - Asymptotically optimal in worst-case
- But
- - Energy partial sums?
- - Atomic clashes?
- second grid with small cutoff distance
20Grid Method ? Surface Halperin and Shelton, 97
- Each sphere intersects O(1) spheres
- Computing each atoms contribution to molecular
surface takes O(1) time - Computation of molecular surface takes T(n) time
- ? implicit solvation term in T(n) time
21General Problem
- Molecules form geometrically complex objects that
deform and move relative to each other - (Self-)collision detection
- Distance computation
- Several computational approaches
- Space occupancy grid, octree
- Tracking pairs of closest features
- Polynomial equation
- Bounding-volume hierarchies (BVH)
- Spanners
22Bounding Volume Hierarchies (BVHs)
- Outline
- Case of rigid objects
- Bounding volume (BV)
- BV hierarchy (BVH)
- Types of BVs
- Collision detection with BVHs
- Distance computation
- Application to deformable objects
- Application to protein simulation
23Basic Problem
- Given the geometric models and relative positions
of two objects, determine whether they overlap
24Basic Problem
- Given the geometric models and relative positions
of two objects, determine whether they overlap
distance 0 ? collision
25Applications
- Computer graphics simulation
- Robotics
- Haptics
26(No Transcript)
27Basic Idea of Solution
- Enclose objects into bounding volumes (spheres
or boxes) - Check the bounding volumes first
28Basic Idea of Solution
- Enclose objects into bounding volumes (spheres
or boxes) - Check the bounding volumes first
- Decompose an object into two
29Basic Idea of Solution
- Enclose objects into bounding volumes (spheres
or boxes) - Check the bounding volumes first
- Decompose an object into two
- Proceed hierarchically
30Basic Idea of Solution
- Enclose objects into bounding volumes (spheres
or boxes) - Check the bounding volumes first
- Decompose an object into two
- Proceed hierarchically
31Bounding Volume Hierarchy (BVH)
- BVH is pre-computed for each object
- BVH is typically a balanced binary tree
32BVH in 3D
33Collision Detection
Two objects described by their precomputed BVHs
34Collision Detection
Search tree
AA
A
A
35Collision Detection
Search tree
AA
A
A
36Collision Detection
Search tree
AA
CC
CB
BC
BB
37Collision Detection
Search tree
AA
CC
CB
BC
BB
G
D
38Variant
Search tree
A
A
39Collision Detection
- Pruning discards subsets of the two objects that
are separated by the BVs - Each path is followed until pruning or until two
leaves overlap - When two leaves overlap, their contents are
tested for overlap
40Search Strategy and Heuristics
- If there is no collision, all paths must
eventually be followed down to pruning or a leaf
node - But if there is collision, it is desirable to
detect it as quickly as possible - ? Greedy best-first search strategy with f(N)
d/(rXrY) Expand the node XY with
largest relative overlap (most likely to
contain a collision)
41Recursive (Depth-First) Collision Detection
Algorithm
- Test(A,B)
- If A and B do not overlap, then return 1
- If A and B are both leaves, then return 0 if
their contents overlap and 1 otherwise - Switch A and B if A is a leaf, or if B is bigger
and not a leaf - Set A1 and A2 to be As children
- If Test(A1,B) 1 then return Test(A2,B) else
return 0
42Performance
- Several thousand collision checks per second for
2 three-dimensional objects each described by
500,000 triangles, on a 1-GHz PC
43Greedy Distance Computation(same recursion as
collision detection)
- Greedy-Distance(A,B)
- If dist(A,B) gt 0, then return dist(A,B)
- If A and B are both leaves, then return distance
between their contents - Switch A and B if A is a leaf, or if B is bigger
and not a leaf - Set A1 and A2 to be As children
- d1 ? Greedy-Distance(A1,B)
- If d1 gt 0 then
- d2 ? Greedy-Distance(A2,B)
- If d2 gt 0 then return Min(d1,d2)
- Return 0
44Exact Distance Computation
M (upper bound on distance) is initialized to
very large number
- Distance(A,B)
- If dist(A,B) gt M, then return M
- If A and B are both leaves, then
- d ? distance between their contents
- Return Min(d,M)
- Switch A and B if A is a leaf, or if B is bigger
and not a leaf - Set A1 and A2 to be As children
- M ? Distance(A1,B)
- If M gt 0 then return Distance(A2,B)
- Else return 0
45Approximate Distance Computation
M (upper bound on distance) is initialized to
very large number
- Approx-Distance(A,B) ? da da ? de and de-da ?
ade - If dist(A,B) gt M, then return M
- If A and B are both leaves, then
- d ? distance between their contents
- If d lt M then return (1-a)?d else return M
- Switch A and B if A is a leaf, or if B is bigger
and not a leaf - Set A1 and A2 to be As children
- M ? Approx-Distance(A1,B)
- If M gt 0 then return Approx-Distance(A2,B)
- Return 0
46Approximate Distance Computation
M (upper bound on distance) is initialized to
very large number
- Approx-Distance(A,B) ? da da ? de and de-da ?
ade - If dist(A,B) gt M, then return M
- If A and B are both leaves, then
- d ? distance between their contents
- If d lt M then return (1-a)?d
- Switch A and B if A is a leaf, or if B is bigger
and not a leaf - Set A1 and A2 to be As children
- M ? Approx-Distance(A1,B)
- If M gt 0 then return Approx-Distance(A2,B)
- Return 0
Garanteed to return an approximate distance
between (1-?)d and d
47- Collision detection
- lt Greedy distance computation
- lt 0.5 Approximate distance computation
- ltlt Exact distance computation
- lt slightly faster
- ltlt much faster
48Desirable Properties of BVs and BVHs
- BVs
- Tightness
- Efficient testing
- Invariance
- BVH
- Separation
- Balanced tree
49Desirable Properties of BVs and BVHs
- BVs
- Tightness
- Efficient testing
- Invariance
- BVH
- Separation
- Balanced tree
50Spheres
- Invariant
- Efficient to test
- But tight?
51Axis-Aligned Bounding Box (AABB)
52Axis-Aligned Bounding Box (AABB)
- Not invariant
- Efficient to test
- Not tight
53Oriented Bounding Box (OBB) Gottschalk, Lin,
and Manocha, 96
54Oriented Bounding Box (OBB)
- Invariant
- Less efficient to test
- Tight
55Rectangle Swept Spheres (RSS)
? Efficient distance computation
56Computation of Distance Between Two RSSs
- Compute the distance between the two underlying
rectangles - Subtract the growing radius
57Comparison of BVs
No type of BV is optimal for all situations
58Computation of a BV Sphere
- Each intermediate sphere encloses the geometry
contained in its descendant leaf nodes - Simple solution Compute each intermediate sphere
to minimally enclose its two children - Tighter-fitting solution each intermediate
sphere is computed to minimally enclose the
spheres leaf descendants Welzl, 91 ? expected
O(N) time
59Computation of an OBBGottschalk, Lin, and
Manocha, 96
- N points ai (xi, yi, zi)T, i 1,, N
- SVD of A (a1 a2 ... aN)
- ? A UDVT where
- D diag(s1,s2,s3) such that s1 ? s2 ? s3 ? 0
- U is a 3x3 rotation matrix that defines the
principal axes of variance of the ais ? OBBs
directions - The OBB is defined by max and min coordinates of
the ais along these directions - Possible improvements use vertices of convex
hull of the ais or dense uniform sampling of
convex hull
rotation described by matrix U
60OBB of a Collection of Spheres
- Compute the OBB of the centers
- Grow the OBB by moving each of its faces
outwardby the atom radius
61Computation of an RSSLarsen, Gottschalk, Lin,
and Manocha, 00
- Similar to OBB. Compute the two principal axes of
variance of the ais (atom centers) - Project all ais into the plane P defined by
these two directions - Compute minimum enclosing rectangle R contained
in P and aligned with these directions - Grow R by half the length of the interval spanned
by the ais along the direction perpendicular to
P increased by the atom radius
62Desirable Properties of BVs and BVHs
- BVs
- Tightness
- Efficient testing
- Invariance
- BVH
- Separation
- Balanced tree
63Desirable Properties of BVs and BVHs
- BVs
- Tightness
- Efficient testing
- Invariance
- BVH
- Separation
- Balanced tree
64Construction of a BVH
- Top-down recursive algorithm from the root to the
leaves - At each step, create the two children of a BV
65Subdivision of a Sphere BV
- Split longest axis of AABB at mid or median point
- Median point guarantees balanced BVH, but takes
slightly more time to compute
66Subdivision of an OBB/RSS
- Split longest axis at mid or median point
67Application to Deformable Objects
- The BVH computed for some initial or nominal
geometry may become useless
68Application to Deformable Objects
- The BVH computed for some initial or nominal
geometry may become useless - ? Group pieces hierarchically based on
topological rather than geometric proximity - Topological proximity is
- invariant
- implies geometric proximity (converse is not true)
69Particular Case Long Chain
70Application to Deformable Objects
- The BVH computed for some initial or nominal
geometry may become useless - ? Group pieces hierarchically based on
topological rather than geometric proximity - Topological proximity is
- invariant
- implies geometric proximity (converse is not
true) - ? BVH with fixed topology, but BVs must still be
adjusted in size and position - Self-collision detection is done by testing a BVH
against itself
71Particular Case Long Chain
- A chain of spheres is well-behaved iff
- The ratio of the radii of the largest and
smallest spheres is less than some g - The distance between any two spherecenters is
greater than some d
72Application to Monte Carlo Simulation of
Proteins(ChainTree)
I. Lotan, D. Halperin, F. Schwarzer and J.C.
Latombe. Algorithm and Data Structures for
Efficient Energy maintenance During Monte Carlo
Simulation of Proteins, J. Computational Biology,
2004
73Monte Carlo Simulation (MCS)
- Random walk through conformation space
- At each cycle
- - Perturb current conformation at random
- Accept step with probability
- Problem Update energy value
74Energy Function
- E S bonded terms S non-bonded terms
S solvation terms - Bonded terms - Relatively few
- Non-bonded terms - Depend on distances between
pairs of atoms - Quadratic number ? Expensive to
compute - Solvation terms- May require computing molecular
surface
75Non-Bonded Energy Terms
- They go to 0 when distance increases ? Use
cutoff distance (6 - 12Å) - vdW forces prevent atoms from bunching up ? Only
O(n) interacting pairs HalperinOvermars 98
Problem How to find these interacting pairs
without enumerating all atom pairs?
76Can We Do Better on Average than Grid method?
- Few DOFs are changed at each MC step
77Can We Do Better on Average than Grid method?
- Few DOFs are changed at each MC step
simulationof 100,000 attempted steps
78Can We Do Better on Average?
- Few DOFs are changed at each MC step
- Proteins are long chain kinematics
- Long sub-chains stay rigid at each step
- Many partial energy sums remain constant
Problem How to retrieve the unchanged partial
sums?
79ChainTree(Twofold Hierarchy BVs Transforms)
links
80ChainTree(Twofold Hierarchy BVs Transforms)
joints
81Updating the ChainTree
- Update path to root
- Recompute transforms that shortcut the DOF
change - Recompute BVs that contain the DOF change
- O(k (log(n/k)1)) work for k changes
82Finding Interacting Pairs
??
83Finding Interacting Pairs
84Finding Interacting Pairs
- Do not search inside rigid sub-chains (unmarked
nodes) - Do not test two nodes with no marked node between
them
85Finding Interacting Pairs
- Do not search inside rigid sub-chains (unmarked
nodes) - Do not test two nodes with no marked node between
them
86EnergyTree
E(N,N)
E(K.L)
E(M,M)
E(L,L)
E(J,L)
87EnergyTree
E(N,N)
E(K.L)
E(M,M)
E(L,L)
E(J,L)
88Computational Complexity
- n total number of DOFs
- k number of DOF changes at each MCS step
- k ltlt n
- Complexity of
- updating ChainTree O(k (log(n/k)1))
- finding interacting pairs O(n4/3) but performs
much better in practice!!!
89Experimental Setup
- Energy function
- Van der Waals
- Electrostatic
- Attraction between native contacts
- Cutoff at 12Å
- 300,000 steps MCS with Grid and ChainTree
- Steps are the same with both methods
- Early rejection for large vdW terms
90Results 1-DOF change
12.5
7.8
speedup
5.8
3.5
amino acids
91Results 5-DOF change
5.9
speedup
4.5
3.4
2.2
92Two-Pass ChainTree (ChainTree)
1st pass small cutoff distance to detect steric
clashes 2nd pass Normal cutoff distance
Tests around native state
gt5
93Interaction with Solvent
- Explicit solvent models 100s or 1000s of
discrete solvent molecules - Implicit solvent models solvent as continuous
medium, interface is solvent-accessible surface
E. Eyal, D. Halperin. Dynamic Maintenance of
Molecular Surfaces under Conformational Changes.
http//www.give.nl/movie/publications/telaviv/EH0
4.pdf
94Conclusion
- ChainTree significantly reduces average time of
MCS for proteins (vs. grid) - It exploits
- Atomic exclusion
- Cutoff distance on potentials
- Chain kinematics of protein
- Small of DOF changes at each MC step
- Larger speed-up for bigger proteins and smaller
of simultaneous DOF changes - Extension to updating protein surface
- http//robotics.stanford.edu/itayl/mcs
Already exploitedby grid method