Title: Analysis of PRMs for Computational Biology
1Analysis of PRMs for Computational Biology
- Shawna ThomasRandomized Algorithms12/5/03
2Motion Planning and PRMs
- The motion planning problem
- Probabilistic Roadmap Methods (PRMs)
Configuration space
goal
C-obst
C-obst
C-obst
C-obst
C-obst
start
3Application to Computational Biology
- Protein Folding how does a protein folding from
an unstructured configuration to its final
native/most stable structure? - Ligand Binding what are potential binding sites
on a protein for the ligand/drug molecule? - RNA Folding what are the folding kinetics of an
RNA molecule (e.g., population kinetics,
transition states, folding rates)?
4Application to Computational Biology
- Protein Folding how does a protein folding from
an unstructured configuration to its final
native/most stable structure? - Ligand Binding what are potential binding sites
on a protein for the ligand/drug molecule? - RNA Folding what are the folding kinetics of an
RNA molecule (e.g., population kinetics,
transition states, folding rates)?
5PRMs for Computational Biology
- Use the same PRM technique as for traditional
robotics problems. - Replace collision check with energy calculation.
- Nodes are accepted based on the following
probability
C-space
Potential Energy
6C-Spaces for Computational Biology
Fuzzy C-space
Traditional C-space
7Analysis of Failure Probability
- Bound using Min. Path Clearance
- Bound using Varying Path Clearance
8Analysis of Failure Probability
- Bound using Min. Path Clearance
- Bound using Varying Path Clearance
9Bound based on Path Clearance
- Idea cover g with a few balls of radius R/2 that
overlap. - Put ball centers at x0a, x2, , xnb so that the
distance between adjacent balls is lt R/2. - Planner will succeed if it samples at least one
node in each ball.
- Planner will succeed if it samples at least one
node in each ball.
10A Little Geometry
- Let xj and xj1 be two points along g whose
distance is less than R/2. - For any two points c Î BR/2(xj) and d Î BR/2, the
line segment cd is contained inside BR(xj) and
therefore is valid.
11A Little Geometry
- Let xj and xj1 be two points along g whose
distance is less than R/2. - For any two points c Î BR/2(xj) and d Î BR/2, the
line segment cd is contained inside BR(xj) and
therefore is valid.
12A Little Geometry
- Let xj and xj1 be two points along g whose
distance is less than R/2. - For any two points c Î BR/2(xj) and d Î BR/2, the
line segment cd is contained inside BR(xj) and
therefore is valid.
13Failure Probability
- The planner might fail if it doesnt sample at
least one node in each of the balls at x1 xn-1. - n 2L/R
- N number of samples
- Prfailure Prdont sample every ball
- Let Ej denote the event that ball j is not
sampled - Prfailure PrE1 U E2 U U En-1 S
PrEj for j1..n-1 - Samples are independent, so this is
- (n-1)(Prdont sample BR/2 )N
- (2L/R)(1 - Prsample BR/2)N
14Prsample BR/2)
- Three worst-case scenarios
15Putting it all Together
Using the inequality we can simplify the
expression to
16Prsample BR/2)
- Three worst-case scenarios
Area of BR-RArea of F
Area of gray regionArea of F
p
17Putting it all Together
Using the inequality we can simplify the
expression to
18Bound using Varying Path Clearance
- Idea most of the points along g have clearance
greater than R. Cover g with as few large balls
as possible.
19Bound using Varying Path Clearance
- Idea most of the points along g have clearance
greater than R. Cover g with as few large balls
as possible.
20Conclusion
- This work is a first step in analyzing the
performance of PRMs on computational biology
applications.
- Weve presented 2 bounds on the failure
probability for the probability distribution p(q)
p in a 2D C-space.
- This analysis extends to higher dimensions and to
the probability distribution based on clearance.
21Conclusion
- This work is a first step in analyzing the
performance of PRMs on computational biology
applications.
- Weve presented 2 bounds on the failure
probability for the probability distribution p(q)
r(q)/R in a 2D C-space.
- This analysis extends to higher dimensions and to
the probability distribution based on clearance.
22Future Work
- The dimensionality of the C-space for
computational biology applications is extremely
large. - A protein of length n has C-space dimension 2n
- n can be anywhere from 50 to 50000
- Typically these applications bias their node
sampling towards more interesting areas of the
C-space. - In the future, we want to extend the analysis to
biased node sampling.
23References
- J.C. Latombe, Robot Motion Planning. Boston, MA
Kluwer Academic Publishers, 1991. - L. Kavraki, P. Svestka, J.C. Latombe, and M.
Overmars, Probabilistic roadmaps for path
planning in high-dimensional configuration
spaces, IEEE Trans. Robot. Automat., vol. 12,
no. 4, pp. 566-580, August 1996. - O.B. Bayazit, G. Song, and N.M. Amato, Ligand
binding with OBPRM and haptic user input
Enhancing automatic motion planning with virtual
touch, in Proc. IEEE Int. Conf. Robot. Autom.
(ICRA), 2001, pp. 954-959. - N.M. Amato and G. Song, Using motion planning to
study protein folding pathways, J. Comput.
Biol., vol 9, no. 2, pp. 149-168, 2002. - X. Tang, B. Kirkpactrick, S. Thomas, G. Song, and
N.M. Amato, Using motion planning to study rna
folding kinetics, PARASOL Lab, Dept. of Computer
Science, Texas AM University, Tech. Rep. 03-005,
Oct 2003. - P. Svestka, On probabilistic completeness and
expected complexity of probabilistic path
planning, Dept. of computer Science, Utrecht
University, Utrecht, the Netherlands, Tech. Rep.
UU-CS-96-20, May 1996. - L. Kavraki, M. Kolountzakis, and J.C. Latombe,
Analysis of probabilistic roadmaps for path
planning, in IEEE Trans. Robot. Automat., vol.
14, 1998, pp. 166-171.