Structure%20and%20Motion - PowerPoint PPT Presentation

About This Presentation
Title:

Structure%20and%20Motion

Description:

Title: ProShape Understanding the Shapes of Protein Structures Patrice Koehl, Michael Levitt (Stanford) Herbert Edelsbrunner (Duke) Author: Patrice Koehl – PowerPoint PPT presentation

Number of Views:160
Avg rating:3.0/5.0
Slides: 54
Provided by: Patric689
Learn more at: http://ai.stanford.edu
Category:

less

Transcript and Presenter's Notes

Title: Structure%20and%20Motion


1
Structure and Motion
  • Jean-Claude LatombeComputer Science Department
    Stanford University
  • NSF-ITR Meeting on

November 14, 2002
2
Stanfords Participants
  • PIs L. Guibas, J.C. Latombe, M. Levitt
  • Research Associate P. Koehl
  • Postdocs F. Schwarzer, A. Zomorodian
  • Graduate students S. Apaydin (EE), S. Ieong
    (CS), R. Kolodny (CS), I. Lotan (CS), A. Nguyen
    (Sc. Comp.), D. Russel (CS), R. Singh (CS), C.
    Varma (CS)
  • Undergraduate students J. Greenberg (CS),E.
    Berger (CS)
  • Collaborating faculty
  • A. Brunger (Molecular Cellular Physiology)
  • D. Brutlag (Biochemistry)
  • D. Donoho (Statistics)
  • J. Milgram (Math)
  • V. Pande (Chemistry)

3
Problems Addressed
  • Biological functions derive from the structures
    (shapes) achieved by molecules through motions
  • ? Determination, classification, and
    prediction of 3D protein structures
  • ? Modeling of molecular energy and
    simulation of folding and binding motion

4
Whats New for Computer Science?
  • Massive amount of experimental data
  • Importance of similarities
  • Multiple representations of structure
  • Continuous energy functions
  • Many objects forming deformable chains
  • Many degrees of freedom
  • Ensemble properties of pathways

5
Massive amount of experimental data
  • ? Abstract/simplify data sets into compact data
    structures

E.g. Electron density map ? Medial axis
6
Importance of similarities
  • ?Segmentation/matching/scoring techniques

E.g. Libraries of protein fragmentsKolodny,
Koehl, Guibas, Levitt, JMB (2002)
7
1tim Approximations
real protein
8
Alignment of Structural Motifs Singh and Saha
Kolodny and Linial
  • Problem
  • Determine if two structures share common motifs
  • 2 (labelled) structures in R3 Aa1,a2,,an,
    Bb1,b2,,bm
  • Find subsequences sa and sb s.t the
    substructures asa(1),asa(2),,
    asa(l) bsb(1),bsb(2),, bsb(l) are similar
  • Twofold problem alignment and correspondence
  • Score ?? Approximation ?? Complexity

9
R. Singh and M. Saha. Identifying Structural
Motifs in Proteins.Pacific Symp. on
Biocomputing, Jan. 2003.
Iterative Closest Point (Besl-McKay) for
alignment
? Score RMSD distance
10
R. Singh and M. Saha. Identifying Structural
Motifs in Proteins.Pacific Symp. on
Biocomputing, Jan. 2003.
Trypsin
Trypsinactivesite
11
R. Singh and M. Saha. Identifying Structural
Motifs in Proteins.Pacific Symp. on
Biocomputing, Jan. 2003.
Trypsin active site against 42Trypsin like
proteins
12
Multiple representations of structure

ProShape softwareKoehl, Levitt
(Stanford),Edelsbrunner (Duke)
13
Statistical potentials for proteins based on
alpha complex Guibas, Koehl, Zomorodian
  • Decoys generated using physical potentials
  • Select best decoys using distance information

14
  • Continuous energy functions
  • Many objects in deformable chains
  • ?Many pairs of objects, but relatively few are
    close enough to interact
  • ? Data structures that capture proximity, but
    undergo small or rare changes

During motion simulation - detect steric clashes
(self-collisions) - find pairs of atoms closer
than cutoff
15
  • Other application domains
  • Modular reconfigurable robots
  • Reconstructive surgery

16
  • Fixed Bounding-Volume hierarchies dont work

sec17
17
  • Instead, exploit what doesnt change chain
    topology? Adaptive BV hierarchiesGuibas,
    Nguyen, Russel, Zhang Lotan, Schwarzer,
    Halperin, Latombe (SOCG02)

sec17
18
  • Wrapped bounding sphere hierarchiesGuibas,
    Nguyen, Russel, Zhang (SoCG 2002)
  • WBSH undergoes small number of changes
  • Self-collision
  • O(n logn ) in R2 O(n2-2/d) in Rd, d ? 3

19
  • ChainTreesLotan, Schwarzer, Halperin, Latombe
    (SoCG02)

Assumption Few degrees of freedom change at each
motion step (e.g., Monte Carlo simulation)
  • Find all pairs of atoms closer than a given
    cutoff
  • Find which energy terms can be reused

20
  • ChainTreesLotan, Schwarzer, Halperin, Latombe
    (SoCG02)

Updating
Finding interacting pairs
(in practice, sublinear)
21
  • ChainTreesApplication to MC simulation
    (comparison to grid method)

m1
m 5
22
  • Future work ChainTrees
  • Run new series of experiments with more complex
    energy field EEF1 Lazaridis Karplus
    (with Pande)
  • Use library of fragments (with Koehl)

Open problem How to find good moves to make when
the conformation is compact and random moves are
rejected with high probability?
23
  • Future Work Spanner for deformable
    chainAgarwal, Gao, Duke Nguyen, Zhang,
    Stanford

3HVT
Capture proximity information with a sparse
spanner
24
Many degrees of freedom
  • ?Tools to explore large dimensional conformation
    space
  • - Sampling strategies - Nearest neighbors

25
Sampling structures by combining
fragmentsKolodny, Levitt
Library of protein fragments
? Discrete set of candidate structures
26
Nearest neighbors in high-dimensional
spaceLotan and Schwarzer
Find k nearest neighbors of a given protein
conformation in a set of n conformations (cRMS,
dRMS)
Idea Cut backbone into m equal subsequences
27
Nearest neighbors in high-dimensional
spaceLotan and Schwarzer
100,000 decoys of 1CTF (Park-Levitt
set) Computation of 100 NN of each conformation
Full rep., dRMS (brute force) 84h
Ave. rep., dRMS (brute force) 4.8h
SVD red. rep., dRMS (brute force) 41min
SVD red. rep., dRMS (kd-tree) 19min
80 of computed NNs are true NNskd-tree
software from ANN library (U. Maryland)
28
Ensemble properties of pathways
  • ? Stochastic nature of molecular motion requires
    characterizing average properties of many
    pathways

29
Example 1 Probability of Folding pfold
We stress that we do not suggest using pfold as
a transition coordinate for practical purposes as
it is very computationally intensive. Du,
Pande, Grosberg, Tanaka, and Shakhnovich On the
Transition Coordinate for Protein Folding
Journal of Chemical Physics (1998).
Folded set
Unfolded set
30
Example 2 Ligand-Protein InteractionSept,
Elcock and McCammon 99
10K to 30K independent simulations
31
Probabilistic Roadmap Apaydin, Brutlag, Hsu,
Guestrin, Latombe (RECOMB02, ECCB02) Idea
Capture the stochastic nature of molecular motion
by a network of randomly selected conformations
and by assigning probabilities to edges
32
Probabilistic Roadmap Apaydin, Brutlag, Hsu,
Guestrin, Latombe (RECOMB02, ECCB02)
  • One linear equation per node
  • Solution gives pfold for all nodes
  • No explicit simulation run
  • All pathways are taken into account
  • Sparse linear system

l
k
j
Pik
Pil
Pij
m
Pim
i
Pii
Let fi pfold(i) After one step fi Pii fi
Pij fj Pik fk Pil fl Pim fm
33
Probabilistic Roadmap
Correlation with MC Approach
  • 1ROP (repressor of primer)
  • 2 a helices
  • 6 DOF

34
Probabilistic Roadmap
Computation Times (1ROP)
Monte Carlo
Over 106 energy computations
Over 11 days of computer time
49 conformations
Roadmap
15,000 energy computations
1 - 1.5 hours of computer time
5000 conformations
4 orders of magnitude speedup!
35
Future work Probabilistic Roadmap
  • Non-uniform sampling strategies
  • Encoding molecular dynamics into probabilistic
    roadmaps (with V. Pande)
  • Quantitative experiments with ligand-protein
    binding (with V. Pande)

36
Bio-X Clark Center
37
The following slides relate to non-research
issues. I do not plan to present them. Jack and
Leo may want to use the contents of some of them
for their own presentations.
38
Education
  • Tutorial on Delaunay, Alpha-Shape and Pockets
    (Koehl)
  • A biocomputing Notebook (Koehl)
  • Biocomputation lectures in pre-existing classes
  • CS326 motion planning molecular motion,
    probabilistic roadmaps, self-collision detection
    (Latombe)
  • CS468 intro to computational topology finding
    pockets and tunnels in molecules, compute surface
    areas and volumes and their derivative
    (Zomorodian)
  • New class on Algorithmic Biology (Batzoglu,
    Guibas, Latombe)
  • Graduate Curriculum Committee, Bio-Engineering
    Dept., Stanford (Latombe)

39
Trained Students (1/2)
  • PhD students
  • Serkan Apaydin, EE
  • An Nguyen, Scientific Computing
  • Carlos Guestrin, CS (Daphne Kollers group)
  • Itay Lotan, CS
  • Rachel Kolodny, CS
  • Daniel Russel, CS
  • Samuel Ieong, CS

Most graduate students have a principal advisor
in CS and a secondaryone in a bio-related
department (Levitt, Brutlag, Pande)
40
Trained Students (2/2)
  • Graduated Master students
  • Rohit Singh, finding motifs in proteins, best
    Stanford CS masters thesis, June 02 current
    position bioinformatics company in San Diego
  • Chris Varma, study of ligand-protein interaction
    with probabilistic roadmaps, June 02 current
    position PhD student, Harvard/MIT Biomedical
    program
  • Current Master student
  • Ben Wong, modeling T cell activity
  • Undergraduate
  • Eric Berger, CS, Stanford, summer internship
  • Julie Greeberg, CS, Harvard, summer internship

41
Visitors
  • Prof. Alberto MunozMath Dept., University of
    Yucatan, Mexico3 months, Summer02Haptic
    interaction and probabilistic roadmaps
  • Prof. Ileana StreinuSmith College6 months, from
    Sept.02Protein folding

42
Interactions Within Stanford
  • - Guibas and Levitt, with J. Milgram (Math)
    topology of configuration spaces of chains-
    Guibas, with V. Pande (Chemistry) and D. Donoho
    (Statistics) non-linear multi-resolution analysis
    of molecular motions- Latombe and Apaydin, with
    D. Brutlag (Biochemistry) and V. Pande
    probabilistic roadmaps- Latombe and Lotan with
    V. Pande efficient MC simulation

43
Interactions Outside Stanford
  • - Collision Detection for Deforming Necklaces,
    P. Agarwal, L. Guibas, A. Nguyen, D. Russel, and
    L. Zhang. Invited to special issue of Comp.
    Geom., Theory and Applications, following
    presentation at SoCG'02.- Kinetic Medians and
    kd-Trees, P. Agarwal, J. Gao, and L. Guibas.
    Proc. 10th European Symp. Algorithms, LNCS 2461,
    Springer-Verlag, 5-16, 2002.- Stochastic Roadmap
    Simulation An Efficient Representation and
    Algorithm for Analyzing Molecular Motion, M.S.
    Apaydin, D.L. Brutlag, C. Guestrin, D. Hsu, and
    J.C. Latombe. Proc. RECOMB'02, Washington D.C.,
    pp. 12-21, 2002. - Efficient Maintenance and
    Self-Collision testing for Kinematic Chains, I.
    Lotan, F. Schwarzer, D. Halperin, and J.C.
    Latombe, SoCG02, pp. 43-42. June 2002.-
    Stochastic Conformational Roadmaps for Computing
    Ensemble Properties of Molecular Motion, M.S.
    Apaydin, D.L. Brutlag, C. Guestrin, D. Hsu, and
    J.C. Latombe. Workshop on Algorithmic Foundations
    of Robotics (WAFR), Nice, Dec. 2002.

44
Attendance to Conferences
  • - BCATS 01 and 02 Bio-Computation At
    Stanford- RECOMB 02 Int. Conf. on Research in
    Computational Biology- ISMB 02 Int. Conf. on
    Intelligent Syst. for Molecular Biology- ECCB
    2002 European Conf. on Computational Biology-
    Biophysical Society Symp. on Molecular
    Simulations in Structural Biology, 2002- SoCG
    2002 ACM Symp. on Computational Heometry

45
Outreach
  • - Latombe and Levitt serve as members of the
    Scientific Leadership Council of Stanfords
    Bio-X program- Presentations Stanfords Bio-X
    Symposium (3/02), Stanfords Computer Forum
    (3/02), Berkeleys Broad Area Seminar (4/02)-
    Conference committees Guibas, program
    committee, WAFR02 and SoCG03 Latombe,
    program committee, 1st IEEE Bioinformatics Conf.
    03 Apaydin, organization committee of BCATS02

46
The following slides are extra slides that I
removed from my presentation for lack of time
47
General Goals
  • Larger proteins considered ? computational
    efficiency
  • Diversity of molecules and interactions ?
    computational abstractions
  • Extension of in-silico experiments ?
    computational correctness
  • ?Enable biological studies that were not
    possible before, more systematically

48
Approach
  • Select hard problems
  • Close interaction between computer scientists
    (Guibas, Koehl, Latombe) and biologists (Koehl,
    Levitt, Brutlag, Pande, Brunger)
  • Most graduate students are CS students with
    secondary advisor in biology
  • Perform extensive tests

49
  • Electron density map ? Medial axisGuibas,
    Brunger, Russel
  • Medial axis of iso-surfaces to estimate backbone
  • Cleaning and simplification of axis to filter
    noise out
  • Persistence of features across multiple
    iso-surfaces

sec17
50
Continuous energy function
  • ?Essential for protein structure prediction and
    molecular motion simulation
  • - Statistical potentials based on alpha
    complex
  • - Maintenance of energy values during
    simulation

51
  • Instead, exploit what doesnt change chain
    topology? Adaptive BV hierarchies
  • Balanced binary trees of constant topology
  • Efficient repair of position/size of BVsGuibas,
    Nguyen, Russel, Zhang Lotan, Schwarzer,
    Halperin, Latombe (SOCG02)

sec17
52
  • Future WorkSpanner for deformable
    chainAgarwal, Gao, Duke Nguyen, Zhang,
    Stanford

53
Probabilistic Roadmap
  • 1ROP (repressor of primer)
  • 2 a helices
  • 6 DOF
  • 1HDD (Engrailed homeodomain)
  • 3 a helices
  • 12 DOF

H-P energy model with steric clash exclusion Sun
et al., 95
Write a Comment
User Comments (0)
About PowerShow.com