Title: Geometric computing in protein design
1Geometric computing in protein design
- Jack Snoeyink
- UNC Computer Science
2Outline
- Introducing Computational Geometry Proteins
- Projects
- 3-d jigsaw puzzles of Protein design
- Precision requirements for exact geometric
computation - Classes
- COMP 281 Computational Geometry
- COMP 290-079 seminar/lab in Applied Optimization
in Computational Biology - Collaboration
- program in cellular and molecular biophysics
- program in bioinformatics
3Computational Geometry
- A branch of the theory of computer science that
considers - the design and analysis of algorithms and data
structures - for problems that are best stated in geometric
form. - Application areas include robotics, databases,
GIS, computer graphics, mathematics, and
molecular biology.
4Beautiful geometry (?)
5Diagramatic representations
- PXR with bound ligand
- ?Ball stick / van der Waals spheres
- ?Model diagram
- ?Solvent accessible surface
6Geometry on computers
- Where we can see structure, shape, connections,
regions, - The computer sees only coordinates
- For example, this PXR protein ligand is in the
Protein Data Bank as
7REMARK Written by O version 7.0.0 REMARK Sun Jan
21 152451 2001 CRYST1 91.345 91.345
85.302 90.00 90.00 90.00 ORIGX1 1.000000
0.000000 0.000000 0.00000 ORIGX2
0.000000 1.000000 0.000000
0.00000 ORIGX3 0.000000 0.000000 1.000000
0.00000 SCALE1 0.010948 0.000000
0.000000 0.00000 SCALE2 0.000000
0.010948 0.000000 0.00000 SCALE3
0.000000 0.000000 0.011723 0.00000 ATOM
1 C GLY 142 -5.808 44.753 13.561
1.00 58.97 6 ATOM 2 O GLY 142
-5.723 45.523 14.515 1.00 59.54 8 ATOM
3 N GLY 142 -4.377 43.177 14.842
1.00 59.37 7 ATOM 4 CA GLY 142
-5.307 43.330 13.685 1.00 59.68 6 ATOM
5 N LEU 143 -6.324 45.108 12.387
1.00 58.87 7 ATOM 6 CA LEU 143
-6.839 46.455 12.152 1.00 58.50 6 ATOM
7 CB LEU 143 -6.483 46.907 10.736
1.00 57.90 6 ATOM 8 CG LEU 143
-5.849 48.290 10.555 1.00 57.77 6 ATOM
9 CD1 LEU 143 -4.599 48.411 11.407
1.00 56.51 6 ATOM 10 CD2 LEU 143
-5.505 48.492 9.090 1.00 56.92 6 ATOM
11 C LEU 143 -8.352 46.446 12.333
1.00 58.92 6 ATOM 12 O LEU 143
-9.046 45.640 11.714 1.00 59.85 8 ATOM
13 N THR 144 -8.862 47.341 13.174
1.00 58.88 7 ATOM 14 CA THR 144
-10.299 47.407 13.444 1.00 59.76 6
8Protein Design
- Goal CAD for protein design
- BioGeometry Project (NSF ITR)
- F. Brooks, C. Carter, A. Tropsha, Duke, Stanford,
NC AT - Enzyme design (DARPA seed funding)
- H. Hellinga, DJ Richardson (Duke), J. Snoeyink
(UNC)
9Protein structure hierarchy
- Primary AA sequence
- Secondary helices/sheets
- Tertiary folding in 3D
- Quaternary conformation of two or more molecules
10Primary amino acid sequence
- 20 amino acids
- Backbone linked peptide units
- Side chains differ
- Geometry ?, ? angles at bonds with Ca carbon
11Primary amino acid sequence
12Protein synthesis
- Transcription and translation in protein
synthesizing - DNA and RNA have nucleotides that determine kind
of protein - 3 nucleotides 1 amino acid of a protein
13Secondary a-helices
- Stabilized by hydrogen bonds
14Secondary b-sheets
- Parallel and Anti-parallel
- Also stabilized by H-bonds
15Tertiary folding
- Critically important
- 3D Structure ? Function
- Protein Folding problem
- Given the sequence, determine the structure
- Decoy Discrimination problem
- Given several folds, determine the native
16Quatenary Structure
- GroEL is made of 28 protein units, in four
7-member rings, that form a pair of cavities in
which proteins can be unfolded. - GroES subunits form another 7-membered ring that
caps the GroEL ring
17Structure - Function
18Protein Design
- Goal CAD for protein design
- BioGeometry Project (NSF ITR)
- F. Brooks, C. Carter, A. Tropsha, Duke, Stanford,
NC AT - Enzyme design (DARPA seed funding)
- H. Hellinga, DJ Richardson (Duke), J. Snoeyink
(UNC)
19NSF ITR Project Team
20Protein Design
- Goal CAD for protein design
- BioGeometry Project (NSF ITR)
- F. Brooks, C. Carter, A. Tropsha, Duke, Stanford,
NC AT - Enzyme design (DARPA seed funding)
- H. Hellinga, DJ Richardson (Duke), J. Snoeyink
(UNC)
21Protein design
- Dezymer software
- H. Hellinga, L. Looger
- Input fixed backbone
- and ligand
- Output top-ranked
- receptor designs
- Example RBP (Ribose Binding Protein)
- Redesigned receptor site to bind TNT
- Generated different receptor designs with
modified backbone
22Backbone modification
- Goal CAD for protein design
- Objective local backbone motion
- modify segment of backbone, leaving remainder of
chain fixed - develop a library of allowed motions
- Motivation
- crystallographic refinement
- receptor design
23Crystallographic refinement
fit structure to electron density from x-ray
diffraction
Crystal structure obtained without hydrogen
atoms Some clashes result after adding hydrogen
atoms Red spikes bad clashes Blue dots
favorable interactions
24Crystallographic refinement
fit structure to electron density from x-ray
diffraction
modified backbone resolves clashes
better choice of side chain
25Problems Approaches
- Problems from biology
- threading in electron density
- crystal packing
- motion libraries
- Robotics techniques
- exact inverse kinematics
- probabilistic roadmaps
- Geometric algorithms
- data representation
- code optimization
26Arithmetic precision as a computational resource
- Provably correct algorithms optimal in time,
space, and arithmetic precision - Examples
- line segment intersection (w/ Mantler)
- ray/polygon intersection
- 3d/4d Delaunay (w/ Liu, Mascarenhas)
- Geometric rounding on output
27Restricted tests imply...
- Restricted to double precision
- Cant test where an intersection is
- Cant sort on lines
- Cant sort by x
28spaghetti lines.
- Restricted to double precision
- Push segments as far right as possible
- Endpoints witness intersections
293D RayPolygon test
If (pq)(pd) return MISS hitmiss MISS For
(i0 ipi1 pipi1qr0 return GRAZE // hit edge pi,pi1
Else if (pipi1qrtoggle hitmiss // pi,pi1 r separates q?
30Topology for visualization
31Teaching this semester
- Computational Geometry3 credit courseCOMP
281TTh 200-315Sitterson 325 - blackboard site link
32Teaching this semester
- Applied Optimization in Computational Biology1
credit seminar/working labCOMP 290-079Th
400-500Beard 116 (Pharmacy) - www.cs.unc.edu/snoeyink/comp290opt/syllabus
33Biophysics training program
- Interdisciplinary program lead by Barry Lentz
- www.hekto.med.unc.edu
- Module courses
- Research rotations
34Bioinformatics training program
- New initiative at UNC
- Biophysics, Biochemistry, SILS, Pharmacy, OR,
Appl. Math, Comp. Sci. - Faculty Wei Wang, Jan Prins, JS
- Students Luke Huan, Andrew Leaver-Fay
- Fits with CS PhD program
- 7 credits of module courses over 2 years
- Seminar/journal club 4pm Weds
- Optimization seminar/lab 4pm Thur, Beard 116
- Talk to me, Jan, Wei, or Todd Vision
35Puzzles
- What motivates the researcher is the conviction
that, if only he (or she) is skilled enough, he
(or she) will succeed in solving a puzzle that
no-one before has solved or solved so well. - The structure of scientific revolutions
- Thomas Kuhn
Jack Snoeyink snoeyink_at_cs.unc.edu Sitterson
333 Tonight 6-8pm Monday 9-12am