Protein Structure Prediction - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Protein Structure Prediction

Description:

'Small Libraries of Protein Fragments Model Native Protein Structures Accurately' Rachel Kolodny, Patrice Koehl, Leonidas Guibas, and Michael Levitt, 2002 ... – PowerPoint PPT presentation

Number of Views:117
Avg rating:3.0/5.0
Slides: 28
Provided by: Sami158
Category:

less

Transcript and Presenter's Notes

Title: Protein Structure Prediction


1
Protein Structure Prediction
  • Samantha Chui
  • Oct. 26, 2004

2
Central Dogma of Biology
  • Question Given a protein sequence, to what
    conformation will it fold?

3
How does nature do it?
  • Hydrophobicity vs. hydrophilicity
  • Van der Waals interaction
  • Electrostatic interaction
  • Hydrogen bonds
  • Disulfide bonds

4
Current Approaches
  • Experimental Methods
  • X-ray crystallography
  • NMR spectroscopy
  • Computational Methods
  • Homology modeling
  • Similar sequences fold into similar structures
  • Threading
  • Dissimilar sequences may fold into similar
    structures
  • Ab initio
  • No similarity assumptions
  • Conformational search

5
Assembly of sub-structural units
predicted structure
known structures
6
Small Libraries of Protein Fragments Model
Native Protein Structures AccuratelyRachel
Kolodny, Patrice Koehl, Leonidas Guibas, and
Michael Levitt, 2002
  • Goal Find finite set of protein fragments that
    can be used to construct accurate discrete
    conformations for any protein
  • 1. Generate fragments from known proteins
  • 2. Cluster fragments to identify common
    structural motifs
  • 3. Test library accuracy on proteins not in the
    initial set

7
Datasets of protein fragments
  • 200 unique protein domains from Protein Data Bank
    (PDB)
  • 36,397 residues
  • Four sets of backbone fragments
  • 4, 5, 6, and 7-residue long fragments
  • Divide each protein domain into consecutive
    fragments beginning at random initial position

8
Fragment structural similarity
  • Coordinate root-mean-square (cRMS) deviation of
    Ca atoms
  • cRMS(A,B) sqrt(Sdi2/N)
  • one to one mapping between atoms in structure A
    and structure B
  • Translate and rotate to find best alignment
  • 0 if superimpose perfectly

9
Pruning and clustering
  • Outliers have large cRMS deviation from all other
    fragments
  • Discard according to some fragment-length
    specific threshold
  • k-means simulated annealing clustering
  • Repeatedly run k-means clustering, merge nearby
    clusters and split disperse clusters
  • Scoring function total variance S (x µ)2
  • Less sensitive to initial choice of cluster
    centers than k-means

10
Compiling the libraries
  • Select cluster centroids as library entries
  • Minimum sum of cRMS deviations from all the other
    cluster fragments
  • Form representative set of protein fragments
  • Library contents highly dependent upon clustering
    procedure
  • For each set of fragments, start with 50 random
    seeds and choose library with minimal total
    variance score

11
Evaluating quality of a library
  • Local-fit
  • How well library fits local conformation of all
    proteins in test set.
  • Global-fit
  • How well library fits global three-dimensional
    conformation of all proteins in test set

12
Local-fit method
  • Protein structures broken into set of all
    overlapping fragments of length f
  • Find for each protein fragment the most similar
    fragment in the library (cRMS)
  • Score Average cRMS value over all fragments in
    all proteins in the test set

13
Local-fit results
14
Global-fit method
  • Concatenate best local-fit library fragments just
    found
  • Determine fragments orientation by superimposing
    its first three Ca atoms onto last three Ca atoms
    of preceding fragment

15
Global-fit method
  • Number of possible sequences of fragments
    exponential in proteins length
  • Greedy algorithm finds good rather than best
    global-fit approximation
  • Start at N terminus, approximate increasingly
    larger segments of the protein
  • Concatenate library fragment which will yield
    structure of minimal cRMS deviation from
    corresponding segment
  • Deterministic, linear time

16
Global-fit results
0.91 Ã…
1.85 Ã…
2.78 Ã…
50 fragments 7 residues 2.66 states/residue
100 fragments 5 residues 10 states/residue
20 fragments 5 residues 4.47 states/residue
17
Assembly of sub-structural units
predicted structure
known structures
18
Protein structure prediction via combinatorial
assembly of sub-structural unitsYuval Inbar,
Hadar Benyamini, Ruth Nussinov, and Haim J.
Wolfson, 2003
19
CombDock
  • Input structural units (SUs) with known 3D
    conformations
  • SUs considered rigid bodies
  • rotated and translated with respect to each other
  • Goal predict overall structure
  • Constraints
  • Penetration avoid steric clashes
  • Backbone restriction on maximum distance between
    consecutive SUs

20
All pairs docking
  • N(N-1)/2 pairs of SUs
  • Calculate candidate transformations according to
    matching complementary local features on surface
    of SUs
  • Apply transformation on 2nd SU of pair
  • Keep K best for each
  • Clustering to ensure all K transformations yield
    significantly different complexes

21
Combinatorial assembly
  • Multigraph representation
  • Vertices SUs
  • Edges transformations between two SUs
  • K parallel edges between any two vertices
  • Final protein conformation spanning tree
  • N SUs, one connectivity component, no cycles

22
Combinatorial Assembly
  • NN-2KN-1 different spanning trees
  • Not all spanning trees are valid complexes
  • Use heuristical algorithm
  • Two subtrees adjacent iff there exists an index i
    so that vertex i is in one subtree and i1 is in
    the other
  • Sequential tree recursive definition
  • One vertex
  • Tree with edge that connects two adjacent
    sequential trees

23
Combinatorial Assembly
  • Hierarchical algorithm of N stages
  • ith stage generate sequential trees with i
    vertices
  • Construct trees by connecting adjacent sequential
    trees of smaller sizes generated earlier
  • Keep D best sequential trees at each step
  • Discard trees which do not meet backbone and
    penetration constraints
  • Score sum of scores of transformations

24
Combinatorial Assembly
25
CombDock Results
26
Conclusion
protein sequence
predicted structure
known structures
fragment library
  • Experimental Methods
  • X-ray crystallography
  • NMR spectroscopy
  • Computational Methods
  • Homology modeling
  • Similar sequences fold into similar structures
  • Threading
  • Dissimilar sequences may fold into similar
    structures
  • Ab initio
  • No similarity assumptions
  • Conformational search


27
References
  • Kolodny et al., Small libraries of protein
    fragments model protein structures accurately
  • Inbar et al., Protein structure prediction via
    combinatorial assembly of sub-structural units
Write a Comment
User Comments (0)
About PowerShow.com