Protein Conformation Prediction (Part III) - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Protein Conformation Prediction (Part III)

Description:

Doug Raiford Lesson 19 Framework model Secondary structure first Assemble secondary structure segments Hydrophobic collapse Molten: compact but denatured Formation of ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 22
Provided by: doug
Category:

less

Transcript and Presenter's Notes

Title: Protein Conformation Prediction (Part III)


1
Protein Conformation Prediction (Part III)
  • Doug Raiford
  • Lesson 19

2
Review two folding models
  • Framework model
  • Secondary structure first
  • Assemble secondary structure segments
  • Hydrophobic collapse
  • Molten compact but denatured
  • Formation of secondary structure after settles
    in
  • van der Waals forces and hydrogen bonds require
    close proximity

3
Review approaches
  • Two main approaches
  • Focus this lesson De novo

4
Review
  • Did a quick look at threading (homology based)
  • Chou-Fasman (frequency of occurrence of aas at
    specific locations in structure)
  • Looked at HMMs (HMMR and Protein FamiliesPFAM)
  • Looked at ROSETTA (De Novo, knowledge based)
  • Name P(a) P(b) P(turn)
  • Alanine 142 83 66
  • Arginine 98 93 95
  • Aspartic Acid 101 54 146
  • Valine 106 170 50

5
An ab initio example
  • Lattice Approach
  • Abstraction take a problem of extreme complexity
    and simplify
  • Levinthals paradox (Physicist, Berkely, MIT,
    Columbia)
  • Protein with 100 amino acids gt 3100 possible
    structures
  • Even if really fast (10-13 seconds to sample each
    structure)
  • 1.61027 years to go through all structures

6
Approach Big Picture
  • Premise proteins fold into lowest energy
    conformation
  • Reduce complexity by restricting amino acid
    locations to evenly spaced lattice points
  • Generate all possible conformations (within
    certain constraints)
  • Lowest energy models should be representative

7
Reduce complexity
  • Only occupy nodes of a lattice
  • Globular
  • limit number of nodes to 50
  • Ellipsoidal bounding volume
  • No nodes without at least 2 connecting edges (no
    dead-ends)
  • Fewer nodes than aas in sequence (n/2)
  • Must align after the fact
  • From 0 to 3 residues between nodes

8
Reduce complexity (contd)
  • Limit to sequence length of 100 (n)
  • Energy function statistically derived (verses
    computationally expensive energy calculations)
  • Minimal edge lattice diamond lattice
  • Between 105 and 107 enumerated conformations

9
How Exhausting is Exhaustive Time
  • We are able to do exhaustive searches of
    compact, bounded lattice structures with up to
    approximately 40 vertices. These searches take on
    the order of a few hours on a fast workstation,
    and can easily be executed in parallel over
    several machines.

10
Complexity Reduction Tetrahedral Lattice
  • At most 3 choices at each node
  • Self avoiding therefore much pruning
  • Constrained to small volume (ellipse)
  • Probably recursive enumeration with self
    avoidance
  • Filter
  • Symmetry check remove conformations that differ
    only in their orientation
  • 26 already
  • Remember, total of 50

11
Given All Possible Conformations
  • How to align sequence
  • Remember there are more aas than nodes (from 0
    to 3 residues between nodes)
  • How to score overall energy of a conformation
  • How to judge similarity to known protein (native)
    conformation

12
Aligning
  • Iterative/Dynamic
  • Start out evenly spaced
  • For each node determine the seven possible
    residues
  • Choose lowest energy not taken previously
  • Rinse and repeat
  • Converges in 3 to 5 iterations

Sequence Position Sequence Position Sequence Position Sequence Position Sequence Position Sequence Position Sequence Position Sequence Position Sequence Position Sequence Position Sequence Position Sequence Position
Nodal Position 1 2 3 4 5 6 7 8 9 10 11
Nodal Position 1 -1 0 1  0              
Nodal Position 2 -1 0 0 0
Nodal Position 3   0 0 -1 0 0 0
Nodal Position 4   0 -1 0 0 0
Nodal Position 5   -1 0 0 0
Nodal Position 6   -1 0 0
13
Scoring Energy
  • Energy associated with m,n contact average of 5
    adjacent energies
  • m and n given double weight
  • Rest given single weight
  • Average of all energies (divide by 6)

14
Scoring Energy
  • But from where did erm,rn come
  • Statistically derived

15
Scoring Energy
  • Given a database of proteins the energy of any
    given combination of two amino acids is given by

16
Discrete State Off-Lattice Models
  • Instead of limiting residues to regularly spaced
    lattice nodes in space
  • Limit phi and psi angles to a reduced set of
    discrete angles

17
Scoring energy
  • Off lattice models often attempt to minimize
    total energy

G Free energy H Enthalpy S Entropy
?Eq-w
SklnO
?H?E?(PV)
?G ?Gvan der Waals ?GH-bonds ?Gsolvent
?GCoulomb
18
Scoring accuracy
  • Backbone RMSD
  • Root mean square deviation
  • Usually choose top 100 or so predictions and show
    that actual resides in the set

Top 100 conformations --------------- ------------
--- --!!Actual!!- --------------- ---------------
---------------
19
(No Transcript)
20
PDB files
X Y
Z Occu Temp Element ATOM 1 N
THR A 5 23.200 72.500 13.648 1.00 51.07
N ATOM 2 CA THR A 5
23.930 72.550 12.350 1.00 51.27 C
ATOM 3 C THR A 5 23.034 72.048
11.220 1.00 50.34 C ATOM 4 O
THR A 5 22.819 72.747 10.228 1.00 51.19
O ATOM 5 CB THR A 5
25.221 71.703 12.416 1.00 51.94 C
ATOM 6 OG1 THR A 5 26.159 72.326
13.305 1.00 53.51 O ATOM 7 CG2
THR A 5 25.849 71.583 11.046 1.00 53.33
C
21
Algorithm
Name P(a) P(b) P(turn) f(i)
f(i1) f(i2) f(i3) Alanine 142 83
66 0.06 0.076 0.035
0.058 Arginine 98 93 95
0.070 0.106 0.099 0.085 Aspartic Acid 101
54 146 0.147 0.110 0.179
0.081 Asparagine 67 89 156
0.161 0.083 0.191 0.091 Cysteine 70
119 119 0.149 0.050 0.117
0.128 Glutamic Acid 151 037 74
0.056 0.060 0.077 0.064 Glutamine 111
110 98 0.074 0.098 0.037
0.098 Glycine 57 75 156
0.102 0.085 0.190 0.152 Histidine 100
87 95 0.140 0.047 0.093
0.054 Isoleucine 108 160 47
0.043 0.034 0.013 0.056 Leucine 121
130 59 0.061 0.025 0.036
0.070 Lysine 114 74 101
0.055 0.115 0.072 0.095 Methionine 145
105 60 0.068 0.082 0.014
0.055 Phenylalanine 113 138 60
0.059 0.041 0.065 0.065 Proline 57
55 152 0.102 0.301 0.034
0.068 Serine 77 75 143
0.120 0.139 0.125 0.106 Threonine 83
119 96 0.086 0.108 0.065
0.079 Tryptophan 108 137 96
0.077 0.013 0.064 0.167 Tyrosine 69
147 114 0.082 0.065 0.114
0.125 Valine 106 170 50
0.062 0.048 0.028 0.053
Write a Comment
User Comments (0)
About PowerShow.com