Protein Structure Prediction - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Protein Structure Prediction

Description:

The sequence of amino acids in a protein. determines its three dimensional ... Ab initio structure prediction, using knowledge only of sequence, and of physics ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 27
Provided by: profng
Category:

less

Transcript and Presenter's Notes

Title: Protein Structure Prediction


1
Protein Structure Prediction N.
Gautham Department of Crystallography and
Biophysics University of Madras, Guindy
Campus Chennai 600 025
2
Lecture Outline
  • The problem of protein structure prediction -
    its
  • statement
  • The Levinthal paradox and computational
  • complexity
  • Methods of structure prediction
  • Ab initio methods
  • Genetic algorithms
  • Potential energy minimisation
  • MOLS

3
Information Transfer pathway within the cell
ATGCATGCATGCATGCATGC..
DNA
CGUACGUACGUACGU
RNA
CGUACGUACGUACGU
DECODING MECHANISM
PROTEIN Sequence
PROTEIN Structure
Biological function
4
Statement of the problem
  • The sequence of amino acids in a protein
  • determines its three dimensional structure

.AVTYRGSED.
  • The structure of a protein is essential for its
  • function

5
Statement of the problem
  • Structures are determined experimentally
  • using X-ray crystallography and NMR
  • This is expensive and time-consuming
  • Instead, can this be done using computers ?
  • The Problem

Given the sequence of a protein, can we
use available information from Physics, Chemistry
(and databases of previous structures, etc.) to
calculate its three dimensional structure ?
6
Levinthal Paradox
  • The Levinthal paradox arises when we consider
  • protein folding as a thermodynamic phenomenon
  • (driven by entropy)
  • This means
  • - the native fold of a protein is its
    minimum
  • energy state
  • the protein folds by sampling its conformational
  • space to find the one with least energy

7
Levinthal Paradox
  • The time taken to search all possibilities
  • increases exponentially with the size of the
  • protein (by known algorithms)
  • In other words the problem of protein
  • structure prediction (or protein folding) is
  • NP in terms of computational complexity

8
Levinthal Paradox
The Golf Course model of the potential energy
landscape
9
Levinthal Paradox The new view of protein
folding
The folding funnel model of the potential
energy landscape
10
Computational complexity
  • If an algorithm is such that the computation
  • time increases as a polynomial function of the
  • size of the problem it is a Polynomial
    time
  • algorithm. It belongs to the set P
  • e.g. Time const x (size)2 const x
    (size)5
  • If an algorithm is such that the computation
  • time increases as an exponential function of
  • the size of the problem it is a Non-
  • Polynomial time algorithm. It belongs to the
  • set NP
  • e.g. Time const x 2.5size

11
Computational complexity
  • The Travelling Salesman problem
  • an example of a problem in NP
  • Problem Find the path with the least
  • distance that covers all cities at least once.
  • The number of paths to be tried increases as
  • an exponential function of the number of
  • cities

12
Computational complexity
  • Problems in computational biology that are
  • in NP
  • Construction of phylogenetic trees
  • Multiple sequence alignment
  • Protein Structure Prediction

13
Methods of Protein Structure Prediction
  • Homology modelling
  • Fold recognition
  • Ab initio methods
  • Genetic algorithms
  • Potential energy
  • minimisation
  • MOLS

14
Ab initio methods Genetic algorithms
  • a.k.a. Evolutionary Computation
  • The method operates on pieces of information
  • (like Nature on genes)
  • Start with a group of individuals (binary coded
    ?)
  • that represent possible solutions to the
    problem
  • Apply mutation, variation and crossover
    operators
  • to the individuals
  • From the resulting population, select
    individuals with
  • high values of fitness to populate next
    generation
  • Iterate till best individual is obtained

15
Genetic algorithms Application to Protein
Structure Prediction
  • The initial generation consisted of protein
    structures
  • with random choice of torsion angle values
  • The fitness function was a semi-empirical
    potential
  • energy function, i.e. EvdW Eel Etor
    Epseudoentropy
  • The mutate operator randomly changed torsion
    angle
  • values
  • The variate operator made small, random
    increments
  • or decrements to torsion angle values
  • The crossover operator interchanged portions of
  • randomly selected pairs of individuals

16
Potential Energy Minimization
  • Minimize Potential Energy (Least squares,
  • Conjugate Gradient, Molecular Dynamics.)
  • The problem where to start? How to avoid
  • local minima?
  • Many methods

- Build-up method
- Conformational Space Annealing
- Monte Carlo Minimization
- Diffusion Equation and Distance Scaling
- Simulated Annealing
- .
17
Potential Energy Minimization
  • Build-up method

18
Mutually Orthogonal Latin Squares
OBJECTIVE To build a library of the lowest
energy conformations of an oligopeptide
19
Mutually Orthogonal Latin Squares
METHOD (IN BRIEF) The MOLS cycle
Parameterize the search space
20
Mutually Orthogonal Latin Squares
Results The (23) best structures for
Met-enkephalin
21
GA
MOLS
Initial population of 50 individual structures
for the sequence
Sequence is split into overlapping fragments of
five/seven/nine residues
Mutations
Conformational search for all fragments using
MOLS yielding 1000 structures each
Variations
Structures from MOLS libraries
Resulting structures are clustered
Crossing over
A library of structures for each fragment
Selection of individuals with lower energy
22
Avian pancreatic polypeptide 36 residues RMSD
4.0 A
Prediction
Experiment (X-ray crystallography)
23
Villin headpiece 36 residues RMSD 5.2 A
Experiment (X-ray crystallography)
Prediction
24
Tryptophan zipper 16 residues RMSD 2.7 A
Prediction
Experiment (NMR)
25
Bovine Pancreatic Trypsin Inhibitor 58
residues 3 disulphide bridges
Prediction
Experiment (X-ray)
26
Protein Structure Prediction Conclusion
  • If the sequence of the protein of unknown
    structure has greater than 40 identity with one
    of known structure, the structure prediction
    problem may be considered solved especially
    with the structural genomics initiative
  • Ab initio structure prediction, using knowledge
    only of sequence, and of physics and chemistry,
    is as yet an unsolved problem
Write a Comment
User Comments (0)
About PowerShow.com