Tertiary Structure Prediction Methods - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Tertiary Structure Prediction Methods

Description:

Title: Homology Modeling: principles, tools and techniques Author: supa Last modified by: Muhammed Sayed Created Date: 6/15/2002 7:21:30 AM Document presentation format – PowerPoint PPT presentation

Number of Views:181
Avg rating:3.0/5.0
Slides: 30
Provided by: supa166
Category:

less

Transcript and Presenter's Notes

Title: Tertiary Structure Prediction Methods


1
Tertiary Structure Prediction Methods
Any given protein sequence
2
Why Homology modelling ?
  • X-ray Diffraction
  • Only a small number of proteins can be made to
    form crystals.
  • A crystal is not the proteins native
    environment.
  • Very time consuming.
  • NMR Distance Measurement
  • Not all proteins are found in solution.
  • This method generally looks at isolated
    proteins rather
  • than protein complexes.
  • Very time consuming

3
Homology ModelingPrinciples, tools and
techniques
  • Development of molecular biology rapid
    identification, isolation and sequencing of
    genes.
  • Problem time-consuming task to obtain the
    3D-structure of proteins.
  • Alternative strategy in structural biology is to
    develop models of protein when the constraints
    from X-ray diffraction or NMR are not yet
    available.
  • Homology modeling is the method that can be
    applied to generate reasonable models of protein
    structure.

4
Database approach to homology modelling
As of June 2000, 12,500 protein structures have
been deposited into the Protein Data Bank (PDB)
and 86,500 protein sequence entries were
contained in SwissProt protein sequence
database. This is a 17 ratio relatively few
structures are known. The number of sequence
will increase much faster than the number of
structures due to advances in sequencing.
5
Sequence similarity methods
These methods can be very accurate if there is
gt 50 sequence similarity. They are rarely
accurate if the sequence similarity lt 30.
They use similar methods as used for sequence
alignment such as the dynamic programming
algorithm, hidden markov models, and clustering
algorithms.
6
What is Homology Modeling?
  • Predicts the three-dimensional structure of a
    given protein sequence (TARGET) based on an
    alignment to one or more known protein structures
    (TEMPLATES)
  • If similarity between the TARGET sequence and the
    TEMPLATE sequence is detected, structural
    similarity can be assumed.
  • In general, 30 sequence identity is required for
    generating useful models.

7
Structural Prediction by Homology Modeling
Structural Databases
SeqFold,Profiles-3D, PSI-BLAST, BLAST FASTA,
Fold-recognition methods (FUGUE)
Reference Proteins
Ca Matrix Matching
Conserved Regions
Protein Sequence
Sequence Alignment Coordinate Assignment
Predicted Conserved Regions
Loop Searching/generation
MODELER
Initial Model
Structure Analysis
Sidechain Rotamers and/or MM/MD
WHAT IF, PROCHECK, PROSAII,..
Refined Model
8
How good can homology modeling be?
  • Sequence Identity
  • 60-100 Comparable to medium resolution NMR
  • Substrate Specificity
  • 30-60 Molecular replacement in crystallography
  • Support site-directed mutagenesis
  • through visualization
  • lt30 Serious errors

9
Significance of Protein Structure
  • What does a structure offer in the way of
    biological knowledge?
  • Location of mutants and conserved residues
  • Ligand and functional sites
  • Clefts/Cavities
  • Evolutionary Relationships
  • Mechanisms

10
The importance of the sequence alignment
  • the quality of the sequence alignment is of
    crucial importance
  • Misplaced gaps, representing insertions or
    deletions, will cause residues to be misplaced in
    space
  • Careful inspection and adjustment on Automatic
    alignment may improve the quality of the modeling.

11
Programs for Model Protein Construction
  • MODELLER 4.0
  • guitar.rockefeller.edu/modeller/modeller.html
  • SWISS-MOD Server
  • www.expasy.ch/swissmod/SWISS-MODEL.html
  • SCWRL (SideChain placement With Rotamer Library)
  • www.fccc.edu/research/labs/dunbrack/scwrl/

12
Protein Structural Databases
  • Templates can be found using the TARGET sequence
    as a query for searching using FASTA or BLAST
  • PDB (http//www.rcsb.org/pdb)
  • MODELLER (http//guitar.rockefeller.edu/modeller/m
    odeller.html)
  • ModBase (http//pipe.rockefeller.edu/modbase/gener
    al-info.html)
  • 3DCrunch (http//www.expasy.ch/swissmod/SM_3DCrunc
    h.html)

13
Gaining confidence in template searching
  • Once a suitable template is found, it is a good
    idea to do a literature search (PubMed) on the
    relevant fold to determine what biological
    role(s) it plays.
  • Does this match the biological/biochemical
    function that you expect?

14
Other factors to consider in selecting templates
  • Template environment
  • pH
  • Ligands present?
  • Resolution of the templates
  • Family of proteins
  • Phylogenetic tree construction can help find the
    subfamily closest to the target sequence
  • Multiple templates?

15
Target-Template Alignment
  • No current comparative modeling method can
    recover from an incorrect alignment
  • Use multiple sequence alignments as initial
    guide.
  • Consider slightly alternative alignments in areas
    of uncertainty, build multiple models
  • Sequence-Structure alignment programs
  • Tries to put gaps in variable regions/loops
  • Note sequence from database versus sequence from
    the actual PDB are not always identical

16
Target-Multiple Template Alignment
  • Alignment is prepared by superimposing all
    template structures
  • Add target sequence to this alignment
  • Compare with multiple sequence alignment and
    adjust

17
Adjusting the alignment
  • Using tools such as Joy (www-cryst.bioc.cam.ac.uk/
    joy/) to view secondary structure along the
    alignment and use this information as criteria
    for adjustments
  • Avoid gaps in secondary structure elements

18
Secondary Structure Prediction
  • The Predict Protein server
  • http//www.embl-heidelberg.de/predictprotein/
  • Adding secondary structure prediction algorithms
    can help make decisions on whether helices should
    be shortened/extended in areas of poor sequence
    identity.
  • PHD program

19
Constructing Multi-domain protein models
  • Building a multi-domain protein using templates
    corresponding to the individual domains
  • proteinA aaaaaaaaaaaaa---------------------
  • proteinB -----------------bbbbbbbbbbbbbbb
  • Target aaaaaaaaaaaaabbbbbbbbbbbbbbb

20
Multiple model approach
  • Reminder Consider the effects of different
    substitution matrices, different gap penalties,
    and different algorithms. (Vogt et al. J. Mol.
    Biol. 1995, 249816-831.)
  • Construct multiple models
  • Use structural analysis programs to determine
    best model

Jaroszewski, Pawlowski and Godsik, J. Molecular
Modeling, 1998, 4294-309 Venclovas, Ginalski and
Fidelis. PROTEINS, 1999, 373-80 (Suppl)
21
Model Building
  • Rigid-Body Assembly
  • Assembles a model from a small number of rigid
    bodies obtained from aligned protein structure
  • Implemented in COMPOSER
  • Segment Matching
  • Satisfaction of Spatial Restraints
  • MODELLER
  • guitar.rockefeller.edu/modeller/modeller.html

22
Initial model and procedures
  • Calculate coordinates for atoms that have
    equivalent atoms in the templates as an average
    over all templates
  • CHARMM internal coordinates are used for
    remaining unknown coordinates
  • Generate stereochemical and homology derived
    restraints

23
Modeller
  • Main input are restraints on the spatial
    structure of AA and ligands to be modeled.
  • Output is a 3D structure that satisfies these
    restraints
  • Restraints are obtained from related protein
    structures (homology modeling) - obtained
    automatically, NMR structures, secondary struture
    packing and other experimental data

24
Spatial restraints ?
  • Minimizes the objective function, F, with respect
    to the Cartesian coordinates of the protein atoms
  • F(R) Sci (fi,pi)
  • R are the cartesian coordinates of the atoms
  • c is a restraint dependant on f,p
  • f is a geometric feature of a molecule and
    include the distance, angle and dihedral values
  • p are parameters to help describe some restraints

25
What are the Restraints ?
  • distances, angles, dihedral angles, pairs of
    dihedral angles and some other spatial features
    defined by atoms or pseudo atoms.

26
Sidechain Conformation
  • Protein sidechains play a key role in molecular
    recognition and packing of hydrophobic cores of
    globular proteins
  • Protein sidechain conformations tend to exist in
    a limited number of canonical shapes, usually
    called rotamers
  • Rotamer libraries can be constructed where only
    3-50 conformations are taken into account for
    each side chain

27
Sidechains on surface of protein
  • Exposed sidechains on surface can be highly
    flexible without a single dominant conformation
  • So ultimately if these solvent exposed sidechains
    do not form binding interactions with other
    molecules or involved in say, a catalytic
    reaction, then accuracy may not be crucialalso
    look at the B-factors
  • Can refine the sidechains with molecular
    mechanics minimization
  • Sampling?
  • Scoring?

28
Errors in Homology Modeling
  • a) Side chain packing b) Distortions and
    shifts c) no template

29
Errors in Homology Modeling
  • d) Misalignments e) incorrect
    template
  • Marti-Renom et al., Ann. Rev. Biophys. Biomol.
    Struct., 2000, 29291-325.

30
Detection of Errors
  • First check should include a stereochemical check
    on the modeled structurePROCHECK, WHATCHECK,
    DISTAN which will show deviations from normal
    bond lengths, dihedrals, etc.
  • Visualization follow the backbone trace and then
    subsequently move out to Ca-Cß orientation.

31
PROCHECK
http//www.biochem.ucl.ac.uk/roman/ procheck/proc
heck.html
Write a Comment
User Comments (0)
About PowerShow.com