BCB 444/544 - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

BCB 444/544

Description:

Orientation can be predicted using 'positive inside' rule ... periplasmic side (space between inner & outer membrane in gram-negative bacteria) ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 19
Provided by: publicI
Category:

less

Transcript and Presenter's Notes

Title: BCB 444/544


1
BCB 444/544
  • Lab 7
  • Protein Structure Prediction
  • Oct 11, 2007

2
Chp 14 - Secondary Structure Prediction
  • SECTION V STRUCTURAL BIOINFORMATICS
  • Xiong Chp 14
  • Protein Secondary Structure Prediction
  • Secondary Structure Prediction for Globular
    Proteins
  • Secondary Structure Prediction for Transmembrane
    Proteins
  • Coiled-Coil Prediction

3
Secondary Structure Prediction
  • Has become highly accurate in recent years (gt85)
  • Usually 3 (or 4) state predictions
  • H ?-helix
  • E ?-strand
  • C coil (or loop)
  • (T turn)

4
Secondary Structure Prediction Methods
  • 1st Generation methods
  • Ab initio - used relatively small dataset of
    structures available
  • Chou-Fasman - based on amino acid propensities
    (3-state)
  • GOR - also propensity-based (4-state)
  • 2nd Generation methods
  • based on much larger datasets of structures now
    available
  • GOR II, III, IV, SOPM, GOR V, FDM
  • 3rd Generation methods
  • Homology-based Neural network based
  • PHD, PSIPRED, SSPRO, PROF, HMMSTR, CDM
  • Meta-Servers
  • combine several different methods
  • Consensus Ensemble based
  • JPRED, PredictProtein, Proteus

5
Secondary Structure Prediction Servers
  • Prediction Evaluation?
  • Q3 score - of residues correctly predicted
    (3-state)
  • in cross-validation experiments
  • Best results? Meta-servers
  • http//expasy.org/tools/ (scroll for 2'
    structure prediction)
  • http//www.russell.embl-heidelberg.de/gtsp/secstru
    cpred.html
  • JPred www.compbio.dundee.ac.uk/www-jpred
  • PredictProtein http//www.predictprotein.org/
    Rost, Columbia
  • Best "individual" programs? ??
  • CDM http//gor.bb.iastate.edu/cdm/
    SenJernigan, ISU
  • FDM (not available separately as server)
    ChengJernigan, ISU
  • GOR V http//gor.bb.iastate.edu/
    KloczkowskyJernigan, ISU

6
Consensus Data Mining (CDM)
  • Developed by Jernigan Group at ISU
  • Basic premise combination of 2 complementary
    methods can enhance performance by harnessing
    distinct advantages of both methods combines
    FDM GOR V
  • FDM - Fragment Data Mining - exploits
    availability of sequence-similar fragments in the
    PDB, which can lead to highly accurate prediction
    - much better than GOR V - for such fragments,
    but such fragments are not available for many
    cases
  • GOR V - Garnier, Osguthorpe, Robson V - predicts
    secondary structure of less similar fragments
    with good performance these are protein
    fragments for which FDM method cannot find
    suitable structures
  • For references additional details
    http//gor.bb.iastate.edu/cdm/

7
Secondary Structure Prediction for Different
Types of Proteins/Domains
  • For Complete proteins
  • Globular Proteins - use methods previously
    described
  • Transmembrane (TMM) Proteins - use special
    methods
  • (next slides)
  • For Structural Domains many under development
  • Coiled-Coil Domains (Protein interaction
    domains)
  • Zinc Finger Domains (DNA binding domains),
  • others

8
SS Prediction for Transmembrane Proteins
  • Transmembrane (TM) Proteins
  • Only a few in the PDB - but 30 of cellular
    proteins are membrane-associated !
  • Hard to determine experimentally, so prediction
    important
  • TM domains are relatively 'easy' to predict!
  • Why? constraints due to hydrophobic environment
  • 2 main classes of TM proteins
  • ??- helical
  • ?- barrel

9
SS Prediction for TM ?-Helices
  • ??-Helical TM domains
  • Helices are 17-25 amino acids long (span the
    membrane)
  • Predominantly hydrophobic residues
  • Helices oriented perpendicular to membrane
  • Orientation can be predicted using "positive
    inside" rule
  • Residues at cytosolic (inside or cytoplasmic)
    side of TM helix, near hydrophobic anchor are
    more positively charged than those on lumenal
    (inside an organelle in eukaryotes) or
    periplasmic side (space between inner outer
    membrane in gram-negative bacteria)
  • Alternating polar hydrophobic residues provide
    clues to interactions among helices within
    membrane
  • Servers?
  • TMHMM or HMMTOP - 70 accuracy - confused by
    hydrophobic signal peptides (short hydrophobic
    sequences that target proteins to the
    endoplasmic reticulum, ER)
  • Phobius - 94 accuracy - uses distinct HMM
    models for TM helices
  • signal peptide sequences

10
SS Prediction for TM ?-Barrels ?
  • ?-Barrel TM domains ?
  • ?-strands are amphipathic (partly hydrophobic,
    partly hydrophilic)
  • Strands are 10 - 22 amino acids long
  • Every 2nd residue is hydrophobic, facing lipid
    bilayer
  • Other residues are hydrophilic, facing "pore" or
    opening
  • Servers? Harder problem, fewer servers
  • TBBPred - uses NN or SVM (more on these ML
    methods later)
  • Accuracy ?

11
Chp 15 - Tertiary Structure Prediction
  • SECTION V STRUCTURAL BIOINFORMATICS
  • Xiong Chp 15
  • Protein Tertiary Structure Prediction
  • Methods
  • Homology Modeling
  • Threading and Fold Recognition
  • Ab Initio Protein Structural Prediction
  • CASP

12
Protein Tertiary Structure Prediction
  • 3 Major Methods
  • Homology Modeling (easiest!)
  • Threading and Fold Recognition (harder)
  • Ab Initio Protein Structural Prediction (really
    hard)

13
Comparative Modeling?
  • Comparative modeling - term is sometimes used
    interchangeably with homology modeling, but also
    sometimes used to mean both homology modeling
    and/or threading/fold recognition

14
Ab Initio Prediction
  • Develop energy function
  • bond energy
  • bond angle energy
  • dihedral angle energy
  • van der Waals energy
  • electrostatic energy
  • Calculate structure by minimizing energy function
  • (usually Molecular Dynamics or Monte Carlo
    methods)
  • Ab initio prediction - impractical for most real
    (long) proteins
  • Computationally? very expensive
  • Accuracy? Usually poor for all except short
    peptides
  • (but much improvement recently!)

Provides both folding pathway folded structure
15
Comparative Modeling
  • Two types
  • 1) Homology modeling
  • 2) Threading (fold recognition)
  • Both rely on availability of experimentally
    determined structures that are "homologous" or
    at least structurally very similar to target

Provide folded structure only
16
Homology Modeling
  • Identify homologous protein sequences (?-BLAST)
  • Among available structures (in PDB), choose one
    with closest sequence to target as template
  • (can combine steps 1 2 by using PDB-BLAST)
  • Build model by placing target sequence residues
    in corresponding positions of homologous
    structure refine by "tweaking" modeled
    structure (energy minimization)
  • Homology modeling - works "well"
  • Computationally? "relatively" inexpensive
  • Accuracy? higher sequence identity ? better
    model
  • Requires 30 sequence identity with sequence for
    which structure is known

17
Threading - Fold Recognition
  • Identify best fit between target sequence
    template structure
  • Develop energy function
  • Develop template library
  • Align target sequence with each template score
  • Identify top scoring template (1D to 3D
    alignment)
  • Refine structure as in homology modeling
  • Threading - works "sometimes"
  • Computationally? Can be expensive or cheap,
    depends on energy function whether "all atom"
    or "backbone only" threading
  • Accuracy? in theory, should not depend on
    sequence identity (should depend on quality of
    template library "luck")
  • Usually, higher sequence identity to protein of
    known structure ? better model

18
Today's Lab
  • Homology Modeling - using SWISS-MODEL
  • http//swissmodel.expasy.org//SWISS-MODEL.html
  • Threading - using 3-D JURY (BioinfoBank, a
    METAserver)
  • http//meta.bioinfo.pl/submit_wizard.pl
  • Take a look at CASP contest
  • http//predictioncenter.gc.ucdavis.edu/
  • CASP7 contest in 2006
  • http//www.predictioncenter.org/casp7/Casp7.html
Write a Comment
User Comments (0)
About PowerShow.com