BCB 444544 - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

BCB 444544

Description:

11. Difficulty of Tertiary Structure Prediction ... State of the art. Critical Assesment of Structure Prediction ... Protein-DNA. Protein-RNA. Protein-Ligand ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 30
Provided by: dobbslabG
Category:
Tags: bcb

less

Transcript and Presenter's Notes

Title: BCB 444544


1
BCB 444/544
  • Lecture 22
  • Protein Structure Prediction
  • (ctd.)
  • 23 Oct 17

2
Chp 15 - Tertiary Structure Prediction
SECTION V STRUCTURAL BIOINFORMATICS Xiong
Chp 15 Protein Tertiary Structure
Prediction Methods Homology Modeling Threading
and Fold Recognition Ab Initio Protein Structural
Prediction CASP
Some slides based on those by Mark Gerstein
3
Structural Genomics - Status Goal
  • 20,000 "traditional" genes in human genome
  • (recall, this is fewer than earlier
    estimate of 30,000)
  • 2,000 proteins in a typical cell
  • gt 4.9 million sequences in UniProt (Oct 2007)
  • gt 46,000 protein structures in the PDB (Oct
    2007)
  • Experimental determination of protein structure
    lags far behind sequence determination!
  • Goal Determine structures of "all" protein folds
    in nature, using combination of experimental
    structure determination methods (X-ray
    crystallography, NMR, mass spectrometry)
    structure prediction

4
Problem statement
  • Given the primary structure (sequence) of a
    protein chain, what is its 3D shape (i.e. what
    are the 3D coordinates of its constituent atoms)?
  • Also
  • Given a desired 3D shape, what amino acid
    sequence(s) will assume that conformation
  • Recently accomplished by David Bakers lab for
    one designed structure!

5
Steps in Protein Folding
  • 1- "Collapse"- driving force is burial of
    hydrophobic aas (fast - msecs)
  • 2- Molten globule - helices sheets form, but
    "loose" (slow - secs)
  • 3- "Final" native folded state - compaction
    rearrangement of
  • 2' structures

Native state? - assumed to be lowest free
energy - may be an ensemble of structures
6
Energy
  • Energy Function
  • Force field
  • bond energy
  • bond angle energy
  • dihedral angle energy
  • van der Waals energy
  • electrostatic energy
  • Knowledge-based statistical potential
  • Conformational Search Function
  • Molecular dynamics
  • Monte Carlo
  • How can we reduce computational complexity of
    each step?

7
Rotamers
Alkane stereochemistry. (2007, April 25). In
Wikipedia, The Free Encyclopedia. Retrieved April
28, 2007, from http//en.wikipedia.org/w/index.php
?titleAlkane_stereochemistryoldid126676235
Image modified slightly from http//nook.cs.ucdav
is.edu8080/koehl/ProModel/sidechain.html
8
Structure modeling methods
  • Comparative modeling
  • (e.g. MODELLER, SWISS-MODEL)
  • Threading
  • (e.g. FUGUE)
  • Ab initio - from first principles
  • Molecular dynamics (physical simulation)
  • (e.g. NAMD, GROMACS)
  • Hybrid methods
  • (e.g. ROSETTA, CABS, I-TASSER)

9
Tertiary Structure Prediction Methods
  • 2 (or 3) Major Methods
  • Comparative Modeling
  • Homology Modeling (easiest!)
  • Threading and Fold Recognition (harder)
  • Ab Initio Protein Structural Prediction (really
    hard)

10
Protein Dynamics
  • Protein in native state is NOT static
  • Function of many proteins requires conformational
    changes, sometimes large, sometimes small
  • Globular proteins are inherently "unstable"
  • (NOT evolved for maximum stability)
  • Energy difference between native and denatured
    state is very small (5-15 kcal/mol)
  • (this is equivalent to 2 H-bonds!)
  • Folding involves changes in both entropy
    enthalpy

11
Difficulty of Tertiary Structure Prediction
  • Folding or tertiary structure prediction problem
    can be formulated as a search for minimum energy
    conformation
  • Search space is defined by psi/phi angles of
    backbone and side-chain rotamers
  • Search space is enormous even for small proteins!
  • Number of local minima increases exponentially
    with number of residues

Computationally it is an exceedingly difficult
problem!
12
How do these methods stack up?
Baker and Sali. Science 294 (5540) 93-96
13
Comparative modeling
  • Requires that the structure has been solved for a
    protein with similar sequence
  • Backbone of query structure is initially
    positioned identically to template structure
  • Position of sidechains and loops are then built
    and modified to remove steric clashes

14
Threading
  • In some ways similar to comparitive modelling
    but
  • Used when no highly sequence similar structures
    are available
  • Thread target sequence to a library of structures
  • Evaluate energy function
  • Incompatible structures will have high energy
  • Only accept structures below a cutoff

15
Steps in Threading
  • Align target sequence with template structures
  • in fold library (usually from the PDB)
  • Calculate energy score to evaluate "goodness of
    fit" between target sequence template structure
  • Rank models based on energy scores

16
Threading Goal - Issues
Find correct sequence-structure alignment of a
target sequence with its native-like fold in
template library (usually derived from PDB)
  • Structure database - must be "complete"
  • Can't build a good model if there is no good
    template in library!
  • Sequence-structure alignment algorithm
  • Bad alignment ? Bad score!
  • Energy function or Scoring Scheme
  • Must distinguish correct sequence-fold alignment
    from incorrect sequence-fold alignments
  • Must distinguish correct fold from close
    decoys
  • Prediction reliability assessment - How determine
    whether predicted structure is correct? (or
    even close?)

17
Threading Template database
  • Build a database of structural templates
  • e.g., ASTRAL domain library derived from the
    PDB

Sometimes, supplement with additional decoys
e.g., generated using ab initio approach such as
Rosetta (Baker)
18
Threading Energy function
  • Two main methods ( combinations of these)
  • Structural profile (environmental)
    physicochemical properties of amino acids
  • Contact potential (statistical)
  • based on contact statistics from PDB
  • famous one Miyazawa Jernigan (ISU)

19
Ab Initio Prediction
  • Develop energy function
  • bond energy
  • bond angle energy
  • dihedral angle energy
  • van der Waals energy
  • electrostatic energy
  • Calculate structure by minimizing energy function
  • usually Molecular Dynamics (MD) or Monte Carlo
    (MC)
  • Ab initio prediction - impractical for most real
    (long) proteins
  • Computationally? very expensive
  • Accuracy? Usually poor for all except short
    peptides
  • (but much improvement recently!)

Provides both folding pathway folded structure
20
Molecular dynamics
  • Physical, all-atom simulation
  • Calculates force (classical, not quantum) between
    all atoms
  • Iterate over very small time-steps (usually 1 or
    2 femtoseconds)
  • Due to computational cost, typically limited to
    simulation run on order of ns
  • Decent at folding very short sequences, but
    impractical for full folding of most sequences

21
State of the art
  • Critical Assesment of Structure Prediction (CASP)
  • Top groups
  • David Baker, University of Washington
  • ROSETTA
  • open source
  • also available through ROBETTA server (4 month
    queue)
  • Andrzej Kolinski, University of Warsaw, Poland
  • CABS
  • Yang Zhang (originally from Jeffrey Skolnicks
    lab), University of Kansas
  • I-TASSER (1 server in CASP7 competition, shorter
    queue than ROBETTA, for now)

22
CABS
Source http//biocomp.chem.uw.edu.pl/multiscale_m
odeling.php
23
Disorder?
  • Some proteins (or segments of proteins) appear to
    be intrinsically disordered in the cell

24
Dynamics?
  • No such thing as a static structure
  • Fluctuation about minimum energy structure
  • Elastic network model - Jernigan

25
Interactions?
  • Protein-Protein
  • Protein-DNA
  • Protein-RNA
  • Protein-Ligand

26
Essential Reading
  • Baker, D and Sali, A. (2001) Protein Structure
    Prediction and Structural Genomics. Science 294
    (5540), 93. DOI 10.1126/science.1065659
    http//tinyurl.com/3njfez

27
Protein Structure Classification
  • SCOP Structural Classification of Proteins
  • Levels reflect both evolutionary and structural
    relationships
  • http//scop.mrc-lmb.cam.ac.uk/scop
  • CATH Classification by Class,
    Architecture,Topology Homology http//cathwww.b
    iochem.ucl.ac.uk/latest/
  • DALI - (recently moved to EBI reorganized)
  • DALI Database (fold classification) http/
    /ekhidna.biocenter.helsinki.fi/dali/start

Each method has strengths weaknesses.
28
SCOP - Structure Classificationhttp//scop.mrc-lm
b.cam.ac.uk/scop/
29
CATH - Structure Classification
http//www.cathdb.info/latest/index.html
Write a Comment
User Comments (0)
About PowerShow.com