Structural Bioinformatics 2 - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Structural Bioinformatics 2

Description:

rate of evolution of different parts of protein structures ... sometimes 1.0 Angstrom RMSD (RMSD is the root mean square deviation between ... – PowerPoint PPT presentation

Number of Views:109
Avg rating:3.0/5.0
Slides: 34
Provided by: davidrober
Category:

less

Transcript and Presenter's Notes

Title: Structural Bioinformatics 2


1
Structural Bioinformatics (2)
  • Dr. D.R. Westhead

2
Reminder of the previous lecture
  • Protein 3D structure
  • primary, secondary, tertiary and quaternary
    structure
  • SSEs and surface loops
  • rate of evolution of different parts of protein
    structures
  • relevance to multiple sequence alignment

3
Reminder of previous lecture
  • Evolution of overall structure
  • sequence similarity guarantees structural
    similarity
  • 25 sequence ID in an 80 or more residue
    alignment means that the two proteins have the
    same basic 3D structure
  • BUT some proteins which share the same structure
    have much lower sequence identities

4
Objectives of this lecture (1)
  • To understand the importance of evolutionary
    relationships in prediction problems in
    bioinformatics
  • To understand what protein structure prediction
    means
  • To understand why protein structure prediction is
    important

5
Objectives of this lecture (2)
  • To be aware a number of structure prediction
    methods
  • secondary structure prediction
  • comparative modelling
  • To understand the conditions under which they can
    and should be applied
  • To understand their expected accuracy and some
    factors which affect it

6
Prediction in bioinformatics
  • Important prediction problems
  • prediction of protein sequence from genomic DNA
  • prediction of protein function from sequence
  • prediction of protein 3D structure from sequence
  • prediction of protein function from structure

7
Why are these prediction methods important
  • Genome sequencing projects
  • generate large quantities of genomic sequences
  • BUT what does it mean?
  • Prediction of protein sequence, structure and
    function can give clue
  • Predictions can be verified experimentally
  • often slow

8
Prediction based on similarity
  • Many successful prediction methods rely on the
    detection of similarity
  • similar sequences have similar functions
  • similar sequences have similar structures (last
    lecture)
  • For instance doing a BLAST/FASTA search with a
    new protein sequence (previous lectures)

9
The importance of evolution
  • Divergent evolution gives rise to many similar
    sequences with related structures and often with
    related functions
  • an evolutionary family

10
An evolutionary prediction paradigm
  • If a sequence of unknown structure/function can
    be shown to be similar to one or more of known
    structure/function then functional/structural
    details can be transferred to the new sequence.

11
Predicting structure is easier
  • As sequences evolve 3D structure tends to be
    conserved
  • Function is often conserved as well
  • BUT function tends to change with evolution
    faster than structure
  • WHY? Or HOW?

12
Protein evolutionary families
  • Within an evolutionary protein family we expect
  • only one basic 3D structure
  • perhaps more than one different function
  • Functional differences can be minor or major
  • changes in enzyme specificity (minor)
  • change from enzyme to structural protein (seen in
    GST family) (major).

13
What is protein structure prediction?
  • In its most general form
  • a prediction of the (relative) spatial position
    of each atom in the tertiary structure generated
    from knowledge only of the primary structure
    (sequence)

14
Why predict protein structure?
  • The sequence structure gap
  • 1 000 000 known sequences, 20 000 known
    structures
  • Structural knowledge brings understanding of
    function and mechanism of action (last lecture
    and practical)
  • Can help in prediction of function

15
Why predict protein structure?
  • Predicted structures can be used in structure
    based drug design
  • It can help us understand the effects of
    mutations on structure or function
  • It is a very interesting scientific problem
  • still unsolved in its most general form after
    more than 20 years of effort

16
Methods of structure prediction
  • Comparative modelling
  • Secondary structure prediction
  • Fold recognition/threading
  • Ab initio protein folding approaches

17
Terminology
  • The (prediction) target sequence
  • a sequence of unknown structure for which we
    require a structure prediction

18
Comparative modelling
  • Makes a prediction of tertiary structure based on
  • sequences of known structure which are similar to
    the target sequence (called template structures)
  • an alignment between these and the target
    sequence
  • Remember 25 seq ID means two proteins have the
    same basic structure

19
Choice of prediction methods
  • If you can find similar sequences of known
    structure then comparative modelling is the best
    way to predict structure
  • all other methods are less reliable
  • Of course, you cant always find similar
    sequences of known structure.

20
When you cant do comparative modelling?
  • The next step is secondary structure prediction
  • less detailed results
  • only predicts the H (helix), E (extended) or C
    (coil/loop) state of each residue, does not
    predict the full atomic structure
  • Example http//www.bioinformatics.leeds.ac.uk/grou
    p/undergraduate/biol3000_blgy2212/ex3_1utg.html

21
Beyond secondary structure prediction
  • When you cant do comparative modelling there are
    some things you could do beyond a secondary
    structure prediction
  • fold recognition or threading
  • ab initio protein folding

22
Fold recognition or threading
  • Aimed at detecting when the target sequence
    adopts a known fold, even if it has no
    significant similarity to sequences of known fold
  • remember the globin example last lecture
  • Beyond the scope of this module

23
Ab initio protein folding
  • Aims to predict tertiary structure from basic
    physico-chemical principles
  • does not rely on any detection of similarity to
    sequences of known structure
  • An important scientific question
  • As yet very unreliable for practical predictions

24
Accuracy of structure prediction
  • Comparative modelling
  • when template and target sequences are closely
    related high accuracy is possible
  • sometimes lt 1.0 Angstrom RMSD (RMSD is the root
    mean square deviation between atomic positions in
    predicted and actual structures)
  • See handout graph

25
Factors affecting accuracy
  • The accuracy of comparative modelling is
    controlled by the quality of the alignment
    between target sequence and template structures
  • Alignment is easier if the sequences are closely
    related (e.g. sequence identity gt 80).

26
Accuracy of secondary structure prediction
  • The best methods have an average accuracy of just
    about 73 (the percentage of residues predicted
    correctly)

27
Secondary structure prediction methods
  • PHD - Rost and Sander (artificial neural network)
  • DSC - King and Sternberg (linear discriminant
    analysis)
  • NNSSP -Salomov and Solevyev (nearest neighbour
    algorithm)
  • PREDATOR - Frishman and Argos
  • Jnet Barton Group

28
Use of evolutionary information
  • All the previous secondary structure prediction
    methods make use of evolutionary information in
    related sequences
  • This improves prediction accuracy enormously
    (without this accuracy would be less than 70)

29
Structure prediction resources
  • Secondary structure prediction
  • Jpred (http//www.compbio.dundee.ac.uk/Software/JP
    red/jpred.html)
  • several others also on the WWW
  • Comparative modelling
  • SWISSMODEL (http//www.espasy.ch/swissmod/SWISS-MO
    DEL.html)
  • We will use these in the practical

30
Caveats
  • The prediction methods we have covered are mainly
    aimed at water soluble (globular) proteins
  • we still do not know very many 3D structures of
    integral membrane proteins

31
Other aspects of structure prediction
  • There are methods for predicting which parts of
    sequences are transmembrane segments
  • see for example the ExPASy WWW site
  • http//www.expasy.ch/
  • There are links to lots of protein prediction
    resources available from ExPASy

32
Summary
  • The main points of this lecture were
  • prediction using methods based on the detection
    of similarity are important in bioinformatics
  • the underlying reason for this is divergent
    evolution of sequences/structures
  • prediction of structure by comparative modelling
    is an example of such a method

33
Summary (2)
  • Continued
  • Comparative modelling should be used to predict
    structures whenever possible
  • If it is not possible then secondary structure
    prediction methods can be used, followed by more
    sophisticated methods
  • These prediction methods are mainly for globular
    proteins
Write a Comment
User Comments (0)
About PowerShow.com