Homology Modelling Course - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Homology Modelling Course

Description:

H = enthalpy. T = temperature. S = entropy. Why? ... In practice, we can approximate the enthalpy term H quite well. by molecular mechanics. ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 24
Provided by: mdur8
Category:

less

Transcript and Presenter's Notes

Title: Homology Modelling Course


1
Homology Modelling Course Dr Marcus
Durrant Computational Biology Group Genome Centre
Room 101b John Innes Centre marcus.durrant_at_bbsrc.
ac.uk
2
Premise every protein has a single structure,
defined by its primary amino acid sequence.
Why?
All chemical structures at equilibrium are
determined by thermodynamics
G H - TS
Where G free energy H enthalpy T
temperature S entropy
At equilibrium, the value of G is as small as it
can be.
3
  • Can we calculate G from first principles?
  • YES- but at present, only for small molecules...
  • The underlying physical laws necessary for the
    mathematical theory of a
  • large part of physics and the whole of chemistry
    are thus completely known,
  • and the difficulty is only that the exact
    application of these laws leads to
  • equations much too complicated to be soluble.
  • - P.A.M. Dirac, 1929
  • In practice, we can approximate the enthalpy term
    H quite well
  • by molecular mechanics.
  • However,
  • proteins have very many possible conformations.
  • the entropy term is very hard to calculate.
  • Hence, we need homology modelling.

4
It has been predicted that the total number of
protein folds will be about 1,000 (C. Chothia,
Nature 357, 543-4, 1992) At present, about 700
folds have been characterised (SCOP
database) Note, however, that some folds are
much less well represented- e.g. membrane
surface proteins (12 known so far).
Structure is much more highly conserved than
sequence.
In homology modelling, we exploit this principle
to build an initial model, which is then refined.
5
Principles of Protein Structure Normally, only
20 amino acids are used in proteins.
sidechain
N-terminus
C-terminus
backbone
6
Protein Structural Elements The principal
protein structural elements are formed by
hydrogen bonding between the backbone amide
groups.
The local pattern of hydrogen bonding is
determined by the sidechains.
7
Protein Structural Elements
Helix
Sheet
parallel
antiparallel (preferred)
Loop
Turn
8
(No Transcript)
9
Example of Structure Conservation/Divergence
missing N-terminus
extra loop
conserved sheets
conserved helices
These two structures are only 28 identical.
10
The a-helix
nth backbone O group hydrogen bonded to the
(n4)th NH group destabilised by Pro, Gly
11
The antiparallel b-sheet

Turn (note Gly)
note bifurcated H-bond Destabilised by
unfavourable R-group interactions
(steric/electronic)
12
How useful will my model be?
sequence identity
0 30 60 100
  • twilight zone
  • overall fold
  • residue-specific
  • effects
  • mutagenesis
  • electrostatics
  • cavity volume

comparable to medium-resolution X-ray/NMR
structure
In the twilight zone, a good alignment is often
more useful than a 3D model.
13
Homology Modelling Procedure
validate model
14
  • Software servers available at JIC
  • Online resources for locating structural
    homologues
  • Fugue (Shi, Blundell Mizuguchi (2001), J Mol
    Biol, 310, 243-57)
  • FASTA on PDB website
  • Modelling software
  • DeepView (AKA Swiss PDB viewer)
  • Insight II (commercial package)
  • See course homepage for links to other protein
    tools

15
  • Types of PDB structure file
  • X-Ray crystal structure
  • Done on solid at very low temperature (minimise
  • thermal motion/entropy)
  • Quality is indicated by resolution the lower
    the
  • number, the better (typically 1 4 Å)
  • Frequently contain more than one molecule of
    protein
  • May have missing residues- check header
  • Always read the paper!
  • NMR structure
  • Done on protein in solution
  • Generally small proteins only
  • Usually presented as a set of similar
    structures
  • No resolution value but roughly equivalent to 2
    Å
  • Always read the paper!

16
Principles of structure-guided sequence alignment
  • The degree of structure conservation is closely
    linked to the
  • biological function. Structure is highly
    conserved when it
  • needs to be.
  • Some residues (e.g. catalytic triad) will be
    strictly conserved.
  • Some at least of the core structural elements
    (e.g. turns)
  • should show significant conservation across
    multiple
  • sequences.
  • Some regions of the sequence will be better
    conserved
  • than others.
  • Deletions and insertions are much more likely
    to occur in
  • loop regions.

17
Dayhoff Mutation Matrix
  • The different chemical properties of the amino
    acids
  • mean that some are more similar than others
  • The probability of a given mutation can be
    estimated
  • by comparing many related sequences
  • The results are expressed as a probability
    matrix

-another tool for local sequence alignment
18
Why use multiple structures in the sequence
alignment?
VLPGDMMHFAADEKRNDLLDQQEGARHFSSPYMDA LLPGDDDIYGVDT-
-----NDQDLTRHLTSPFQNA   VLPGDMMHFAADEKNLDLRDQQEGA
RHFSSPYMDA LLPGDDDIYGVDTNDQDLTR------HLTSPFQNA MLP
GRKMVFALPIKVGDLHHR---SKKVTSPYNNA MVPGHHTLFGITQDLAD
LVTR----SPQSSPFNDG VVPGKHSPYVVSTRDQDLITRPG--TVRSSP
YQNG
19
Use structure to refine sequence alignment
---helix---------loop-------helix-- HFNVKVRTMQAHR
AAAV--PVYYAGKGLTTENFTT HFQAKVRSMQAKKTGLYTKLKKPGVQA
LTSENWNS HFNVKVRT-----HAIYLYTKLKKAVTLTNDNFKT   HFN
VKVRTMQAHRAAAV--PVYYAGKGLTTENFTT HFQAKVRSMQAKKTGLY
TKLKKPGVQALTSENWNS HFNVKVRTHAIYLYTKL-----KKAVTLTND
NFKT

20
Structure assignment procedures
  • Structurally conserved regions
  • Conserved residue- copy all coordinates
  • Non-conserved residue- copy sidechains using the
  • maximum overlap principle (Summers Karplus)

21
  • Loops
  • If possible, use a reference structure with the
    same
  • length of loop
  • Otherwise, either
  • Search the database for comparable loops
    (preferred)
  • Build the loops using random conformational
    searching

22
Molecular Mechanics
Treats molecules using classical mechanics rather
than Quantum mechanics
Electrostatics, etc also included as
classical terms Very efficient but limited
accuracy- avoid over-minimisation
MM optimisation cant make a bad model into a
good one!
23
Now for the practical
Write a Comment
User Comments (0)
About PowerShow.com