Title: Macromolecular structure
1Macromolecular structure
2Contents
- Determination of protein structure
- Structure databases
- Secondary structure elements (SSE)
- Tertiary structure
- Structure analysis
- Structure alignment
- Domain recognition
- Structure prediction
- Homology modelling
- Threading/folder recognition
- Secondary structure prediction
- ab initio prediction
3Determination of protein structure
4Crystallisation
Hanging drop method / vapour diffusion method
Microscope
1-Dilute protein solution
Microscope slide
many different conditions of 12 must be tried
2-Concentrated salt solution
Crystal
Slide courtesy from Shoshana Wodak
5Determination of protein structure
Diffraction pattern
Atomic model
Slide courtesy from Shoshana Wodak
6The resolution problem
q
q
q
A high resolution protein structure 1.5 - 2.0 Å
resolution
Slide courtesy from Shoshana Wodak
7Nuclear Magnetic Resonance (NMR)
Source Branden Tooze (1991)
8Interatomic forces
- Covalent interactions
- Hydrogen bonds
- Hydrophobic/hydrophilic interactions
- Ionic interactions
- van der Waals force
- Repulsive forces
9Structure databases
10Structure databases
- PDB (Protein database)
- Official structure repository
- SCOP (Stuctural Classification Of Proteins)
- Structure classification. Top level reflect
structural classes.The second level, called Fold,
includes topological and similarity criteria. - CATH (Class, Architecture, Topology and
Homologous superfamily)
11PDB entry header
12CATH - A protein domain classification
- In CATH, protein domains are classified according
to a tree with 4 levels of hierarchically - Class
- Architecture
- Topology
- Homology
Figure from Shoshana Wodak
13Classifications of protein structures (domains)
CATH structural classification of proteins,
http//www.biochem.ucl.ac.uk/bsm/cath/
SCOP Structural classification of
proteins http//scop.mrc-lmb.cam.ac.uk/s
cop/ FSSPFold classification based on
structure alignments http//www.sander.e
bi.ac.uk/fssp/ HSSP Homology derived
secondary structure assignments
http//www.sander.ebi.ac.uk/hssp/
DALIClassification of protein domains
http//www.ebi.ac.uk/dali/domain/ VAST
structural neighbours by direct 3D structure
comparison http//www.ncbi.nlm.nih.gov
80/Structure/VAST/vast.shtml CE Structure
comparisons by Combinatorial Extension
http//cl.sdsc.edu/ce.html
Slide courtesy from Shoshana Wodak
14Books
- Branden, C. Tooze, J. (1991). Introduction to
protein structure. 1 edit, Garland Publishing
Inc., New York and London. - Westhead, D.R., J.H. Parish, and R.M. Twyman.
2002. Bioinformatics. BIOS Scientific Publishers,
Oxford. - Mount, M. (2001). Bioinformatics Sequence and
Genome Analysis. 1 edit. 1 vols, Cold Spring
Harbor Laboratory Press, New York. - Gibas, C. Jambeck, P. (2001). Developing
Bioinformatics Computer Skills, O'Reilly.
15Secondary structure elements
16Secondary structure - ?-helix
3.6 residues
hydrogen bond
Source Branden Tooze (1991)
17Hydrophobicity of side-chain residues in helices
Blue polar Red basic or acidic
Source Branden Tooze (1999)
18Secondary structure - ? sheets
Antiparallel
Parallel
Source Branden Tooze (1991)
19Secondary structure - twist of ? sheets
Mixed ? sheet
Source Branden Tooze (1991)
20Angles of rotation
- Each dipeptide unit is characterized by two
angles of rotation - Phi around the N-Calpha bond
- Psi around the Calpha-C bond
Image from Branden Tooze (1999)
21The Ramachandran map
Dipeptide unit
Dipeptide unit
Slide courtesy from Shoshana Wodak
22Tertiary structure
23Combinations of secondary structures
24Analysis of structure
25Structure-structure alignment and comparison
Structure B
Structure A
Question Is structure A similar to structure B ?
Approach structure alignments
Slide courtesy from Shoshana Wodak
26Analyzing conformational changes
Open form
Closed form
Citrate synthase, ligand induced conformational
changes Domain motion and small structural
distortions
Slide courtesy from Shoshana Wodak
27 Defining Domains What for?
- Link between domain structure and function
Different structural domains can be associated
with different functions
Enzyme active sites are often at domain
interfaces domain movements play a functional
role
Cathepsin D
DNA Methyltransferase
Slide courtesy from Shoshana Wodak
28Methods for Identifying Domains
- Underlying principle
- Domain limits are defined by identifying groups
of residues such that the number of contacts
between groups is minimized.
N
N
C
C
4-cuts
1-cut
N
C
2-cuts
Slide courtesy from Shoshana Wodak
29Lactate dehydrogenase
Domains From Contact Map
Slide courtesy from Shoshana Wodak
30Structure prediction
31Methods for structure prediction
- Homology modelling
- Building a 3D model on the basis of similar
sequences - Threading
- Threading the sequence on all known protein
structures, and testing the consistency - Secondary structure prediction
- ab initio prediction of tertiary structure
- For proteins of normal size, it is almost
impossible to predict structures ab initio. - Some results have been obtained in the prediction
of oligopeptide structures.
32Homology modelling - steps
- Similarity search
- Modelling of backbone
- Secondary structure elements
- Loops
- Modelling of side chains
- Refinement of the model
- Verification
- Steric compatibility of the residues
33Homology modelling - similarity search
- Starting from a query sequence, search for
similar sequences with known structure. - Search for similar sequences in a database of
protein structures. - Multiple alignment.
- A weight can be assigned to each matching protein
(higher score to more similar proteins) - The higher is the sequence similarity, the more
accurate will be the predicted structure. - When one disposes of structure for proteins with
gt70 similarity with the query, a good model can
be expected. - When the similarity is lt40, homology modeling
gives poor results. - The lack of available structures constitutes one
of the main limitations to homology modeling - In 2004, PDB contains
34Homology modelling - Backbone modelling
- Modelling of secondary structure elements
- a-helices
- b-sheets
- For each secondary structure element of the
template, align the backbone of query and
template. - Loop modelling
- Databases of loop regions
- Loop main chain depends on number of aa and
neighbour elements (a-a, a-b, b-a, b-b)
35Homology modelling - Side chain modelling
- Side-chain conformation (model building and
energy refinement) - Conserved side chains take same coordinates as in
the template. - For non-conserved side chains, use rotamer
libraries to determine the most favourable
conformation.
36Homology modelling - refinement
- After the steps above have been completed, the
model can be refined by modifying the positions
of some atoms in order to reduce the energy.