Title: Principles of protein structure and stability.
1Principles of protein structure and stability.
2Polypeptide bond is formed between two amino
acids.
3Backbone conformation is described by f and ?
angles.
Picture from T. Przytycka, 2002
4Hierarchy of protein structure.
- Amino acid sequence
- Secondary structure
- Tertiary structure
- Quaternary structure
Picture from Branden Tooze Introduction to
protein structure
5Right-handed alpha-helix.
- Helix is stabilized by HB between backbone NH
and backbone carbonyl atom. - Geometrical characteristics
- 3.6 residues per turn
- translation of 5.4 Ã… per turn
- translation of 1.5 Ã… per residue
6?-strand and ß-sheet.
7Loop regions are at the surface of protein
molecules.
Adjacent antiparallel ß-strands are joined by
hairpin loops. Loops are more flexible than
helices and strands. Loops can carry binding and
active sites, functionally important sites.
Branden Tooze Introduction to protein
structure
8Protein classification based on the secondary
structure content.
- Class a - proteins with only a-helices
- Class ß proteins with only ß-sheets
-
- Class aß - proteins with a-helices and ß-sheets
9Protein stability.Anfinsens experiments
10Native proteins have low stability
- Scale of interactions in proteins
- - Interactions less than kT0.6 kcal/mol
- are neglected.
- - Interactions more than ?G 10 kcal/mol
- are too large
- Potential energy Van der Waals
Electrostatic Hydrophobic
G
U
F
?G
Reaction coordinate
11Electrostatic force.
Coulombs law for two point charges in a vacuum
q point charge, e dielectric constant
e 2-3 inside the protein, e 80 in water
Na
Cl-
d 2.76 Ã…, E 120 kcal/mol
12Dipolar interactions.
- 0.42
Dipole moment
O
0.42
C
Interaction energy of two dipoles separated by
the vector r
-0.20
N
Peptide bond µ 3.5D, Water molecule µ 1.85D.
0.20
H
13Van der Waals interactions.
Lennard-Jones potential
E (kcal/mol)
0.2
repulsion
London dispersion energy
0
d
d-
attraction
d
d-
- 0.2
12
10
8
6
4
2
Distance between centers of atoms
14Hydrogen bonds
d-
d
3 ?
D
A
D
A
HOH
OHH
HOHOHH
15Hydrogen bonding patterns in globular proteins.
- 1. Most HB are local, close in sequence.
- 2. Most HB are between backbone atoms.
- 3. Most HB are within single elements of
secondary structure. - 4. Proteins are almost equally saturated by HB
0.75 HB per amino acid.
16Disulfide bonds.
- PROTEIN GS-SG ?PROTEIN GSH?PROTEIN 2GSH
SH
HS
SH
S-SG
- Breakdown and formation of S-S bonds are
catalyzed by disulfide isomerase. - In the cell
S-S bonds are reversible, the energetic
equilibrium is close to zero. - Secreted proteins
have a lot of S-S bonds since outside the cell
the equilibrium is shifted towards their
formation.
17Hydrophobic effect.
H
- Hydrophobic interaction tendency of
- nonpolar compounds to transfer from an
- aqueous solution to an organic phase.
-
- The entropy of water molecules decreases when
they make a contact with a nonpolar surface, the
energy increases. - As a result, upon folding nonpolar AA are burried
inside the protein, polar and charged AA
outside.
O
H
H
O
H
18Hydrophobicities of amino acids.
19Cooperativity of protein interactions
- Protein denaturation is a first
- order (all-or-none) transition.
- As T increases
- 1. Globule expansion, loose packing.
- 2. As expansion crosses the barrier,
- liberation of side chains and
- increase in enthropy.
E
T1
T2
T
W(E)
T2
T
T1
20Summary
- Hydrophobic effect is mostly responsible for
making a compact globule. Final specific tertiary
structure is formed by van der Waals
interactions, HB, disulfide bonds. - Secret of stability of native structures is not
in the magnitude of the interactions but in their
cooperativity.
21Classwork I CN3D viewer.
- Go to http//ncbi.nlm.nih.gov
- Select alpha-helical protein (hemoglobin)
- Select beta-stranded protein (immunoglobulin)
- Select multidomain protein 1I50, chain A
- View them in CN3D
22PDB databank.
- Archive of protein crystal structures was
established in 1971 with several structures - in 2002 17000 structure including NMR
structures - Data processing data deposition, annotation and
validation - PDB code nXYZ, n integer, X, Y, Z -characters
23Content of Data in the PDB.
- Organism, species name
- Full protein sequence
- Chemical structure of cofactors and prosthetic
groups - Names of all components of the structure
- Qualitative description of the structural
characteristics - Literature citations
- Three-dimensional coordinates
24Protein secondary structure prediction.
- Assumptions
- There should be a correlation between amino acid
sequence and secondary structure. Short aa
sequence is more likely to form one type of SS
than another. - Local interactions determine SS. SS of a residues
is determined by their neighbors (usually a
sequence window of 13-17 residues is used). - Exceptions short identical amino acid sequences
can sometimes be found in different SS. - Accuracy 65 - 75, the highest accuracy
prediction of an a helix
25Methods of SS prediction.
- Chou-Fasman method
- GOR (Garnier,Osguthorpe and Robson)
- Neural network method
26Chou-Fasman method.
- Analysis of frequences for all amino acids to be
in different types of SS. - Ala, Glu, Leu and Met strong predictors of
alpha-helices, - Pro and Gly predict to break the helix.
27GOR method.
- Assumption formation of SS of an amino acid is
determined by the neighboring residues (usually a
window of 17 residues is used). - GOR uses principles of information theory for
predictions. - Method maximizes the information difference
between two competing hypothesis that residue
a is in structure S, and that a is not in
conformation S.
28Neural network method.
Input layer
Input sequence window
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Output layer
Predicted SS
Hidden layer
L A W P G E V G A S T Y P
a
Si
Hj
Oi
1
ß
0
coil
0
Wij Sj
Hj Oi
29PHD neural network program with multiple
sequence alignments.
- Blast search of the input sequence is performed,
similar sequences are collected. - Multiple alignment of similar sequences is used
as an input to a neural network. - Sequence pattern in multiple alignment is
enhanced compared to if one sequence used as an
input.
30Classwork
- Go to http//ncbi.nlm.nih.gov, search for protein
flavodoxin in Entrez, retrieve its amino acid
sequence. - Go to http//cubic.bioc.columbia.edu/predictprotei
n and run PHD on the sequence.
31Definition of protein domains.
- Geometry group of residues with the high contact
density, number of contacts within domains is
higher than the number of contacts between
domains. - - chain continuous domains
- - chain discontinous domains
- Kinetics domain as an independently folding
unit. - Physics domain as a rigid body linked to other
domains by flexible linkers. - Genetics minimal fragment of gene that is
capable of performing a specific function.
32Domains as recurrent units of proteins.
- The same or similar domains are found in
different proteins. - Each domain performs a specific function.
- Proteins evolve through the duplication and
domain shuffling. - The total number of different types of domains is
small (1000 3000).
33The Conserved Domain Architecture Retrieval Tool
(CDART).
- Performs similarity searches of the NCBI Entrez
Protein Database based on domain architecture,
defined as the sequential order of conserved
domains in proteins. - The algorithm finds protein similarities across
significant evolutionary distances using
sensitive protein domain profiles. Proteins
similar to a query protein are grouped and scored
by architecture.