Title: NMR Spectroscopy and Protein Structures
1NMR Spectroscopy and Protein Structures Chem 991A
Special Topics in Physical Chemistry MWF
1030-1120, Rm 733 Hamilton Hall COURSE
OUTLINE Instructor Dr. Robert Powers
Office Labs Address 722
HaH 721 HaH Phone 472-3039 Phone
472-5316 Office Hours 1130-1230 am MWF or by
Special Appointment. web page
http//bionmr-c1.unl.edu/ Text J. N. S. Evans,
Biomolecular NMR Spectroscopy, Oxford University
Press, M. H. Levitt, Spin
Dynamics Basics of Nuclear Magnetic Resonance,
Wiley, Course Work Exam 1 100 pts (Mon.,
Sept. 28th) Exam 2 100 pts (Wed., Nov.
4th) Final Exam 200 pts (Fri, Dec 18th
10am-12pm) Written/Oral Reports (2) 100
pts (due Nov. 23rd and Dec. 7th) Problem Sets
(2) 150 pts (due Sept 21st Nov.
16th) Total 650 pts
2WRITTEN REPORTS CRITICALLY REVIEWING STRUCTURE
PAPERS
- Paper General
- Written in your own words. Do Not Copy or
Directly Paraphrase the paper. Do not just
summarize the paper. - gt2-3 pages single space text
- Additional pages for figures, references if
necessary - 12 pitch font
- Double spacing between paragraphs and headings
- Prepare a 15 minute PowerPoint presentation to
make to the class. - Paper Topic
- Protein Structure Using NMR
- Can include complex structures
- Small molecules, protein, DNA, RNA, etc.
- Review Background of the Protein and its
Biological Significance - Summarize the Findings/Results of the Paper
- Outline/Explain the Methods Used
3WRITTEN REPORTS CRITICALLY REVIEWING STRUCTURE
PAPERS
- Provide Your Critical Analysis of the Paper
- Are there Any Problems or Issues With any of the
Experiments or Data? - Where the Experiments Sufficient to Address the
Stated Problem? - Would You Have Liked to Have Seen Other
Experiments? - What are those Experiments? Why?
- Where the Experiments Adequately Described or
Referenced? - Where the Interpretations Consistent With the
Data? - Did the Authors Over Interpret the Data?
- Does the Data Suggest Other Equally Plausible
Conclusions? - Do the Authors Acknowledge this?
- There is No Right Answer!
- Your Review Does Not Need to Be Negative or
Positive - Just an Accurate Presentation of the Paper
- You Need to Support Your Analysis
- Do not just Say the Authors Didnt Make Their
Point Explain Why!
4WRITTEN REPORTS CRITICALLY REVIEWING STRUCTURE
PAPERS
- Recommended Sources of Papers
- Nature Structural Molecular Biology, Science,
Nature, Cell, Molecular Cell, Structure, Protein
Science, Proteins, PNAS, Journal of Molecular
Biology, Biochemistry, Journal of Biological
Chemistry, Journal of Biomolecular NMR - Grading (50 points each paper)
- Due Dates
- First Paper - Mon., Nov. 23rd (presentations
start Monday) - You will receive critical feed-back to help
improve the 2nd paper - Second Paper - Mon., Dec 7th (presentations
start Monday) - Problem Set (150 points)
- Writing Awk Scripts (50 points)
- Due Date Mon., Sept 21st
- Protein structure analysis modeling questions
using XPLOR (100 points) - Due Date Mon., Nov. 16th
5- Lecture Topics
- Topic Chapters in Biomol. NMR Spect.
- Overview of Protein Structures
- Introduction to Linux and Awk
- Protein Structures from an NMR Perspective 4
- Protein Modeling Software 3.9
- Molecular Mechanics and Dynamics 3.5-3.9
- Comparison of X-ray and NMR Structures
- Isotope Labeling of Proteins 4.2.2 4.2.3
- NMR Assignment Problem 2
- NMR Software 3.9
- 2D NMR 2.1
- 3D NMR 2.2
- 4D NMR 2.3
- NMR Structure Determination 3
- NOEs 3.1
- Coupling constants and stereospecific
assignments 3.2, 4.1.2 - Chemical shifts 4.1.4
- Amide Exchanges 4.1.3,5.2
6Some Other Recommended Resources
NMR of Proteins and Nucleic Acids Kurt
Wuthrich Protein NMR Spectroscopy Principals
and Practice John Cavanagh, Arthur Palmer,
Nicholas J. Skelton, Wayne Fairbrother
Principles of Protein Structure G. E. Schulz
R. H. Schirmer Introduction to Protein
Structure C. Branden J. Tooze Enzymes A
Practical Introduction to Structure, Mechanism,
and Data Analysis R. Copeland Biophysical
Chemistry Parts I to III, C. Cantor P.
Schimmel Principles of Nuclei Acid Structure
W. Saenger
7Some Important Web Sites
RCSB Protein Data Bank (PDB) Database of NMR
X-ray Structures http//www.rcsb.org/pdb/ BMRB
(BioMagResBank) Database of NMR resonance
assignments http//www.bmrb.wisc.edu/ CATH
Protein Structure Classification Classification
of All Proteins in PDB http//www.biochem.ucl.ac.u
k/bsm/cath/ SCOP Structural Classification of
Proteins Classification of All Structures into
http//scop.berkeley.edu Families, Super
Families etc. DALI Compares 3D-Stuctures of
Proteins to http//www.ebi.ac.uk/dali/
Determine Structural Similarities of New
Structures NMR Information Server NMR
Groups, News, Links, Conferences,
Jobs http//www.spincore.com/nmrinfo/ NMR
Knowledge Base A lot of useful NMR
links http//www.spectroscopynow.com/
8Protein Structures from an NMR Perspective
- Background
- We are using NMR Information to FOLD the
Protein. - We need to know how this NMR data relates to a
protein structure. - We need to know the specific details of properly
folded protein structures to verify the accuracy
of our own structures. - We need to know how to determine what NMR
experiments are required. - We need to know how to use the NMR data to
calculate a protein structure. - We need to know how to use the protein structure
to understand biological function
9Protein Structures from an NMR Perspective
Analyzing NMR Data is a Non-Trivial Task! there
is an abundance of data that needs to be
interpreted
Initial rapid convergence to approximate correct
fold
Iterative guesses allow correct fold to emerge
Interpreting NMR Data Requires Making Informed
Guesses to Move Toward the Correct Fold
10Protein Structures from an NMR Perspective
What Do We Mean By Informed Guesses? As we
will see in detail, analysis of NMR data is
commonly ambiguous
A simple illustration
Diagonal peak assigned to Ala 97 CaH
NOE cross- peak assigned to Thr 17 CgH
Chemical Shift assignment of peak consistent
with Ala 16 CbH Thr 43 CgH
Ile 36 Cg2H etc,
Options 1) be conservative and leave the
ambiguous peak unassigned 2) Guess the
assignment at Ala 16 CbH based on the proximity
to assigned Thr 17 CgH
11Protein Structures from an NMR Perspective
Initial rapid convergence to approximate correct
fold
Iterative guesses allow correct fold to emerge
- To progress to the correct protein fold, it is
important to make limited guesses - Do Not Be Afraid or Hesitant to Make
Reasonable Guesses! - if the guess is wrong
- within limits, process is self-correcting ? too
many guesses are a problem - the structure combined with the abundance of
other correct data will identify the wrong
guess - if the guess is correct
- the assignment will be consistent with the
structure ? more correct DATA! - may resolve other ambiguous data
- allow for other guesses to further the
structure analysis
12Protein Structures from an NMR Perspective
What Information Do We Know at the Start of
Determining A Protein Structure By NMR?
- Amino Acids (building blocks of protein
structures) - Important features of Amino Acids that Impact
the Overall Structure of a Protein Include - Size
- Charge
- Polarity
- Hydrophobicity
- Aromaticity
- Conformationally unusual side chains
-
13Protein Structures from an NMR Perspective
What Information Do We Know at the Start of
Determining A Protein Structure By NMR?
- Amino Acids (building blocks of protein
structures) - Important features of Amino Acids that Impact
the Overall Structure of a Protein Include - Zwitterion (charge)
- The chemistry of amino acids is complicated by
- the fact that the -NH2 group is base and the
-CO2H - group is an acid. At physiological pH (7.4), an
H ion - is transferred from one end of the molecule to
the - other to form a zwitterion or salt-like
structure
14Protein Structures from an NMR Perspective
Illustration of Zwitterion Characteristics of
Amino Acids from the pH Titration of Alanine
15Protein Structures from an NMR Perspective
Polar Amino Acids
Asparagine, ASN, N
Cysteine, CYS, C
Glutamine, GLN, Q
Histidine, HIS, H (depends on pH)
Serine, SER, S
Theronine, THR, T
Tryptophan, TRP, W
Tyrosine, TYR, Y
Carbon gray Oxygen red Hydrogen
white Nitrogen light blue Sulfur yellow
16Protein Structures from an NMR Perspective
Hydrophobic Amino Acids
Alanine, ALA, A
Isoleucine, ILE, I
Leucine, LEU, L
Methionine, MET, M
Phenylalanine, ALA, A
Proline, Pro, P
Valine, VAL, V
Glycine, GLY, G
Carbon gray Oxygen red Hydrogen
white Nitrogen light blue Sulfur yellow
17Protein Structures from an NMR Perspective
Charged Amino Acids
Positive
Arginine, ARG, R
Histidine, HIS, H (depends on pH)
Lysine, LYS, K
Negative
Carbon gray Oxygen red Hydrogen
white Nitrogen light blue Sulfur yellow
Aspartate, ASP, D
Glutamate, GLU, E
18Protein Structures from an NMR Perspective
Amino Acid Structures as Part of a Protein
Structure
Knowing the shape and composition of individual
amino acids makes it easier to identify them as
part of a more complex protein structure
19Protein Structures from an NMR Perspective
Amino Acid Structures as Part of a Protein
Structure
20Protein Structures from an NMR Perspective
Venn diagram grouping amino acids according to
their properties
Livingstone Barton, CABIOS, 9, 745-756, 1993
21Protein Structures from an NMR Perspective
22Protein Structures from an NMR Perspective
- Some General Rules Regarding the Distribution of
Amino Acids in Proteins - Charged residues are hardly ever buried.
- if buried generally involved in salt-bridge
- Polar residues are usually found on the surface
of the protein, but can be buried. - if buried generally involved in hydrogen bond
- The inside, or core of a protein contains mostly
non-polar residues. - Non-polar residues are also found on the outside
of proteins.
23Energetic Cost of Putting Amino Acid in Interior
or Surface of Protein
24Protein Structures from an NMR Perspective
- Kyte-Doolittle Hydropathy Ranking of Relative
Amino Acid Hydrophobicity - Does it make sense for the residue to be on the
protein surface or buried in its core? - Based on an amalgam of experimental
observations derived from the literature. - Web page to calculate hydrophobicity plots for
protein sequence - http//fasta.bioch.virginia.edu/o_fasta/grease.htm
J. Mol. Biol. (1982) 157 105-132.
25Protein Structures from an NMR Perspective
- Biological Base Hydrophobicity Scale (Nature
(2005)433377) - based on the stability of a peptide sequence in
a membrane - where n 0-7
- also, variable stability based on position
-
Decreasing stability
26Protein Structures from an NMR Perspective
Consensus Hydrophobicity Scale (Journal of
Chromatography A (2003)1000637)
Distribution of hydrophobicity rankings
Ala
Arg
Asn
Asp
Comparison of four commonly used oil partioning
scales to measure hydrophobicity ethanol-dioxane,
N-methylacetamide, octanol-water,
water-cyclohexane
Cys
Gln
27Protein Structures from an NMR Perspective
Consensus Hydrophobicity Scale (Journal of
Chromatography A (2003)1000637)
Distribution of hydrophobicity rankings
Glu
Gly
Phe
Met
Ile
His
Pro
Ser
Leu
Lys
Thr
Trp
28Protein Structures from an NMR Perspective
- Some General Rules Regarding the Distribution of
Amino Acids in Proteins - To bury charged or polar residues, residues are
probably involved in a salt bridge or hydrogen
bond.
Salt Bridge
Hydrogen Bond
Salt-bridge
- This minimizes or eliminates the DG transfer
energy needed to bury polar or charged residues
29Protein Structures from an NMR Perspective
- Propensity of Amino-Acids To Be Present In A
Proteins Active-Site - probability of contact with a non-protein atom
- positive number means higher than random ?
likely to be part of active-site - negative number means lower than random ?
unlikely to be part of active-site - does not include protein-protein or
protein-peptide interactions - roles for tryptophan and proline
HIS 0.360 ALA 0.025 CYS 0.210 MET 0.025 SER 0.13
0 ILE -0.005 LYS 0.100 TYR -0.040 THR 0.100 VAL
-0.060 ASN 0.080 GLY -0.070 ARG 0.055 PHE -0.12
0 GLN 0.050 TRP -0.140 GLU 0.050 LEU -0.180 ASP
0.045 PRO -0.200
Holm Sander, Intelligent Systems for Molecular
Biology, 5, 140-146, 1997
30Protein Structures from an NMR Perspective
- All Amino Acids (except Gly) Have at Least One
Chiral Center - All amino acids in protein are L-configuration
- Gly Increases Main Chain Flexibility
- well-conserved during evolution
- Branched Side Chains are Stiffer
- Val, Ile, Leu
- chain folding is facilitated (DS is small)
- Pro is a Very Rigid Side-Chain
- Also Fixes backbone conformation
- Phi (f) is always -60o
- His is Suitable for Enzyme Catalytic Site
- Commonly Found in Protein Active-Site
- pKa (6.0) Near Physiological pH
- Cys can form intra or inter-strand bonds
31Protein Structures from an NMR Perspective
- pH Titration of Histidine Side Chain
- observed pKa is very dependent on the local
structure around the histidine
32Protein Structures from an NMR Perspective
- pH Titration of Histidine Side Chain
- Experimentally measure pKa of His by following
chemical shift difference of His ring proton as a
function of pH. - Will observe different pKas for different His in
a single protein based on their local structure
and involvement in proteins function/activity. - pKa pH where the observed chemical shift is
half-way between protonated and deprotonated
state
pKa
His fully protonated
His fully deprotonated
33Protein Structures from an NMR Perspective
- pH Titration of Histidine Side Chain
- Experimental data for Human Myoglobin
- Similar Titrations for Other Side-Chains (Tyr,
Glu) - Measure presence of salt-bridge, hydrogen bonds,
etc.
34Protein Structures from an NMR Perspective
- pH Titration of Histidine Side Chain
- Presence of a protonated side chain affects the
local carbon chemical shifts - Unprotonated
- Ca 54.3 ppm
- Cb 30.7 ppm
- Protonated
- Ca 53.3 ppm
- Cb 28.5 ppm
35Protein Structures from an NMR Perspective
- Spectral properties of amino acids
- Trp, Tyr, and Phe contain conjugated aromatic
rings and absorb UV light. - Extinction coefficients are
- Â Trp 5,050 M-1cm-1 (280 nm)
- Tyr 1,440 M-1cm-1 (274 nm)
- Phe 220 M-1cm-1 (257 nm)
- Extinction coefficients are additive
-
- Therefore, if a protein contained 3 Tyr and one
Trp its extinction coefficient would be - e 3 x 1,440 1 x 5,050
36Protein Structures from an NMR Perspective
Basic Amino Acid Nomenclature
37Protein Structures from an NMR Perspective
More Detail Amino Acid Nomenclature
Each atom is given a unique identifier. This
includes equivalent methyl hydrogens.
Two Versions of Naming Convention 3?1
38Protein Structures from an NMR Perspective
Amino Acid 1H NMR Assignments
39Protein Structures from an NMR Perspective
Amino Acid 13C NMR Assignments
40Protein Structures from an NMR Perspective
- NMR Chemical Shifts Exhibit Specific Amino-Acid
Trends - By combining 2 or more correlated chemical shifts
41Protein Structures from an NMR Perspective
- Local Protein Structure Affects NMR Chemical
Shifts - Significant Deviations From Random-Coiled
Chemical Shifts Are Routinely Observed - Charge state, conformation, covalent
modification, etc. - Structure-Based Deviations May be Larger than
Residue Based Differences - Ring Current Effect
- Proximity to Aromatic Rings will have pronounced
affect on NMR Chemical shifts. - Affect also depends on spatial orientation ?
above/below plane has different impact than edge
on. - Which amino-acids that are next to aromatic
rings depend on the overall fold of the protein
42Protein Structures from an NMR Perspective
- Local Protein Structure Affects NMR Chemical
Shifts - Hydrogen Bond
- a dipole-dipole attraction
- typical ranges
- 2.4 Ã… lt d lt 4.5 Ã…
- 180o lt f lt 90o
HN Chemical Shifts and Hydrogen Bond Length
sobs-sring 8.29dN-1 4.11
43Protein Structures from an NMR Perspective
- A Number of Amino Acid Hydrogens are Labile and
Exchange Readily with Water - Exchange Rate is pH Dependent
- As Exchange Increases ? NMR Lines Broaden Beyond
Detection - Backbone NH is Critical Hydrogen that Exchanges
with Water - Hydrogen Bonds and buried NHs (protected from
solvent) decrease Exchange Rate - Reason Why NMR Samples Use low pH Buffers
(typically pH 5.0 to 6.5)
NMR Line widths
Increase Exchange Rate
44Protein Structures from an NMR Perspective
- Overview of Some Basic Structural Principals
- Primary Structure the amino acid sequence
arranged from the amino (N) terminus to the
carboxyl (C) terminus ? polypeptide chain - Secondary Structure regular arrangements of the
backbone of the polypeptide chain without
reference to the side chain types or conformation - Tertiary Structure the three-dimensional folding
of the polypeptide chain to assemble the
different secondary structure elements in a
particular arrangement in space. - Quaternary Structure Complexes of 2 or more
polypeptide chains held together by noncovalent
forces but in precise ratios and with a precise
three-dimensional configuration.
45Protein Structures from an NMR Perspective
Primary Structure linear arrangement of
amino-acid sequence N- Alanine Glycine
Phenylalanine Tyrosine Serine C Three
letter code N-Ala-Gly-Phe-
-Tyr-Ser-C Single Letter code AGFYS
46Protein Structures from an NMR Perspective
The linear arrangement of amino-acid are joined
or connected by the formation of a peptide bond
The Peptide Bond chemical linkage -CO-NH-
formed by the condensation of the amino group and
carboxyl group of a pair of amino acids to form
an amide bond.
47Protein Structures from an NMR Perspective
- Important Features of the Peptide Bond
- the bond is always planar.
- Rotation about peptide bond is inhibited
- The bond is very stable
- Not generally pH, buffer or temperature labile
- Boil the sample in very high or low pH to cleave
- Cleavage more efficient at high pH
- Exception cleavage occurs at Asp-Pro peptide
bond at low pH and elevated temperatures - Half-life at pH 2.5 and 40oC is 50 hrs
48Protein Structures from an NMR Perspective
- Important Features of the Peptide Bond
- 3) the bond is always trans except for proline
- Cis-proline and trans-proline exhibit unique H-H
distances - Trans distance of Ha of residue preceding
proline and the proline Hd is short (lt2.5Ã…) - Cis distance of Ha of residue preceding the
proline and the proline Ha is short (lt2.5Ã…)
49Protein Structures from an NMR Perspective
- Important Features of the Peptide Bond
- 4) Structural Dimensions are well defined
- Bond lengths and bond angles of peptide bond are
known
50Protein Structures from an NMR Perspective
General PolyPeptide Nomenclature
51Protein Structures from an NMR Perspective
- Amino Acid Structural Nomenclature
- Definitions of Torsion Angles
- Backbone
- Phi (f) Ci-1 Ni Cai Ci
- Psi (y) Ni Cai Ci Ni1
- Omega (w) Cai-1 Ci-1 Ni Cai
- constrained to 180o
- Side-chain
- Chi-1 (c1) Ni Cai Cbi Cgi
- Chi-2 (c2) Cai Cbi Cgi Cd1i
Note c1 IleCg1, SerSg, ThrOg1, ValCg1
c2 His Nd1
52Protein Structures from an NMR Perspective
- Ramachandran Plot
- Peptide Conformation is Defined by f,y diehdrals
(w constrained) - Steric Configurations Limits the Range of f,y
diehdrals Available to the Amino Acid. - Pro is more restricted where f is constrained to
-60o - Gly is less restricted, wider range of f,y
diehdrals
Gly
Non-Gly/Proline
Allowable Regions in f,y space. Dark Gray
Corresponds to Most Favorable Regions
. Significant region of f,y is unallowed
53Protein Structures from an NMR Perspective
- Ramachandran Plot
- If f,y dihedral values were listed for every
amino acid - Protein Topology is Defined!
- Ramachandran considered what combinations of f, y
were favorable for each amino acid - Only van der Waals forces were considered.
- How many backbone conformations of a 300 residue
protein are possible? - Only f, y important.
- f, y need only be given 15o
- i,.e sampled every 30o
- Consider only minima of Ramachandran plot.
- Still Encounter Approximately 10300
conformations! - Levinthal paradoxes
- How is the right conformation found?
- Why are there only 5,000 protein folds?
54Protein Structures from an NMR Perspective
- Ramachandran Plot
- Sensitivity of the protein structure to
relatively small changes in f, y
Same Number of Amino Acids
f -57o y -70o
f -57o y -47o
f -74o y -4o
55Protein Structures from an NMR Perspective
- Similar Issues For Side Chain Conformation
- Steric considerations define allowable c
- Staggered configuration is lowest energy
- 60o, -60o or 180o
Valine c1
180o 60o -60o
56Protein Structures from an NMR Perspective
57Protein Structures from an NMR Perspective
- Limited Number of Possible Conformers for c1, c2,
c3 - All conformers are not equal energy
- Different amino acids have different c energy
profile and different population - Example Potential Energy Surfaces for Side Chain
Dihedrals - Still combination of 60o, 180o, or -60o (300o)
Gln/Glu c1 c2 Map
Gln/Glu c2 c3 Map
http//spin.niddk.nih.gov/clore/Software/Torsion_a
ngles/protein-tor/protein_side.html
58Protein Structures from an NMR Perspective
- Example Potential Energy Surfaces for Side Chain
Dihedrals - Still combination of 60o, 180o, or -60o (300o)
Leu c1 c2 Map
Ile c1 c2 Map
59Protein Structures from an NMR Perspective
- c2 for Phe, Trp and Tyr are Restricted to 90o or
-90o - c1 can still be 60o, 180o or -60o
Trp c1 c2 Map
Phe/Tyr c1 c2 Map
60Protein Structures from an NMR Perspective
- Primary Structure Disulphide Bonds
- Distinct regions of the primary polypeptide
sequence may be joined by the formation of a
disulphide bond between two spatially adjacent
Cysteines. - Disulphide bonds are formed by the oxidation of
two cysteine residues to form a covalent
sulphur-sulphur bond which can be intra- or
inter- molecular bridges.
- - Multiple disulphide bonds are possible in a
protein structure. - - Presence of a disulphide bond(s) restricts the
conformations available to the protein. - Disulphide bonds stabilize the overall proteins
fold by 2.5 - 3.5 kcal/mol. - Disulphide bond is present in both folded and
unfolded protein. Probably only contributes
entropically, not enthalpically.
61Protein Structures from an NMR Perspective
- Primary Structure Disulphide Bonds
- Restriction of conformational space is more
apparent in small protein structures - Presence of free Cysteines in the protein
structure may cause problems in NMR/X-ray
structural work
62Protein Structures from an NMR Perspective
- Primary Structure Disulphide Bonds
- Geometry of a disulphide bond
- Sg Sg covalent bond length of 2.08Ã…
- Defined by 5 dihedral angles
- Two main types
- Left-handed c1 -60o c2 -60o c3 -85o c2 -60o c1
-60o Ca-Ca distance 5.880.49Ã… - Right-handed c1 -60o c2 120o c3 99o c2 -50o
c1 -60 Ca-Ca distance 5.070.73Ã…
63Protein Structures from an NMR Perspective
- Primary Structure Disulphide Bonds
- Presence of a disulphide bond affects the local
carbon chemical shifts - Reduced
- Ca 56.9 ppm
- Cb 28.9 ppm
- Oxidized
- Ca 54.05 ppm
- Cb 42.25 ppm
64Protein Structures from an NMR Perspective
What Information Do We Know at the Start of
Determining A Protein Structure By NMR?
- Effectively Everything We have Discussed to this
Point! - The primary amino acid sequence of the protein of
interest. - All the known properties and geometry associated
with each amino acid and peptide bond within the
protein. - General NMR data and trends for the unstructured
(random coiled) amino acids in the protein. - The number and location of disulphide bonds.
- Not Necessary ? can be deduced from structure.
65Protein Structures from an NMR Perspective
Secondary Structure regular arrangements of the
backbone of the polypeptide chain without
reference to the side chain types or conformation
- Major Types of Secondary Structure Elements
- helices
- a-helix
- 310helix
- p-helix
- b-strands
- parallel
- anti-parallel
- Turns
- b turns
- types I,I,II,II,III,III,Via,VIb
- g turns
- Inverse
- Other or random coil
Assigning the Secondary Structure is the First
Stage of Determining an NMR Protein Structure
66Protein Structures from an NMR Perspective
- Secondary Structure Helices
- Helix Nomenclature
67Protein Structures from an NMR Perspective
- Secondary Structure Helices
- Secondary structures are typically distinguished
by f,y values and hydrogen bonding - pattern
68Protein Structures from an NMR Perspective
- Secondary Structure Helices
- Secondary structures are typically distinguished
by f,y values and hydrogen bonding - pattern
69Protein Structures from an NMR Perspective
- Secondary Structure Helices
- a-helix most common helix found in protein
structures ? most thermodynamically stable - 31 of secondary structure elements
- Right-handed twist to helix.
- Helix Dipole
- 85 of helices are distorted (f,y ? -60o)
- Amino-acid preference in a-helix
- Side-chains on the Surface of Helix
70Protein Structures from an NMR Perspective
- Secondary Structure Helices
- Amino Acid Preference for a-Helix
a-Helix Propensity (larger number better)
Ala 1.489 Leu 1.236 Arg 1.224 Lys
1.172 Asn 0.772 Met 1.363 Asp 0.924 Phe
1.195 Cys 0.966 Pro 0.492 Gln 1.164 Ser
0.739 Glu 1.504 Thr 0.785 Gly 0.510 Trp
1.090 His 1.003 Tyr 0.787 Ile 1.003 Val
0.990
Protein Engineering 1289-294(1987).
J. Mol. Biol. (2004) 337, 11951205
71Protein Structures from an NMR Perspective
- Secondary Structure Helices
- Amphipathic a-helix
- have a polar and a non-polar side
- hydrophobic residues are regularly spaced three
or four position apart in a linear sequence. - plays a crucial role in
- helix-helix interaction
- interaction of small peptides that have a helical
conformation - interaction with membranes
- air-water interfaces
- self-assembly processes
Helical wheel representation of amphipathic
a-helix
leucine zipper
72Protein Structures from an NMR Perspective
- Secondary Structure Helices
- Amphipathic a-helix
- have a polar and a non-polar side
Amphipathic a-helix intereacts with membrane
73Protein Structures from an NMR Perspective
- Secondary Structure Helix Dipole
- CO - HN H-bonds are almost parallel with the
helix axis ? H-bond dipoles reinforce in the
helix to form helix dipole - Helix dipole ( end towards N-terminal)
- capping by hydrogen bonding to NH and CO groups
at the N- and C-termini - charge-dipole interactions
- charged side chains form stabilizing interactions
with the helix dipole.
74Protein Structures from an NMR Perspective
- Secondary Structure Helix Dipole
- Residues preferred at N- and C-terminus of an
a-helix
Protein Science (1995), 41325-1336.
75Protein Structures from an NMR Perspective
- Secondary Structure Helices
- 85 of helices are distorted (f,y ? -60o)
- radius of curvature gt 90Ã…
- deviation of axis from straight line is 0.25Ã….
- Distortions caused by
- A substantial amount of all 310-helices occur at
the ends of a-helices. - p-helixes also occur at the ends of a-helices.
- Packing of buried helices against other secondary
structural elements in the core of a protein can
lead to distortions since the side chains are on
the surface of helices. - Proline residues induce distortions of around 20o
in the direction of a helix. - Proline causes 2 hydrogen bonds in the helix to
be broken. - Helices containing proline are usually long
because shorter helices would be destabilized. - Exposed helices are often bent away from the
solvent.
76Protein Structures from an NMR Perspective
- Secondary Structure Helix Length
- Average Length of a-helix is 10 residues
- One helical turn requires 4 residues ? defines
minimal length - Helix Nomenclature ...-N''-N'-Ncap-N1-N2-N3-.....
.......-C3-C2-C1-Ccap-C'-C''-... - Ncap N-terminus of helix, Ccap C-terminus of
Helix
Stability of Helix Length Depends on Relative
Spatial Orientation of Ncap, Ccap, etc
Position of C-cap relative to N-cap in function
of length.The good length are black circle, the
bad length are white circle. The N-cap is a
cross.
Position of C2 relative to N-cap in function of
length.The good length are black circle, the bad
length are white circle. The N-cap is a cross.
77Protein Structures from an NMR Perspective
- Secondary Structure 310-helix and p-helix
- 310-helix is rare
- Only 3.4 of helical residues.
- Found at end of a-helix.
- Dipoles not aligned as in a-helix.
- 3 residues per turn 10 atoms enclosed in ring
formed by each hydrogen bond. - CO forms H-bond with NH 3 residues along chain
(i,i3) - p-helix is extremely rare
- Found at end of a-helix
- f,y at edge of allowed region of Ramachandran
plot - t (N-Ca-C') angle is 114.9o larger than standard
109.5o - Larger radius causes axial hole too small for
solvent - Side-chains less staggered than a-helix
78Protein Structures from an NMR Perspective
- Secondary Structure b-strands
- b-sheet is an abundant secondary structure ? 25
of globular proteins - b-strands adopt an extended structure with an
average length of 6 residues - Single b-strands are not stable.
- If the b -strand contains alternating polar and
non-polar residues ? amphipathic b -sheet. - b-strands occur in association with other strands
to form b-sheets. - Strands can be parallel N?C or anti-parallel
N?C - N?C
C?N - b-strand has right-handed twist (0-30o per
residue) - Hydrogen bonding occurs between strands
- H-bond geometry is different between parallel and
anti-parallel strands
Rise distance between adjacent residues Pitch-
distance between repeat structure
rise
pitch
79Protein Structures from an NMR Perspective
- Secondary Structure b-Sheets
- Secondary structures are typically distinguished
by f,y values and hydrogen bonding - pattern
b
a
80Protein Structures from an NMR Perspective
- Secondary Structure b-strands
- anti-parallel b-sheet
- Left-handed twist (25o)
- Majority of bulges occur in anti-parallel
b-sheets
Note alternating spaced H-bonds
b-strand II
Hydrogen bonds between NH (blue) and CO (red)
C-terminus
N-terminus
H-bond length 2.90.3Ã…
b-strand I
81Protein Structures from an NMR Perspective
- Secondary Structure b-strands
- b-bulge
- hydrogen-bonding of two residues from one strand
with one residue from another strand
Bulge
Hydrogen bonds from residue 33 to both residues
41 and 42
82Protein Structures from an NMR Perspective
- Secondary Structure b-strands
- parallel b-sheet
- Less twisted than anti-parallel b-sheets
- Less likely to have a bulge compared to
anti-parallel b-sheets (only 5) - Hydrogen bonds are not perpendicular to
individual strands - Has macrodipole that is 5 times less than
average a-helix dipole
Hydrogen bonds between NH (blue) and CO (red)
b-strand II
N-terminus
C-terminus
b-strand III
b-strand I
Note Individual strands that comprise a sheet do
not need to be sequentially related or the same
size
83Protein Structures from an NMR Perspective
- Secondary Structure b-Sheets
- b-sheet can continue in both directions.
- Most b-sheets have lt 6 b-strands with an average
of 6 residues per strand. - H-bonds are 0.1Ã… shorter than a-helix
- b-sheets can be all parallel, all anti-parallel
or mixed. - Formed from strands that are very often from
distant portions of the polypeptide sequence. - Lengths of individual strands can vary.
- Do not need to be of uniformed length
- Most b-sheets exhibit a left-handed twisted
(25o). - results from a relative rotation of each residue
in the strands by 30o per amino acid in a
right-handed sense.
84Protein Structures from an NMR Perspective
- Secondary Structure b-sheet
- Amino Acid Preference for b-Sheet
- Hydrophobic and steric effects are unimportant
- inductive effect largely determines the
beta-sheet propensities - amino acid side chains shielding of the Ca
nucleus - No capping preference has been identified to date
b-Sheet Propensity (larger number better)
Ala 0.79 Leu 1.17 Arg 0.94 Lys 0.73 Asn
0.66 Met 1.01 Asp 0.66 Phe 1.23 Cys 1.07
Pro 0.62 Gln 1.00 Ser 0.94 Glu 0.51 Thr
1.33 Gly 0.87 Trp 1.24 His 0.83 Tyr
1.31 Ile 1.57 Val 1.64
85Protein Structures from an NMR Perspective
- Secondary Structure Turns
- Short and tight structural regions that connect
other secondary structure elements - Comprised of 3 to 5 residues
- Allows the peptide chain to reverse directions
- Therefore, Proline and glycine are prevalent in
turns - Connect adjacent b-strands
- Reverse turns occur mainly on the surface
- Therefore, charged residues are prevalent in
turns - Two common turns
- b-turns
- More common turn
- Four consecutive residues, two do not form
H-bonds - Carbonyl of one residue is H-bonded to amide
proton of a residue three residues away - Distance lt 7 Ã… between the Ca atoms of residue i
and i3 - Nine types of b-turns differ by f, y of i2 and
i3 residues - Types I, II, III are mirror images of Types I,
II, III - Type III b-turns may be considered as short
regions of 310-helix - g-turns
- Very tight turn
86Protein Structures from an NMR Perspective
- Secondary Structure Turns
- Secondary structures are typically distinguished
by f,y values and hydrogen bonding pattern - Some preferred residues are indicated, bold are
most significant
87Protein Structures from an NMR Perspective
- Secondary Structure b-turns
- Illustration of the Type I II b-turns and
mirror images
Hydrogen Bond
88Protein Structures from an NMR Perspective
- Secondary Structure g-turns
- Illustration of the classical and inverse g-turn
Hydrogen Bond
89Protein Structures from an NMR Perspective
- Secondary Structure Turns
- Amino acids preference for turns
90Protein Structures from an NMR Perspective
- SuperSecondary Structure
- arrangements of two, three or more consecutive
secondary structures - a-helices or b-strands
- Common features in many different proteins
- Completely different amino acid sequences
(a) ßaß - two parallel strands of ß-sheet
connected by a stretch of a-Helix (b) aa - two
anti-parallel a-helices (c) ß meander - an
anti-parallel sheet formed by a series of tight
reverse turns connecting stretches of a
polypeptide chain (d) Greek Key a repetitive
super-secondary structure formed when an
anti-parallel sheet doubles back on itself
91Protein Structures from an NMR Perspective
- SuperSecondary Structure
- Coiled coils
- 2 or more a-helices
- Contains a heptad repeat (H hydrophobic P
polar) - Leucine zippers leucine in d position
- N is 3
- Knob (a and d) into hole interactions
a b c d e f g (H P P H P P P)n
Knobs
Moutevelis and Woolfson (2009) J. Mol. Biol.
385726
92Protein Structures from an NMR Perspective
- SuperSecondary Structure
- Coiled coils
- Periodic Table
- Leucine zippers leucine in d position
- N is 3
- Knob into hole interactions (KIH)
- Population and percentage of occupancy
- Number of coils increase to right
- Circle helix, lines KIH, grey hydrophobic core
- Population and percentage of occupancy
- below each architecture.
- Complexity increases down column
- Helix shared between two helix coiled coils
- Interface between 2 or more coiled coils
93Protein Structures from an NMR Perspective
- SuperSecondary Structure
- Coiled coils
- Diversity of structures
Kohn et al. (1997) J. Biol. Chem. 2722583
94Protein Structures from an NMR Perspective
- SuperSecondary Structure
- Coiled coils
- Packing angle (W) and axial separation
- Angle between two helices
- Shortest distance between the two helices
Walther et al. (1998) PROTEINS 33457
95Protein Structures from an NMR Perspective
- SuperSecondary Structure
- Coiled coils
- Average axial separation differ for transmembrane
and soluble coiled coils - Solution 9.6 Ã…
- Transmembrane 9.0 Ã…. Two clusters ate 7.3 Ã… and
10.8 Ã… - Transmembrane coiled coils are more compact
contain shorter amino acids (Gly)
Eilers et al. (2000) PNAS 975796
96Protein Structures from an NMR Perspective
- SuperSecondary Structure
- Coiled coils
- Average axial separation varies linearely with
amino-acid volumes - Size (volume) of residues at helix contact
97Protein Structures from an NMR Perspective
- SuperSecondary Structure
- Coiled coils
- Packing angle (W)
- distribution
- Preferential angles are
- -45o
- 23o
- 75o
Bowie (1997) Nature Structural Biology 4915
98Protein Structures from an NMR Perspective
- SuperSecondary Structure
- Coiled coils
- Packing angle (W)
- Depends on geometry of hydrophobic residues
- Steric compatibility alone defines packing angle
Heptad repeat
11-residue repeat
Side by side
long
Heptad repeat
face to face
11-residue repeat
normal
W 20o W 20o
W 0o to -10o W -30o to -40o
Efimov (1999) FEBS Letters 4633
99Protein Structures from an NMR Perspective
- Tertiary Structure
- the three-dimensional folding of the polypeptide
chain to assemble the different secondary
structure elements in a particularly arrangement
in space
100Protein Structures from an NMR Perspective
- Tertiary Structure
- Periodic table of Protein Folds
- Set of idealized structures
- Experimental structures are compared to idealized
set to find best match and classification
Basis Set- most biologically important protein
structures are derived from these idealized
structures
Taylor (2002) Nature 416657
101Protein Structures from an NMR Perspective
- Tertiary Structure
- Periodic table of Protein Folds
- Set of idealized structures looking edge on
- 4- layers thick
Small Circles - helix Bars - b-sheet Arc
curved b-sheet Open circle b-barrel
102Protein Structures from an NMR Perspective
- Tertiary Structure
- the three-dimensional folding of the polypeptide
chain to assemble the different secondary
structure elements in a particularly arrangement
in space - 800 unique folds have been identified
- 1,000 5,000 protein folds are predicted
SCOP Structural Classification of Proteins. 1.75
release38221 PDB Entries (23 Feb 2009). 110800
Domains. 1 Literature Reference(excluding
nucleic acids and theoretical models)
http//scop.mrc-lmb.cam.ac.uk/scop/index.html
103Protein Structures from an NMR Perspective
Tertiary Structure
Family Clear evolutionarily relationshipProteins
clustered together into families are clearly
evolutionarily related. Generally, this means
that pairwise residue identities between the
proteins are 30 and greater. However, in some
cases similar functions and structures provide
definitive evidence of common descent in the
absense of high sequence identity for example,
many globins form a family though some members
have sequence identities of only
15. Superfamily Probable common evolutionary
originProteins that have low sequence
identities, but whose structural and functional
features suggest that a common evolutionary
origin is probable are placed together in
superfamilies. For example, actin, the ATPase
domain of the heat shock protein, and hexakinase
together form a superfamily. Fold Major
structural similarityProteins are defined as
having a common fold if they have the same major
secondary structures in the same arrangement and
with the same topological connections. Different
proteins with the same fold often have peripheral
elements of secondary structure and turn regions
that differ in size and conformation. In some
cases, these differing peripheral regions may
comprise half the structure. Proteins placed
together in the same fold category may not have a
common evolutionary origin the structural
similarities could arise just from the physics
and chemistry of proteins favoring certain
packing arrangements and chain topologies.
104Protein Structures from an NMR Perspective
- Tertiary Structure
- Classifying protein structures is not
straightforward or definitive - Multiple equally valid approaches
CATH v3.2
http//www .cathdb.info/
CATH assigns each protein domain to a four number
code based on its class (C), architecture (A),
topology (T), and homologous super family
(H). Example chain A from PDB ID 1kbl is
assigned a CATH code of 1.20.80.30 class
..................................................
. 1 mainly alpha architecture ................
........................ 20 Up-down
bundle topology.................................
............ 80 Acyl-CoA Binding
Protein homologous super family
................. 30 no description
105Protein Structures from an NMR Perspective
CATH is a novel hierarchical classification of
protein domain structures, which clusters
proteins at four major levels Class ( )
derived from secondary structure content, is
assigned for more than 90 of protein structures
automatically. Architecture ( ) describes the
gross orientation of secondary structures,
independent of connectivities, is currently
assigned manually. Topology ( ) clusters
structures according to their toplogical
connections and numbers of secondary structures
are made by sequence and structure
comparisons. Homologous super family ( )
cluster proteins with highly similar structures
and functions are made by sequence and structure
comparisons. Other Levels Sequence Family (
) cluster proteins based on sequence identity
35, nearly always have identical
structure Non-Identical ( ) cluster proteins
based on sequence identity 95 Identical ( )
numerous cases where the protein structure based
on the identical sequence has been deposited into
the PDB. Domain ( ) semi-independent folding
unit
106Protein Structures from an NMR Perspective
Tertiary Structure Some Common Examples
Mainly a (4-helix bundle)
Mainly b (b-sandwich)
Mixed a/b (a/b-barrel)
Minimal Secondary Structure (Kringle Domain)
107Protein Structures from an NMR Perspective
Tertiary Structure Continuit