Title: Outline
1Outline
- Introduction
- What's it all about
- Course Mechanics
- From Genomics to Proteomics
- Protein Separation
- Protein Identification
- Mass Spectrometry
- Fundamentals
- Protein Chemistry
- Ionization Techniques
- Fragmentation techniques
- Mass Analyzers
- CID MS/MS Interpretation
- Peptide Fragmentation Chemistry
- Interpretation of Spectra
2Analyzing Spectra
3Identifying Proteins with MS
- There are three main approaches for identifying
proteins. - Peptide sequence tag
- Database search
- Using de novo algorithm to build peptide sequence.
4Identifying Proteins with MSUsing Peptide
Sequence Tags
- In tandem mass spectrometry, peptides are
fragmented. - The location and identity of some amino acid in
these fragments can be determined by the spacing
of these fragments of the mass of that amino
acid. - Since there are 20 kinds of amino acids
- Some combinations of amino acids are sufficient
to identify protein in database - Or at least help greatly to reduce the number of
matched peptide sequence - Called peptide sequence tags
- Mann and Wilm used a combination of a partial de
novo algorithm and a database search to implement
this method - The limitation of this method is that if the
peptide sequence tag is not suitably selected - It will produce both false positives
- And false negatives.
5Identifying Proteins with MSUsing Database
Search
- During the process of mass spectrometry, the
related mass spectra represent the experimental
spectrum. - For all sequences in a protein database or
genomic database - We construct a theoretical mass spectrum.
- An exhaustive search is then used to compare
experimental spectrum and all these theoretical
ones. - This method has following limitations
- Post-translational modifications will reduce the
accuracy of identification - With accumulation of protein sequences, this
method can be slow. - This method will not find new proteins that are
not in the database.
6Identifying Proteins with MSBy de novo Sequencing
- First generate a spectrum graph.
- Then attempt to find what is called a feasible
path through this graph. - The feasible path corresponds to a valid peptide
sequence.
7De novo Peptide Sequencing
Sequence
8Generating a Theoretical Spectrum for a Database
Search
9Generating a Theoretical Spectrum for a Database
Search
10Generating a Theoretical Spectrum for a Database
Search
11Building a Spectrum Graph for de novo Sequencing
- How to calculate masses of b and y ions
- How to create vertices (from masses)
- How to create edges (from mass differences)
- How to score paths
- How to find best path
12(No Transcript)
13Protein Backbone Plus Water at Termini
H...-HN-CH-CO-NH-CH-CO-NH-CH-CO-OH
Ri-1
Ri
Ri1
C-terminus
N-terminus
AA residuei-1
AA residuei1
AA residuei
14y Ion Mass
- C terminal peptide has an extra OH
- N terminal peptide has an extra H
- Must add
- 18 Da. for the H20
- 1 Da. for the proton
- So, add the residue masses and add 19
- Assuming singly charged
15b Ion Mass
- b ion is different
- C terminal peptide forms cyclic structure (see
next slide) - Thus, has no extra OH
- In fact, the N on the final residue (on the C
terminal end) has an extra covalent bond with the
backbone Carbons - Thus, it doesnt have an H
- Thus, the residue is 1 Da less
- (see next slide again)
- N terminal peptide has an extra Hydrogen
- Must add
- 1 Da. for the N terminal H
- 1 Da. for the proton
- -1 Da for the loss of the H on the final residue
- So, add the residue masses and add 1
- Assuming singly charged
16b and y Ion Chemistry
17 S E Q U E N C E
b
Mass/Charge (M/Z)
18 a
S E Q U E N C E
Mass/Charge (M/Z)
19a is an ion type shift in b
S E Q U E N C E
Mass/Charge (M/Z)
20 y
E C N E U Q E S
Mass/Charge (M/Z)
21y with corresponding intensities
E C N E U Q E S
Intensity
Mass/Charge (M/Z)
22 Intensity
Mass/Charge (M/Z)
23Intensity
Mass/Charge (M/Z)
24 noise
Mass/Charge (M/Z)
25 MS/MS Spectrum
Intensity
Mass/Charge (M/z)
26Mass Differences Correspond to Amino Acids
u
q
e
e
q
s
u
e
n
n
c
e
e
e
q
c
s
n
e
s
u
e
c
e
27de novo Sequencing from the C terminus (1/4)
- To begin sequencing a tryptic peptide, Assume
that the C-terminus of the peptide is either
lysine or arginine. - This assumption is usually true except for
- Tryptic peptides derived from non-tryptic
cleavage - Due to contaminating chymotryptic activity
- Or tryptic peptides encompassing the C-terminus
of the original protein - Where the C-terminal residue of the protein is
not lysine or arginine. - The y1 ion is calculated by adding 19.018 Da t o
the residue masses of lysine and arginine - Three hydrogens and one oxygen
- H20 plus proton
- Lysine
- 128.095 19.018 147.113 Da
- Arginine
- 156.1011 19.018 175.119 Da
- If either mass is present, make a note.
28de novo Sequencing from the C terminus (2/4)
- On ion traps, the y1 ions will be below the mass
cutoff - The corresponding high m/z b-type ion is often
found, though - This b ion contains all of the residues except
the arginine or lysine at the C-terminus. - These b ions are calculated by
- Subtract 17.002 Da from the precursor mass
- 17.002 Da One oxygen and one hydrogen
Water loss plus proton gain - Subtract the residue mass of arginine or lysine.
- If this b ion is found, make a note of it.
29de novo Sequencing from the C terminus (3/4)
- If a y1 ion for lysine or arginine is found
- Get peak corresponding to higher product ion
masses - Subtract the y1 mass
- Check in the residue mass table to see if any
mass differences correspond to an amino acid. - If any differences equate to an amino acid
residue mass - Make a note of what each putative y2 ion might be
- Also subtract each possible residue mass from the
high mass b ions - Recall these correspond to loss of arginine or
lysine from precusor - high m/z b ions rarely seen for ion traps or
QTOFs - Proceed to the y3 ion and higher
- For each, check to see if the corresponding high
m/z b-type ion is present. - Eventually as amino acid residue masses are added
to the y ion series, it passes the b ion series.
30de novo Sequencing from the C terminus (4/4)
- Favored partial sequence appear to have both
- High m/z b ions
- Corresponding y ions.
- Eventually, you might get a complete sequence for
the peptide - Hypothesized sequence should have a calculated
mass that equals the observed precursor mass - Within the error tolerance of the peptide mass
measurement - Often you cannot get a complete sequence all the
way to the N-terminus - It is common for a CID spectrum to lack
fragmentations between the first and second amino
acids at the N-terminus. - Therefore, no b1 ion observed
- The N-terminus of this proposed has the combined
residue mass of the first two amino acids. - Make sure that this unsequenced mass at the
N-terminus corresponds to the sum of two amino
acid residue masses. - For example, an unsequenced N-terminal mass of
150 Da is not possible in the absence of the
additional mass of a post-translational
modification.
31de novo Sequencing from the Middle (1/4)
- Procedure
- Get some partial sequence from the middle of the
peptide - Try to connect this partial sequence to the
peptide N-terminus. - Proceed to C terminus.
- For Qtof or Triple Quads there is a short stretch
of fairly intense ions at a m/z greater than the
precursor m/z - The mass differences between these ions in the
series correspond to amino acid residue masses - These are the so-called sequence tags introduced
by Matthias Mann - In principal, one does not know if these are
b-type or y-type ions - And hence, whether the partial sequence goes
forward or backwards - For Qtof and triple quad tryptic peptides it is
usually safe to guess that this is a partial y
ion series. - The precursor must be doubly or triply charged
32de novo Sequencing from the Middle (2/4)
- Take the peptide mass 2.016 Da, and subtract the
highest mass ion in this series - This is the mass of two protons
- The peptide mass has no ions, but the b and y
ions each have one. - This mass difference would correspond to a
hypothetical lower mass b ion. - (Precursor Mass 2) - (y ion with proton) (b
ion with proton) - Often times a y ion series will encompass all but
the two N-terminal amino acids - Then the mass difference between that y ion and
the precursor mass 2 corresponds to a b2 ion. - If that b2 ion is present, check to see if there
is another ion 27.995 Da lower - Which would possibly be the matching a2 ion.
33de novo Sequencing from the Middle (3/4)
- If all of these ions are found
- The putative high m/z y ion
- Plus the alleged low m/z b
- Plus the a ion
- Then its most likely a real peptide ion.
- If you are lucky, the high m/z y-type ion series
extends all the way to the N-terminus - In which case this mass difference corresponds to
b1 ion - An amino acid residue mass plus a hydrogen
- Don't bother looking for a b1 ion they don't
exist. - If a partial y ion series is found, then try to
identify the low mass b ion series that
corresponds to the high m/z y ion series.
34de novo Sequencing from the Middle (4/4)
- For Qtof and triple quad tryptic peptides, the b
ions usually decrease in intensity to negligible
values at the high end - At the same time the corresponding y ions are
going into the low m/z end of the spectrum, where
its harder to see. - This portion of the spectrum usually contains
many more fragment ions of different type - Immonium ions
- b ions
- a ions
- y ions
- Other charged fragments
- Keep trying to connect the high m/z y-type ion
series until you reach a y1 ion for the
C-terminal lysine or arginine - 147.113 Da or 175.119 Da, respectively
35Peak Tips What to Look for in a Spectrum
- Look for the tallest peak in the spectrum above
200 mass units. - The low end of a spectrum can often be confounded
with solvent noise. - Then look for peaks that are roughly double or
half the mass. - This may tell you whether there are multiply
charged species present which can help with mass
determination. - Counts.
- Resist interpreting meager spectra.
- If you know that on a particular day that 1.0106
is a respectable, reliable signal then you will
know that a spectrum that tops out at 1.0104
counts (or at background) may not be a reliable
spectrum to interpret.
36Peak Tips What to Look for in a Spectrum
- The quality of the spectrum is important.
- You will waste valuable time interpreting a low
quality spectrum. - If you are not happy with the quality of the
spectrum try averaging several or many low level
spectra to obtain a better quality "averaged mass
spectrum." - Compare this to a background spectrum to see if
the peaks really stand out. - Reproducibility is important.
- The peak must be consistent to be considered a
relevant peak. - You should not lend credence to "one scan
wonders."
37Peak Tips What to Look for in a Spectrum
- Determining the charge state of a peak when only
one peak is obvious - A molecule will often have adduct ions associated
with it other than hydrogen . - Look for sodium or ammonium adducts.
- These adducts can often give you a hint as to the
charge state of a peak. - For example if there is only one major species in
a spectrum, look for the sodium adduct following
that peak. - If it is a singly charged species the sodium
adduct will be found at 22 mass units higher
than the MH peak. - If the peak is doubly charged the adduct will
appear at 11 mass units.
38Isotopes
- If the mass spectrometer you are working with has
sufficient resolution, look at the isotopes - A singly charged ion will show isotopic peaks
that differ by 1 mass unit - A doubly charged ion will show peaks that differ
by 0.5 mass units and so on. - This is another way to deduce the charge state of
a peak and thus the mas.
39Frequently Asked Questions
- Q How can I tell if a peak is real?
- A Wow, this is some question.
- All peaks are real.
- In an LC/MS run, we look for peaks that reoccur
in multiple adjacent scans (spectra) but not in
every scan. - If the peak occurs in every scan it may be a
background peak. - It is possible to get system noise or spikes that
only occur in one scan or sporadically these are
most likely electronic or some other form of
system noise.
40Frequently Asked Questions
- Q How can I be sure of the identity of a peak.
- A Well, a mass is just a mass and many compounds
have isobaric mass so you can't be sure from just
a mass. - In the old days we would
- Perform an enzymatic digest on a protein
- Run an LC/MS peptide map and
- Match up the mass with the theoretical fragments.
- Today the bar is rightly higher and we go one
step further in the identification - We take the peak through a fragmentation
- And match up the fragment masses with the
theoretical CID fragment masses for that peptide.
- This gives us a positive ID.
- Another overlooked component in LC/MS
- The correlation of mass and LC retention time.
- This is part of what makes LC/MS so powerful
- If the retention time of a molecule has been
previously characterized this information can be
linked with the mass information for a positive
ID. - If you are characterizing a new molecule
- Try modifying the molecule to see if you can
modify the mass. - Try an enzyme digest if the unknown is a protein
41Frequently Asked Questions
- Q How can I differentiate a compound at one mass
from another at twice the mass? - For example a compound with mass 1000 will
display peaks at m/z 1001 and 501, - (1000 1) / 1
- (1000 2) / 2
- A compound with mass 2000 may display peaks at
m/z 2001, 1001, 667.7 and 501. - (2000 1) / 1
- (2000 2) / 2
- (2000 3) / 3
- (2000 4) / 4
- The mass determination can further be confounded
if the peptide at 1000 forms dimers during the
electrospray process.
42Frequently Asked Questions
- A
- The peak envelope does not skip peaks
- For example the 2000 mass even if it does not
have an obvious peak at 2001 it should have the
667.7 peak between the 1001 and 501 peaks. - Also try to determine the charge state of the
ions from the adducts or from the isotopes. - This will tell you what the mass of the compound
is. - Dimer formation can be a major problem in some
analyses. - Try to reduce the concentration of the analyte.
- Often if the concentration is too high dimers
will be observed in the spectrum. - Also dimers can be reduced by changing some of
the setting on the mass spectrometer.
43Frequently Asked Questions
- With the peak envelope of larger molecules
(10kDa) look for smooth peak distributions. - The peak distribution should have a smooth bell
shaped curve appearance, sometimes trailing off
to the right. - The peak to peak relationship should be
predictable - If one observes an alternating pattern of peak
intensities this may be a clue to a co-eluting
dimer.
44Example
45(No Transcript)
46(No Transcript)
47(No Transcript)
48(No Transcript)
49(No Transcript)
50(No Transcript)
51(No Transcript)
52(No Transcript)
53(No Transcript)
54Example
55Example
389.9 (2/2) 2 779.8 - 2 777.8
(777.8 2) - 518.3 261.5 261.2
27.995 233.2
56(No Transcript)
57(No Transcript)
58(No Transcript)
59Example
60Example
389.9 (2/2) 2 779.8 - 2 777.8
(777.8 2) 431.2 348.6
61Example
348.0 - 27.995 320.0
62(No Transcript)
63(No Transcript)
64(No Transcript)
65(No Transcript)
66Example
67Example
389.9 (2/2) 2 779.8 - 2 777.8
(777.8 2) 303.1 476.7
68(No Transcript)
69(No Transcript)
70(No Transcript)
71Example
72Example
389.9 (2/2) 2 779.8 - 2 777.8
(777.8 2) 204.1 575.7
73(No Transcript)
74(No Transcript)
75(No Transcript)
76Example
77Example
389.9 (2/2) 2 779.8 - 2 777.8
(777.8 2) 147.1 632.7
78(No Transcript)
79(No Transcript)
80 81(No Transcript)