Protein Identification Using Tandem Mass Spectrometry - PowerPoint PPT Presentation

1 / 60

About This Presentation

Title:

Protein Identification Using Tandem Mass Spectrometry

Description:

xn-i. ci. zn-i. 11. Peptide Fragmentation. Peptide: S-G-F-L-E-E ... Incomplete ladders create ambiguity. Noise peaks and unmodeled fragments create ambiguity ' ... – PowerPoint PPT presentation

Number of Views:201

Avg rating:3.0/5.0

Slides: 61

Provided by: Partner9

Category:

more less

Transcript and Presenter's Notes

Title: Protein Identification Using Tandem Mass Spectrometry

1
Protein Identification Using Tandem Mass
Spectrometry

Nathan Edwards
Center for Bioinformatics and Computational
Biology
University of Maryland, College Park

2
Outline

Proteomics context
Tandem mass spectrometry
Peptide fragmentation
Peptide identification
De novo
Sequence database search
Mascot screen shots
Traps and pitfalls
Summary

3
Proteomics Context

High-throughput proteomics focus
(Differential) Quantitation
How much of each protein is there?
Identification
What proteins are present?
Two established workflows
2-D Gels
LC-MS, LC-MALDI

4
Sample Preparation for Tandem Mass Spectrometry
5
Single Stage MS
MS
6
Tandem Mass Spectrometry(MS/MS)
MS/MS
7
Peptide Fragmentation
Peptides consist of amino-acids arranged in a
linear backbone.
N-terminus
H-HN-CH-CO-NH-CH-CO-NH-CH-CO-OH
Ri-1
Ri
Ri1
C-terminus
AA residuei-1
AA residuei
AA residuei1
8
Peptide Fragmentation
9
Peptide Fragmentation
yn-i
yn-i-1
-HN-CH-CO-NH-CH-CO-NH-
CH-R
Ri
i1
R
bi
i1
bi1
10
Peptide Fragmentation
xn-i
yn-i-1
-HN-CH-CO-NH-CH-CO-NH-
CH-R
Ri
i1
R
ai
i1
bi1
11
Peptide Fragmentation
Peptide S-G-F-L-E-E-D-E-L-K
12
Peptide Fragmentation
1166
1020
907
778
663
534
405
292
145
88
b ions
K
L
E
D
E
E
L
F
G
S
147
260
389
504
633
762
875
1022
1080
1166
y ions
100
Intensity
0
m/z
250
500
750
1000
13
Peptide Fragmentation
1166
1020
907
778
663
534
405
292
145
88
b ions
K
L
E
D
E
E
L
F
G
S
147
260
389
504
633
762
875
1022
1080
1166
y ions
y6
100
y7
Intensity
y5
y2
y3
y8
y4
y9
0
m/z
250
500
750
1000
14
Peptide Fragmentation
1166
1020
907
778
663
534
405
292
145
88
b ions
K
L
E
D
E
E
L
F
G
S
147
260
389
504
633
762
875
1022
1080
1166
y ions
y6
100
y7
Intensity
y5
b3
b4
y2
y3
b5
y8
y4
b8
y9
b6
b7
b9
0
m/z
250
500
750
1000
15
Peptide Identification

Given
The mass of the parent ion, and
The MS/MS spectrum
Output
The amino-acid sequence of the peptide

16
Peptide Identification

Two paradigms
De novo interpretation
Sequence database search

17
De Novo Interpretation
100
Intensity
0
m/z
250
500
750
1000
18
De Novo Interpretation
100
Intensity
E
0
m/z
250
500
750
1000
19
De Novo Interpretation
100
Intensity
G
E
E
E
D
KL
E
E
E
D
0
m/z
250
500
750
1000
20
De Novo Interpretation
21
De Novo Interpretation
from Lu and Chen (2003), JCB 101
22
De Novo Interpretation
23
De Novo Interpretation
from Lu and Chen (2003), JCB 101
24
De Novo Interpretation

Find good paths in spectrum graph
Cant use same peak twice
Forbidden pairs NP-hard
Nested forbidden pairs Dynamic Prog.
Simple peptide fragmentation model
Usually many apparently good solutions
Needs better fragmentation model
Needs better path scoring

25
De Novo Interpretation

Amino-acids have duplicate masses!
Incomplete ladders create ambiguity.
Noise peaks and unmodeled fragments create
ambiguity
Best de novo interpretation may have no
biological relevance
Current algorithms cannot model many aspects of
peptide fragmentation
Identifies relatively few peptides in
high-throughput workflows

26
Sequence Database Search

Compares peptides from a protein sequence
database with spectra
Filter peptide candidates by
Parent mass
Digest motif
Score each peptide against spectrum
Generate all possible peptide fragments
Match putative fragments with peaks
Score and rank

27
Sequence Database Search
K
L
E
D
E
E
L
F
G
S
100
Intensity
0
m/z
250
500
750
1000
28
Sequence Database Search
1166
1020
907
778
663
534
405
292
145
88
b ions
K
L
E
D
E
E
L
F
G
S
147
260
389
504
633
762
875
1022
1080
1166
y ions
100
Intensity
0
m/z
250
500
750
1000
29
Sequence Database Search
1166
1020
907
778
663
534
405
292
145
88
b ions
K
L
E
D
E
E
L
F
G
S
147
260
389
504
633
762
875
1022
1080
1166
y ions
y6
100
y7
Intensity
y5
b3
b4
y2
y3
b5
y8
y4
b8
y9
b6
b7
b9
0
m/z
250
500
750
1000
30
Sequence Database Search

No need for complete ladders
Possible to model all known peptide fragments
Sequence permutations eliminated
All candidates have some biological relevance
Practical for high-throughput peptide
identification
Correct peptide might be missing from database!

31
Peptide Candidate Filtering

Digestion Enzyme Trypsin
Cuts just after K or R unless followed by a P.
Basic residues (K R) at C-terminal attract
ionizing charge, leading to strong y-ions
Average peptide length about 10-15 amino-acids
Must allow for missed cleavage sites

32
Peptide Candidate Filtering

gtALBU_HUMAN MKWVTFISLLFLFSSAYSRGVFRRDAHKSEVAHRFKDL
GEENFKALVLIAFAQYLQQCPFEDHVKLVNEVTEFAK

No missed cleavage sites
MK WVTFISLLFLFSSAYSR GVFR R DAHK SEVAHR FK DLGEENF
K ALVLIAFAQYLQQCPFEDHVK LVNEVTEFAK
33
Peptide Candidate Filtering

gtALBU_HUMAN MKWVTFISLLFLFSSAYSRGVFRRDAHKSEVAHRFKDL
GEENFKALVLIAFAQYLQQCPFEDHVKLVNEVTEFAK

One missed cleavage site
MKWVTFISLLFLFSSAYSR WVTFISLLFLFSSAYSRGVFR GVFRR RD
AHK DAHKSEVAHR SEVAHRFK FKDLGEENFK DLGEENFKALVLIAF
AQYLQQCPFEDHVK ALVLIAFAQYLQQCPFEDHVKLVNEVTEFAK
34
Peptide Candidate Filtering

Peptide molecular weight
Only have m/z value
Need to determine charge state
Ion selection tolerance
Mass for each amino-acid symbol?
Monoisotopic vs. Average
Default residual mass
Depends on sample preparation protocol
Cysteine almost always modified

35
Peptide Molecular Weight
i0
Same peptide,i of C13 isotope
i1
i2
i3
i4
36
Peptide Molecular Weight
i0
Same peptide,i of C13 isotope
i1
i2
i3
i4
37
Peptide Molecular Weight
from Isotopes An IonSource.Com Tutorial
38
Peptide Molecular Weight

Peptide sequence WVTFISLLFLFSSAYSR
Potential phosphorylation?
S,T,Y 80 Da

7 Molecular Weights
64 Peptides

39
Peptide Scoring

Peptide fragments vary based on
The instrument
The peptides amino-acid sequence
The peptides charge state
Etc
Search engines model peptide fragmentation to
various degrees.
Speed vs. sensitivity tradeoff
y-ions b-ions occur most frequently

40
Mascot Search Engine
41
Mascot MS/MS Ions Search
42
Mascot Peptide Mass Fingerprint
43
Mascot Sequence Query
44
Mascot MS/MS Search Results
45
Mascot MS/MS Search Results
46
Mascot MS/MS Search Results
47
Mascot MS/MS Search Results
48
Mascot MS/MS Search Results
49
Mascot MS/MS Search Results
50
Mascot MS/MS Search Results
51
Mascot MS/MS Search Results
52
Mascot MS/MS Search Results
53
Mascot MS/MS Search Results
54
Sequence Database SearchTraps and Pitfalls

Search options may eliminate the correct peptide
Parent mass tolerance too small
Fragment m/z tolerance too small
Incorrect parent ion charge state
Non-tryptic or semi-tryptic peptide
Incorrect or unexpected modification
Sequence database too conservative
Unreliable taxonomy annotation

55
Sequence Database SearchTraps and Pitfalls

Search options can cause infinite search times
Variable modifications increase search times
exponentially
Non-tryptic search increases search time by two
orders of magnitude
Large sequence databases contain many irrelevant
peptide candidates

56
Sequence Database SearchTraps and Pitfalls

Best available peptide isnt necessarily correct!
Score statistics (e-values) are essential!
What is the chance a peptide could score this
well by chance alone?
The wrong peptide can look correct if the right
peptide is missing!
Need scores (or e-values) that are invariant to
spectrum quality and peptide properties

57
Sequence Database SearchTraps and Pitfalls

Search engines often make incorrect assumptions
about sample prep
Proteins with lots of identified peptides are not
more likely to be present
Peptide identifications do not represent
independent observations
All proteins are not equally interesting to report

58
Sequence Database SearchTraps and Pitfalls

Good spectral processing can make a big
difference
Poorly calibrated spectra require large m/z
tolerances
Poorly baselined spectra make small peaks hard to
believe
Poorly de-isotoped spectra have extra peaks and
misleading charge state assignments

59
Summary