CSE182L13 - PowerPoint PPT Presentation

About This Presentation
Title:

CSE182L13

Description:

Upon phosphorylation, the b-, and y-ions shift in a characteristic fashion. ... A simple trick can let us predict the modification sites? Consider the peptide ASTYER. ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 50
Provided by: vineet50
Learn more at: https://cseweb.ucsd.edu
Category:
Tags: cse182l13

less

Transcript and Presenter's Notes

Title: CSE182L13


1
CSE182-L13
  • Mass Spectrometry
  • Quantitation and other applications

2
What happens to the spectrum upon modification?
  • Consider the peptide MSTYER.
  • Either S,T, or Y (one or more) can be
    phosphorylated
  • Upon phosphorylation, the b-, and y-ions shift in
    a characteristic fashion. Can you determine where
    the modification has occurred?

2
1
5
4
3
1
6
5
4
3
2
If T is phosphorylated, b3, b4, b5, b6, and y4,
y5, y6 will shift
3
Effect of PT modifications on identification
  • The shifts do not affect de novo interpretation
    too much. Why?
  • Database matching algorithms are affected, and
    must be changed.
  • Given a candidate peptide, and a spectrum, can
    you identify the sites of modifications

4
Db matching in the presence of modifications
  • Consider MSTYER
  • The number of modifications can be obtained by
    the difference in parent mass.
  • With 1 phosphorylation event, we have 3
    possibilities
  • MSTYER
  • MSTYER
  • MSTYER
  • Which of these is the best match to the spectrum?
  • If 2 phosphorylations occurred, we would have 6
    possibilities. Can you compute more efficiently?

5
Scoring spectra in the presence of modification
  • Can we predict the sites of the modification?
  • A simple trick can let us predict the
    modification sites?
  • Consider the peptide ASTYER. The peptide may have
    0,1, or 2 phosphorylation events. The difference
    of the parent mass will give us the number of
    phosphorylation events. Assume it is 1.
  • Create a table with the number of b,y ions
    matched at each breakage point assuming 0, or 1
    modifications
  • Arrows determine the possible paths. Note that
    there are only 2 downward arrows. The max scoring
    path determines the phosphorylated residue

A S T Y E R
0 1
6
Modifications
  • Modifications significantly increase the time of
    search.
  • The algorithm speeds it up somewhat, but is still
    expensive

7
Fast identification of modified peptides
8
Filtering Peptides to speed up search
Candidate Peptides
Db 55M peptides
Filter
Significance
Score
extension
De novo
As with genomic sequence, we build computational
filters that eliminate much of the database,
leaving only a few candidates for the more
expensive scoring.
9
Basic Filtering
  • Typical tools score all peptides with close
    enough parent mass and tryptic termini
  • Filtering by parent mass is problematic when PTMs
    are allowed, as one must consider multiple parent
    masses

10
Tag-based filtering
  • A tag is a short peptide with a prefix and suffix
    mass
  • Efficient An average tripeptide tag matches
    Swiss-Prot 700 times
  • Analogy Using tags to search the proteome is
    similar to moving from full Smith-Waterman
    alignment to BLAST

11
Tag generation
W
R
TAG Prefix Mass AVG 0.0 WTD
120.2 PET 211.4
V
A
L
T
G
E
P
L
K
C
W
D
T
  • Using local paths in the spectrum graph,
    construct peptide tags.
  • Use the top ten tags to filter the database
  • Tagging is related to de novo sequencing yet
    different.
  • Objective Compute a subset of short strings, at
    least one of which must be correct. Longer tagsgt
    better filter.

12
Tag based search using tries
YFD DST STD TDY YNM
trie
De novo
scan
..YFDSTGSGIFDESTMTKTYFDSTDYNMAK.
13
Modification Summary
  • Modifications shift spectra in characteristic
    ways.
  • A modification sensitive database search can
    identify modifications, but is computationally
    expensive
  • Filtering using de novo tag generation can speed
    up the process making identification of modified
    peptides tractable.

14
MS based quantitation
15
The consequence of signal transduction
  • The signal from extra-cellular stimulii is
    transduced via phosphorylation.
  • At some point, a transcription factor might be
    activated.
  • The TF goes into the nucleus and binds to DNA
    upstream of a gene.
  • Subsequently, it switches the downstream gene
    on or off

16
Transcription
  • Transcription is the process of transcribing or
    copying a gene from DNA to RNA

17
Translation
  • The transcript goes outside the nucleus and is
    translated into a protein.
  • Therefore, the consequence of a change in the
    environment of a cell is a change in
    transcription, or a change in translation

18
Counting transcripts
  • cDNA from the cell hybridizes to complementary
    DNA fixed on a chip.
  • The intensity of the signal is a count of the
    number of copies of the transcript

19
Quantitation transcript versus Protein Expression
Sample 1
Sample2
Sample 1
Sample 2
4
35
Protein 1
100
20
mRNA1
Protein 2
mRNA1
Protein 3
mRNA1
mRNA1
mRNA1
Our Goal is to construct a matrix as shown for
proteins, and RNA, and use it to identify
differentially expressed transcripts/proteins
20
Gene Expression
  • Measuring expression at transcript level is done
    by micro-arrays and other tools
  • Expression at the protein level is being done
    using mass spectrometry.
  • Two problems arise
  • Data How to populate the matrices on the
    previous slide? (easy for mRNA, difficult for
    proteins)
  • Analysis Is a change in expression significant?
    (Identical for both mRNA, and proteins).
  • We will consider the data problem here. The
    analysis problem will be considered when we
    discuss micro-arrays.

21
MS based Quantitation
  • The intensity of the peak depends upon
  • Abundance, ionization potential, substrate etc.
  • We are interested in abundance.
  • Two peptides with the same abundance can have
    very different intensities.
  • Assumption relative abundance can be measured by
    comparing the ratio of a peptide in 2 samples.

22
Quantitation issues
  • The two samples might be from a complex mixture.
    How do we identify identical peptides in two
    samples?
  • In micro-array this is possible because the cDNA
    is spotted in a precise location? Can we have a
    location for proteins/peptides

23
LC-MS based separation
HPLC ESI
TOF Spectrum
(scan)
p1
p2
p3
p4
pn
  • As the peptides elute (separated by
    physiochemical properties), spectra is acquired.

24
LC-MS Maps
Peptide 2
I
Peptide 1
m/z
time
  • A peptide/feature can be labeled with the triple
    (M,T,I)
  • monoisotopic M/Z, centroid retention time, and
    intensity
  • An LC-MS map is a collection of features

Peptide 2 elution
x x x x x x x x x x
x x x x x x x x x x
m/z
time
25
Peptide Features
Capture ALL peaks belonging to a peptide for
quantification !
26
Data reduction (feature detection)
  • First step in LC-MS data analysis
  • Identify Features each feature is represented
    by
  • Monoisotopic M/Z, centroid retention time,
    aggregate intensity

27
Feature Identification
  • Input given a collection of peaks (Time, M/Z,
    Intensity)
  • Output a collection of features
  • Mono-isotopic m/z, mean time, Sum of intensities.
  • Time range Tbeg-Tend for elution profile.
  • List of peaks in the feature.

Int
M/Z
28
Feature Identification
  • Approximate method
  • Select the dominant peak.
  • Collect all peaks in the same M/Z track
  • For each peak, collect isotopic peaks.
  • Note the dominant peak is not necessarily the
    mono-isotopic one.

29
Relative abundance using MS
  • Recall that our goal is to construct an
    expression data-matrix with abundance values for
    each peptide in a sample. How do we identify that
    it is the same peptide in the two samples?
  • Differential Isotope labeling (ICAT/SILAC)
  • External standards (AQUA)
  • Direct Map comparison

30
ICAT
  • The reactive group attaches to Cysteine
  • Only Cys-peptides will get tagged
  • The biotin at the other end is used to pull down
    peptides that contain this tag.
  • The X is either Hydrogen, or Deuterium (Heavy)
  • Difference 8Da

31
ICAT
Label proteins with heavy ICAT
Cell state 1
Combine
Proteolysis
Normal
Cell state 2
Isolate ICAT- labeled peptides
Fractionate protein prep
Label proteins with light ICAT
- membrane - cytosolic
diseased
Nat. Biotechnol. 17 994-999,1999
  • ICAT reagent is attached to particular
    amino-acids (Cys)
  • Affinity purification leads to simplification of
    complex mixture

32
Differential analysis using ICAT
Time
M/Z
33
ICAT issues
  • The tag is heavy, and decreases the dynamic range
    of the measurements.
  • The tag might break off
  • Only Cysteine containing peptides are retrieved
    Non-specific binding to strepdavidin

34
Serum ICAT data
MA13_02011_02_ALL01Z3I9A Overview (exhibits
stack-ups)
35
Serum ICAT data
  • Instead of pairs, we see entire clusters at 0,
    8,16,22
  • ICAT based strategies must clarify ambiguous
    pairing.

46
40
38
32
30
24
22
16
8
0
36
ICAT problems
  • Tag is bulky, and can break off.
  • Cys is low abundance
  • MS2 analysis to identify the peptide is harder.

37
SILAC
  • A novel stable isotope labeling strategy
  • Mammalian cell-lines do not manufacture all
    amino-acids. Where do they come from?
  • Labeled amino-acids are added to amino-acid
    deficient culture, and are incorporated into all
    proteins as they are synthesized
  • No chemical labeling or affinity purification is
    performed.
  • Leucine was used (10 abundance vs 2 for Cys)

38
SILAC vs ICAT
Ong et al. MCP, 2002
  • Leucine is higher abundance than Cys
  • No affinity tagging done
  • Fragmentation patterns for the two peptides are
    identical
  • Identification is easier

39
Incorporation of Leu-d3 at various time points
  • Doubling time of the cells is 24 hrs.
  • Peptide VAPEEHPVLLTEAPLNPK
  • What is the charge on the peptide?

40
Quantitation on controlled mixtures
41
Identification
  • MS/MS of differentially labeled peptides

42
Peptide Matching
  • SILAC/ICAT allow us to compare relative peptide
    abundances without identifying the peptides.
  • Another way to do this is computational. Under
    identical Liquid Chromatography conditions,
    peptides will elute in the same order in two
    experiments.
  • These peptides can be paired computationally

43
Map Comparison for Quantification
44
Comparison of features across maps
  • Hard to reduce features to single spots
  • Matching paired features is critical
  • M/Z is accurate, but time is not. A time scaling
    might be necessary

45
Time scaling Approach 1 (geometric matching)
  • Match features based on M/Z, and (loose) time
    matching. Objective ?f (t1-t2)2
  • Let t2 a t2 b. Select a,b so as to minimize
    ?f (t1-t2)2

46
Geometric matching
  • Make a graph. Peptide a in LCMS1 is linked to all
    peptides with identical m/z.
  • Each edge has score proportional to t1/t2
  • Compute a maximum weight matching.
  • The ratio of times of the matched pairs gives a.
    Rescale and compute the scaling factor

M/Z
T
47
Approach 2 Scan alignment
  • Each time scan is a vector of intensities.
  • Two scans in different runs can be scored for
    similarity (using a dot product)

S11
S12
S1i 10 5 0 0 7 0 0 2 9
S2j 9 4 2 3 7 0 6 8 3
M(S1i,S2j) ?k S1i(k) S2j (k)
S22
S21
48
Scan Alignment
S11
S12
  • Compute an alignment of the two runs
  • Let W(i,j) be the best scoring alignment of the
    first i scans in run 1, and first j scans in run
    2
  • Advantage does not rely on feature detection.
  • Disadvantage Might not handle affine shifts in
    time scaling, but is better for local shifts

S22
S21
49
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com