NoDupe algorithm to detect and group similar mass spectra' - PowerPoint PPT Presentation

About This Presentation

Title:

NoDupe algorithm to detect and group similar mass spectra'

Description:

Number of Views:60

Avg rating:3.0/5.0

Slides: 22

Provided by: syst50

Learn more at: http://darwin.informatics.indiana.edu

Category:

more less

Transcript and Presenter's Notes

Title: NoDupe algorithm to detect and group similar mass spectra'

1
NoDupe algorithm to detect and group similar
mass spectra.
2

Reducing the number of similar spectra in
proteomic experiments Why?
Identifying peptides from spectral collections is
time consuming.
Detecting similarities reduces number of spectra
to be processed.
Dynamic exclusion feature of the mass
spectrometer does not eliminate all duplicate
spectra.
a. Peptides may elute over a period of time
b. Peptide mixture may have high complexity.

3
MS/MS spectra from the same peptide may look
different

4
Finding degree of similarity between two spectra

5
NoDupe Algorithm

Created in Java programming language.
Spectra are grouped on the based on their
similarities.
Preprocessing done to reduce complexity.
Optionally removes duplicate spectra from each LC
run retaining only one representative spectrum.

6
NoDupe Preprocessing

7
Results of preprocessing
8
NoDupe Finding similarities

Scans are sorted based on the precursor m/z.
Spectral contrast angles are calculated for pairs
of spectra within 3 m/z of each other.

9
Spectral contrast angles
10
Similarity angle cutoff is taken as 1.1
11
NoDupe Selecting representative spectra

12
Samples used

13
Experimental process

LC separations were done for all three samples.
2to3 algorithm was applied to remove spectral
copies with incorrect charge state assignments.
They used NoDupe to reduce the number of spectra.

14
Observations

Large number of peaks removed.
For the peptide VAAPEEHPVLLTEAPLNPK,
Approximately 70 of the peaks in the spectra
were removed. number of peaks and relative
standard deviation diminished.
The relative standard deviation diminished from
26 to 20.

15
Observations Clusters

16
Identifications lost

4 to 14 of the identifications were lost.
Without removing the duplicate spectra 5 to 19
of the identifications were lost.
Angle is found to be 0.847.

17
For group size 2

Since there are only two spectra in this group,
the most representative one is chosen.
Scan 491 is chosen as only 21 of the peaks are
remaining as opposed to 24.
Since pairs are common, there might be a
significant loss of protein identifications.

18
Lost spectra
Scan 4892 was not found to be similar enough by
NoDupe.
19
Duplicate spectra and peptides identified
20
Where it can be used

Grouping results in substantial savings in time.
Instead of finding the best sequence for each
spectrum, it will find the spectrum that best
matches each of the spectra in a group.
If the database is large, it is more effective in
saving time.
A narrower mass window can be used.
Alleviates random matching.
Spectral libraries will be more effective if they
contain representative spectra than randomly
chosen ones.
Spectra that are in the same groups but receive
different identifications by De Novo examination
can be flagged.

21
Acknowledgments

The paper presented was Similarity among tandem
mass spectra from proteomic experiments
detection, similarity and utility David L.Tabb,
Michael J.MacCoss, Christine C.Wu, Scott
D.Anderson, and John R.Yates.
Thanks to Prof. Haixu Tang for guiding me.

Write a Comment

User Comments (0)