Title: Mass Spectrometry
1Mass Spectrometry
2What are mass spectrometers?
- They are analytical tools used to measure the
molecular weight of a sample. - Accuracy 0.01 of the total molecular weight
of a large sample (biomolecule) which is enough
to identify substitutions, post translational
modifications.
3Use to BioChemists
- Accurate molecular weight measurements
- Determine sample purity
- Verify amino acid substitutions, post
translational modifications. - Amino acids sequencing
- Oligonucleotide structure
- Protein structure determination (protein folding,
macromolecular structure determination)
4Use in the Industry
- Biotechnology (Analysis of proteins, peptides,
oligonucleotides) - Pharmaceuticals (Drug discovery,
pharmacokinetics, drug metabolism) - Clinical (neonatal screening, haemoglobin
analysis) - Geological (Oil composition)
- Environmental (Water quality, food contamination)
5Mass Spectrometer has
- 3 parts
- Ionization source
- Analyzer
- Detector
http//www.astbury.leeds.ac.uk/Facil/MStut/mstutor
ial.htm
6Matrix-Assisted Laser Desorption/Ionization
(MALDI)
From lectures by Vineet Bafna,UCSD
7Protein to peptide to fragment ion
- Proteins are digested by using a protease like
Trypsin. - Trypsin breaks the protein backbone at L and R
which are basic residues and form positive ions. - The mass spectrometer further breaks these
peptides into fragment ions.
8Peptide Cleavage
9Glycan Cleavage
10Mass Spectrum
11Experimental Spectrum
Theoretical Spectrum
12Peptide sequencing problem
- Goal Find a peptide whose theoretical spectrum
matches the given experimental spectrum the best. -
-
13How can it be done?
- Database search
- De Novo search
14Database search
- Given a experimental spectrum and the parent mass
of the experimental peptide, find candidate
peptides with the same parent mass in the
database that match the experimental peptide the
best
15De novo Search
- Build a spectrum graph from the masses to create
the nodes - Use mass differences to create the edges
- Find the best path
16(No Transcript)
17Database vs De Novo Search
- Database search is very successful in
identification of already known proteins. - De novo helps in identification of proteins not
in the database. - Database search is not as fast as De novo.
- De novo needs good quality spectra and without
any modifications to work with. - De novo is not very accurate.
- Database SEQUEST (Yates et al)
- De Novo PepNovo (Frank and Pevzner)
18Our Project
- Database search tool for Peptide Identification
from Mass Spectrometry data using a Machine
Learning approach
19What makes it different from the traditional
search tools?
- Traditional search tools use ad-hoc rules and
or unified probabilistic models - Our tool is based on the Machine learning
approach
20What is Machine Learning?
- An area of artificial intelligence concerned
with the development of techniques which allow
computers to learn from data. - The researcher feeds a set of training examples
to a computer program that aims to learn the
connection between features of the examples and a
specified target concept.
21Examples of Machine learning techniques
- Linear Regression
- Decision Tree learning
- Artificial Neural Networks
- Bayesian Learning
- Analytical Learning
- Reinforcement Learning
- etc.....
22Our Choice....
- Artificial Neural Networks
- Reason?
- Peptide fragmentation is a non-linear problem
governed by complex rules.
23A brief overview of Neural Networks
24Brain Cells
25From Human Neurons to Artificial Neurons....
26Feed Forward Networks
27Work flow of the project
28Data Preparation
- Protein Samples were isolated from rat brains
- Samples were digested with trypsin and passed
through LCQ Deca Xp ion-trap mass spectrometer
and spectra of peptide ions were recorded. - All spectra were searched against protein
sequences for Rattus in Swiss-Prot database using
Mascot
29- Precursor peptides were divided into double and
triple charged sets. - Peak intensities were estimated for the following
ion types - precursor-H2O
- b, b-H2O, b-NH3, b-H2O-NH3
- y, y-H2O, y-NH3, y-H2O-NH3
30Network Training
- Features for each ion were extracted.
- Target is the peak intensity for each ion.
- Seperate ensembles of two layer feed-forward
networks were constructed for each ion type and
trained on the data.
31Prediction of Spectrum
- The predicted fragment spectrum was constructed
combining the outputs of individual predictors
for each ion type. - A blackbox was constructed from the trained
Neural Network models - The blackbox when presented with a peptide as
input will output the predicted spectrum
32(No Transcript)
33Thank You