Title: Proteomics: Strategies for protein Identification
1Proteomics Strategies for protein Identification
- Yao-Te Huang
- Oct 12, 2009
2Two major methods to determine proteins ID
- Determine proteins ID by traditional chemical
method (Edman degradation) - Determine proteins ID by mass spectrometry
3- Part One
-
- Determine proteins ID by traditional chemical
method (Edman degradation)
41A complete hydrolysis
- The first step is to determine the amino acid
composition of a protein. - The protein is hydrolyzed into its constituent
amino acids by heating it in 6 N HCl at 110C for
24-72 hours. Amino acids in hydrolysates can then
be labeled with ninhydrin or fluorescamine, and
be separated by ion-exchange chromatography on
columns of sulfonated polystyrene. - The identity of the amino acid is revealed by its
elution volume (which is the volume of buffer
used to remove the amino acid from the column)
and the height of the absorption peak is
proportional to the number of times that
particular amino acids occurs in the protein.
5If the unknown protein is A-G-D-F-R-G
Determination of Amino Acid Composition.
Different amino acids in a protein hydrolysate
can be separated by ion-exchange chromatography
on a sulfonated polystyrene resin (such as
Dowex-50).Buffers (in this case, sodium citrate)
of increasing pH are used to elute the amino
acids from the column. The amount of each amino
acid present is determined from the absorbance.
Aspartate, which has an acidic side chain, is
first to emerge, whereas arginine, which has a
basic side chain, is the last.
61A complete hydrolysis (contd.)
- Amino acids treated with ninhydrin give an
intense blue color, except for proline, which
gives a yellow color because it contains a
secondary amino group. - The concentration of an amino acid in a solution,
after heating with ninhydrin, is proportional to
the optical absorbance of the solution. This
technique can detect a microgram (10 nmol) of an
amino acid. As little as a nanogram (10 pmol) of
an amino acid can be detected by fluorescamine,
which reacts with the ?-amino group to form a
highly fluorescent product.
7(No Transcript)
8(No Transcript)
91A complete hydrolysis (contd.)
- The complete hydrolysis gives the info of the
amino acid composition of a protein, not the
sequence info. - However, there are algorithms such as
AACompIdent, which attempt to predict protein
sequences on the basis of amino acid compositions
by searching protein sequence databases for
entries that would give a similar composition
profile.
10Polypeptides have characteristic amino acid
compositions
111B protein sequencing by Edman degradation
- Edman degradation involves labeling the
N-terminal amino acid of a protein or peptide
with phenyl isothiocyanate .
121B protein sequencing by Edman degradation
(contd.)
- Mild acid hydrolysis then results in the cleavage
of the peptide bond immediately adjacent to this
modified residue, but leaves the rest of the
protein intact. - The terminal amino acid can then be identified by
chromatography, and the procedure is repeated on
the next residue and the next, thus building up a
longer sequence.
13(No Transcript)
141B protein sequencing by Edman degradation
(contd.)
- It is not suitable for sequencing proteins larger
than 50 residues in a single run because each
cycle of degradation is less than 100 efficient. - This problem is addressed by cleaving large
proteins into peptides, using either chemical
reagents or specific endoproteases.
15(No Transcript)
16(No Transcript)
171C Edman degradation in proteomics
181C Edman degradation in proteomics (contd.)
- Disadvantages (a) laborious time-consuming
(sequencing 10 residues per day) (b) in some
proteins, the ?amino group of the N-terminal
amino acid residue is modified, and then fails to
react with phenyl isothiocyanate. - Advantages (a) the most convenient method for
determining the N-terminal sequence of a protein
(b) also very sensitive method (that can sequence
0.5-1 pmol of pure protein).
19- Part Two
-
- Determine proteins ID by mass spectrometry
20What is a mass spectrometer, and what does it do?
- A mass spectrometer is an analytical device that
determines the molecular weight of chemical
compounds by separating molecular ions according
to their mass-to-charge ratio (m/z). - The ions are generated by inducing either the
loss or the gain of a charge (e.g., deprotonation
or protonation). - Once the ions are formed they can be separated
according to the m/z and finally detected. - The resulting mass spectrum may provide the info
about MW of a chemical compound or even about its
structural information.
21Major components of a mass spectrometer (1)
(1) The ion source unit (including sample
introduction) in which, molecular ions are
generated, and then electrostatically
propelled into the mass resolution unit (2) the
mass resolution unit (the mass analyzer) in
which molecules ions can be resolved (separated
or filtered) according to their m/z ratios. (3)
the ion detector unit in which the signal is
detected, and transferred to a computer for
further processing.
e.g. MALDI or ESI
e.g., TOF (time-of-flight)
22Major components of a mass spectrometer (2)
23MALDI (Matrix-assisted laser desorption-ionization
)
- We may pump much energy into a solid matrix (in
which macromolecules are embedded) to ionize and
desorb (into a vacuum) the macromolecule without
significant degradation. - The best way to pump energy is through a laser
pulse, and the matrix is chosen as a substance
that absorbs strongly at the laser wavelength. - The power density required to generate a
significant ion current corresponds to an energy
flux of 20mJ/cm2. - Aromatic molecules such as 2,5-dihydroxybenzoic
acid, which absorbs in the UV, are favorite
matrices because of the common use of UV lasers
in the MALDI method. - The pulsed nature of the excitation (from a few
tens of nanoseconds to a few hundred
microseconds) simplifies the data analysis
because all molecules begin their flight in
nearly a synchronous fashion.
24MALDI (Matrix-assisted laser desorption-ionization
)
- In MALDI, the analyte is first co-crystalized
with a large molar excess of a matrix compound,
usually a UV-absorbing weak organic acid. - Irradiation of this analyte-matrix mixture by a
laser results in the vaporization of the matrix,
which carries the analyte with it into the vapor
phase. That is, both the matrix and any sample
embedded in the matrix are vaporized. - Ionization of the analyte results from exchanges
of electrons (or protons) with the matrix
compound. - Once in the gas phase, the desorbed charged
molecules are then directed electrostatically
into the mass resolution unit (the mass
analyzer).
25MALDI (Matrix-assisted laser desorption-ionization
)
26Commonly used MALDI matrices
27(No Transcript)
28ESI (electrospray ionization)
- Charged microdroplets containing the
macromolecules to be studied are sprayed into the
mass spectrometer through a charged nozzle (which
ionizes the drops that are exciting the tip). - As the droplets accelerate away from the tip, the
solvent evaporates until, at some point, the
concentration of charges is so high that the
coulombic forces overcome the surface tension of
the drop, resulting in dispersion of the drop
into a spray of smaller droplets. - These droplets continue to evaporate and will
themselves disperse into even finer sprays until
all the solvent is gone, leaving the macroions
they contained for analysis.
29ESI (electrospray ionization)
30ESI
31nanoESI
- The spray needle has been made very small, and is
positioned close to the entrance to the mass
analyzer. - The flow rates for nanoESI sources are on the
order of tens to hundreds of nanoliters per
minute. - The end result of this rather simple adjustment
is increased efficiency, which includes a
reduction in the amount of sample needed. - NanoESI is more tolerant of salts and other
impurities (because less evaporation means the
impurities are not concentrated down as much as
they are in ESI)
32nanoESI
33Mass Analyzers having many kinds, including TOF,
quadrupoles, etc
- Performance characteristics accuracy,
resolution, mass range, tandem analysis
capabilities, and scan speed
34Accuracy
- It is the ability with which the analyzer can
accurately provide m/z information and is largely
a function of an instruments stability and
resolution. - For example, an instrument with 0.01accuracy can
provide info on a 1000.00 Da peptide to 0.1 Da
or a 10000 Da protein to 1.0 Da. - An alternative means of describing accuracy is
using part per million (ppm) terminology, where
1000.00 Da peptide to 0.1 Da could also be
described as 1000.00 Da peptide to 100 ppm.
35Resolution
- It is the ability of a mass spectrometer to
distinguish between ions of different
mass-to-charge ratios. ResolutionM/(?M) - Where ?M represents the peak width at half
maximum, - And M corresponds to the m/z
36Resolution (contd.)
37Mass Range
- It is the m/z range of the mass analyzer. For
instance, time-of-flight (TOF) analyzers have
virtually unlimited m/z range, and quadrupole
analyzers typically scan up to m/z 3000.
38Tandem MS analysis
- It is the ability of the analyzer to separate an
ion, generate fragment ions from the original
ion, and then analyze the fragmentation ions. - Typically tandem MS experiments are performed by
generating the ion of interest and selecting it
with an analyzer. The ion is then collided with
inert gas molecules such as argon or helium, and
the fragments generated by the collision are
analyzed.
39Tandem MS analysis
- Information obtained via tandem analysis can be
used to sequence peptides, or structurally
characterize carbohydrates, small
oliogonucleotides, and lipids.
40Scan speed
- It refers to the rate at which the analyzer scans
over a particular mass range. Most instruments
require seconds to perform a full scan, however
this can vary widely depending on the analyzer.
Time-of-flight analyzers, for example, complete
analyses in milliseconds or less.
41The principles of a time-of-flight (TOF) mass
spectrometer
42The principles of a time-of-flight (TOF) mass
spectrometer
43reflector time-of-flight (TOF)
In reflector time-of-flight(TOF) instruments,
the ions are accelerated to high kinetic energy
and are separated along a flight tube as a
result of their different velocities. The ions
are turned around in a reflector, which
compensates for slight differences in kinetic
energy, and then impinge on a detector that
amplifies and counts arriving ions.
44Quadrupoles
Quadrupole mass spectrometers select by
time-varying RF fields between four rods, which
permit a stable trajectory only for ions of a
particular desired m/z.
45Triple Quadrupoles
Again, ions of a particular m/z are selected in
a first section (Q1), fragmented in a collision
cell (Q2), and the fragments separated in Q3.
46Triple quadrupoles (contd.)
- The first quadrupole (Q1) is used to scan across
a preset m/z range or to select an ion of
interest. - The second quadrupole (Q2), also known as the
collision cell, transmits the ions while
introducing a collision gas (argon) into the
flight path of the selected ion. After colliding
with Ar, the selected ion is fragmented ( a
process called CID (collision-induced
dissociation). - The third quadrupole (Q3) serves to analyze the
fragment ions generated in the collision cell
(Q2).
47Peptide Mass Fingerprinting (PMF)
- More recently, MS has been combined with protease
digestion to enable peptide mass fingerprinting. - (1) Sequence specific proteases or certain
chemical cleaving agents are used to obtain a set
of peptides from the target protein that are then
mass analyzed - (2) The observed masses of the proteolytic
fragments are compared with theoretical in
silico digests of all the proteins listed in a
sequence database. - (3) The matches or hits are then statistically
evaluated and ranked according to the highest
probability
48Protein identification by PMF
49- Various databases are available on the web, and
can be used in conjunction with such computer
programs such as Profound, ProteinProspector, and
Mascot.
50PMF possible causes of incorrect protein
identification
- An error in the sequence database
- An inaccurate experimental mass determination
- Existence of two or more polymorphic variants
(e.g., SNPs) - Post-translational modifications
- Occasional nonspecific cleavage of the protein by
trypsin
51Protein identification using Tandem Mass
Spectrometry
52Protein identification using Tandem Mass
Spectrometry (contd.)
- Tandem mass spectrometry has the ability to
induce fragmentation and perform successive mass
spectrometry experiments on these ions. It is
generally used to obtain this structural info.
(Abbreviated MSn, where n refers to one the
number of generations of fragment ions being
analyzed).
53(No Transcript)
54CID (collision-induced dissociation)
- CID is accomplished by selecting an ion of
interest with the mass analyzer and then
subjecting that ion of interest to collisions
with neutral atoms or molecules. The selected ion
will collide with the collision gas (e.g., Ar,
He, or Xe), resulting in fragment ions which are
then mass analyzed. CID can be accomplished with
a variety of instruments, including triple
quadrupoles or TOF/TOF mass analyer.
55CID (contd.)
- The fragment ions produced in this process can be
separated into two classes - (1) One class retains the charge on the
N-terminal and occurs at three different
positions, designated as types an, bn, and cn. - (2) The second class of fragment ion ions retains
the charge on the C-terminal and fragmentation
occurs at three different positions, types xn,
yn, and zn.
56CID (contd.)
- Most fragment ions are obtained from cleavage
between a carbonyl and a nitrogen (the amide
bond). - Thus, if the charge is retained on the N-terminal
end of the molecule, the cleavage is a b-type. If
the charge is retained on the C-terminal end of
the peptide, the cleavage is y-type.
57(No Transcript)
58(No Transcript)
59Ladder sequencing by mass spectrometry
The differences in mass between consecutive ions
in either series should correspond to the masses
of individual amino acids
E Glu T Thr
60Ladder sequencing by mass spectrometry
Two pairs of residues that are hard to
be distinguished by MS (1) Gln (128.13) Lys
(128.17) (2) Leu (113) Ile (113)