Title: Amino acids, Peptide bond and Amino acid analysis
1Introduction to Proteomics
Masaru Miyagi, Ph.D. Case Center for
Proteomics January 16, 2008
2Outline of Todays Talk
- What is proteome
- Nature of proteome
- Tools to study proteome
- Current proteomic studies
3Human Genome
- 3 billion chemical nucleotide bases (A, C, T, and
G) - About 2 of the genome (30,000 genes) encodes
instructions for the synthesis of proteins - the largest known human gene is dystrophin at 2.4
million bases!! - The functions are unknown for more than 50 of
discovered genes. - The human genome sequence is almost (99.9)
exactly the same in all people. - Over 40 of the predicted human proteins share
similarity with fruit-fly or C. elegans proteins
http//www.ornl.gov/sci/techresources/Human_Genome
/publicat/primer2001/4.shtml
4Comparison of Human Genomes
http//www.ornl.gov/sci/techresources/Human_Genome
/publicat/primer2001/4.shtml
5What is Proteome?
- A PROTEOME is the entire PROTein complement
expressed by a genOME, or by a cell or tissue
type. - There is only one definitive genome of an
organism, the proteome is an entity which can
change under different conditions, and can be
dissimilar in different tissues of a single
organism. - The number of proteins in a proteome can exceed
the number of genes present, as protein products
expressed by alternative gene splicing or with
different posttranslational modifications are
observed as separate molecules on a 2-D gel.
Wilkins, M.R. et al., Biotechnol. Genet. Eng.
Rev., 13, 19-50 (1996)
6Human Proteome
- Human genome 30,000 genes
- Human proteome 1,000,000 protein variants??
- alternative splicing, co-/post-translational
modifications, post-translational processing etc.
add complexity to the human proteome
7Genome vs Proteome
- Genome Static
- Proteome Very Dynamic
81 Genome 2 Proteomes
91 Genome Many Proteomes
101 Genome Many Proteomes
11Proteomes Respond to External Factors
12Proteomes Respond to External Factors
Caffeine
Drugs
Temperature
Proteome
Infections
Foods
Toxins
Circadian Cycle
Possible mechanisms include
Protein expression Protein-protein
interactions Post-translational modifications
13Proteome
- Proteins expressed by a cell or organ at a
particular time and under specific conditions.
Sample preparation is critical in evaluating a
proteome!!
14Proteomics
- Proteomics is the study of total protein
complements, proteomes, e.g. from a given tissue
or cell type.
15-omes
Genome the complete set of genes in a
system Transcriptome the complete set of mRNAs
in a system Proteome the complete set of
proteins expressed in a system
16Why Proteomics?
We can not accurately predict protein functions,
expression levels post-translational
modifications, protein-protein interactions subcel
lular localizations, 3D-structures etc. from the
genome sequence.
Functional Protein
Protein
mRNA
DNA
Transcriptional Control
Translational Control
Post-Translational Control
We need to look at proteins expressed in
cells/tissues!!
17Why Proteomics?
mRNA level (transcriptome) does not always
reflect protein expression level
- Comparative expression analyses of mRNA and
proteins have shown that expression levels of
mRNAs are not necessarily correlated with those
of the encoded proteins. - Different stability of mRNAs
- Different protein translation rate
- Different half-lives of proteins
- Post-translational processing/modification
Gygi, S. P. et al. (1999) Mol. Cell. Biol., 19,
1720
18Goal of Proteomics
- Quantitatively characterize all proteins
expressed by a tissue or organism - Complete protein expressions
- Complete covalent structures
- Complete protein networks
- Complete 3D structures
- Complete functional assignments
- Understanding how proteins function in the living
cells
19Tools to Study Proteomes
20History of Protein Studies
- Protein studies slow growth. Boom since 1995
- 1937 Electrophoresis of proteins (Tiselius)
- 1948 Amino acid analysis (Stein and Moore)
- 1955 complete amino acid sequence of insulin
(Sanger) - 1967 automated protein sequencer (Edman)
- 1975 Two-dimensional electrophoresis (OFarrell)
- 1988 Matrix-assisted laser desorption ionization
mass spectrometry of proteins
(Tanaka) - 1989 Electrospray mass spectrometry of proteins
(Fenn) - 1993 Identification of proteins from 2D-gel by
mass spectrometry (Henzel)
21Proteomics Analytical Challenges
- Proteome
- Very dynamic
- Very complex
- A huge range of protein abundances
- Variable solubility
- Can not be amplified
22Sensitivity and Resolution Limits of Current
Global Proteomic Profiling Methods
mili
micro
nano
pico
femto
ato
zepto
Million
Kenyon et al, Mol. Cell. Prot. 2002
23109 Order of Difference in Concentration
12 mm
Diameter 12,000 km
24Proteomics needs
- High resolution protein/peptide separation
techniques - Electrophoresis
- High Performance Liquid Chromatography
- Highly sensitive detection techniques
- Mass spectrometry
- Reliable bioinformatics tools
- Database search tools
- Data interpretation tools
25General Flow Scheme for Proteomic Analysis
Top-down method
2D-PAGE
Protein mixture
Proteins
Separation
Digestion
Digestion
Bottom-up/Shotgun methods
Electrospray/MALDI
Peptide mixture
HPLC
Peptides
MS analysis
Separation
Protein identification
MS data
Database search software
Post-translational modification
26Electrophoresis - History
- 1937, Tiselius First Electrophoresis
- 1959, Raymond and Winstraub Introduction of
polyacrylamide gels - 1969, Beber and Osborn Introduction of
denaturing agents (SDS) - 1970, Laemmli Stacking gel SDS
- 1975, OFarrell isoelectric focusing then SDS
gel electrophoresis - 1981, Jorgenson Capillary electrophoresis
- 1988, Gorg IEF using immobilized pH gradient
(IPG) strip
Arne Tiselius
won the Nobel Prize for Chemistry in 1948
272D-electrophoresis
Capable of separating several thousand proteins
in a single gel
28Chromatography - History
- 1906, Michael Tswett first chromatography. He
separated various plant pigments, such as
chlorophylls, by passing solutions of them
through glass columns packed with finely divided
calcium carbonate. The separated species appeared
as colored bands on the column, which account for
the name he chose for the method (Greek chroma
meaning color and graphein meaning write). - Mid 1970s, Majors and Kirkland Introduction of
high-performance liquid chromatography (HPLC).
Skoog, D. A., (1996) Fundamentals of Analytical
Chemistry, Seventh Edition, Thomson Learning
29High-Performance Liquid Chromatography (HPLC)
- HPLC is a type of chromatography that employs a
liquid mobile phase and a very finely divided
stationary phase. - The diameter of the particles of the solid
stationary phase is 3 10 mm. - In order to obtain satisfactory flow rates, the
liquid must be pressurized to several hundred
pounds per square inch or more.
30Advantages of HPLC
- Speed
- High resolution
- Sensitivity
- Reproducibility
- Accuracy
- automation
UV/VIS MS
Reverse-phase HPLC
Miyagi et al,, Eur. J. Biochem, 2000
31Mass Spectrometry - History
- 1907, Thomson First Mass Spectrometer
- 1918, Dempster Electron ionization
- 1919, Aston Discovery of isotopes
- 1934, Mattauch and Herzog Double-focusing mass
spectrometer - 1956, McLafferty and Gohlke First GC-MS
- 1984, Fenn ESI-MS
- 1985, Tanaka MALDI-MS
Sir Joseph John Thomson
won the Nobel Prize for physics in 1906
32Concept of Mass Spectrometry
M2
M1
M1
M1
M1
M3
M2
M2
M2
M3
M3
M3
m/z
Ionization
Ion Separation
Ion Detection
33Basics Components of Mass Spectrometer
- Mass spectrometers generate charged species (e.g.
molecular ion) and then - sort them based on mass-to-charge (m/z) ratio
Ionization
Ion Sorting
Ion Detection
High Vacuum
Sample Inlet
Ion Source
Mass Analyzer
Ion Detector
34Mass Spectrometry Techniques Used in Analysis of
Peptides and Proteins
Mass Spectrometers are usually classified on the
basis of how samples are ionized and how the
mass separation is accomplished.
- Ionization techniques
- Electrospray ionization (ESI), Nano-ESI
- Matrix-assisted laser desorption Ionization
(MALDI) - Mass analyzers
- Quadrupole
- Time-of-flight
- Ion trap
- Quadrupole/quadrupole
- Quadrupole/Time-of-flight
- Quadrupole/Ion trap
- Time-of-flight/Time-of-flight
- Ion-trap/Fourier-transform ion cyclotron resonance
35Peptide MS/MS Spectrum
36Protein Identification using MS/MS Data
Bioinformatics tools (database search software)
is used
Trypsin
LCESI
-
MS/MS
Tryptic
Tryptic
Protein of interest
Protein of interest
peptides
peptides
ID
ID
Protein
Protein
Trypsin
MS/MS
Proteins in
Proteins in
Tryptic
Tryptic
the data base
the data base
peptides
peptides
m/z
m/z
m/z
m/z
Proteomics relies on genome sequence databases!!!
m/z
m/z
m/z
m/z
37- Proteomics relies on genome sequence databases!!!
38Genome Projects (as of January 03, 2008)
- Published complete genomes 702
- Archaeal 50
- Bactrial 575
- Eukaryal 77
- On going genome projects
- Archaeal 68
- Prokaryote 1628
- Eukaryote 851
-
http//www.genomesonline.org/
39Complete Genomes Mammals
- Homo sapiens
- Mus musculus
- Rattus norvegicus
40Current Proteomic Studies
41Categories of Proteomics
- Expression proteomics
- Comparative proteomics
- diseased vs. normal, drug treatment vs. control
etc. - Organellar proteomics
- Interaction proteomics
- immunoprecipitation, FLAG epitope-tagged protein,
etc. - Posttranslational modifications
- phosphoproteome, Nitroproteome, etc.
- Structural proteomics
- X-ray crystallography/NMR based
- Deuterium exchange, Protein foot printing and
other techniques based
42Expression Proteomics
43Control
Light exposed
Exp Eye Res. 2003 Jan76(1)131-3
44Interaction Proteomics
- Immunoprecipitation/Immunopurification
Excise bands And digest
MALDI-Tof MS ESI-MS/MS
Anti-
Immunoprecipitation
Database search
Anti-
Identification of Components
45Mol Syst Biol. 2007389
46Post-translational proteomicsPhosphoproteome
Liebler, D. C. (2002) Introduction to Proteomics,
Humana Press
47Cell. 2006 Nov 3127(3)635-48
48Structural Proteomics
?OH
49Proc Natl Acad Sci U S A. 2007 May
8104(19)7910-5