Title: INTERNATIONAL COLLABORATION IN PROTEOMICS AND INFORMATICS
1INTERNATIONAL COLLABORATION IN PROTEOMICS AND
INFORMATICS
- Bibliotheca Alexandrina, 9 October, 2007
- Gilbert S. Omenn, M.D., Ph.D.
- Center for Computational Medicine Biology
- Chair, HUPO Plasma Proteome Project
- University of Michigan, Ann Arbor, MI, USA
2It Is Such A Great Pleasure to Visit The
Bibliotheca Alexandrina
- One of the Wonders of the Modern World!
- The First Digital Library, from its Birth
- Facilitating International Collaboration in
- Science and Technology
3Nearly-Complete Human Genome Sequence, 15-16 Feb
2001
4 We Live in a New World of Life Sciences
- New Biology---New Technology a parts list
- Genome Expression Microarrays
- Comparative Genomics CNV miRNA
- Proteomics and Metabolomics
- Bioinformatics Computational Biology
- Mechanism- Evidence-Based Medicine
- What were you doing up to now?!
- Predictive, personalized, preventive,
- participatory healthcare and community
- health services
5Key Components of the Vision of Biology As An
Information Science
- An avalanche of genomic information validated
SNPs, haplotype blocks, candidate genes/alleles,
proteins, metabolites--associated with disease
risk - Powerful computational methods
- Effective linkages with better environmental and
behavioral datasets for eco-genetic analyses - Credible privacy and confidentiality protections
- Breakthrough tests, vaccines, drugs, behaviors,
and regulatory actions to reduce health risks and
cost-effectively treat patients globally. -
6A Golden Age for the Public Health Sciences
- Sequencing and analyzing the human genome is
generating genetic information that must be
linked with information about - Nutrition and metabolism
- Lifestyle behaviors
- Diseases and medications
- Microbial, chemical, physical exposures
- Every discipline of public health sciences
needed. -
7Definitions
- Genetics is the scientific study of genes and
their roles in health and disease, physiology,
and evolution. - Genomics is a modern subset of the broader field
of genetics, made feasible by remarkable advances
in molecular biology, biotechnology, and
computational sciences, to examine the entire
complement of genes and their actions. - Global analyses permit us and require us to go
beyond the known lamp-posts of individual gene
associations and effects. -
8- Proteins are the action molecules of the cell and
the leading candidates for biomarkersin tissues
and in the blood. Proteins are coded for by
genes. Understanding one protein can be a
lifetimes work! - Proteomics is the global analysis of proteins in
cells or body fluids. Techniques for global
analysis of proteins are advancing rapidly,
especially for discovery of biomarkers for
diagnosis, treatment, and prevention. - Metabolomics is the global analysis of
metabolites. - Proteomics metabolomics epigenomics
functional genomics
9Protein
DNA
10Rationale for Proteomics
- Proteins are much closer to the pathophysiologic
changes and molecular targets for drugs than are
mRNAs. - Changes in mRNAs are clues, but changes in
corresponding proteins often are not highly
correlated. - Advances in fractionation of complex tissue and
plasma protein mixtures, in mass spectrometry,
and in curated databases of proteins help address
complexity, dynamic range, and uncertainty of
protein identifications.
11A Vision For Proteomics
- Multiple protein biomarkers discovered
- Biomarkers combined on diagnostic chips
- Detect organ location of cancers, for surgery or
radiation - Detect mechanism of disease for chemotherapy,
even if location unknown - Mechanistic, rather than geographic
classification - Better efficacy/less toxicity for all types of
patients
12Status of Proteomics Assays
- Many technology platforms of increasing
sensitivity and resolution - Patterns or specific proteins still just
biomarker candidates most lack independent
confirmation and coefficient of variation, let
alone validation with standard clinical
chemistry parameters of sensitivity, specificity,
and especially positive predictive value - Approaches of clinical chemistry needed to guide
further development of the field
13Barriers for Proteomic Cancer Biomarker Discovery
in Plasma
- Human cancers are very heterogeneous
- Tumor proteins are in low abundance for early
detection of cancers - Tumor proteins are greatly diluted upon release
to ECF and blood - Plasma is an extraordinarily complex specimen
dominated by high abundance proteins (50 by
weight is albumin) - Knowledge of the plasma proteome is still limited
14Outline of Lecture
- Review of the vision, strategy, and output of the
HUPO Human Plasma Proteome Project Pilot Phase - Objectives for the New Phase of the Plasma
Proteome Project - Example of the power of computational tools and
collaborations (if time)
15HUPO
- The international Human Proteome Organization
(HUPO) was founded in 2001. Its aims are - To advance the science of proteomics
- To enhance training in proteomics
- To build international initiatives by organ
(liver, brain, kidney), biofluid (plasma, urine,
CSF, saliva), and disease (cardiovascular,
cancers), plus antibodies and data standards.
16Proteomics Interaction Map Ruth McNally,
sociologist
17 Samir Hanash, founding President of HUPO
Gil Omenn, leader of HUPO PPP
18THE PLASMA PROTEOME
- Advantages The most available human specimen
the most comprehensive sample of tissue-derived
proteins the basis for a Disease Biomarkers
Initiative tied to organ proteomes. - Specific Disadvantages
- Extreme complexity/enormous dynamic range
- High risk of ex vivo modifications
- Lack of highly standardized protocols
- General Challenges Inadequate appreciation of
incomplete sampling by MS/MS evolving
annotations and unstable databases
19- Long-Term Scientific Goals of the HUPO
- Human Plasma Proteome Project
- 1. Comprehensive analysis of plasma and serum
- protein constituents in people
- Identification of biological sources of variation
- within individuals over time, with
validation of - biomarkers
- Physiological age, sex/menstrual cycle,
exercise - Pathological selected diseases/special
cohorts - Pharmacological common medications
- 3. Determination of the extent of variation
across - populations and within populations
20Scheme Showing Aims and Linkages of the
HUPO Plasma Proteome Project, Pilot
Phase
Serum vs Plasma
Technology Platforms--Separation and
Identification
Reference Specimens
HUPO HUMAN PLASMA PROTEOME PROJECT (PPP)
Development Validation of Biomarkers
HUPO PPP Participating Labs
Technology Vendors
Liver and Brain Proteome, Antibody, Protein Stds
Projects
Omenn GS. The Human Proteome Organization Plasma
Proteome Project Pilot Phase Reference
Specimens, Technology Platform Comparisons, and
Standardized Data Submissions and Analyses.
Proteomics 200441235-1240.
21OUTPUT FROM PPP Pilot Phase
- Special Issue Aug 2005, Proteomics, Exploring
the Human Plasma Proteome 28 paperscollaborativ
e analyses and annotations, plus lab-specific
analyses, and Wiley book (2006) - Publicly-accessible datasets
- www.ebi.ac.uk/pride EBI www.peptideatlas.org
/repository ISB - www.bioinformatics.med.umich.edu/hupo/ppp
- Additional papers are encouraged
- Nature Biotechnology 2006 24333-338
(States et al) - Genome Biology 20067R35 (Fermin et al)
- Proteomics 2006 6 5662-5673 (Omenn)
- Numerous citations/comparisons of datasets
22(No Transcript)
23SERUM AND PLASMA REFERENCE SPECIMENS
- BD specially prepared male/female pooled
samples, divided into EDTA-, Heparin-, and
Citrate-anti-coagulated Plasma and Serum (250 ul
x4 of each). - BD clot activator. No protease inhibitors.
Three separate ethnic pools prepared. Shipped
frozen. - 2. Chinese Academy of Medical Sciences Sets of
three - plasmas serum, similar to BD protocol.
- 3. National Institute for Biological Standards
Control, - UK citrate-anti-coagulated, freeze-dried
plasma, from - 25 donors, prepared for Intl Soc Thrombosis
- Hemostasis, 1 ml aliquots/ampoules.
24Specifications for Data Submission
- Each of 55 labs agreed (July, 2003 Workshop) to
provide, and 31 labs did provide - a) a detailed experimental protocol, to push
the limits to detect low-abundance proteins - b) peptide sequences, rated as high or lower
confidence, based on MS/MS criteria - c) protein IDs from IPI 2.21 (July 2003) and
search engine parameters used to align peptide
sequences with proteins in human database - Later, we obtained m/z peak lists and raw spectra
(by DVD) for independent analyses.
25 From Peptides to Genome Annotation
digestion
databasesearch
LC-MS/MS
extraction
Peptides
Mass Spectrum
Proteins
Sample
Peptides
Spectrum Peptide Probability
Spectrum 1 LGEYGH 1.0
Spectrum N EIQKKF
0.3
statistical filtering
BLAST protein database
SBEAMS
Map to genome
Peptide Chrom Start_Coord
End_Coord PAp00007336 X
132217318 132217368
visualization
PeptideAtlas Database
Genome Browser
26Numbers of Proteins Identified (LC-MS/MS or
FTICR-MS, 18 labs)
- From 15,519 reported distinct protein IDs in IPI
2.21, we chose one representative/cluster - (a) 9504 1 or more peptide matches
- (b) 3020 2 peptide matches (Core Dataset)
- (c) 1274 3 or more peptide matches
- 889 follow-up high-stringency analysis with
adjustments for protein length and multiple
(43,000) comparisons in IPI v2.21 - (Nature Biotech 2006 24333-338)
27GREATEST RESOLUTION AND SENSITIVITY
- The most extensive high-confidence yield was from
combined methods of immunoaffinity (top-6)
depletion, 2 or 3-D high-resolution
fractionation, and then ESI-MS/MS with ion-trap
LTQ instrument. - LTQ gave several fold more IDs (1168) than did
LCQ (271) in same hands (B1-serum vs B1-heparin)
and obtained multiple peptides for many proteins
which had just one hit with LCQ.
28SPECIFIC OBSERVATIONS DEPLETION
- Many investigators depleted albumin and/or
immunoglobulins - Several were provided Agilent immunoaffinity
column to remove top-6 proteins - Much higher numbers of identifications after
depletion if sufficient fractionation - Inadvertent removal of other proteins sponge
effect of albumin - Assay both flow-through bound fractions
29SPECIMEN VARIABLES
- What evidence have we developed for choice of
specimens for analysis? - Plasma preferred over serummore consistent, less
degradation - EDTA-plasma preferred over heparin interferences
and citrate dilution - Clot activator? necessary only for serum
- Minimize freeze/thaw cycles (archives)
- Minimal evidence of platelet activation 4C
- Protease inhibitors desirable, but alter proteins
30INFLUENCE OF ABUNDANCE
- Using quantitative immunoassays and microarrays
(generally unknown epitopes), we have found very
high rates of detection of the more abundant
proteins, less in the mid-range, and occasional
detection of very low abundance proteins, as
expected. - High correlation (r0.9) between peptides and
measured concentrations
31Least Abundant Proteins Identified with two
distinct peptides(pg/ml range 200 pg/ml to 20
ng/ml)
- Alpha fetoprotein
2.9E-02 - TNF-R-8
3.3E02 - TNF-ligand-6
1.5E03 - PDGF-R alpha
4.6E03 - Leukemia inhibitory factor receptor 5.0E03
- MMP-2/gelatinase
8.8E03 - EGFR
1.1E04 - TIMP-1
1.4E04 - IGFBP-2
1.5E04 - Activated leukocyte adhesion mol 1.6E04
- Selectin L five labs10 peptides
1.7E04
32BIOLOGICAL INSIGHTS
- The proteins identified can be annotated by many
methods. We have searched multiple databases,
including Gene Ontology, Novartis Atlas, Online
Mendelian Inheritance in Man (OMIM), incomplete
or unidentified sequences in the human genome,
microbial genomes, InterPro protein domains,
transmembrane domains, secretion signals. - See Proteomics 2005 53226-3519 Wiley, 2006
33GENE ONTOLOGY SPECIFIC TERMS
- Over-represented in PPP 3020 (vs whole genome)
extracellular, immune response, blood
coagulation, lipid transport, complement
activation, regulation of blood pressure, as
expected also cytoskeletal proteins, receptors
and transporters. - Proteins from most cellular locations and
molecular processes are recognized. - Under-represented perception of smell (1 vs
25 exp) cation transporters, ribosomal proteins,
G-protein coupled receptors, and nucleic acid
binding proteins.
34InterPro Protein Domain Analysis
- Compared with the whole human genome, the 3020
PPP proteins are - Over-represented for EGF, intermediate filament
protein, sushi, thrombospondin, complement C1q,
and cysteine protease inhibitor. - Under-represented Zinc finger (C2H2, B-box,
RING), tyrosine protein phosphatase, tyrosine and
serine/threonine protein kinases,
helix-turn-helix motif, and IQ calmodulin binding
region domains.
35TRANSMEMBRANE AND SECRETED PROTEIN FEATURES
- 1297 of 3020
- SwissProt Annotated
ProFun Both - Transmembrane 230 151 104
- Secretion signal 373 420
358 - 1723 of 3020
ProFun Predicted - TM domain(s)
137 - Secretion signal
255
36Cardiovascular-Related Proteins Biomarker
Candidates in the PPP Database
- Proteins characterized in eight groups
- Inflammation
- Vascular
- Signaling
- Growth and differentiation
- Cytoskeletal
- Transcription factors
- Channels
- Receptors
37Comparison of Five Search Algorithms
- Using PPP data, Kapp et al (Proteomics 2005)
found Sequest and Spectrum Mill more sensitive
and MASCOT, Sonar, and X!Tandem more specific for
peptide identifications at specified
false-positive rates. - Some investigators have reported using
combinations of two or more search engines.
Decision rules are necessary.
38Can We Overcome the Idiosyncrasies of Individual
Instruments and Laboratories?
- Several informatics investigators approached the
human PPP with an offer to re-analyze the
complete MS/MS datasets using their own software
and criteria from the raw spectra (or peaklists). - These analyses eliminated the heterogeneity of
search algorithms, search parameters, and
idiosyncrasies of individual labs. - The results are hard to compare, given different
extent of analysis. However, each can be
compared with the Core Dataset.
39Independent Analyses from Raw Spectra (IDs with
2 peptides)
- Core Dataset (18 datasets, 3020)
- PepMiner (Beer, 8 large datasets, 2895) 1051 in
3020 dataset, 700 in the 9504 - X!Tandem (Beavis/States, 18 datasets, 2678) 577
in the 3020 218 in the 889 - PeptideProphet/ProteinProphet (Deutsch, 7
datasets, 960)479 in 3020 - Mascot/Digger (Kapp, Australia, 14 datasets, 513
with 1.4 error rate ongoing analysis
40What is Required and Feasible to Enhance the
Statistical Robustness of Findings?
- Many complex proteomics analyses are done once,
without replicates required to estimate
coefficient of variation or other standard
parameters for clinical chemistry use. - Five to ten independent repetitions of the
experiments are a must Hamacher et al,
Proteomics in Drug Discovery, 2006. - How should we determine how similar or different
are samples A and B, or the results of methods X
and Y? What decision rules apply? - We have a long way to go from discovery research
to clinical applications.
41Comparison of 5 Published Reports on Plasma
Proteins with HUPO PPP Datasets
- Report IDs IPI in 3020 in 9504
- Anderson 1175 990 316 471
- Shen 1682 1842 213 526
- Chan 1444 1019 257 402
- Zhou 210 148 51 88
- Rose 405 287 142 159
42Comparison of New Biofluid Proteome Findings with
HUPO PPP-3020 Proteins
- Proteome Proteins IPI 2.21 PPP-3020
- Urine 1543 910
293 - tears 491 313
117 - semen 923 560
180 - Refs from Matthias Mann Lab, Genome Biology,
2007, different IPI versions. - Comparison, Omenn, Proteomics-Clinical
Applications (2007).
43NEXT PHASE OF PPP (PPP-2)
- Standard operating procedures (SOPs), including
EDTA-plasma as standard specimen replication and
confirmation of results - Quantitation and subproteomes, using new methods
and advanced instruments - Databases and robust bioinformatics
- Clinical chem/disease-related studies
44PPP-2 Research Technology Thrusts
- Learned a lot from Pilot Phaseplasma is a very
complex specimen no single platform sufficient
analyses currently far from comprehensive, let
alone reproducible now have improved data
quality and informatics resources. - PPP-2 use multiple methods focus on biomarker
discovery build upon already-funded laboratories
and repositories.
45Specific Technology Recommendations
- N-Glycosite (proteotypic) peptide resource is
a special subproteome likely to have high
biomarker relevance. - Capture glycoproteins, digest with
trypsin and PNGase F to yield N-linked
glycopeptides. Choose one unique to each protein
a finite number not all proteins. Use
complementary lectin approach to characterize
glycans. - Prepare isotope-labeled
N-glyco-peptides for multiple uses as standards
and to spike specimens. -
46N-Glycosites
Glycoproteins are enriched on cell surface, in
secreted proteome and in plasma Glycoproteins
tend to be stable Only few glycosites per
protein reduction in sample complexity (excludes
albumin) Inherent validation of N-glycosite by
fragment ion spectrum N-glycosite subproteome is
probably the one easiest to completely map
47Glycopeptide Isolation
Zhang H., Li X.-J., Martin D.B. Aebersold R.
(2003) Nat Biotech 21 660-666
48Flow chart of process
Tissue Samples
Plasma Samples
Normal Disease
Capture / Digestion
'Glycopeptide' Fract.
'Glycopeptide' Fract.
LC-MS
LC/MS Maps
Target peptides
Data Analysis
MRM LC/MS/MS
Data Analysis
Targeted LC/MS/MS
49Reducing Complexity Glycoprotein-Enriched
Subproteomes
- Methods Lab 2
Lab 11 - Enrichment hydrazide chem lectin chromy
- Peptide Fxn SCX RP RP
- Mass Spec qtof
deca-xp - Search engine Seq/ProteinProphet Sequest
- Protein IDs 222
83 - in B1-serum 51 in common
- Of total 254, 164 found among data from 11 other
labs without glycoprotein enrichment.
50Technology Recommendations (contd)
- Orbitrap and other advanced instruments with high
mass accuracy and increased throughput - Multiple Reaction Monitoring (Q-Trap, triple
quad---LOD lt50 amol, 5 logs range, probably ng/ml
range for GP. - Extensive fractionation and newer labeling
methods. - Recruit several major labs be open to
volunteers. - Determine interest in reference specimen.
- Make peptide standards available through PPP-2
post lists and make labeled compounds.
51Multiple Reaction Monitoring (MRM)
- High selectivity two levels of mass selection
(increased S/N) - High sensitivity because of high duty cycle (Q1
and Q3 are static) - Only known peptides (candidates) are detected
52Technology Recommendations (contd)
- Compare pooled samples from disease and control
high throughput not essential for discovery phase - Continue to build the catalog
- Do longitudinal repeat measures on individuals to
establish CoVmust reliably tell whether two
samples are the same or different, including PTMs - Pay attention to precursor ions
- Known interested labs Aebersold, Paik, Smith,
Speicher, Hancock, Mann probably Chinese,
Michigan, FHCRC, Japanese/glycomics.
53Issues for PPP Bioinformatics
- What are imperatives for project design?
- How can many more spectra be interpreted?
- How can more confident protein IDs be generated?
- How do we add value and benefit from EBI/PRIDE
and ISB/PeptideAtlas repositories? - What is required to make the datasets more useful
for other investigators? - Can quantitation, including of PTMs, be achieved
with statistical robustness?
54A Robust Bioinformatics Architecture
PRIDE
Dissemination
Peptide Atlas
Genome annotation
Level I repository
Individual labs
55Repositories and Resources for Proteomics
Informatics
- PRIDE at EBI, repository for protein
identifications (Martens) - PeptideAtlas, repository for raw data processed
through TransProteomics Pipeline at ISB
(Deutsch), plus SpectraST barcodes from NIST - Tranche Distributed File System/DFS (Andrews, UM)
at ProteomeCommons.org, National Resource for
Proteomics and Pathways - CPAS, developed as part of Mouse Models of Human
Cancers Consortium, at Fred Hutchinson (McIntosh) - GPMdb, developed by Beavis (Canada)
56- Tranche Distributed (P2P) File System
- Open, simple, cross-platform protocols
- e-Commerce-grade encryption makes it appropriate
for scientific research (peer-review and
traceability) - Can easily grow to accommodate very large amounts
of data and users - Commodity hardware _at_ 0.37 per GB storage
- 16 TB over 12 servers (30 additional TB ordered)
and funding for additional 20TB - Documentation, tools, code, credits
http//www.proteomecommons.org/dev/dfs - Data sets GPM, PNNL, Aurum, QqTOF vs QSTAR, sPRG
ABRF 2006, HUPO PPP - Links with PeptideAtlas, OPD, HPRD, TheGPM
57Can We Identify More High Confidence Peptides
from the MS/MS spectra?
- The spectra, not protein lists, are the raw data.
lt20 of spectra are confidently assigned to
peptide sequences the rest are typically
discarded. - More high quality spectra can be mined
(Nesvizhskii et al, MCP 2006). - Higher mass accuracy greatly enhances results
(with some complications---Eric Deutsch). - Error estimates and thresholds should be routine
for peptide IDs and protein matches.
TransProteomicPipeline (TPP) from ISB has been
designed for this purpose.
58Mining Un-assigned High Quality
Spectra(Nesvizhskii)
- Typical search SEQUEST, IPI database
- semi-constrained (tryptic on one end)
- Met 16
- /- 3 Da, average mass
- Average numbers (LCQ/LTQ data) 10-15 of all
spectra assigned peptide with high confidence - 20-25 of all high quality spectra are not
assigned
59Why Are Spectra Not Assigned?
- Possible causes of failure to assign peptide
- Imperfect scoring scheme
- Constrained search (PTM, not tryptic etc.)
- Incorrect mass/ charge state
- Low spectrum quality / contaminant ion
- Correct sequence may not be in the database
searched (e.g., SNP) - Novel sequence (splice variants, fusion
peptides?) - Use MS/MS data for genome
annotation
60Finding and Mining High Quality Unassigned
Spectra (Nesvizhskii)
61Further Analyses at the Peptide Level
- The PPP, GPM, and PeptideAtlas databases are rich
with peptide-level findings, which can be
analyzed for many questions---e.g., which
peptides are most likely to be detected from
among the predicted tryptic peptides of various
proteins, and why? Can peptides be used directly
to identify sequences of splice isoforms and
SNPs? Can PTMs be identified more readily?
Answers Yes to all three questions. - Proteotypic peptides will be a major feature of
Next Phase PPP.
62What Kinds of Biological Insights Emerge from
Annotation?
- The aim of proteomics analyses is not just to
create lists of peptides and proteins, but to
advance our understanding of complex biological
processes in health and disease. Going forward,
quantitation of proteins and their PTMs will be
increasingly important---and feasible.
63High Throughput Proteomics and Systems Biology
condition 1
Understanding and modeling cell behavior Systems
Biology
condition 2
condition 3
Integration of genomic, transcriptomic,
proteomic, metabolomic data
64SUMMARY
- Enthusiasm for continuing and expanding Plasma
Proteome Project, confirmed at Seoul, Korea,
World Congress of Proteomics Oct 2007 - Commitment to combine PPP with concept of Disease
Biomarker Initiative - Interest in linking with and absorbing datasets
from other Biofluid Proteomes (saliva, urine,
CSF, organ-related proximal fluids)
65Biology as an Information Science NIH Roadmap
National Centers for Biomedical Computing
Informatics for IntegratingBiology and the
Bedside (i2b2) Isaac Kohane, PI
Physics-Based Simulation of Biological Structures
(SIMBIOS) Russ Altman, PI
National Center for Integrative Biomedical
Informatics (NCIBI) Brian Athey, PI
National Alliance for Medical Imaging Computing
(NA-MIC) Ron Kikinis, PI
The National Center For Biomedical Ontology
(NCBO) Mark Musen, PI
Multiscale Analysis of Genomic and Cellular
Networks (MAGNet) Andrea Califano, PI
Center for Computational Biology (CCB) Arthur
Toga, PI
66A Bioinformatics Approach to Discover Candidate
Oncogenes
- Few causal cancer genes have been discovered
using gene expression microarrays - Oncogenic events are often heterogeneous
- ERBB2/HER2 amplification in 20 of breast CA
- Activating Ras mutations in 25 of melanomas
- E2A-PBX1 translocation in 5-10 of leukemias
- Chromosomal aberrations that result in marked
over-expression of an oncogene should be
detectable in transcriptome data - Protein products then may be identified in tumor,
biofluids, and plasma
67(No Transcript)
68COPA of microarray data revealed ETV1 and ERG as
outlier genes across multiple prostate cancer
gene expression data sets
Tomlins et al., Science 2005, 310 644
-648
69 COPA Unveils Androgen-Responsive TF Fusion
Genes
70The Molecular Concept Map Project Chinnaiyan,
Rhodes
71(No Transcript)
72Our Genetic Future
- Mapping the human genetic terrain may rank with
the great expeditions of Lewis and Clark, Sir
Edmund Hillary, and the Apollo Program.
--Francis Collins, Director - National Human Genome
Research Institute, 1999 - Next
- Understand gene and protein expression
- Elucidate genetic, environmental, and
behavioral interactions in health and disease - Engage scientists globally
73Acknowledgements
- HUPO PPP Ruedi Aebersold and Young-Ki Paik,
co-chairs Eric Deutsch, Lennart Martens, Alexey
Nesvizhskii, David States, bioinformatics lab
leaders and sponsors (see Proteomics 2005) - UM Proteomics Alliance for Cancer Research Phil
Andrews, David States, Alexey Nesvizhskii, George
Michailidis, Mike Pisano, Arul Chinnaiyan, Dan
Rhodes, Scott Tomlins, Arun Sreekumar, Adai
Vellaichamy, Brian Haab - UM National Center for Integrative Biomedical
Informatics Brian Athey, David States, HV
Jagadish, Jignesh Patel, Peter Woolf, Biaoyang Lin