Title: BeeSpace Meta-Analysis: Methods and Plans
1BeeSpace Meta-AnalysisMethods and Plans
- Nathan D. Price
- Department of Chemical and Biomolecular
Engineering - Center for Biophysics and Computational Biology
- Institute for Genomic Biology
- University of Illinois, Urbana-Champaign
- May 22, 2009
2Meta-Analysis What can we learn from integrating
data from 2000 microarrays?
- How distinct and separable are each of the
phenotypes? - What are the molecular features that are unique
for each phenotype? - Are these differences sufficient to enable
identification of phenotype just given the
microarray? - What are shared molecular features between
subsets of phenotypes? - Can we reliably reconstruct networks active in
bee brains and at what scale?
3A few key issues
- Data normalization from the loop design arrays
- Using knowledge from drosophila to aid in
interpretation at multiple levels - Single gene
- Pathway
- Network
- Cell specificity?
- Interpretation of data from homogenized brain
- Multiple cell types
- Scale of network reconstruction/inference
possible - Particularly since we dont have time series or
molecular perturbation experiments
4Minimal sample needs for statistical learning
- Differential expression
- Pathway Analysis
- Classification
- Network Inference
-
- 10s
- 10s
- 10s-100s
- 100s-1000s
-
Data sets in the 1000s are rare in biology today
so tremendous opportunity!
5Examples of methods and results from previous
meta-analysis studies
6Classification molecular signatures to
differentiate phenotypes
- Price, N.D. et al, PNAS 1043414-9 (2007)
7Multi-class classification Example from brain
disease
8Novel method for pathway analysisDifferential
rank conservation (DIRAC)
across pathways in a phenotype
across phenotypes for a pathway
tightly regulated pathway
Highest conservation
g3
g3
g3
g3
g2
g2
g1
g2
shuffled pathway ranking between phenotypes
g1
g1
g2
g1
GIST
LMS
g4
g4
g4
g4
g3
g4
g2
g1
weakly regulated pathway
g1
g3
g7
g6
g5
g7
g4
g2
g8
g8
g7
g6
g6
g7
g6
g8
Lowest conservation
g5
g5
g8
g5
Eddy et al, In preparation
9Rank difference scores in GBM and normal
10Diverse rank conservation in brain disease
Highest rank conservation
Lowest rank conservation
Low conservationAstrocytoma, grade I Lower
conservationAstrocytoma, grade III Lowest
conservationGlioblastoma (grade IV)
Pathways
Phenotypes
11Differential regulation of pathway ranking in
disease
12Classification with DIRAC
13Network inference
Training Set (268 conditions)
Test Set (24 conditions)
Similar accuracies not overfitting and has
predictive capacity!
Bi-clustering for data reduction and learning
SAMBA, cMONKEY
Bonneau, R. et al, 7R36, Genome Biology, 2006
14Visual Representations
My lab has made a function Matlab-based version
of the Inferelator and are looking forward to
testing it out on the BeeSpace data!
15Acknowledgments
Price Lab Postdocs Pan-Jun Kim Amit
Ghosh Graduate Students James Eddy Shu-wen
Huang Matt Gonnerman Swati Gupta Caroline
Milne Ravali Raju Jaeyun Sung Chunjing
Wang Sriram Chandrasekaran
Collaborators Donald Geman, Johns Hopkins Lee
Hood, Institute for Systems Biology Ilya
Shmulevich, Institute for Systems
Biology Jonathan Trent, MD Anderson Cancer
Center Wei Zhang, MD Anderson Cancer Center
Funding Sources NIH Howard Temin Pathway to
Independence Award NSF CAREER Department of
Defense TATRC Energy Biosciences Institute (BP)