Title: National Center for Biomedical Computing at
1- National Center for Biomedical Computing at
- Columbia University
2Mission
- Basic Science To study the organization of the
complex networks of biochemical interactions
whose concerted activity determines cellular
processes at increasing levels of granularity. - Software Tools To provide an integrative
computational framework to organize molecular
interactions in the cell into manageable context
dependent components. - Biomedical Applications To develop interoperable
computational models and tools that can leverage
such a map of cellular interactions to elucidate
important biological processes and to address a
variety of biomedical applications.
3MAGNet Organization
4Core I Computational Sciences
Coordinator Waltz Proj. Lead Leslie, Wiggins,
Friedman, Califano, Yemini Invest. Servedio,
Lussier, Kaiser, Ofran, Ross
- Machine Learning - Classification, Network
analysis, Functional analysis. - NLP - Analysis of Literature for biomedical
content (genotytic/phenotypic) - Software Design - BISON, an ontology for
bioinformatics interoperability - Biomedical Database Integration - GeneTegrate a
semantic layer for bioinformatics data integration
5Core II Bioinformatics
Coordinator Rost Leaders Honig, Bussemaker,
Califano, Rzhetsky, Lussier Invest. Yemini,
Ofran, Petrey, Long, Anastassiou, Leslie,
Pavlidis, Wiggins, Friedman
- Protein Structure and Function - Sequence and
structure based annotation of protein function
(specifically protein-protein interactions) - Reverse Engineering of Cellular Networks - An
integrated knowledge-base of Cellular
interactions in human B lymphocytes - Cellular and Molecular Context - Using cellular
and molecular phenotypes for context filtering - MAGNet Tools - Software platform (geWorkbench)
Hot Topic
Hot Topic
6Core III Driving Biological Projects
Coordinator Califano Leaders Shapiro, Dalla
Favera, Gilliam Invest.
- Cell Adhesion - Structural and energetic basis of
cadherin binding specificity A combined
computational and experimental study - Pathway Dysregulation - Regulatory Modules in
Normal and Transformed B-Cells - Complex Diseases - Genomic and Bioinformatics
Solutions to the Search for Genetic Determinants
of Common, Heritable Disorders Alzheimers
Disease and Autism.
Hot Topic
7Core IV Infrastructure (cont.)
MAGNet/C2B2
8Hot Topic B-Cell Knowledge Base
Basic Science
- Everything you always wanted to know about B
Cells but were afraid to ask
H. Bussemaker A. Califano R. Dalla Favera C.
Leslie A. Rzhetsky C. Wiggins
9Knowledge Base for Human B Lymphocytes
- Integrative
- Bayesian Evidence integration of pairwise
interactions - Protein-Protein, Protein-DNA
- Context Specific
- ARACNE, GeneWays, REDUCE
- B-Cell data or B-cell specific criteria
- Linked to one of the largest B-Cell expression
profiles microarray dataset, ChIP-Chip assays
(MYC/BCL6), miRNA profiles, and Literature - Captures Multi-variate dependencies
- Three-way interactions via MINDY and MATRIXReduce
- Post-translational modulation of transcriptional
regulation - Combinatorial transcriptional regulation
- Signal transduction control of Transcriptional
Regulation I.e. the Transferome meets the
Transcriptome - Links to literature, via GeneWays
10Knowledge Base for Human B Lymphocytes V1.0
- Integrative
- Bayesian Evidence integration of pairwise
interactions - Protein-Protein, Protein-DNA
- Context Specific
- ARACNE, GeneWays, REDUCE
- B-Cell data or B-cell specific criteria
- Linked to one of the largest B-Cell expression
profiles microarray dataset, ChIP-Chip assays
(MYC/BCL6), miRNA profiles, and Literature - Captures Multi-variate dependencies
- Three-way interactions via MINDY and MATRIXReduce
- Post-translational modulation of transcriptional
regulation - Combinatorial transcriptional regulation
- Signal transduction control of Transcriptional
Regulation I.e. the Transferome meets the
Transcriptome - Links to literature, via GeneWays
11Integrating protein-DNA and protein-protein
Interactions via Naïve Bayes Classification
- Protein-Protein Interactions (PPIs)
- Human PPI databases
- Human Protein Reference Database (HPRD)
- Biomolecular Interaction Network Database (BIND)
- Database of Interacting Proteins (DIP and IntAct)
- Y2H Studies (2in human)
- Eukaryotic PPI via hortologous genes
(Inparanoid) - MIPS, BIND, IntAct.
- GeneWays Predictions (context-specific literature
analysis) - Co-expression analysis (Mutual Information)
- Gene Ontology classification (biological
process/compartment) - Protein-DNA Interactions (PDIs)
- Human PDI databases
- TRANSFAC, BIND, MycDB
- Mouse PDI databases (TRANSFAC, BIND via
orthologous genes (Inparanoid) - ARACNE (bootstrap-TF)
- GeneWays predictions (context-specific literature
analysis)
- 49,719 interactions (4,944 genes)
- 27,705 PPIs (4,209 genes)
- 22,014 PDIs (3,216 genes/457 TFs)
12Network Motifs
Protein complexes
Regulatory Motifs
13Definition of a Modulator
- Modulator genes capable of modulating the
activity of transcription factors at
post-transcriptional levels, i.e. without
affecting its mRNA concentration (e.g. activating
Kinase, co-factor, etc.)
14Algorithm Workflow
- Statistical Tests
- The gene gm is a modulator of the Interaction
gTF ? gt if -
- A modulator has sufficient expression range
-
- Modulators need to be statistically
independent of the TF - i.e., it does not condition the TF expression
range -
- Conditional MI difference is statistically
different from zero
15An Example JUN as Cofactor of MYC
DNA Binding sites -2kb,2kb
Mutual Information
Candidate Targets
16MYC Modulation by Transferases
17MYC Modulation by co-Transcription Factors
18Hot Topic Caherins
Biomedical Applications
- Elucidating Cadherins binding specificity
L. Shapiro B. Honig
19Differential expression of cadherins is critical
for vertebrate development
20Cadherins structure
21Electron tomography reconstructions of
desmosomes.
He et al., Science, 302, 109-113, 2003.
2223
8
Asparagine Arginine
Serine Glutamine
23(No Transcript)
24(No Transcript)
258NgtS, 23RgtQ N-Cadherin
26Homophilic adhesion is retained in the N-8/23
mutant
N-cadherin
N- 8/23 mutant
E-cadherin
27But adhesive specificity is converted to that of
E-cadherin
N-cad
E-cad
N- 8/23
E-cad
N-8/23
N-cad
4X
10X
28Hot Topic geWorkbench
MAGNet Software Tools
- An interoperable platform for integrative
genomics research
Columbia University A. Califano A.
Floratos The BROAD Institute (GenePattern)
Jill Mesirov Michael Reich
29geWorkbench (genomic Workbench)
- Based on caWorkbench, an NCI/caBIG-funded effort
- Open source, Java based platform
- Integrated Genomics Platform
- Support for gene expression data, sequences,
pathways, structure, etc. (40 visualization and
analysis modules). - Access to local and remote data sources and
analytical services. - Support for workflow scripting.
- Integration with caGRID.
- Development framework
- Open source development.
- Modular/extensible architecture, supporting
pluggable components with configurable user
interface. - Formal (caBIG-registered) data models for
multitude of bioinformatics concepts. - Easy integration of 3rd party components.
30geWorkbench
- Major effort on making the platform broadly
available to and extensible by the biomedical
research community - Main Vehicle for Integration and Dissemination of
MAGNet Tools - B cell Interaction Knowledge base
- ARACNE
- REDUCE
- MEDUSA
- GeneWays
- Protein Structure Pipeline
- JMol (Open source Molecular Viewer)
- Functional Annotation Pipeline
-
- Etc.
31Integration of 3rd party components
Cytoscape
GenePattern
MatrixREDUCE
GoMiner
32BISON Biomedical Informatics Structured Ontology
Component A
_at_Publish public DSDataSet publish(. . .)
DSDataSet dataSet // do some work that
assigns a value to dataSet. return
dataSet _at_Subscribe public void
receive(DSDataSet dataSet, Object source)
// Consume the argument dataSet, as
appropriate
Component B
- Provide re-usable models of common bioinformatics
concepts - Data sequence, expression, genotype, structure,
proteomics - Complex data structures patterns, clusters,
HMMs, PSSMs, alignments - Algorithms Clustering, matching, discovery,
normalization, filtering - Provide a foundation for the development of
interoperable geWorkbench components - Endorsed by multiple communities (caBIG, AMDeC,
NCBCs)
33geWorkbench Resources
http//www.geworkbench.org/
34The DREAM Project
35Web Resources
- Center Overview
- magnet.c2b2.columbia.edu
- geWorkbench
- www.geworkbench.org
- DREAM Project
- www.nyas.org/ebriefreps/splash.asp?intEbriefID534