Title: Inferring transcriptional networks II
1Inferring transcriptional networks II
CSCI5461 Functional Genomics, Systems Biology
and Bioinformatics
- Rui Kuang and Chad Myers
- Department of Computer Science and Engineering
- University of Minnesota
2Announcements
- Project proposals due Wed. 4/8!
- Get started on your projects! (see us if you
need help with data, etc.) - Remember to read paper for next time!
- M. Middendorf, E. Ziv, C. H. Wiggins. Inferring
network mechanisms the Drosophila melanogaster
protein interaction network. Proc Natl Acad Sci U
S A., 102(9)31927.
3Outline for today
- Finish inferring transcriptional networks from
gene-expression data using Bayesian networks - Paper discussion (Dynamic Bayes nets)
- Wrap-up of regulatory network inference, new
directions, etc.
4A reminder about gene transcription
Transcription Factors (proteins)
RNA polymerase (protein)
C T A A T G T . . .
5
3
3
5
G A T T A C A . . .
Binding sites
Transcription factors recognize transcription
factor binding sites and bind to them, forming a
complex. RNA polymerase binds the complex.
Protein-DNA interaction!
(eukaryotes)
5Gene Transcription
Transcription Factors (proteins)
RNA polymerase (protein)
C T A A T G T . . .
5
3
3
5
G A T T A C A . . .
Transcription factors recognize transcription
factor binding sites and bind to them, forming a
complex. RNA polymerase binds the complex.
(eukaryotes)
6Gene Transcription
G A T T A C A . . .
5
3
3
5
C T A A T G T . . .
The two strands are separated
(eukaryotes)
7Gene Transcription
G A T T A C A . . .
5
3
G A U U A C A
3
5
C T A A T G T . . .
An RNA copy of the 5?3 sequence is created from
the 3?5 template until a termination sequence
is reached
(eukaryotes)
8Inferring regulatory networks from expression data
Transcriptional regulatory network
Microarray data
9The sprinkler Bayes net
Prior probability that it is cloudy
Conditional probability that it rains when its
cloudy
- What are the conditional independence relations
implied? - What is the joint probability?
Probability that grass is wet when the sprinkler
is off and it rains
10Applying Bayes nets to transcriptional network
inference
Gene 2
Gene 4
Gene 3
Gene 1
Gene 5
Gene 6
Transcriptional regulatory network
Microarray data
- Challenges
- We dont know either the structure or the
conditional probabilities! - Expression data are noisy
- Even if expression data were perfect, they dont
capture the complete picture (e.g.
post-transcriptional/translational regulation)
11Bayesian structure/parameter learning
Likelihood of data given model
Posterior probability of model given data
12Structure scoring criteria
where
Assuming multinomial distribution with Dirichlet
prior (analytical solution)
-Heckerman,A Tutorial on Learning With Bayesian
Networks and Neapolitan,Learning Bayesian Networks
13Dynamic Bayesian networks paper discussion
14Question can we use prior knowledge of
transcription factors to help in network
inference?
Transcription Factors (proteins)
RNA polymerase (protein)
C T A A T G T . . .
5
3
3
5
G A T T A C A . . .
Binding sites
(eukaryotes)
15Answer Yes!
Module networks identifying regulatory modules
and their condition-specific regulators from gene
expression data. Nature Genetics  34, 166 - 176
(2003)
- Basic idea
- start with known transcription factors
- simultaneously learn regulatory program and
regulated module groups
16Preprocessing
- Candidate regulators are chosen from among known
and suspected transcription factors and signal
transduction molecules. Informed choice of
candidate regulators makes algorithm workable
without selectivity, bad results are likely.
17Module network procedure
- Genes are partitioned into modules and regulation
program is sought for each module to explain gene
expression in module.
18Post-processing
- Enrichment of annotations for predicted modules
are sought in literature enrichment of
regulatory motifs sought within 500 base pairs
upstream from genes
19Regulator program
20Learning module networks/regulatory programs
- Iteration
- Search for regulation program for each module
- Re-assign genes to the module whose program best
predicts its behavior
21An example result respiration carbon
regulation module
22Validating module results
- Compare module gene set with GO terms
- 31 gt 50 functional coherence
- 4 lt 30 functional coherence
- Look for enriched upstream sequence elements
(regulator binding sites) - No sequence information was used for defining
groups!
23Module summary
24(No Transcript)
25Are these learned models predictive?
- Experimental validation
- Knock-out 3 predicted regulators, check
- Does it cause change under the predicted
condition? - Does it affect the predicted set of genes?
- Does the function of the predicted regulator
match the prediction?
26Results summary
- The method is able to accurately predict (1)
functions for regulators - (2) known transcription factor targets
- (3) the conditions under which regulation occurs
- What general principles about successful methods
for inferring networks can we learn from this
example?
27New trends in transcriptional network inference
- Sub-structure learning
- Can we extend these models to infer complete
networks over all genes? - Basic idea learn small sub-networks and
stitch together - Incorporating perturbations into network
inference process (more on this later) - Models that leverage more prior knowledge (e.g.
TF binding site info)
28Experimental determination of protein-DNA
interactions
ChIP-chip Chromatin immunoprecipitation chip
(microarray)
(antibodies bind transcription factor of
interest)
(TF-bound sequences hybridized to microarray)
Simon et al., Cell 2001
29Mapped transcription factor binding sites in
yeast (based on ChIP-chip)
Harbison C., Gordon B., et al. Nature 2004
30Example MEDUSA- learning regulatory programs
from known TFs and binding sites
Kundaje A, Lianoglou S, Li X, Quigley D, Arias M,
Wiggins CH, Zhang L, Leslie C.Learning
regulatory programs that accurately predict
differential expression with MEDUSA. Ann N Y Acad
Sci. 2007 Dec1115178-202.
31MEDUSA performance (yeast gene expression)
Evaluation can regulatory programs predict
up/down expression of held-out differentially
expressed genes
Kundaje A, Lianoglou S, Li X, Quigley D, Arias M,
Wiggins CH, Zhang L, Leslie C.Learning
regulatory programs that accurately predict
differential expression with MEDUSA. Ann N Y Acad
Sci. 2007 Dec1115178-202.
32Summary of transcriptional network inference
- Learning network structure is hard! (model space
gtgt data) - Bayesian models provide a reasonable framework
for learning these models - Incorporating prior knowledge is key!
- Validation is and will continue to be an issue
(very few gold standards)