Title: Probabilistic Models that uncover the hidden Information Flow in Signalling Networks
1Probabilistic Models that uncover the hidden
Information Flow in Signalling Networks
2Which model?
A model that explains the datamerely finds
associations E.g. Epidemiology (predict colon
cancer risk from SNPs)
A model that explains the mechanism
finds explanations E.g. Physics, Systems
Biology (predict the signal flow through a
cascade of transcription factors)
3Which model?
Our choice Graphical Modelsnodes correspond to
physical entities, arrows correspond to
interactions
?
Two different types of nodes Observable
componentsPerturbed components (signals)
Need for inter-ventional data
4How do marionettes walk?
5How do marionettes walk?
This is what we observe
This is the true model
Both models explain the observations perfectly.
What makes the right model (biologically) more
plausible?
6How do marionettes walk?
This is what we observe
This is the true model
Both models explain the observations perfectly.
Signals,Signal graph G
Observables,Effects graph T
What makes a model (biologically) more plausible?
7Nested Effects Models
Signals
Signal graph, Adjacency matrix G (with 1s in
the diagonal)
Observables
Effects graph,Adjacency matrix T
Predicted effects Ft
Parsimony Assumption Each observable is linked
to exactly one action
Definition Markowetz, Bioinformatics 2005 A
Nested Effects Model (NEM) is a model F for
which F G T
8Nested Effects Models
Signals
Why nested ?
If the signal graph is transitively closed, then
the observed effects are nested in the sense that
a ? b implies effects(a) ? effects(b)
Observables
Predicted effects Ft
? ?
Predicted effects
The present formulation of a NEM drops the
transitivity requirement.
9Nested Effects Models
Effect of signal s on observable a
Signals
s
Ra,s
a
Observables
Predicted effects Ft
Measured effects Rt
The final ingredient A quantification of the
measured effect strength
Ra,s gt 0 if the data favours an effect of s on a
10Nested Effects Models
Assuming independent data, it follows that
Note Missing data is handeled easily set Rs,a 0
11NEM Estimation
There are two ways of finding a high scoring NEM
Maximum Likelihood
Theorem (Tresch, SAGeMB 2008) For ideal data,
is unique up to reversals (Corollary
if G is a DAG).
Bayesian, posterior mode
For n5 signals, an exhaustive parameter space
search is possible. For larger n, apply standard
optimization strategiesGradient ascent,
Simulated annealingor heuristics tailored to
NEMsModule networks Fröhlich et al., BMC
Bioinformatics 2007, Triplet search Markowetz
at al., Bioinformatics 2007
12Simulation
R/Bioconductor package Nessy
True graphs G,T
simulatedmeasure-ments (R)
idealmeasure-ments (GT)
13Simulation
True graph
Estimated graph
12 edges, 2124096 signal graphs, 4seconds
Distribution of the likelihoods
14Application Synthetic Lethality
- Hypotheses
- SL between two genes occurs if the genes are
located in different pathways - Genes sharing the same synthetic lethality
partners have an increased chance of being
located in the same pathway Ye, Bader et al.,
Mol.Systems Biology 2005
Pathway II
Pathway I
1
a
2
b
3
- Consequence
- A gene b whose SL partners are nested into the SL
partners of another gene a is likely to be
located beneath a in the same pathway.
15Application Synthetic Lethality
Pan et al., Cell 2006
16Application Synthetic Lethality
7 of 10 Genes directly linked to DNA repair
Tresch, unpublished
17Software, References
- R/Bioconductor packages
- NEM (Markowetz, Fröhlich, Beissbarth)
- Nessy (Tresch)
- References
- Structure Learning in Nested Effects Models. A.
Tresch, F. Markowetz, to appear in SAGeMB 2008,
avaliable on the ArXive - Nested Effects Models as a Means to learn
Signaling Networks from Intervention Effects. H.
Fröhlich, A. Tresch, F. Markowetz, M. Fellmann,
R. Spang, T. Beissbarth, in preparation - Computational identification of cellular networks
and pathways F. Markowetz, Olga G. Troyanskaya,
Dennis Kostka, Rainer Spang. Molecular
BioSystems, Bioinformatics 2007 - Non-transcriptional Pathway Features
Reconstructed from Secondary Effects of RNA
Interference. F. Markowetz, J. Bloch, R. Spang,
Bioinformatics 2005
18Acknowledgements
- Florian Markowetz Lewis-Sigler Institute,
Princeton - Tim Beissbarth, Holger FröhlichGerman Cancer
Research Center, Heidelberg -
- Rainer SpangComputational Diagnostics Group,
Regensburg
19Conclusion
Exercise Why is this administration model
inefficient? Construct a model that scores
better!
Thank You!
20(No Transcript)
21What I did not show
Automatic Feature Selection, without Control
experimentEstimated graph (120 genes selected)
22What I did not show
The observed graph of the Fellmann estrogen
receptor dataset
23What I did not show
15 Genes 17 Knockdown Experiments 6 of them
double Knockdowns
24What I did not show
Same Data, With prior knowledge.