Title: Recovering Temporally Rewiring Networks: A Model-based Approach
1Recovering Temporally Rewiring Networks A
Model-based Approach
- Fan Guo, Steve Hanneke, Wenjie Fu, Eric P. Xing
- School of Computer Science, Carnegie Mellon
University
2Social Networks
Physicist Collaborations
High School Dating
The Internet
All the images are from http//www-personal.umich.
edu/mejn/networks/. That page includes original
citations.
3Biological Networks
Model for the Yeast cell cycle transcriptional
regulatory networkFig. 4 from (T.I. Lee et al.,
Science 298, 799-804, 25 Oct 2002)
Protein-Protein Interaction Network in S.
cerevisiaeFig. 1 from (H. Jeong et al., Nature
411, 41-42, 3 May 2001)
4When interactions are hidden
- Infer the hidden network topology from node
attribute observations. - Methods
- Optimizing a score function
Information-theoretic approaches Model-based
approach - Most of them pool the data together to infer a
static network topology.
5And changing over time
- Network topologies and functions are not static
- Social networks can grow as we know more friends
- Biological networks rewire under different
conditions -
Fig. 1b from Genomic analysis of regulatory
network dynamics reveals large topological
changes N. M. Luscombe, et al. Nature 431,
308-312, 16 September 2004
6Overview
- Network topologies and functions are not always
static. - We propose probabilistic models and algorithms
for recovering latent network topologies that are
changing over time from node attribute
observations.
7Rewiring Networks of Genes
- Networks rewire over discrete timesteps
Part of the image is modified from Fig. 3b (E.
Segal et al., Nature Genetics 34, 166-176, June
2003).
8The Graphical Model
Transition Model
Emission Model
9Technical Challenges
- Latent network structures are of higher
dimensions than observed node attributes - How to place constraints on the latent space?
- Limited evidence per timestep
- How to share the information across time?
-
10Energy Based Conditional Probablities
- Energy-based conditional probability model
(recall Markov random fields) - Energy-based model is easier to analysis, but
even the design of approximate inference
algorithm can be hard. -
9/2/2015
10
ICML 2007 Presentation
11Transition Model
- Based on our previous work on discrete temporal
network models in the ICML06 SNA-Workshop. - Model network rewiring as a Markov process.
- An expressive framework using energy-based local
probabilities (based on ERGM) -
- Features of choice
(Density)
(Edge Stability)
(Transitivity)
9/2/2015
11
ICML 2007 Presentation
12Emission Model in General
- Given the network topology, how to generate the
binary node attributes? - Another energy-based conditional model
- All features are pairwise which induces an
undirected graph corresponding to the
time-specific network topology - Additional information shared over time is
represented by a matrix of parameters ? - The design of feature function F is
application-specific.
9/2/2015
12
ICML 2007 Presentation
13Design of Features for Gene Expression
- The feature function
- If no edge between i and j, F equals 0
- Otherwise the sign of F depends on ?ij and the
empirical correlation of xi, xj at time t.
14Graphical Structure Revisit
Hidden rewiring networks
Initial network to define the prior on A1
Time-invariant parameters dictating the direction
of pairwise correlation in the example
15Inference
- A natural approach to infer the hidden networks
A1T is Gibbs sampling - To evaluate the log-odds
- Conditional probabilities in a Markov blanket
Tractable transition model the partition
function is the product of per edge terms
Computation is straightforward
Given the graphical structure, run variable
elimination algorithms, works well for small
graphs
16Parameter Estimation
- Grid search is very helpful, although Monte Carlo
EM can be implemented. - Trade-off between the transition model and
emission model - Larger ? better fit of the rewiring processes
- Larger ? better fit of the observations.
17Results from Simulation
- Data generated from the proposed model.
- Starting from a network (A0) of 10 nodes and 14
edges. - The length of the time series T 50.
- Compare three approaches using F1 score
- avg averaged network from ground
truth(approx. upper bounds the performance of
any static network inference algorithm) - htERG infer timestep-specific networks
- sERG the static counterpart of the proposed
algorithm - Study the edge-switching events
18Varying Parameter Values
- F1 scores on different parameter settings
(varying )
19Varying the Amount of Data
- F1 scores on different number of examples
20Capturing Edge Switching
- Summary on capturing edge switching in networks
- Three cases studied offset, false positive,
missing (false negative) - mean and rms of offset timesteps
21Results on Drosophila Data
- The proposed model was applied to infer the
muscle development sub-network (Zhao et al.,
2006) on Drosophila lifecycle gene expression
data (Arbeitman et al., 2002). - 11 genes, 66 timesteps over 4 development stages
- Further biological experiments are necessary for
verification.
Network in (Zhao et al. 2006)
Embryonic
Larval
Pupal Adult
22Summary
- A new class of probabilistic models to address
the problem of recoving hidden, time-dependent
network topologies and an example in a biological
context. - An example of employing energy-based model to
define meaningful features and simplify
parameterization. - Future work
- Larger-scale network analysis (100?)
- Developing emission models for richer context
23Acknowledgement
- Yanxin Shi CMU
- Wentao Zhao Texas AM University
- Hetunandan Kamisetty CMU
24Thank You!