Title: Example: Intel, Novelus, Motorola, Dell depend on the pric
1Learning Module Networks
- Eran Segal
- Stanford University
Aviv Regev (Harvard) Nir Friedman (Hebrew U.)
Joint work with Dana Peer (Hebrew U.) Daphne
Koller (Stanford)
2Learning Bayesian Networks
- Density estimation
- Model data distribution in population
- Probabilistic inference
- Prediction
- Classification
- Dependency structure
- Interactions between variables
- Causality
- Scientific discovery
3Stock Market
- Learn dependency of stock prices as a function of
- Global influencing factors
- Sector influencing factors
- Price of other major stocks
4Stock Market
- Learn dependency of stock prices as a function of
- Global influencing factors
- Sector influencing factors
- Price of other major stocks
MSFT
DELL
INTL
NVLS
MOT
5Stock Market
- Learn dependency of stock prices as a function of
- Global influencing factors
- Sector influencing factors
- Price of other major stocks
Bayesian Network
DELL
INTL
MSFT
NVLS
MOT
6Stock Market
- 4411 stocks (variables)
- 273 trading days (instances) from Jan.02
Mar.03
- Problems
- Statistical robustness
- Interpretability
7Key Observation
- Many stocks depend on the same influencing
factors in much the same way - Example Intel, Novelus, Motorola, Dell depend on
the price of Microsoft - Many other domains with similar characteristics
- Gene expression
- Collaborative filtering
- Computer network performance
-
8The Module Network Idea
Bayesian Network
MSFT
MOT
INTL
DELL
AMAT
HPQ
9Problems and Solutions
- Statistical robustness
- Interpretability
10Outline
- Module Network
- Probabilistic model
- Learning the model
- Experimental results
11Module Network Components
- Module Assignment Function
- A(MSFT)MI
- A(MOT)A(DELL)A(INTL) MII
- A(AMAT) A(HPQ)MIII
MSFT
AMAT
HPQ
INTL
MOT
DELL
MSFT
Module I
MOT
INTL
DELL
Module II
AMAT
HPQ
Module III
12Module Network Components
- Module Assignment Function
- Set of parents for each module
- Pa(MI)?
- Pa(MII)MSFT
- Pa(MIII)DELL, INTL
MSFT
Module I
MOT
INTL
DELL
Module II
AMAT
HPQ
Module III
13Module Network Components
- Module Assignment Function
- Set of parents for each module
- CPD template for each module
MSFT
Module I
MOT
INTL
DELL
Module II
AMAT
HPQ
Module III
14Ground Bayesian Network
- A module network induces a ground BN over X
- A module network defines a coherent probabilty
distribution over X if the ground BN is acyclic
MSFT
Module I
MOT
INTL
DELL
Module II
AMAT
HPQ
Module III
15Module Graph
- Nodes correspond to modules
- Mi?Mj if at least one variable in Mi is a parent
of Mj
MSFT
Module I
MOT
INTL
DELL
Module II
AMAT
HPQ
Acyclicity checked efficiently using the module
graph
Module III
16Outline
- Module Network
- Probabilistic model
- Learning the model
- Experimental results
17Learning Overview
- Given data D, find assignment function A and
structure S that maximize the Bayesian score -
- Marginal data likelihood
-
18Likelihood Function
MSFT
Module I
MOT
INTL
DELL
Module II
AMAT
HPQ
Likelihood function decomposes by modules
Module III
Instance 1
Instance 2
Sufficient statistics of (X,Y)
Instance 3
19Bayesian Score Decomposition
- Bayesian score decomposes by modules
MSFT
Module I
Module j variables
Module j parents
MOT
INTL
DELL
Delete INTL ? ModuleIII
Module II
AMAT
HPQ
Module III
20Bayesian Score Decomposition
- Bayesian score decomposes by modules
MSFT
Module I
MOT
INTL
DELL
A(MOT)2 ? A(MOT)1
Module II
AMAT
HPQ
Module III
21Algorithm Overview
- Find assignment function A and structure S that
maximize the Bayesian score -
Find initial assignment A
Dependency structure S
22Initial Assignment Function
Variables (stocks)
AMAT
MOT
MSFT
DELL
INTL
HPQ
Instances (trading days)
x1
x2
x3
x4
Find variables that are similar across instances
A(MOT) MII A(INTL) MII A(DELL) MII
23Algorithm Overview
- Find assignment function A and structure S that
maximize the Bayesian score -
Find initial assignment A
Dependency structure S
24Learning Dependency Structure
- Heuristic search with operators
- Add/delete parent for module
- Cannot reverse edges
- Handle acyclicity
- Can be checked efficientlyon the module graph
- Efficient computation
- After applying operator formodule Mj, only
update scoreof operators for module Mj
MSFT ? ModuleII
X
MSFT
Module I
MOT
MI
MII
MIII
INTL
DELL
Module II
X
INTL ? ModuleI
AMAT
HPQ
?
INTL ? ModuleIII
Module III
25Learning Dependency Structure
- Structure search done at module level
- Parent selection
- Reduced search space relative to BN
- Acyclicity checking
- Individual variables only used for computation of
sufficient statistics
26Algorithm Overview
- Find assignment function A and structure S that
maximize the Bayesian score -
Find initial assignment A
Dependency structure S
27Learning Assignment Function
DELL
DELL
MSFT
Module I
MOT
INTL
Module II
AMAT
HPQ
Module III
28Learning Assignment Function
- A(DELL)MI
- Score 0.7
- A(DELL)MII
- Score 0.9
DELL
MSFT
Module I
MOT
INTL
DELL
Module II
AMAT
HPQ
Module III
29Learning Assignment Function
- A(DELL)MI
- Score 0.7
- A(DELL)MII
- Score 0.9
- A(DELL)MIII
- Score cyclic!
MSFT
Module I
MOT
INTL
DELL
Module II
DELL
AMAT
HPQ
Module III
30Learning Assignment Function
- A(DELL)MI
- Score 0.7
- A(DELL)MII
- Score 0.9
- A(DELL)MIII
- Score cyclic!
MSFT
Module I
MOT
INTL
DELL
Module II
AMAT
HPQ
Module III
31Ideal Algorithm
- Learn the module assignment of all variables
simultaneously -
32Problem
- Due to acyclicity cannot optimize assignment for
variables separately
A(DELL)ModuleIV
A(MSFT)ModuleIII
DELL
MSFT
DELL
DELL
MSFT
DELL
Module I
Module II
MI
MII
DELL
AMAT
HPQ
MIII
MIV
Module III
Module IV
Module graph
Module Network
33Problem
- Due to acyclicity cannot optimize assignment for
variables separately
A(DELL)ModuleIV
A(MSFT)ModuleIII
DELL
MSFT
DELL
DELL
MSFT
DELL
Module I
Module II
MI
MII
DELL
AMAT
HPQ
MIII
MIV
Module III
Module IV
Module graph
Module Network
34Learning Assignment Function
- Sequential update algorithm
- Iterate over all variables
- For each variable, find its optimal assignment
given the current assignment to all other
variables - Efficient computation
- When changing assignment from Mi to Mj, only need
to recompute score for modules i and j
35Learning the Model
MSFT
AMAT
HPQ
- Initialize module assignment A
- Optimize structure S
- Optimize module assignment A
- For each variable, find its optimalassignment
given the currentassignment to all other
variables
INTL
MOT
DELL
MSFT
Module I
MOT
INTL
DELL
Module II
AMAT
HPQ
MOT
Module III
36Related Work
Bayesian networks
Parameter sharing
PRMs
OOBNs
Module Networks
37Outline
- Module Network
- Probabilistic model
- Learning the model
- Experimental results
- Statistical validation
- Case study Gene regulation
38Learning Algorithm Performance
-128
-129
Bayesian score (avg. per gene)
-130
Algorithm iterations
-131
0
5
10
15
20
39Generalization to Test Data
- Synthetic data 10 modules, 500 variables
40Generalization to Test Data
- Synthetic data 10 modules, 500 variables
500 instances
200 instances
Test data likelihood (per instance)
100 instances
- Gain beyond 100 instances is small
25 instances
50 instances
Number of modules
41Structure Recovery Graph
- Synthetic data 10 modules, 500 variables
500 instances
200 instances
Recovered structure ( correct)
100 instances
50 instances
25 instances
Number of modules
42Stock Market
- 4411 variables (stocks), 273 instances (trading
days) - Comparison to Bayesian networks (cross validation)
43Regulatory Networks
- Learn structure of regulatory networks
- Which genes are regulated by each regulator
44Gene Expression Data
Experiments
- Measures mRNA level forall genes in one
condition - Learn dependency of the expression of genes as a
function of expression of regulators
Induced
Genes
Repressed
45Gene Expression
- 2355 variables (genes), 173 instances (arrays)
- Comparison to Bayesian networks
46Biological Evaluation
- Find sets of co-regulated genes (regulatory
module) - Find the regulators of each module
46/50
30/50
Segal et al., Nature Genetics, 2003
47Experimental Design
- Hypothesis Regulator X activates process Y
- Experiment Knock out X and repeat experiment
X
Segal et al., Nature Genetics, 2003
48Differentially Expressed Genes
Segal et al., Nature Genetics, 2003
49Biological Experiments Validation
- Were the differentially expressed genes predicted
as targets? - Rank modules by enrichment for diff. expressed
genes
Segal et al., Nature Genetics, 2003
50Summary
- Probabilistic model for learning modules of
variables and their structural dependencies - Improved performance over Bayesian networks
- Statistical robustness
- Interpretability
- Application to gene regulation
- Reconstruction of many known regulatory modules
- Prediction of targets for unknown regulators
51Thank You!