Using Bayesian Networks to Analyze Expression Data - PowerPoint PPT Presentation

About This Presentation

Title:

Using Bayesian Networks to Analyze Expression Data

Description:

Title: 1 Document presentation format: On-screen Show Other titles: Arial Times New Roman Wingdings Bitstream Charter SimSun Default Design ... – PowerPoint PPT presentation

Number of Views:135

Avg rating:3.0/5.0

Slides: 41

Provided by: acuk

Category:

more less

Transcript and Presenter's Notes

Title: Using Bayesian Networks to Analyze Expression Data

1
Using Bayesian Networks to Analyze Expression Data

By Friedman Nir, Linial Michal, Nachman Iftach,
Pe'er Dana (2000)
Presented by
Nikolaos Aravanis
Lysimachos Zografos
Alexis Ioannou

2
Outline

Introduction
Bayesian Networks
Application to expression data
Application to cell cycle expression patterns
Discussion and future work

3
The Road to Microarray Data Analysis

Development of microarrays
Measure all the genes of an organism
Enormous amount of data
Challenge Analyze datasets and infer biological
interactions

4
Most Common Analysis Tool

Clustering Algorithms
Allocate groups of genes with similar expression
patterns over a set of experiments
Discover genes that are co-regulated

5
Problems

Data give only a partial picture
Key events are not reflected (translation and
protein (in) activation)
Amount of samples give few information for
constructing full detailed models
Using current technologies even few samples have
high noise to signal ratio

6
Possible Solution

Analyze gene expression patterns that uncover
properties of the transcriptional program
Examine dependence and conditional independence
of the data
Bayesian Networks

7
Bayesian Networks

Represent dependence structure between multiple
interacting quantities
Capable of handling noise and estimating the
confidence in the different features of the
network
Focus on interactions whose signal is strong
Useful for describing processes composed of
locally interacting components
Statistical foundations for learning Bayesian
networks and the statistics to do so are well
understood and have been successfully applied
Provide models of causal influence

8
Informal Introduction to Bayesian Networks

Let P(X,Y) be a joint distribution over variables
X and Y
X and Y independent if P(X,Y) P(X)P(Y) for all
values X and Y

Gene A is transcriptor factor of gene B
We expect their expression level to be dependent
A parent of B
B trascription factor of C
Expression levels of each pair are dependent
If A does not directly affect C, if we fix the
expression
level of B, we will observe A and C are
independent
P(AB,C) P(AB) (A and C conditionally
independent of B)
I(ACB)

9
Informal Introduction to Bayesian Networks
(contd)

Component of Bayesian Networks is that each
variable
is a stochastic function of its parents
Stochastic models are natural in gene expression
domain
The biological models we want to process are
stochastic
Measurements are noisy

10
Representing Distributions with Bayesian
Networks

Representation of joint probability distribution
consisting of 2 components
Directed acyclic graph (G)
Conditional distribution for each variable given
its parents in G
G encodes Markov Assumption
By applying chain rule this decomposes in product
form

11
Equivalence Classes of BNs

A BN implies further independence assumptions
gt Ind(G)
gt1 graphs can imply the same assumptions
gt Equivalent networks if Ind(G)Ind(G')

12
Equivalence Classes of BNs

A BN implies further independence statements
gt Ind(G)
gt1 graphs can imply the same statements
gt Equivalent networks if Ind(G)Ind(G')

13
Equivalence Classes of BNs

For equivalent networks
DAGs have the same underlying undirected graph.
PDAGs are used to represent them.

14
Equivalence Classes of BNs

For equivalent networks
DAGs have the same underlying undirected graph.
PDAGs are used to represent them.

Disagreeing edge
15
Learning BNs

Question
Given dataset D, what BN, BltG,Tgt best matches D?
Answer
Statistically motivated scoring function to
evaluate each BN e.g. Bayesian Score
S(GD)logP(GD)logP(DG)logP(G)C,
where C is a constant independent of G
and P(DG)?P(DG,T)P(TG)dT
is the marginal likelihood over all parameters
for G.

16
Learning BNs

Question
Given dataset D, what BN, BltG,Tgt best matches D?
Answer
Statistically motivated scoring function to
evaluate each BN e.g. Bayesian Score
S(GD)logP(GD)logP(DG)logP(G)C,
where C is a constant independent of G
and P(DG)?P(DG,T)P(TG)dT
is the marginal likelihood over all parameters
for G.

17
Learning BNs (contd)

Steps
Decide priors (P(TD), P(G))
gt Use of BDe priors
(structure equivalent, decomposable)
Find G to maximize S(GD)
NP hard problem
gtlocal search using local permutations of
candidate G

(Heckerman et al. 1995)
18
Learning Causal Patterns

Bayesian Network is model of dependencies
Interest in modelling the process that generated
them.
gt model the flow of causality in the system of
interest and create a Causal Network (CN).
A Causal Network models the probability
distribution
as well as the effect of causality.

19
Learning Causal Patterns

CNs VS BNs
- CNs interpret parents as immediate causes
(c.f. BNs)
- CNs and BNs relate when using the
Causal Markov Assumption
given the values of a variable's immediate
causes, it is independent of its earlier causes,
if this holds, then BNCN

20
Learning Causal Patterns

CNs VS BNs
- CNs interpret parents as immediate causes
(c.f. BNs)
- CNs and BNs relate when using the
Causal Markov Assumption
given the values of a variable's immediate
causes, it is independent of its earlier causes,
if this holds, then BNCN

equivalent BNs but not CNs
21
Applying BNs to Expression Data

Expression level of each gene as a random
variable
Other attributes (e.g temperature, exp.
conditions) that affect the system can be
modelled as random variables
Bayesian Net/ Dependency structure can answer
queries
CON problems in computational complexity and
the statistical significance of the resulting
networks.
PRO genetic regulation networks are sparse

22
Representing Partial Models

Gene networks many variables
gt gt1 plausible models (not enough data) we can
learn up to equivalence class.
Focus on feature learning in order to have a
causal network

23
Representing Partial Models

Features
- Markov relations (e.g. Markov Blanket)
- Order relations (e.g. X is an ancestor of Y in
all networks)

24
Representing Partial Models

Features
- Markov relations (e.g. Markov Blanket)
- Order relations (e.g. X is an ancestor of Y in
all networks)
Feature learning leads to a Causal Network

25
Statistical Confidence of Features

Likelihood that a given feature is actually
true.
Can't calculate posterior (P(GD))
gt Bootstrap method

for i1...n resample D with replacement -gt
D' learn G' from D' end
26
Statistical Confidence of Features

Individual feature confidence (IFC)
IFC (1/n)?f(G')
where f(G') 1 if the feature exists in G'

27
Efficient Learning Algorithms

Vast search space
gt need efficient algorithms
Attention on relevant regions of the search
space
gt Sparse Candidate Algorithm

28
Efficient Learning Algorithms

Sparse Candidate Algorithm
Identify a small number of candidate parents
for each gene based on simple local statistics
(e.g. correlation).
Restrict our search to networks with the
candidate parents
Potential pitfall early choice
gt Solution adaptive algorithm

29
Discretization

The practical side
Need to define the local probability model for
each variable.
gt discretize experimental data into -1,0,1
(expression level lower, similar, higher than
control)
Set control by averaging.
Set a threshold ratio for significantly
higher/lower.

30
Application to Cell Cycle Expression Patterns

76 gene expression measurements of the mRNA
levels of 6177 Saccharomyces cerevisiae ORFs. Six
time series under different cell cycle
synchronization methods(Spellman 1998).
800 differentially expressed, 250 clustered in 8
distinct clusters. Variables for the networks
represent the expression level of the 800 genes.
Introduced an additional variable that denoted
the cell cycle phase to deal with the temporal
nature of the cell cycle process and forced it as
a root in the network
Applied Sparse Candidate Algorithm to 200- fold
bootstrap of the original data.
Used no prior biological knowledge in the
learning algorithm

31
Network with all edges
32
Network with edges that represent relations with
confidence level above 0.3
33
YNL058C Local Map

Edges
Markov
Ancestors
Descendants
SGD entry
YPD entry

34
Robustness analysis

Use 250 gene data for robustness analysis
Create random data set by permuting the order of
experiments independently for each gene
No real features are expected to be found

35
Robustness analysis (contd)

Lower confidence for order and
Markov relations in the random
data set
Longer and heavier tail in the
high confidence region in the
original data set

Sparser networks learned from
real data
Features learned in original
data with high confidence level are
not an artifact of the bootstrap
estimation

36
Robustness analysis (contd)

Compared confidence level of learned features
between 250 and 800 gene data set
Strong linear correlation
Compared confidence level of learned features
between different discretization thresholds
Definite linear tendency

37
Biological Analysis

Order relations
Dominant genes indicate potential causal sources
of the cell cycle process
Dominance score of X
where is the confidence in X
being ancestor of Y , k is used to reward high
confidence features and t is a threshold to
discard low confidence ones

38
Biological Analysis (contd)

Dominant genes are key genes in basic cell
functions

39
Biological Analysis (contd)
Markov relations

Top Markov relations reveal functional relations
between genes
1. Both genes known The relations make sense
biologically
2. One unknown gene Firm homologies to proteins
functionally
related to the other gene
3. Two unknown genes Physically adjacent to the
chromosome,
presumably regulated by the same
mechanism
FAR1- ASH1, low correlation, different clusters,
known though to participate in a mating type
switch
CLN2 is likely to be a parent to RNR3, SVS1, SRO4
and RAD41. Appeared in same cluster. No links
between the 4 genes. CLN2 is known to be a
central cycle control and there is no clear
biological relationship between the others

40
Discussion and Future Work

Applied Sparse Candidate Algorithm and Bootstrap
resampling to extract a Bayesian Network for the
800 genes data set of Spellman
Used no prior biological knowledge
Derived biologically plausible conclusions
Capability of discovering causal relationships,
interactions between genes and rich structure
between clusters.
Developing hybrid algorithms with clustering
algorithms to learn models over clustered genes
Extensions
Learn local probability models dealing with
continuous data
Improve theory and algorithms
Include biological knowledge as prior knowledge
Improve search heuristics
Apply Dynamic Bayesian Networks to temporal data
Discover causal patterns (using interventional
data)