Folie 1 - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Folie 1

Description:

Recommended: Christopher Bishop s tutorial on graphical models, based on ... Oversimple/wrong model assumptions 'All models are wrong, some of them are useful' ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 18
Provided by: tre70
Category:
Tags: folie | oversimple

less

Transcript and Presenter's Notes

Title: Folie 1


1
Graphical Models and Biological Networks
Recommended Christopher Bishops tutorial on
graphical models, based on his book Pattern
Recognition and Machine Learning
http//research.microsoft.com/en-us/um/people/cmb
ishop/prml/slides/prml-slides-8.ppt
2
Learning from non-interventional data
  • Gaussian Graphical Models
  • Pruning

Best suited for high dimensional, noisy data
What do the arrows mean? Do they have a
biological interpretation?
3
Learning from non-interventional data
  • Possible Models include
  • Correlation Graphs
  • Gaussian Graphical Models
  • Bayesian Networks
  • However Correct Reconstruction of the complete
    regulatory network is impossible due to
  • Lack of data
  • Measurement error
  • Oversimple/wrong model assumptions

All models are wrong, some of them are
useful (Edwards Deming, George Box)
4
Correlation Graphs
5
Correlation Graphs
  • An expression profile is a collection of
    expression vectors Xg
    (Xg,s)s ? samples , g ? Genes
  • Correlation graph Depict genes as vertices of a
    graph and draw an undirected edge (i, j) if some
    correlation measure (Pearson correlation,
    Spearman rank correlation, Kendalls tau) between
    Xi and Xj is sufficiently different from zero.
  • Advantage This representation of the marginal
    dependence structure is easy to interpret and can
    be estimated accurately even if genes
    measurements (p N situation).
  • Application Stuart et al. (Science, 2003) build
    a graph from coexpression across multiple
    organisms.

6
Correlation Measures
Pearson
with
Spearman
Kendall
Sample (Pearson) correlations (taken from
Wikipedia)
7
Correlation Graphs
  • It is impossible to distinguish direct from
    indirect dependence
  • Three reasons why X, Y , and Z may be highly
    correlated
  • Possible remedies
  • search for correlations which cannot be explained
    by other variables.
  • measure effects of gene perturbations
  • A strong correlation is not a strong evidence for
    regulatory dependence (lots of false positives)
    rather than a low correlation is a strong
    evidence for no regulatory edge.

8
Recap Conditional Independence
  • In other words
  • Knowing Z, knowing Y is irrelevant for knowing X
    (and vice versa).
  • Z explains any observed dependence between X and
    Y .

taken from Florian Markowetz
9
Gaussian Graphical Models
taken from Florian Markowetz
10
Gaussian Graphical Models
11
Gaussian Graphical Models
If we assume that the common expression
distribution of all genes follows a multivariate
Gaussian distribution (which is of course never
the case), conditional independence can be
assessed as follows
12
Problems in high dimensions
  • Full conditional relationships can only be
    accurately estimated if the number of samples N
    is relatively large compared to the number of
    variables p.
  • Thus, if p N, you can . . .
  • use the Moore-Penrose pseudoinverse, bootstrap
    aggregation and shrinkage estimators to stabilize
    the result
  • resort to a simpler model that does not rely on
    full conditional independence

Graph from Basso et al (Nat Genet, 2005)
13
Problems in high dimensions
14
Modified GGMs
Correlation Graphs
GGMs
Wille / Bühlmann
Recall that independence does not imply
conditional independence and vice versa, thus all
these methods are distinct.
All methods failed to accurately reconstruct
networks, even if they were of very moderate
size (20)
15
Markov Random Fields
Definition An undirected graph G(V,E), together
with a family of random variables (Xv, v?V) is a
Markov network (Markov Random Field, MRF) if one
of the three equivalent conditions holds
Pairwise Markov property
For all non-adjacent u,v ?V
Local Markov property
For all v?V
Global Markov property
For all subsets A,B,S of V such that S separates
A and B
16
Markov Random Fields
The joint density of a Markov Random field can be
factorized into clique potentials,
if the density is positive (Hammersley-Clifford
theorem), or if the graph is chordal (without
proof).
A Gaussian Graphical Model is a particular Markov
random field
And (u,v)?E whenever
(Proof Blackboard)
17
Graphical Models - Overview
Probabilistic models
Graphical models
Undirected
Directed
(Markov Randomfields - MRFs)
(Bayesian networks)
Write a Comment
User Comments (0)
About PowerShow.com