Untangling graphs: denoising protein-protein interaction networks - PowerPoint PPT Presentation

1 / 35

About This Presentation

Title:

Untangling graphs: denoising protein-protein interaction networks

Description:

and S ek = d. d. i,j. j. intractable sum? No, use dynamic programming. 12/18/09 ... Construct likelihood function out of all observations using Na ve Bayes ... – PowerPoint PPT presentation

Number of Views:18

Avg rating:3.0/5.0

Slides: 36

Provided by: biomlwork

Category:

more less

Transcript and Presenter's Notes

Title: Untangling graphs: denoising protein-protein interaction networks

1
Untangling graphs denoising protein-protein
interaction networks

Quaid Morris
(joint work with Brendan Frey)

2
Motivation

High-throughput graph data is noisy, e.g.,
protein-protein interaction networks
synthetic lethal interaction networks
Real world graphs are highly structured

Idea Use prior knowledge about structure to
denoise graphs
3
Protein-Protein interaction network
Jeong et al, Nature 2001
4
Overview

Illustrative example
Model and inference algorithm
Protein-protein interaction network denoising

5
Example spy rings
Suspects
Phone Records
Call

Spies call exactly two other spies
Suspects may call other suspects
Phone records may be lost

6
Example spy rings cont.
Phone Records
Possible Rings
7
Denoising example
Noise assumptions

No lost calls
Rare, independent social calls

Possible Rings
8
Denoising example
Noise assumptions

No lost calls
Rare, independent social calls

Possible Rings
9
Denoising example
Noise assumptions

No lost calls
Rare, independent social calls

Possible Rings
10
Untangling example
Telemarketing Example
Noise assumptions

Lost calls and rare social calls
Telemarketing

Possible Decompositions
11
Summary

With structured noise, observed graph is composed
of different graphs, each with their own
properties.

12
Graph generative model
E
E
1
2

Sample hidden graphs
E from P(E )

i
h
h
j
2) Sample x from P(x e , e )
1
2
i,j
i,j
i,j
i,j
X
13
Model and inference
Joint
1
2
H
h
1
2
H
P(X, E , E , , E ) P P(E ) P P(x e ,
e , , e )
i,j
i,j
i,j
i,j
h
igtj
Posterior marginal
h
P(e , X)
i,j
h
P(e X)
i,j
Generally Intractable Sums
P(X)
Probability of evidence

1
2
H
P(X) S S S
P(X, E , E , , E )
H
2
1
E
E
E
14
Three tricks for tractability

Degree-based graph priors
Sum-product approximate inference
Dynamic programming trick

15
Degree-based graph priors
h
h
h
P(E ) P f (d ) / Z
i
i
Degree of vertex i
Degree potential for graph h
h
i
h
d S e
i,j
i
j

Real-world network structure captured
Nice sum-product (loopy belief prop) algorithm
Introduces dummy degree variable, d

16
Two types of random graphs
Exponential
Scale-free
Jeong et al, Nature 2000
17
Random graph degree distributions
Scale-free
Exponential
-p
f(k) Ck , p gt 1
Poisson(ltkgt)
Jeong et al, Nature 2000
18
Other real-world structure

Small-worldness
Degree correlations
Clustering

Google Mark Newman Michigan for more info
19
Factor graph for denoising
e
x
1,2
1,2
e
x
d
1,3
1,3
1
d
e
x
2
1,4
1,4
d
x
e
3
2,3
2,3
e
d
x
2,4
4
2,4
e
x
3,4
3,4
20
Factor graph for denoising
e
x
1,2
1,2
e
x
d
1,3
1,3
1
d
e
x
2
1,4
1,4
d
x
e
3
2,3
2,3
e
d
x
2,4
4
2,4
e
x
f(d)
degree potentials
3,4
3,4
21
Factor graph for denoising
I(d , Se )
Indicator functions
e
x
j
i,j
1,2
1,2
i
e
x
d
1,3
1,3
1
d
e
x
2
1,4
1,4
d
x
e
3
2,3
2,3
e
d
x
2,4
4
2,4
e
x
f(d)
3,4
3,4
22
Factor graph for denoising
I(d , Se )
e
x
j
i,j
1,2
1,2
i
e
x
d
1,3
1,3
1
d
e
x
2
1,4
1,4
d
x
e
3
2,3
2,3
e
d
x
2,4
4
2,4
e
x
f(d)
3,4
3,4
P(xe)
Likelihood functions
23
Factor graph for denoising
I(d , Se )
e
x
j
i,j
1,2
1,2
i
e
x
d
1,3
1,3
1
d
e
x
2
1,4
1,4
d
x
e
3
2,3
2,3
e
d
x
2,4
4
2,4
e
x
f(d)
3,4
3,4
P(xe)
24
Sum-product approximate inference

Two types of binary messages
edge variables a constraint nodes
constraint nodes a edge variables

25
Calculating edge a constraint messages
e 0 or 1
edge e -gt degree I messages
i,j
i
m (e) m (e) m (e)
e -gt I
I -gt e
x -gt e
j
i
i,j
i,j
i,j
i,j
likelihood message
constraint -gt edge message
P(x e e)
i.e.
i,j
i,j
26
Calculating constraint a edge messages
degree I -gt edge e messages
i,j
j
m (e) S S f(d) P m (ek)
e -gt I
I -gt e
k j
i
i,k
d
j
i,j
e1, e2, , eN s.t. ej e, and S ek d
degree prior
intractable sum? No, use dynamic programming
27
Dynamic programming solution
d
j
I

s
s
s
j
1
2
N

e
e
e
d
1,j
2,j
N,j
j

e
e
e
1,j
2,j
N,j
N
2
O(2 ) time
O(N ) time
28
Dynamic programming solution
Constraint
d
j
s
s e
i1
i
i1,j
I

s
s
s
j
1
2
N

e
e
e
d
1,j
2,j
N,j
j

e
e
e
1,j
2,j
N,j
N
2
O(2 ) time
O(N ) time
29
Inference for untangling

Message passing same as denoising, except
likelihood message needs to be recalculated.
Likelihood message incorporates edge
information from other hidden graphs

30
Factor graph for untangling
1
2
f (d)
P(xe ,e )
f (d)
1
2
e
1
1
d
e
d
2
2
1,2
1
1,2
1
x
1,2
e
1
1
d
e
d
2
2
1,3
2
1,3
2
x
1,3
e
d
e
d
1
1
2
2
2,3
3
2,3
3
x
2,3
I(d, ee)
I(d, ee)
Graph 1
Graph 2
31
Protein-protein interaction network denoising
Von Mering et al (2002) dataset

Eight PPI networks consisting of
Low quality direct evidence (high-throughput)
Indirect evidence
Gold standard
A small set of confirmed interactions

32
Empirical degree distributions
33
Methods

Split 6k ORFs into training and test set
On training set
Fit degree priors to both true graph and
false graph.
Construct likelihood function out of all
observations using Naïve Bayes
Every observed interaction must be placed in
exactly one of the two hidden graphs.

34
Results
Untangling
Baseline
35
Summary