Network Motifs: Simple Building Blocks of Complex Network - PowerPoint PPT Presentation

1 / 37

About This Presentation

Title:

Network Motifs: Simple Building Blocks of Complex Network

Description:

Network Motifs: Simple Building Blocks of Complex Network Lecturer: Jian Li Introduction Recently, it was found that biochemical and neuronal network share a similar ... – PowerPoint PPT presentation

Number of Views:454

Avg rating:3.0/5.0

Slides: 38

Provided by: csUmdEdu7

Category:

more less

Transcript and Presenter's Notes

Title: Network Motifs: Simple Building Blocks of Complex Network

1
Network Motifs Simple Building Blocks of Complex
Network

Lecturer Jian Li

2
Introduction

Recently, it was found that biochemical and
neuronal network share a similar property they
contain recurring circuit elements which occur
more often far more than that in randomized
networks.
We call such simple building blocks network
motifs.

3
Introduction

In the case of biological regulation networks, it
has been suggested that network motifs play key
information processing roles.

4
Introduction

Some examples

Three major network mortifs were found in the
transcription network of bacteria and yeast.
One of these the feed-forward loop, has been
shown theoretically to perform information
processing tasks such as sign-sensitive
filtering, response acceleration and
pulse-generation.
5
Introduction

Some examples

6
Introduction

Schematic Illustration

Red dashed line indicate edges that participate
in the feedforward loop motif, which occur five
times in the real network.
7
Introduction

Applications in other network
Ecology (food web)
Neurobiology (neuron connectivity)
Engineering (electronic circuit, WWW)

8
Introduction

Some remarks
The solution we get is closely related to the
randomized network model. So a reasonable select
of randomized network model is very important.
Some functional-important but less-frequent
building block will be missed no matter how we
select our model. To find this type of things
need specific knowledge and information which are
beyond the sweep of graph theory approach.

9
Related Problems

Theoretical Perspective
efficiently counting cycle.
counting spanning trees.
number of nonisomorphic graphs
testing isomorphism
approximating perfect matching.
approximating frequent subgraphs based on the
regularity lemma.

10
Related Problems

Data mining perspective.
Mining frequent subgraphs.
Mining a given subgraph.
Mining subgraphs in sparse network.
Graph-based substructure pattern
mining(gSpan)

11
Related Problems

Random network.
Generating randomized network with prescribed
degree sequence.
Estimating subgraphs in random networks.

12
Related Problems

Random network.
Erdos model
-the distribution of the number of edges per node
exhibit a Poissonian distribution.
Scale-free model
-the distribution of the number of edges per node
exhibit a exponential distribution.

13
Randomized Network

Generating randomized network
Here we only give a simple algorithm.
We employed a Markov-chain algorithm, based on
starting with the real network and repeatedly
swapping randomly chosen pairs of connections
(X1-gtY1, X2 -gtY2 is replaced by X1-gtY2, X2-gtY1)
until the network is well randomized.
Switching is prohibited if the either of the
connections X1-gtY2 or X2-gtY1 already exist.

14
Randomized Network

Controlling for Appearances of (n 1)-Node
Motifs
We generate a series of randomized network
ensembles, each of which has the same (n
1)-node subgraph count as the real network, as a
null hypothesis for detecting n-node motifs.
This is done to avoid assigning high significance
to a structure only because of the fact that it
includes a highly significant substructure.

15
Randomized Network

Controlling for Appearances of (n 1)-Node
Motifs
Metropolis Monte-Carlo approach
Vreal,k be the number of appearances of each of
the kth (n-1)-node subgraphs in the real network
and Vrand,k be the corresponding vector in the
randomized network.
We define an energy
E ?k(Vreal,k Vrand,k/(Vreal,k
Vrand,k)).
The energy E is zero only when all the three-node
subgraph counts of the real and randomized graphs
are equal.

16
Randomized Network

Controlling for Appearances of (n 1)-Node
Motifs
start by fully randomizing the network according
to first algorithm.
Then, we generate a random switch (X1-gtY1, X2-gt
Y2 to (X1-gtY2, X2-gtY1), and similarly for double
edges, as described above).
If this switch lowers E, it is accepted.
Otherwise, it is accepted with probability exp(M
E/T), where ME is the difference in energy before
and after the switch and T is an effective
temperature.

17
Graph Theoretical Results

Controlling for Appearances of (n 1)-Node
Motifs
This process is repeated, with a simulated
annealing regiment to lower T slowly until a
solution with E 0 is obtained.
This can be readily generalized to form (n
1)-node null-hypothesis networks

18
Algorithm Counting

Goal find all n-node network motif
Method
Do the following for both real network and
randomized network
Simply enumerate all the possible n node
subgraphs, classify them into non-isomorphic
class.
Count the number of subgraphs in each class.see
all types of 3,4node nonisomorphic graphs

19
Algorithm Counting

Efficiently count all connected n-node subgraphs
in a connectivity matrix M
main
for all rows i
for each nonzero element (i, j)
search (i,j)
search(i,j)
for each k such that Mik 1 and k!j
if an n-node subgraph is obtained then
record it and return
else search (i,k)
do similar things for each Mki 1, Mkj 1, Mjk
1

20
Algorithm Counting

A table is formed that counts the number of
appearances of each type of subgraph in the
network,
This process is repeated for each of the
randomized networks. The number of appearances of
each type of subgraph in the random ensemble is
recorded, to assess its statistical significance.

21
Algorithm Counting

Criteria for Network Motif Selection
(i) The probability that it appears in a
randomized network an equal or greater number of
times than in the real network is smaller than P
0.01.
(ii) The number of times it appears in the real
network with distinct sets of nodes is at least
4.
(iii) The number of appearances in the real
network is significantly larger than in the
randomized networks Nreal Nrand gt 0.1Nrand.
This is done to avoid detecting as motifs some
common subgraphs that have only a slight
difference between Nrand and Nreal but have a
narrow distribution in the randomized networks.

22
Algorithm Counting

Result

CiNi/?i Ni Z-scores Z (Creal
Crand)/Varrand (note the inequality P(X-E(x))
gtZVar(x)lt1/Z2 ) High Z-scores indicate the
event is quit unlikely.
23
Algorithm Sampling

A clever trade-off between accuracy and
efficiency.
The counting algorithm can exactly enumerate the
number of subgraph, but to detect network motifs,
we only need to know which type of subgraph occur
more frequently in real network than in
randomized network.

24
Algorithm Sampling

Using random sampling method can do pretty good
estimation.
Random sampling has many applications.
-approximating dense subset
-approximating P-complete problem
-mechine learning

25
Algorithm Sampling

This algorithm does not enumerate subgraphs
exhaustively but instead samples subgraphs in
order to estimate their relative frequency.
The runtime of the algorithm asymptotically does
not depend on the network size.
Surprisingly, few samples are needed to detect
network motifs reliably.
The sampling method is useful for analyzing very
large networks or for detection of high-order
motifs, which are beyond the reach of exhaustive
enumeration algorithms.

26
Algorithm Sampling

DefinitionEs is the set of picked edges
Vs is the set of all node that are touch be the
edges in Es
ALGORITHM Sampling
Initiate Vs? and Es ?
1.Pick a random edge e1(vi,vj),update
Ese1,Vsvi,vj
2.Make a list L of all neighboring edges of Es,
omit all edges between Vs.if L? return to 1
3.pick a random edge e(vk,vl)from L. Update
EsEs U e, VsVs U vk,vl
4.Repeat steps 2-3 until completing n-node
subgraph S.
5.Calculate the probability P to sample S.

27
Algorithm Sampling

The probability of sampling the subgraph is the
sum of the probabilities of all such possible
ordered sets of n-1 edges
Where Sm is a set of all (n-1)-permutations of
the edges from the specific subgraph edges that
could lead to a sample of the subgraph. Ej is the
j -th edge in a specific (n-1)-permutation (s).

28
Algorithm Sampling
29
Algorithm Sampling

Add score W 1/P to the accumulated score, Si ,
of the relevant subgraph type i Si Si W.
After ST samples, assuming we sampled L different
subgraph types, we calculate the estimated
subgraph concentrations
Ci Si/?k1L Sk

30
Algorithm Sampling

Z-scores is calculated as before.
Z (Creal ltCrandgt)/Varrand
where Creal is the concentration in the real
network, ltCrandgt and Varrand are the mean and SD
in the randomized networks.

31
Algorithm Sampling
Sampling method versus exhaustive enumeration,
Highlighted subgraphs were found to be network
motifs.
32
Algorithm Sampling

Algorithm convergence
The subgraph concentrations calculated by the
sampling algorithm converged to the fully
enumerated concentrations. Different numbers of
samples were required for achieving good
estimations for different subgraphs and in
different networks.
All of the simulations we performed, on a variety
of networks, showed that the results converge
toward the real values within ST 105 samples or
less.

33
Algorithm Sampling

Algorithm convergence
It is seen that even with a small number of
samples one can estimate reliably concentrations
as low as C 10-5.
It is possible to use convergence studies in
order to decide the required number of
samples.(adaptive sampling method,using
instantaneous convergence rate to decide how many
samples are enough)

34
Algorithm Sampling

The sampling method allows accurate counting of
rare, high-order subgraphs and motifs

35
Some discuss and Future attempt

We focus on comparing between the real network
and the randomized network with prescribed degree
sequence. So our question is whether some real
frequent building block are caused by the degree
sequence.
If so, so what we have done will miss this type
of building block. Some other randomized network
model (rather than the ones with prescribed
degree sequence) could be introduced to deal with
such case.

36
Some discuss and Future attempt