Network Motifs: Simple Building Blocks of Complex Network - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Network Motifs: Simple Building Blocks of Complex Network

Description:

Network Motifs: Simple Building Blocks of Complex Network Lecturer: Jian Li Introduction Recently, it was found that biochemical and neuronal network share a similar ... – PowerPoint PPT presentation

Number of Views:454
Avg rating:3.0/5.0
Slides: 38
Provided by: csUmdEdu7
Category:

less

Transcript and Presenter's Notes

Title: Network Motifs: Simple Building Blocks of Complex Network


1
Network Motifs Simple Building Blocks of Complex
Network
  • Lecturer Jian Li

2
Introduction
  • Recently, it was found that biochemical and
    neuronal network share a similar property they
    contain recurring circuit elements which occur
    more often far more than that in randomized
    networks.
  • We call such simple building blocks network
    motifs.

3
Introduction
  • In the case of biological regulation networks, it
    has been suggested that network motifs play key
    information processing roles.

4
Introduction
  • Some examples

Three major network mortifs were found in the
transcription network of bacteria and yeast.
One of these the feed-forward loop, has been
shown theoretically to perform information
processing tasks such as sign-sensitive
filtering, response acceleration and
pulse-generation.
5
Introduction
  • Some examples

6
Introduction
  • Schematic Illustration

Red dashed line indicate edges that participate
in the feedforward loop motif, which occur five
times in the real network.
7
Introduction
  • Applications in other network
  • Ecology (food web)
  • Neurobiology (neuron connectivity)
  • Engineering (electronic circuit, WWW)

8
Introduction
  • Some remarks
  • The solution we get is closely related to the
    randomized network model. So a reasonable select
    of randomized network model is very important.
  • Some functional-important but less-frequent
    building block will be missed no matter how we
    select our model. To find this type of things
    need specific knowledge and information which are
    beyond the sweep of graph theory approach.

9
Related Problems
  • Theoretical Perspective
  • efficiently counting cycle.
  • counting spanning trees.
  • number of nonisomorphic graphs
  • testing isomorphism
  • approximating perfect matching.
  • approximating frequent subgraphs based on the
    regularity lemma.

10
Related Problems
  • Data mining perspective.
  • Mining frequent subgraphs.
  • Mining a given subgraph.
  • Mining subgraphs in sparse network.
  • Graph-based substructure pattern
    mining(gSpan)

11
Related Problems
  • Random network.
  • Generating randomized network with prescribed
    degree sequence.
  • Estimating subgraphs in random networks.

12
Related Problems
  • Random network.
  • Erdos model
  • -the distribution of the number of edges per node
    exhibit a Poissonian distribution.
  • Scale-free model
  • -the distribution of the number of edges per node
    exhibit a exponential distribution.

13
Randomized Network
  • Generating randomized network
  • Here we only give a simple algorithm.
  • We employed a Markov-chain algorithm, based on
    starting with the real network and repeatedly
    swapping randomly chosen pairs of connections
    (X1-gtY1, X2 -gtY2 is replaced by X1-gtY2, X2-gtY1)
    until the network is well randomized.
  • Switching is prohibited if the either of the
    connections X1-gtY2 or X2-gtY1 already exist.

14
Randomized Network
  • Controlling for Appearances of (n 1)-Node
    Motifs
  • We generate a series of randomized network
    ensembles, each of which has the same (n
    1)-node subgraph count as the real network, as a
    null hypothesis for detecting n-node motifs.
  • This is done to avoid assigning high significance
    to a structure only because of the fact that it
    includes a highly significant substructure.

15
Randomized Network
  • Controlling for Appearances of (n 1)-Node
    Motifs
  • Metropolis Monte-Carlo approach
  • Vreal,k be the number of appearances of each of
    the kth (n-1)-node subgraphs in the real network
    and Vrand,k be the corresponding vector in the
    randomized network.
  • We define an energy
  • E ?k(Vreal,k Vrand,k/(Vreal,k
    Vrand,k)).
  • The energy E is zero only when all the three-node
    subgraph counts of the real and randomized graphs
    are equal.

16
Randomized Network
  • Controlling for Appearances of (n 1)-Node
    Motifs
  • start by fully randomizing the network according
    to first algorithm.
  • Then, we generate a random switch (X1-gtY1, X2-gt
  • Y2 to (X1-gtY2, X2-gtY1), and similarly for double
    edges, as described above).
  • If this switch lowers E, it is accepted.
  • Otherwise, it is accepted with probability exp(M
    E/T), where ME is the difference in energy before
    and after the switch and T is an effective
    temperature.

17
Graph Theoretical Results
  • Controlling for Appearances of (n 1)-Node
    Motifs
  • This process is repeated, with a simulated
    annealing regiment to lower T slowly until a
    solution with E 0 is obtained.
  • This can be readily generalized to form (n
    1)-node null-hypothesis networks

18
Algorithm Counting
  • Goal find all n-node network motif
  • Method
  • Do the following for both real network and
    randomized network
  • Simply enumerate all the possible n node
    subgraphs, classify them into non-isomorphic
    class.
  • Count the number of subgraphs in each class.see
    all types of 3,4node nonisomorphic graphs

19
Algorithm Counting
  • Efficiently count all connected n-node subgraphs
    in a connectivity matrix M
  • main
  • for all rows i
  • for each nonzero element (i, j)
  • search (i,j)
  • search(i,j)
  • for each k such that Mik 1 and k!j
  • if an n-node subgraph is obtained then
    record it and return
  • else search (i,k)
  • do similar things for each Mki 1, Mkj 1, Mjk
    1

20
Algorithm Counting
  • A table is formed that counts the number of
    appearances of each type of subgraph in the
    network,
  • This process is repeated for each of the
    randomized networks. The number of appearances of
    each type of subgraph in the random ensemble is
    recorded, to assess its statistical significance.

21
Algorithm Counting
  • Criteria for Network Motif Selection
  • (i) The probability that it appears in a
    randomized network an equal or greater number of
    times than in the real network is smaller than P
    0.01.
  • (ii) The number of times it appears in the real
    network with distinct sets of nodes is at least
    4.
  • (iii) The number of appearances in the real
    network is significantly larger than in the
    randomized networks Nreal Nrand gt 0.1Nrand.
    This is done to avoid detecting as motifs some
    common subgraphs that have only a slight
    difference between Nrand and Nreal but have a
    narrow distribution in the randomized networks.

22
Algorithm Counting
  • Result

CiNi/?i Ni Z-scores Z (Creal
Crand)/Varrand (note the inequality P(X-E(x))
gtZVar(x)lt1/Z2 ) High Z-scores indicate the
event is quit unlikely.
23
Algorithm Sampling
  • A clever trade-off between accuracy and
    efficiency.
  • The counting algorithm can exactly enumerate the
    number of subgraph, but to detect network motifs,
    we only need to know which type of subgraph occur
    more frequently in real network than in
    randomized network.

24
Algorithm Sampling
  • Using random sampling method can do pretty good
    estimation.
  • Random sampling has many applications.
  • -approximating dense subset
  • -approximating P-complete problem
  • -mechine learning

25
Algorithm Sampling
  • This algorithm does not enumerate subgraphs
    exhaustively but instead samples subgraphs in
    order to estimate their relative frequency.
  • The runtime of the algorithm asymptotically does
    not depend on the network size.
  • Surprisingly, few samples are needed to detect
    network motifs reliably.
  • The sampling method is useful for analyzing very
    large networks or for detection of high-order
    motifs, which are beyond the reach of exhaustive
    enumeration algorithms.

26
Algorithm Sampling
  • DefinitionEs is the set of picked edges
  • Vs is the set of all node that are touch be the
    edges in Es
  • ALGORITHM Sampling
  • Initiate Vs? and Es ?
  • 1.Pick a random edge e1(vi,vj),update
    Ese1,Vsvi,vj
  • 2.Make a list L of all neighboring edges of Es,
    omit all edges between Vs.if L? return to 1
  • 3.pick a random edge e(vk,vl)from L. Update
    EsEs U e, VsVs U vk,vl
  • 4.Repeat steps 2-3 until completing n-node
    subgraph S.
  • 5.Calculate the probability P to sample S.

27
Algorithm Sampling
  • The probability of sampling the subgraph is the
    sum of the probabilities of all such possible
    ordered sets of n-1 edges
  • Where Sm is a set of all (n-1)-permutations of
    the edges from the specific subgraph edges that
    could lead to a sample of the subgraph. Ej is the
    j -th edge in a specific (n-1)-permutation (s).

28
Algorithm Sampling
29
Algorithm Sampling
  • Add score W 1/P to the accumulated score, Si ,
    of the relevant subgraph type i Si Si W.
    After ST samples, assuming we sampled L different
    subgraph types, we calculate the estimated
    subgraph concentrations
  • Ci Si/?k1L Sk

30
Algorithm Sampling
  • Z-scores is calculated as before.
  • Z (Creal ltCrandgt)/Varrand
  • where Creal is the concentration in the real
    network, ltCrandgt and Varrand are the mean and SD
    in the randomized networks.

31
Algorithm Sampling
Sampling method versus exhaustive enumeration,
Highlighted subgraphs were found to be network
motifs.
32
Algorithm Sampling
  • Algorithm convergence
  • The subgraph concentrations calculated by the
    sampling algorithm converged to the fully
    enumerated concentrations. Different numbers of
    samples were required for achieving good
    estimations for different subgraphs and in
    different networks.
  • All of the simulations we performed, on a variety
    of networks, showed that the results converge
    toward the real values within ST 105 samples or
    less.

33
Algorithm Sampling
  • Algorithm convergence
  • It is seen that even with a small number of
    samples one can estimate reliably concentrations
    as low as C 10-5.
  • It is possible to use convergence studies in
    order to decide the required number of
    samples.(adaptive sampling method,using
    instantaneous convergence rate to decide how many
    samples are enough)

34
Algorithm Sampling
  • The sampling method allows accurate counting of
    rare, high-order subgraphs and motifs

35
Some discuss and Future attempt
  • We focus on comparing between the real network
    and the randomized network with prescribed degree
    sequence. So our question is whether some real
    frequent building block are caused by the degree
    sequence.
  • If so, so what we have done will miss this type
    of building block. Some other randomized network
    model (rather than the ones with prescribed
    degree sequence) could be introduced to deal with
    such case.

36
Some discuss and Future attempt
  • Embedding the graph to euclidean space, and
    considering the subgraph with no only topological
    properties but also geometric properties.

37
THANKS
Write a Comment
User Comments (0)
About PowerShow.com