Module Networks - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Module Networks

Description:

Lines represent 500 bp of genomic sequence located upstream to the start codon ... stock variable, instance trading day. ... Stock Market Data ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 42
Provided by: csta3
Category:
Tags: module | networks

less

Transcript and Presenter's Notes

Title: Module Networks


1
Module Networks
  • Discovering Regulatory Modules and their
    Condition Specific Regulators from Gene
    Expression Data

Cohen Jony
2
Outline
  • The Problem
  • Regulators
  • Module Networks
  • Learning Module Networks
  • Results
  • Conclusion

3
The Problem
  • Inferring regulatory networks from gene
    expression data.

From
4
Regulators
5
Regulation types
6
Regulators example
This is an example for a regulating module.
7
Known solution Bayesian Networks
The problem Too many variables and too little
data cause statistical noise to lead to spurious
dependencies, resulting in models that
significantly over fit the data.
8
From Bayesian To Module
9
Module Networks
  • We assume that we are given a domain of random
    variables X X1 Xn.
  • We use Val(Xi) to denote the domain of values of
    the variable Xi.
  • A module set C is a set of such formal
  • variables M1 MK. As all the variables in
    a module share the same CPD.
  • Note that all the variables must have the same
    domain of values!

10
Module Networks
  • A module network template T (S ?) for C
    defines, for each module Mj in C
  • 1) a set of parents PaMj from X
  • 2) a conditional probability template (CPT) P( Mj
    PaMj ) which specifies a distribution over Val
    (Mj ) for each assignment in Val (PaMj ).
  • We use S to denote the dependency structure
    encoded by PaMj Mj in C and ? to denote the
    parameters required for the CPTs P( Mj PaMj )
    Mj in C.

11
Module Networks
  • A module assignment function for C is a function
    A X ? 1 K such
    that A(Xi) j only if Val (Xi) Val (Mj ).
  • A module network is defined by both the module
    network template and the assignment function.

12
Example
  • In our example, we have three modules M1, M2, and
    M3.
  • PaM1 Ø , PaM2 MSFT, and PaM3 AMAT
    INTL.
  • In our example, we have that A(MSFT) 1, A(MOT)
    2, A(INTL) 2, and so on.

13
Learning Module Networks
  • The iterative learning procedure attempts to
    search for the model with the highest score by
    using the expectation Maximization (EM)
    algorithm.
  • An important property of the EM algorithm is that
    each iteration is guaranteed to improve the
    likelihood of the model, until convergence to a
    local maximum of the score.
  • Each iteration of the algorithm consists of two
    steps

M-step
E-step
14
Learning Module Networks cont.
M-step
  • In the , the procedure is given
    a partition of the genes into modules and learns
    the best regulation program (regression tree) for
    each module.
  • The regulation program is learned via a
    combinatorial search over the space of trees.
  • The tree is grown from the root to its leaves. At
    any given node, the query which best partitions
    the gene expression into two distinct
    distributions is chosen, until no such split
    exists.

15
Learning Module Networks cont.
E-step
  • In the , given the inferred
    regulation programs, we determine the module
    whose associated regulation program best predicts
    each genes behavior.
  • We test the probability of a genes measured
    expression values in the dataset under each
    regulatory program, obtaining an overall
    probability that this genes expression profile
    was generated by this regulation program.
  • We then select the module whose program gives the
    genes expression profile the highest
    probability, and re-assign the gene to this
    module.
  • We take care not to assign a regulator gene to a
    module in which it is also a regulatory input.

16
Bayesian score
  • When the priors satisfy the assumptions above,
    the Bayesian score decomposes into local module
    scores
  • Where

17
Bayesian score cont.
  • Where Lj(U,X, ?MjD) is the Likelihood function .
  • Where P(?Mj Sj u) is the Priors.
  • Where Sj U denotes that we chose a structure
    where U are the parents of module Mj.
  • Where Aj X denotes that A is such that Xj X.

18
Assumptions
  • Let P(A), P(S A), P(? S,A) be assignment,
    structure, and parameter priors.
  • P(? S,A) satisfies parameter independence if
  • P(? S,A) satisfies parameter modularity if
  • for all structures S1 and S2 such that

19
Assumptions
  • P(?, S A) satisfies assignment independence if
  • P(? S, A) P(? S) and P(S A) P(S).
  • P(S) satisfies structure modularity if
  • where Sj denotes the choice of parents for
    module Mj , and ?j is a distribution over the
    possible parent sets for module Mj.
  • P(A) satisfies assignment modularity if
  • where Aj is the choice of variables assigned to
    module Mj, and aj j 1 K is a family
    of functions from 2X to the positive reals.

20
Assumptions - Explainations
  • Parameter independence, parameter modularity, and
    structure modularity are the natural analogues of
    standard assumptions in Bayesian network
    learning.
  • Parameter independence implies that P(? S, A)
    is a product of terms that parallels the
    decomposition of the likelihood, with one prior
    term per local likelihood term Lj.
  • Parameter modularity states that the prior for
    the parameters of a module Mj depends only on the
    choice of parents for Mj and not on other aspects
    of the structure.
  • Structure modularity implies that the prior over
    the structure S is a product of terms, one per
    each module.

21
Assumptions - Explainations
  • These two assumptions are new to module networks.
  • Assignment independence makes the priors on the
    parents and parameters of a module independent of
    the exact set of variables assigned to the
    module.
  • Assignment modularity implies that the prior on
    A is proportional to a product of local terms,
    one corresponding to each module.
  • Thus, the reassignment of one variable from one
    module Mi to another Mj does not change our
    preferences on the assignment of variables in
    modules other than i j.

22
Experiments
  • The network learning procedure was evaluated on
    synthetic data, gene expression data, and stock
    market data.
  • The data consisted solely of continuous values.
    As all of the variables have the same domain, the
    definition of the module set reduces simply to a
    specification of the total number of modules.
  • Beam search was used as the search algorithm,
    using a look ahead of three splits to evaluate
    each operator.
  • As a comparison, Bayesian networks were used with
    precisely the same structure learning algorithm,
    simply treating each variable as its own module.

23
Synthetic data
  • The synthetic data was generated by a known
    module network.
  • The generating model had 10 modules and a total
    of 35 variables that were a parent of some
    module. From the learned module network, 500
    variables where selected, including the 35
    parents.
  • This procedure was run for training sets of
    various sizes ranging from 25 instances to 500
    instances, each repeated 10 times for different
    training sets.

24
Synthetic data - results
  • Generalization to unseen test data, measuring the
    likelihood ascribed by the learned model to4500
    unseen instances.
  • As expected, models learned with larger training
    sets do better but, when run using the correct
    number of 10 modules, the gain of increasing the
    number of data instances beyond 100 samples is
    small.
  • Models learned with a larger number of modules
    had a wider spread for the assignments of
    variables to modules and consequently achieved
    poor performance.

25
Synthetic data results cont.
  • Log-likelihood per instance assigned to held-out
    data.
  • For all training set sizes, except 25, the model
    with 10 modules performs the best.

26
Synthetic data results cont.
  • Fraction of variables assigned to the largest 10
    modules.
  • Models learned using 100, 200, or 500 instances
    and up to 50 modules assigned 80 of the
    variables to 10 modules.

27
Synthetic data results cont.
  • Average percentage of correct parent-child
    relationships recovered.
  • The total number of parent-child relationships in
    the generating model was 2250.
  • The procedure recovers 74 of the true
    relationships when learning from a dataset of
    size 500 instances.

28
Synthetic data results cont.
  • As the variables begin fragmenting over a large
    number of modules, the learned structure contains
    many spurious relationships.
  • Thus in domains with a modular structure,
    statistical noise is likely to prevent overly
    detailed learned models such as Bayesian networks
    from extracting the commonality between different
    variables with a shared behavior.

29
Gene Expression Data
  • Expression data which measured the response of
    yeast to different stress conditions was used.
  • The data consists of 6157 genes and 173
    experiments.
  • 2355 genes that varied significantly in the data
    were selected and learned a module network over
    these genes.
  • A Bayesian network was also learned over this
    data set.

30
Candidate regulators
  • A set of 466 candidate regulators was compiled
    from SGD and YPD.
  • Both transcriptional factors and signaling
    proteins that may have transcriptional impact.
  • Also included genes described to be similar to
    such regulators.
  • Excluded global regulators, whose regulation is
    not specific to a small set of genes or process.

31
Gene Expression reasults
  • The figure demonstrates that module networks
    generalize much better then Bayesian network to
    unseen data for almost all choices of number of
    modules.

32
Biological validity
  • Biological validity of the learned module network
    with 50 modules was tested.
  • The enriched annotations reflect the key
    biological processes expected in our dataset.
  • For example, the protein folding module
    contains 10 genes, 7 of which are annotated as
    protein folding genes. In the whole data set,
    there are only 26 genes with this annotation.
    Thus, the p-value of this annotation, that is,
    the probability of choosing 7 or more genes in
    this category by choosing 10 random genes, is
    less than 10-12.
  • 42 modules, out of 50, had at least one
    significantly enriched annotation with a p-value
    less than 0.005.

33
Biological validity Cont.
  • The enrichment of both HAP4 motif and STRE,
    recognized by Hap4 and Msn4, respectively,
    supporting their inclusion in the modules
    regulation program.
  • Lines represent 500 bp of genomic sequence
    located upstream to the start codon of each of
    the genes colored boxes represent the presence
    of cis-regulatory motifs locates in these
    regions.

34
Stock Market Data
  • NASDAQ stock prices for 2143 companies, covering
    273 trading days.
  • stock ? variable, instance ? trading day.
  • The value of the variable is the log of the ratio
    between that days and the previous days closing
    stock price.
  • As potential controllers, 250 of the 2143 stocks,
    whose average trading volume was the largest
    across the dataset were selected.

35
Stock Market Data
  • Cross validation is used to evaluate the
    generalization ability of different models.
  • Module networks perform significantly better than
    Bayesian networks in this domain.

36
Stock Market Data
  • Module networks compared with Autoclass
  • Significant enrichment for 21 annotations,
    covering a wide variety of sectors where found.
  • In 20 of the 21 cases, the enrichment was far
    more significant in the modules learned using
    module networks compared to the one learned by
    AutoClass.

37
Conclusions
  • The results show that learned module networks
    have much higher generalization performance than
    a Bayesian network learned from the same data.
  • Parameter sharing between variables in the same
    module allows each parameter to be estimated
    based on a much larger sample, this allows us to
    learn dependencies that are considered too weak
    based on statistics of single variables. (these
    are well-known advantages of parameter sharing)
  • An interesting aspect of the method is that it
    determine automatically which variables have
    shared parameters.

38
Conclusions
  • The assumption of shared structure significantly
    restricts the space of possible dependency
    structures, allowing us to learn more robust
    models than those learned in a classical Bayesian
    network setting.
  • In module network, a spurious correlation would
    have to arise between a possible parent and a
    large number of other variables before the
    algorithm would introduce the dependency.

39
Overview on Module Networks
40
Literature
  • Reference Discovering Regulatory Modules and
    their Condition Specific Regulators from Gene
    Expression Data.
  • By Eran Segal, Michal Shapira, Aviv Regev, Dana
    Peer, David Botstein, Daphne Koller Nir
    Friedman.
  • Bibliography
  • P. Cheeseman, J. Kelly, M. Self, J. Stutz, W.
    Taylor, and D. Freeman. Autoclass a Bayesian
    classification system. In ML 88. 1988.

41
THE END
Write a Comment
User Comments (0)
About PowerShow.com