Learning disjunctions in Geronimo - PowerPoint PPT Presentation

About This Presentation
Title:

Learning disjunctions in Geronimo

Description:

Learning disjunctions in Geronimo's regression trees ... Newly diagnosed patients have an average survival of 1 year. ... Nat Genet, 2003. 34(2): p. 166-176. ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 8
Provided by: feli61
Learn more at: http://www.columbia.edu
Category:

less

Transcript and Presenter's Notes

Title: Learning disjunctions in Geronimo


1
Learning disjunctions in Geronimos regression
trees
  • Felix Sanchez Garcia
  • supervised by Prof. Dana Peer

2
Motivation
  • Gliobastoma most common primary brain tumour in
    adults.
  • Newly diagnosed patients have an average survival
    of 1 year.
  • Need for better models of the network.
  • Data used to create models microarrays
  • genes ? 8000
  • candidate regulators ? 800
  • samples ?120

3
Module networks
  • Bayesian model that benefits from high
    correlation of groups of variables 2
  • Algorithm similar to EM (but hard decisions).
    Loop
  • Module assignment step assign variables to
    modules
  • Structure search step calculate CPD for each
    module

4
Regression trees as CPD
xlt0.3
  • Regression trees are used for each modules CPD
  • Internal nodes condition on a single variable
  • Leaf nodes parameters for normal distribution
  • Bayesian score
  • Exhaustively calculates score for each split for
    each regulator

ygt-0.2
pdf of normal-gamma
prior on structure (complexitybiological
penalties)
5
Incorporating pathway information
  • Biological pathways contain sets of genes and
    represent chains of biochemical reactions that
    perform some function
  • Aberrations in gliobastoma tend to occure as
    disjunctions within pathways derregulating 1
    component is usually enough to alter the function
    of the whole pathway 4
  • Idea use pathway information to obtain a better
    model
  • Methodology extend node conditions to
    disjunctions of conditions on pathway elements
  • We will use 15 sets of regulators (20-30 genes
    per set)
  • 5 sets of regulators of pathways known to be
    related to cancer.
  • 5 sets of regulators of other pathways
  • 5 sets of regulators chosed at random

6
Problem setting
  • Concept class disjunction of threshold functions
    on a single variable
  • Loss functions -Bayesian score (biological
    penalty?)
  • Potential number of hypotheses 2m
  • Related classification problem tackled by
    Marchand and Shah (2005) and Kestler et al.
    (2006).

7
Bibliography
  • Pe'er, D., Bayesian Network Analysis of Signaling
    Networks A Primer. Sci. STKE, 2005. 2005(281)
    p. pl4-.
  • Segal, E., et al., Module networks identifying
    regulatory modules and their condition-specific
    regulators from gene expression data. Nat Genet,
    2003. 34(2) p. 166-176.
  • Lee, S.-I., et al., Identifying regulatory
    mechanisms using individual variation reveals key
    role for chromatin modification. Proceedings of
    the National Academy of Sciences, 2006. 103(38)
    p. 14062-14067.
  • Comprehensive genomic characterization defines
    human glioblastoma genes and core pathways.
    Nature, 2008. 455(7216) p. 1061-1068.
  • Kestler, H., W. Lindner, and A. Müller, Learning
    and Feature Selection Using the Set Covering
    Machine with Data-Dependent Rays on Gene
    Expression Profiles, in Artificial Neural
    Networks in Pattern Recognition. 2006. p.
    286-297.
  • Marchand, M. and M. Shah, PAC-Bayes Learning of
    Conjunctions and Classification of
    Gene-Expression Data, in Advances in Neural
    Information Processing Systems 17. 2005, MIT
    Press Cambridge, MA. p. 881-888.
Write a Comment
User Comments (0)
About PowerShow.com