Consensus NMF: A unified approach to twosided testing of micro array data PowerPoint PPT Presentation

presentation player overlay
1 / 16
About This Presentation
Transcript and Presenter's Notes

Title: Consensus NMF: A unified approach to twosided testing of micro array data


1
Consensus NMF A unified approach to two-sided
testing of micro array data
  • Paul Fogel, Consultant, Paris, France
  • S. Stanley Young, National Institute of
    Statistical Sciences
  • Post-Genomic workshop, Rennes, France
  • February 1, 2008

2
Principle
  • Tests of means of several variables.
  • Variables are ordered according to a
    data-dependent criterion and tested in this
    succession without alpha-adjustment until the
    first non-significant test.

3
Data-Driven ordering of Hypothesis
  • Simple methods that ignore correlation structure
  • Sums of Squares (Kropf Lauter)
  • Matrix Factorization methods
  • Singular value decomposition (SVD)
  • Non-negative matrix factorization (NMF)
  • Consensus NMF

4
Matrix FactorizationKey Papers
  • Good (1969) Technometrics SVD.
  • Liu et al. (2003) PNAS rSVD.
  • Lee and Seung (1999) Nature NMF.
  • Brunet et al. (2004) PNAS Micro array.
  • Fogel et al. (2007) Bioinformatics Micro
    array.

5
Non-negative matrix factorization Sum of
positive parts
  • ? Identifies only up-regulated genes.

W
6
Consensus NMF Simultaneous factorization of X
and 1/X
  • ? Identifies both up- and down-regulated genes.

7
CNMF Algorithm
  • Stepwise approach implementing standard NMF
    updating rules at each step
  • Run standard NMF, ignoring block information, to
    obtain initial consensus row factors W and
    block-wise column factors Hb.
  • Calculate Consensus W between the Wbs (see
    below).
  • Go to 2. until convergence.
  • Update H from W.
  • For each block Xb, update Wb from Hb and scale Wb
    to 1.

8
Building the consensus
  • To calculate the consensus row factors, we run
    one iteration of NMF in order to factorize the
    concatenated matrix of row factors W1,,B
    W1W2WB W I1,,B
  • Start with previous consensus W and I1,,B
    I1I2IB with I1I2IBI where I is the
    K?K Identity matrix.
  • Update H.
  • Update W and scale to 1.

9
Simulated experiment
  • Simulate a micro array experiment
  • One normal group,
  • Two treated groups
  • 10 observations per group, with various settings
    for the numbers of regulated genes.
  • Up and down regulated genes simulated in equal
    proportion.
  • Added correlation structure between regulated
    genes.

10
Experimental design
11
Simulation results
12
Real experiment
  • RT-PCR experiment, 38 genes, 2 groups
  • CNMF factorization sequential testing ?12
    significant genes.
  • BH adjustment
  • Standard ?1 significant gene.
  • Using NMF to cluster genes and BH adjusting by
    cluster ? 9 significant genes!

13
Heatmap view
  • All significant genes were clustered into
    cluster 1.
  • Cluster 1 is smaller ? BH adjustment is less
    conservative.

14
Overcoming misordered variables
  • Misordered variables may stop the procedure at an
    early state. Possible causes are
  • Varying scale of the variables Those having a
    higher scale tend to be top-ranked Scaling is
    necessary.
  • Outliers or other sources of high variability.
  • Consecutive P-values can be pooled using Fishers
    method to prevent early stopping.

15
Other applications of Consensus NMF
  • Signal originating from different sources micro
    array, Taqman, proteomics
  • FastNMF
  • Image analysis
  • Large micro array datasets

16
NMF Software
  • irMF inferential, robust Matrix Factorization
    (JMP script) http//www.niss.org/irMF/
  • Array Studio Software package which provides
    state of the art statistics and visualization for
    the analysis of high dimensional quantification
    data (e.g. Microarray or Taqman data). OmicSoft
    Corporation http//www.omicsoft.com
Write a Comment
User Comments (0)
About PowerShow.com