Robust diagnosis of DLBCL from gene expression data from different laboratories - PowerPoint PPT Presentation

About This Presentation
Title:

Robust diagnosis of DLBCL from gene expression data from different laboratories

Description:

Alexe, Alexe, Axelrod, Hammer, Weissmann (2005) Artificial Intelligence in Medicine ... (Logical Analysis of Data, Hammer 1988) Positive Patterns. Negative ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 26
Provided by: gabr89
Category:

less

Transcript and Presenter's Notes

Title: Robust diagnosis of DLBCL from gene expression data from different laboratories


1
Robust diagnosis of DLBCL from gene expression
data from different laboratories
DIMACS - RUTCOR Workshop onBoolean and
Pseudo-Boolean Functionsin Memory of Peter L.
Hammer January 19-22, 2009
2
Peter L Hammer Sorin Alexe David E
Axelrod RUTGERS UNIV
Gustavo Stolovitzky IBM TJ WATSON RESEARCH
Gyan Bhanot Arnold J Levine INSTITUTE FOR
ADVANCED STUDY PRINCETON
David Weissmann CANCER INSTITUTE OF NEW JERSEY
3
Overview
  • Motivation
  • Pattern-based ensemble classifiers
  • Case study compare data from two labs for DLBCL
    vs FL diagnosis

Shipp et al. (2002) Nature Med. 8(1), 68-74.
(Whitehead Lab) Stolovitzky G. (2005) In
Deisboeck et al Complex Systems Science in
BioMedicine (in press) (preprint
http//www.wkap.nl/prod/a/Stolovitzky.pdf).
(DellaFavera Lab) Alexe, Alexe, Axelrod, Hammer,
Weissmann (2005) Artificial Intelligence in
Medicine Bhanot, Alexe, Stolowitzky, Levine
(2005) Genome Informatics
4
Non-Hodgkin lymphomas
  • FL low grade non-Hodgkin lymphoma / no cure if
    advanced stage
  • second most frequent subtype of nodal lymphoid
    malignancies
  • Incidence has risen from 23/ to more than
    57/ 100,000/year (50 00)
  • t(1418) translocationover-expression of
    anti-apoptotic bcl2
  • 25-60 FL cases evolve to DLBCL
  • DLBCL high grade non-Hodgkin lymphoma / high
    variability to treatment
  • most frequent subtype of NHL
  • lt 2 years survival if untreated
  • Biomarkers FL transformation to DLBCL
  • p53/MDM2 (Moller et al., 1999)
  • p16 (Pyniol, 1998)
  • p38MAPK (Elenitoba-Johnson et al., 2003)
  • c-myc (Lossos et al., 2002)

5
Gene arrays
  • Gene arrays are a way to study the variation of
    mRNA levels between different types of cells.
  • This allows diagnosis and inference of pathways
    that cause disease / early stage diagnosis
  • Identify molecular profiles of disease
    personalized medicine

6
Lymphoma datasets
  • Data WI (Shipp et al., 2002) Affy HuGeneFL
  • CU (DallaFavera Lab, Stolovitzky, 2005) Affy
    Hu95Av2
  • Samples
  • WI 58 DLBCL 19 FL
  • CU 14 DLBCL 7 FL
  • Genes
  • WI 6817
  • CU 12581

7
Diagnosis problem
  • Input
  • Training (biomedical) data
  • 2 classes FL and DLBCL
  • m samples described by N gtgt features
  • Output
  • Collection of robust biomarkers, models
  • Robust, accurate classifier / tested on
    out-of-sample data

8
(No Transcript)
9
Patterns (Logical Analysis of Data, Hammer 1988)
Positive Patterns
Negative Patterns
Model
  • -Exhaustive collections of patterns
  • Pattern space
  • Classification / attribute analysis / new class
    identification

10
Data Preprocessing
  • 50 P calls, UL 16000, LL 20
  • 2/1 stratify WI data to train/test CU data test
  • Normalize data to median 1000 per array
  • Generate 500 data sets using noise k fold
    stratified sampling jackknife
  • Find genes with high correlation to phenotype
    using t-test or SNR. Keep genes that are in gt
    90 of datasets

11
Choosing support sets
  • Create quality patterns using small subsets of
    genes, validate using weighted voting with 10
    fold cross validation
  • Sort genes by their appearance in good patterns
  • Select top genes to cover each sample by at least
    10 patterns

Alexe, Alexe, Hammer, Vizvari (2005)
12
The 30 genes that best distinguish FL from DLBCL
13
Genes identified by LAD (AIIM 2005) to
distinguish DLBCL from FL
14
Examples of FL and DLBCL patterns
WI training data Each DLBCL case satisfies at
least one of the patterns P1 and P2 Each FL case
satisfies the pattern N1 (and none of the
patterns P1 and P2)
15
Pattern data
16
Meta-classifier performance
17
Error distribution raw and pattern data
18
Biology based method
19
p53 related genes identified by filtering
procedure
FL ? DLBCL progression
20
p53 pattern data
21
Examples of p53 responsive genes patterns
WI data Each DLBCL case satisfies one of the
patterns P1, P2, P3 Each FL case satisfies one of
the patterns N1, N2, N3
22
p53 combinatorial biomarker
77 FL 21 DLBCL cases (3.7 fold) at most one
gene over-expressed 79 DLBCL 23 FL cases
(3.4 fold) at least two genes over-expressed
Each individual gene over- expressed in about
40-70 DLBCL 20-40 FL (specificity 50-60,
sensitivity 60-70)
23
What are these genes?
  • Plk1 (stpk13) polo-like kinase serine threonine
    protein kinase 13, M-phase specific
  • cell transformation, neoplastic, drives quiescent
    cells into mitosis
  • over-expressed in various human tumors
  • Takai et al., Oncogene, 2005 plk1 potential
    target for cancer therapy, new prognostic marker
    for cancer
  • Mito et al, Leuk Lymph, 2005 plk1 biomarker for
    DLBCL
  • Cdk2 (p33) cyclin -dependent kinase G2/M
    transition of mitotic cell cycle, interacts with
    cyclins A, B3, D, E
  • P53 tumor suppressor gene (Levine 1982)

24
Conclusions
  • Pattern-based meta-classifier is robust against
    noise
  • Good prediction of FL ? DLBCL
  • Biology based analysis also possible
  • Yields useful biomarker
  • Should study biologically motivated sets of genes
    ? build pathways

25
ltgt
  • Thank you for your attention !
Write a Comment
User Comments (0)
About PowerShow.com