Logical Analysis of Diffuse Large B Cell Lymphoma - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Logical Analysis of Diffuse Large B Cell Lymphoma

Description:

... Sorin Alexe1, David Axelrod2, Peter Hammer1, and David Weissmann3 ... David Axelrod: axelrod_at_biology.rutgers.edu. Peter Hammer: hammer_at_rutcor.rutgers.edu ... – PowerPoint PPT presentation

Number of Views:204
Avg rating:3.0/5.0
Slides: 32
Provided by: RUT7
Learn more at: https://cs.nyu.edu
Category:

less

Transcript and Presenter's Notes

Title: Logical Analysis of Diffuse Large B Cell Lymphoma


1
Logical Analysis of Diffuse Large B Cell
Lymphoma
  • Gabriela Alexe1, Sorin Alexe1, David Axelrod2,
    Peter Hammer1, and David Weissmann3
  • of RUTCOR(1) and Department of Genetics(2),
    Rutgers University and Robert Wood Johnson
    Medical School(3)

2
This Talk
  • Lymphoma
  • Gene Expression Level Analysis
  • cDNA Microarray
  • Applied to Diffuse Large B-Cell Lymphoma
  • Logical Analysis of Data
  • Discretization/Binarization
  • Support Sets
  • Pattern Generation
  • Theories and Models
  • Prediction

3
Lymphoma
4
Lymphoma
  • Cancer of lymphoid cells
  • Clonal
  • Uncontrolled growth
  • Metastasis
  • Lymphoma
  • Diagnosis
  • Grade

5
Diffuse Large B Cell Lymphoma (DLBCL)
  • 31 of non-Hodgkin lymphoma cases
  • 50 long-term, disease-free survival
  • Clinical variability
  • Prognosis therapy
  • IPI
  • Morphology
  • Gene expression

6
Diffuse Large B Cell Lymphoma
7
Spleen with Diffuse Large B Cell Lymphoma
8
Gene Expression Level Analysis
9
DNA-RNA Hybridization
10
Gene Expression Profiling
cDNA microarray analysis
11
DLBCL cDNA Microarray Analysis
  • Distinct types of diffuse large B-cell lymphoma
    identified by gene expression profiling,Alizadeh
    et al., Nature, Vol 403, pp 503-511
  • cDNA microarray data -gt unsupervised hierarchical
    agglomerative clustering
  • Germinal center signature 76 survival at 5
    years
  • Activated B cell signature 16 at 5 years

12
DLBCL Clustering
Each case (patient) is a point in N-dimensional
space where N of genes
13
DLBCL Survival by Type
14
Supervised Learning Classification of DLBCL
  • Diffuse large B-cell lymphoma prediction by
    gene-expression profiling and supervised machine
    learningShipp et al., Nature Medicine, vol 8, p
    68-74
  • Prognosis of DLBCL
  • Highly correlated genes -gt weighted voting
    algorithm

15
Shipps 13 Gene Predictor
16
Logical Analysis of Data
17
Logical Analysis of Data (LAD)
  • Non-statistical method based on
  • Combinatorics
  • Optimization
  • Logic
  • Based on dataset of cases/patients
  • LAD learns patterns characteristic of classes
  • Subsets of patients who are /- for a condition
  • Collections of patterns are extensible
  • Predictions

18
The Problem Approximation of Hidden Function
Dataset
HiddenFunction
LAD Approximation
19
Main Components of LAD
  • Discretization/Binarization
  • Support Sets
  • Pattern Generation
  • Theories and Models
  • Prediction

20
Discretization
Separating Cutpoints
Minimum Set of SeparatingCutpoints
21
Cutpoints and Support Set
  • Minimization is NP hard
  • Numerous powerful methods
  • Support set
  • Cutpoints define a grid in which ideally no cell
    contains both and cases
  • Cutpoints simplify data and decrease noise

22
Patterns
  • Examples
  • Gene A gt 34 gene B lt 24 gene C lt 2
  • Positive and negative patterns
  • Pattern parameters
  • Degree ( of conditions)
  • Prevalence ( of /- cases that satisfy it)
  • Homogeneity (proportion of /- cases among those
    it covers)
  • Best low degree, large prevalence, high
    homogeneity
  • Patterns are extensible!

23
Pattern Generation
  • Generate patterns based on learning set
  • Stipulate control parameters. For example
  • Degree 4
  • - prevalences gt 70
  • - homogeneities 100
  • All 75 patterns in 1.2 seconds on Pentium IV 1 Gz
    PC
  • Evaluate set
  • Average of patterns covering each observation
  • Accuracy applied to evaluation set

24
Patterns Illustration
Negative Pattern
Positive Pattern
25
Theories Approximations of the 2 Regions
A theory is a set of positive (or negative)
patterns such that every positive (or negative)
case is covered.
26
Models
  • A set of a positive and a negative theory
  • A good model
  • Small number of features (genes)
  • Patterns are high quality
  • Low degrees
  • High prevalences
  • High homogeneities
  • Number of patterns is small
  • Maximize their biologic interpretability

27
Theories and Models
Unexplained Area
Positive Theory
Negative Theory
Model
Positive Area
Discordant Area
Negative Area
28
LAD Prediction
  • A new case a set of gene expression levels
  • Satisfy some positive no negative?
  • Satisfy some negative no positive ?
  • Satisfy some of both?
  • Which more?
  • Does not satisfy any (rare)

29
8 Gene Classification Model
30
Accuracy of Prognosis
31
Conclusion
  •  
  • Logical Analysis of Data (LAD ) a versatile new
    classification method here applied to diagnosis
    and prognosis of lymphoma.
  • LAD genes differ almost entirely from those
    specified by other studies.
  • Genes not individually correlated with diagnosis
    or prognosis but highly correlated in
    combinations of as few as two genes.
  • Patterns suggest biologic pathways
  • LAD provides highly accurate prognosis of DLBCL

32
Contacts
  • Gabriela Alexe galexe_at_us.ibm.com
  • Soren Alexe salexe_at_rutcor.rutgers.edu
  • David Axelrod axelrod_at_biology.rutgers.edu
  • Peter Hammer hammer_at_rutcor.rutgers.edu
  • David Weissmann weissmdj_at_umdnj.edu
Write a Comment
User Comments (0)
About PowerShow.com