Molecular Classification of Cancer - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Molecular Classification of Cancer

Description:

Acute Leukemia AML ALL DNA Microarrays Data mining methods Class discovery and class prediction by gene expression * Class discovery and class prediction by gene ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 21
Provided by: csUnmEdu1
Category:

less

Transcript and Presenter's Notes

Title: Molecular Classification of Cancer


1
Molecular Classification of Cancer
  • Christopher Davis
  • Mark Fleharty

2
Introduction
  • Clinical applications of computational molecular
    biology
  • Class prediction
  • Class discovery

3
Topics of Discussion
  • Acute Leukemia
  • AML
  • ALL
  • DNA Microarrays
  • Data mining methods

4
Acute Leukemia
  • Different types
  • Acute myeloid leukemia (AML)
  • Acute lymphoblastic leukemia (ALL)
  • Importance of correct diagnosis
  • Maximize efficacy
  • Minimize toxicity
  • Morphological vs. Molecular characteristics

5
DNA Microarrays
  • Hybridization of mRNAs onto chips with
    complementary strands of DNA
  • What they tell us
  • How much is a gene expressed
  • When are genes expressed
  • Where are genes expressed
  • Under what conditions are they expressed

6
Gene Expression Example
  • mRNAs are indicator
  • Yeast Wine
  • Anaerobic
  • Alcohol
  • Yeast Bread
  • Aerobic
  • CO2

7
Data Mining
  • Correlation Weighting Methods
  • Self Organizing Maps
  • K-means
  • PCA (Principle Component Analysis)

8
Correlated Weighting Methods
  • The magnitude of each vote is dependant on the
    expression level in the new sample and the
    correlation with the class distinction

9
Pearsons r Correlation
  • Continuous interval between 1 and 1
  • 1 if 2 genes are correlated perfectly
  • -1 if 2 genes are correlated negatively
  • 0 if there is no correlation

10
Example r .8
11
Idealized AML/ALL Gene
12
High Correlation With Idealized Gene
13
Allow genes to vote
  • Sort strongest correlated genes (This list is
    often informative)
  • Genes cast weighted votes based on their
    correlation with the idealized gene and how much
    they are expressed in the patient
  • Votes are summed and based on a predetermined
    threshold the patient is classified as having
    AML/ALL/Inconclusive
  • Prediction Strength

14
Self Organizing Maps
  • Method for unsupervised learning reduces high
    dimensional data to low dimensional data
  • Based on a grid of artificial neurons
  • Each grid location has a weight vector

15
Self Organizing Maps Continued
  • The node with a weight vector closest to input
    vector is chosen and its weights adjusted closer
    to the input vector
  • This nodes neighbors are also adjusted to be
    closer to the input vector according to some
    decay function
  • Process all vectors and repeat until stable

16
Use SOM to discover classes
  • SOM is used to find the class members to train
    the predictors
  • Predictors are tested on a new set of known
    classification
  • If the cross validation is positive and the
    prediction strength good the cluster discovery
    and prediction are considered good
  • Iterate if you want to find finer classes

17
K-Means
  • Dataset is partitioned into K clusters randomly
  • For each data point calculate the distance from
    the point to the cluster if it is closer to
    its current cluster leave it there, otherwise
    move it to the closest cluster
  • Repeat until stable

18
Principle Components Analysis
  • A transform that chooses a new coordinate system
    for the data set s.t. the greatest variance comes
    to lie on the first axis(principle component),
    the 2nd greatest variance on the 2nd axis, etc.
  • Can be used to reduce dimensionality by
    eliminating later principle components

19
What This Means
  • Diagnostic Tools
  • Use in diagnosis of other diseases
  • Look for toxins in environment
  • Decoding regulatory networks
  • Use of time sensitive data
  • Use of stress data
  • Drug discovery
  • New classifications of disease

20
Future Work
  • Algorithm research
  • How do we gear experiments to maximize the amount
    of information we get?
Write a Comment
User Comments (0)
About PowerShow.com