Gene expression profiling identifies molecular subtypes of gliomas - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Gene expression profiling identifies molecular subtypes of gliomas

Description:

Gene expression profiling identifies molecular subtypes of gliomas Ruty Shai, Tao Shi, Thomas J Kremen, Steve Horvath, Linda M Liau, Timothy F Cloughesy, Paul S ... – PowerPoint PPT presentation

Number of Views:200
Avg rating:3.0/5.0
Slides: 25
Provided by: step457
Category:

less

Transcript and Presenter's Notes

Title: Gene expression profiling identifies molecular subtypes of gliomas


1
Gene expression profiling identifies molecular
subtypes of gliomas
  • Ruty Shai, Tao Shi, Thomas J Kremen, Steve
    Horvath, Linda M Liau, Timothy F Cloughesy, Paul
    S Mischel and Stanley F Nelson
  • Presented by Stephanie Tsung

2
Outline
  • Descriptions of Data
  • Statistical Methods
  • Multidimensional Scaling Plot
  • Hierarchical Clustering
  • K-means Clustering
  • Gene Filtering/Selection
  • Predictor Comparison
  • Conclusion/ Future works

3
Background
  • Brain tumors can be classified by tumor origins,
    cell type origin or the tumor site etc
  • Tumor classification has been critical in
    treatment selection and outcome prediction.
    However, current classification methods are still
    far from perfect
  • As a new technology, DNA microarray has been
    introduced to cancer classification on the basis
    of gene expression levels.

4
Background Cancer Classification
  • Cancer classification can be divided into two
    challenges class discovery and class prediction.
  •  Class discovery refers to defining previously
    unrecognized tumor subtypes.
  •  Class prediction refers to the assignment of
    particular tumor samples to already-defined
    classes.

5
Objectives
  • To test whether gene expression measurements can
    be used to classify different brain tumors
  • To determine sets of significant genes to
  • distinguish brain tumor of different
    pathological types, grades and survival times
  • To validate the selected informative genes in
    brain tumor classification and prediction.

6
Data and Pre-Processing
  • Affymetrix HG-U95Av2 chips
  • 12,555 Genes and total 42 samples
  • Tumor Types ()
  • N(7) O(3) D(18) A(2) AA(3) P(9)
  • Data pre-processing
  • Each tumor was examined by a neuropathologist and
    dissected into two portions tissue diagnosis and
    RNA extraction.
  • Normalization and Model-Based Expression indices
    in dChip.

7
Q. Are the global transcriptional signatures of
the different pathologic subtypes of gliomas
molecularly distinct?
8
Multidimensional Scaling Plot (MDS Plot)
  • To uncover the hidden structure of data.
  • D(N) -gt D(2)
  • Dimension reduction technique
  • 12,555 dimensional space to low dimensional
    Euclidean space
  • Explain observed similarities and dissimilarity
    between objects such as correlation, euclidean
    distance etc.
  • R cmd1 lt- cmdscale(dist(dat1,130),k2,eigT)

9
MDS Plot
Figure 1. (a)Multidimensional scaling plot of all
42 tissue samples plotted in two-dimensional
space using expression values from all 12 555
probesets.
10
Hierarchical Clustering
  • Evaluate all pair wise distance between objects
  • Look for a pair with shortest distance
  • Construct new obj by avg. of two obj.
  • Evaluate distance from new obj to all other
    objects and Go to Step 2
  • R h1 lt- hclust(dist(x), methodaverage)

11
Hierarchical Clustering
I
II
Figure 1. (b) The same 42 tissue samples were
grouped into hierarchical clusters. Tissue
samples are color-coded.
I II P0.00006, Fishers exact test III IV
P0.00001
12
Fishers Exact Test
Sample w/ charat. w/o charat. Total
1 A B AB
2 C D CD
Total AC BD N
Ho Whether proportion of interest differs
between two groups.
The two-tailed probability .326 .007 .093
.163 .019 .608
13
Q. Can we uncover these subtypes without prior
knowledge?
  • i.e. How many categories of gliomas are suggested
    by the gene expression data?

14
K-means Clustering
  • To find a K-partition of the observations that
    minimizes the within sum of squares (WSS) for
    each clusters
  • The number of clusters, k, needs to be
    pre-specified.
  • Tibshirani prediction strength can be used to
    determine the optimal k.
  • R cl1lt- kmeans (x, 3)

15
Figure 2 Grouping of tumors. All tumor samples
were plotted using multidimensional scaling using
all 12 555 probesets. We performed
nonhierarchical Kmeans clustering (Kaufmann and
Rousseeu, 1990).
16
Gene Filtering/Selection
  • To find the interesting genes which differently
    expressed in 6 two groups comparisons
  • Using top 30 genes based on T-test
  • 170 most differentially expressed genes using
    T-test

17
Predictor Comparison
  • Compare the performance of predictors
  • Gene Vote
  • Leave-one-out crossvalidation error rates were
    calculated.
  • For a given method and sample size, n, a
    classifier is generated using
  • (n - l) cases and tested on the single remaining
    case. This is repeated n times, each time
    designing a classifier by leaving-one-out. Thus,
    each case in the sample is used as a test case,
    and each time nearly all the cases are used to
    design a classifier

18
Table 1.
19
Using 170 filtered genes based on t-test
20
Table 2.
21
Conclusion
  • Performed MDS plots and K-means clustering
    analysis and found evidence for three clusters
    glioblastomas, lower grade astrocytomas, and
    oligodendrogilmas (plt0.00001).
  • A relatively small number of genes can be used
    to distinguish between molecular subtypes.
  • Subsets of gliomas can be potentially used for
    patient stratification and potential targets for
    treatment.

22
Future Directions
  • Construct predictors using different gene
    selection methods.
  • Validate the selected genes with new tumor
    samples.

23
K3 gave us the best prediction power
24
Statistical problems in response-basedclassificat
ion
  • Identification of new or unknown
    classes--unsupervised learning
  • Classification into known classes supervised
    learning
  • Identification of best predictor
    variablesvariable selection, e.g. marker genes
    in microarray data (gene voting, hierarchical
    clustering)
Write a Comment
User Comments (0)
About PowerShow.com