Gene expression profiling identifies molecular subtypes of gliomas

About This Presentation

Title:

Gene expression profiling identifies molecular subtypes of gliomas

Description:

Gene expression profiling identifies molecular subtypes of gliomas Ruty Shai, Tao Shi, Thomas J Kremen, Steve Horvath, Linda M Liau, Timothy F Cloughesy, Paul S ... – PowerPoint PPT presentation

Number of Views:200

Avg rating:3.0/5.0

Slides: 25

Provided by: step457

Category:

more less

Transcript and Presenter's Notes

Title: Gene expression profiling identifies molecular subtypes of gliomas

1
Gene expression profiling identifies molecular
subtypes of gliomas

Ruty Shai, Tao Shi, Thomas J Kremen, Steve
Horvath, Linda M Liau, Timothy F Cloughesy, Paul
S Mischel and Stanley F Nelson
Presented by Stephanie Tsung

2
Outline

Descriptions of Data
Statistical Methods
Multidimensional Scaling Plot
Hierarchical Clustering
K-means Clustering
Gene Filtering/Selection
Predictor Comparison
Conclusion/ Future works

3
Background

Brain tumors can be classified by tumor origins,
cell type origin or the tumor site etc
Tumor classification has been critical in
treatment selection and outcome prediction.
However, current classification methods are still
far from perfect
As a new technology, DNA microarray has been
introduced to cancer classification on the basis
of gene expression levels.

4
Background Cancer Classification

Cancer classification can be divided into two
challenges class discovery and class prediction.
Class discovery refers to defining previously
unrecognized tumor subtypes.
Class prediction refers to the assignment of
particular tumor samples to already-defined
classes.

5
Objectives

To test whether gene expression measurements can
be used to classify different brain tumors
To determine sets of significant genes to
distinguish brain tumor of different
pathological types, grades and survival times
To validate the selected informative genes in
brain tumor classification and prediction.

6
Data and Pre-Processing

Affymetrix HG-U95Av2 chips
12,555 Genes and total 42 samples
Tumor Types ()
N(7) O(3) D(18) A(2) AA(3) P(9)
Data pre-processing
Each tumor was examined by a neuropathologist and
dissected into two portions tissue diagnosis and
RNA extraction.
Normalization and Model-Based Expression indices
in dChip.

7
Q. Are the global transcriptional signatures of
the different pathologic subtypes of gliomas
molecularly distinct?
8
Multidimensional Scaling Plot (MDS Plot)

To uncover the hidden structure of data.
D(N) -gt D(2)
Dimension reduction technique
12,555 dimensional space to low dimensional
Euclidean space
Explain observed similarities and dissimilarity
between objects such as correlation, euclidean
distance etc.
R cmd1 lt- cmdscale(dist(dat1,130),k2,eigT)

9
MDS Plot
Figure 1. (a)Multidimensional scaling plot of all
42 tissue samples plotted in two-dimensional
space using expression values from all 12 555
probesets.
10
Hierarchical Clustering

Evaluate all pair wise distance between objects
Look for a pair with shortest distance
Construct new obj by avg. of two obj.
Evaluate distance from new obj to all other
objects and Go to Step 2
R h1 lt- hclust(dist(x), methodaverage)

11
Hierarchical Clustering
I
II
Figure 1. (b) The same 42 tissue samples were
grouped into hierarchical clusters. Tissue
samples are color-coded.
I II P0.00006, Fishers exact test III IV
P0.00001
12
Fishers Exact Test
Sample w/ charat. w/o charat. Total
1 A B AB
2 C D CD
Total AC BD N
Ho Whether proportion of interest differs
between two groups.
The two-tailed probability .326 .007 .093
.163 .019 .608
13
Q. Can we uncover these subtypes without prior
knowledge?

i.e. How many categories of gliomas are suggested
by the gene expression data?

14
K-means Clustering

To find a K-partition of the observations that
minimizes the within sum of squares (WSS) for
each clusters
The number of clusters, k, needs to be
pre-specified.
Tibshirani prediction strength can be used to
determine the optimal k.
R cl1lt- kmeans (x, 3)

15
Figure 2 Grouping of tumors. All tumor samples
were plotted using multidimensional scaling using
all 12 555 probesets. We performed
nonhierarchical Kmeans clustering (Kaufmann and
Rousseeu, 1990).
16
Gene Filtering/Selection

To find the interesting genes which differently
expressed in 6 two groups comparisons
Using top 30 genes based on T-test
170 most differentially expressed genes using
T-test

17
Predictor Comparison

Compare the performance of predictors
Gene Vote
Leave-one-out crossvalidation error rates were
calculated.
For a given method and sample size, n, a
classifier is generated using
(n - l) cases and tested on the single remaining
case. This is repeated n times, each time
designing a classifier by leaving-one-out. Thus,
each case in the sample is used as a test case,
and each time nearly all the cases are used to
design a classifier

18
Table 1.
19
Using 170 filtered genes based on t-test
20
Table 2.
21
Conclusion

Performed MDS plots and K-means clustering
analysis and found evidence for three clusters
glioblastomas, lower grade astrocytomas, and
oligodendrogilmas (plt0.00001).
A relatively small number of genes can be used
to distinguish between molecular subtypes.
Subsets of gliomas can be potentially used for
patient stratification and potential targets for
treatment.

22
Future Directions

Construct predictors using different gene
selection methods.
Validate the selected genes with new tumor
samples.

23
K3 gave us the best prediction power
24
Statistical problems in response-basedclassificat
ion

Identification of new or unknown
classes--unsupervised learning
Classification into known classes supervised
learning
Identification of best predictor
variablesvariable selection, e.g. marker genes
in microarray data (gene voting, hierarchical
clustering)

Write a Comment

User Comments (0)