Analysis of Gene Expression Metadata In Studies of HNSCC and LSCC Louise Showe Molecular and Cellula - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Analysis of Gene Expression Metadata In Studies of HNSCC and LSCC Louise Showe Molecular and Cellula

Description:

xap:CreatorTool Adobe Photoshop CS Windows /xap:CreatorTool ... xapMM:DocumentID adobe:docid:photoshop:150e2937-e5f5-11da-804c-c993a6105c0 2 /xapMM:DocumentID ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 36
Provided by: Lou1102
Category:

less

Transcript and Presenter's Notes

Title: Analysis of Gene Expression Metadata In Studies of HNSCC and LSCC Louise Showe Molecular and Cellula


1
Analysis of Gene Expression Metadata In Studies
of HNSCC and LSCCLouise ShoweMolecular and
Cellular Oncogenesis Program
2
  • Cancer Functional Genomics
  • Gene Expression Microarrays
  • Classification, Diagnosis, Prognosis, Response to
    Treatment
  • Who will analyze the data?

3
Molecular Biology Computational Biology
4
(No Transcript)
5
Data Pre-processing
6
Classification Class Discovery
7
Biomarker Selection

8
(RFE) Recursive Feature Elimination Iteratively
Discards Genes That Contribute Least
Multivariate Wrapper Approach
Univariate Filter Approach
Data
Data
Univariate Feature selection
Learning Machine
Learning machine
Multivariate Feature Selection
9
Resampling Methods
  • Goals
  • estimate the value and uncertainty of model
    parameters
  • identify the best classification model for the
    dataset
  • boost the probability of classification of
    hard samples

10
Biomedical Problem
  • Patients with previous HNSCC are at risk for both
    lung metastases and new primary lung cancers.
  • Distinction between a primary LSCC and a HNSCC
    lung metastasis can be very difficult.
  • Clinical dilemma Depending on the origin,
    patients have drastically different treatment
    options and prognoses.

11
Can We Distinguish Primary Lung SCC Tumors From
Head Neck Derived SCC Metastasis?
12
Study Schema
18 HNSCC 10 LSCC (Training Set For Gene
Selection on U133A)
Identify Best Diagnostic Genes
Validate Genes On Independent Data Sets Data
for 122 Samples 4 Sources-2 different Affy
chips
Test On Lung Nodules From Patients With Previous
History Of HNSCC (MSKCC)
13
Top genes by PDA-RFE
Top PDA genes have distinct profiles Top genes
by selected by t-test alone have common profile
14
Penn Data Set U133A
10 Genes Can Accurately Classify Penn Samples
15
Summary
  • Excellent classification accuracy on small data
    set.
  • Could it be validated on new samples?

16
External Validation Datasets
  • U133A
  • 40 HNSCC Samples from Minnesota (MN)
  • U95Av2
  • 21 LSCC Samples from Dana-Farber (DF)
  • 11 LSCC Samples from Columbia (CU)
  • Note Only the 9530 probe sets common between
    U95Av2 and U133A were considered. Raw data
  • re-analyzed (RMA).

17
Unsupervised Clustering Shows Strong Systematic
Bias Due To Chip Type And Institution
LC.DF
LC.CU
HN.MN
HN.UP
LC.UP
U95Av2
U133A
Penn
Minnesota
CU
Dana-Farber
Head Neck
Lung
Lung
18
DWD Visualization
courtesy of S. Marron, UNC
19
DWD Correction
  • Correct Bias Due To Different Hybridization
    Sites
  • 1. Merge Dana Farber Set with Columbia Set
  • (same chip U95, same phenotype LSCC)
  • 2. Merge Minnesota Set with Penn Head Neck Set
  • (same chip U133, same phenotype HNSCC)
  • Correct Bias Due To Affy Chip Type And Cancer
    Type
  • 3. Merge Penn LSCC Set with combined Dana Farber
    Columbia Sets
  • (different chips U133 vs. U95, same phenotype)

20
DWD Corrected For Systematic Bias
Head Neck
Lung
Samples cluster by cancer type and not By chip
type or institution
21
Reduction in systematic bias improved global
correlation
Before Correction
After Correction
22
Reduction in systematic bias regularized
classification
AFTER DWD
BEFORE DWD
23
10 GeneClassification of Independent Test Set
Head Neck
Lung
Dana-Farber
CU
Minnesota
24
Study at MSKCC
  • Training Set
  • 52 subjects
  • 31 HNSCC
  • 18 LSCC
  • Test Set
  • 12 lung nodules in patients with prior HNSCC

Talbot et al, Cancer Res (2005) 65 (8)
25
Methods used
  • Gene Selection by t-test
  • Classification by Support Vector Machines (SVM)
  • Accuracy by Leave-one-out cross-validation (LOOCV)

Main Conclusions
  • Minimum set of 500 genes is needed for robust
    classification

26
10-gene Classification of Talbot et al Samples
Head Neck
Lung
27
10 genes on MKSCC Samples
28
PDA Classification of 50 New Lung 72 new HN
Samples is 96 Accurate With 10 Genes
29
Additional Test Set Lung Nodules from patients
with prior HNSCC
  • 13 samples from patients with prior HNSCC
  • 11 samples clinically classified as primary lung
    cancer (U01-U11)
  • 2 samples (lung nodule and pancreatic nodule from
    the same patient) classified as metastases (U12
    U13)

30
10 gene classifier predictions agree with the
clinical assessment
Based on Data from Talbot et al Cancer Res. 2005
31
10 genes separate HNSCC adjacent tissue
HNSCC
Adjacent Tissue (donor matched)
32
Validation by QRT-PCR
33
Expression Ratios From QPCR For Selected Genes
Pairs Correctly Classify New HNSCC LSCC Samples
9 7
Vachani ,Nebozhyn et al. Submitted to Cancer
Research
34
Summary
  • Microarray analysis enables diagnostics for HNSCC
    - LSCC with just few genes
  • Using different analysis methods can make a big
    difference in the results 10 genes vs. 500 for
    HNSCC
  • Selection of biomarkers by RFE is much superior
    to t-test
  • When combining data from separate experiments,
    observed batch effect needs to be addressed and
    alleviated
  • Resampling and validation on independent data
    sets and experimental platforms are crucial for
    assessing the reliability and reproducibility of
    the results
  • Close clinical, biological and mathematical
    collaboration is essential for proper design and
    analysis of experiments

35
  • Wistar
  • Michael Showe
  • Michael Nebozhyn
  • Malik Yousef
  • Wen-Hwai Horng
  • Linda Alila
  • UNC
  • Steve Marron
  • Everett Zhou
  • Xuxin Liu
  • Univ of Pennsylvania
  • Steve Albelda
  • Anil Vachani
  • Charles Powell (Columbia)
  • Patrick Gaffney (U. of Minn)
  • Bhuvanesh Singh (MSKCC)
  • Matt Myerson (Harvard)
  • Ruth Muschel (Penn)

Write a Comment
User Comments (0)
About PowerShow.com