Title: An Array of FDA Efforts in Pharmacogenomics
1An Array of FDA Efforts in Pharmacogenomics
- Weida Tong
- Director, Center for Toxicoinformatics, NCTR/FDA
- Weida.tong_at_fda.hhs.gov
CAMDA 08, Boku University, Vienna, Austria, Dec
4-6, 2008
2Pipeline Problem Spending More, Getting Less
While research spending (Pharma and NIH) has
increased, fewer NMEs and BLAs have been
submitted to FDA
3The FDA Critical Path to New Medical Products
- Pharmacogenomics and toxicogenomics have been
identified as crucial in advancing - Medical product development
- Personalized medicine
4Guidance for Industry Pharmacogenomic Data
Submissions
www.fda.gov/cder/genomics
www.fda.gov/cder/genomics/regulatory.htm
5A Novel Data Submission Path - Voluntary
Genomics Data Submission (VGDS)
- Defined in Guidance for Industry on
Pharmacogenomics (PGx) Data Submission (draft
document released in 2003 final publication,
2005) - To encourage the sponsor interacting with FDA
through submission of PGx data at the voluntary
basis - To provide a forum for scientific discussions
with the FDA outside of the application review
process. - To establish regulatory environment (both the
tools and expertise) within the FDA for
receiving, analyzing and interpreting PGx data
6VGDS Status
- Total of gt40 submissions have been received
- The submissions contain PGx data from
- DNA Microarrays
- Proteomics
- Metabolomics
- Genotyping including Genome wide association
study (GWAS) - Others
- Bioinformatics has played an essential role to
accomplish - Objective 1 Data repository
- Objective 2 Reproduce the sponsors results
- Objective 3 Conduct alternative analysis
7FDA Genomic Tool ArrayTrack Support FDA
regulatory research and review
- Developed by NCTR/FDA
- Develop 1 An integrated solution for microarray
data management, analysis and interpretation - Develop 2 Support meta data analysis across
various omics platforms and study data - Develop 3 SNPTrack, a sister product in
collaboration with Rosetta - FDA agency wide application
- Review tool for the FDA VGDS data submission
- gt100 FDA reviewers and scientists have
participated the training - Integrating with Janus for e-Submission
8ArrayTrack An Integrated Solution for omics
research
Clinical and non-clinical data
Chemical data
ArrayTrack
9(No Transcript)
10Specific Functionality Related to VGDS
- Phenotypic anchoring
- Systems Approach
Gene
Gene name is hidden
Clinical pathology data
CLinChem name is hidden
11ArrayTrack-Freely Available to Public
Web-access
Local installation
of unique users calculated quarterly
- To be consistent with the common practice in the
research community - Over 10 training courses have been offered,
including two in Europe - Education Part of bioinformatics course in UCLA,
UMDNJ and UALR - Eli Lilly choose ArrayTrack to support its
clinical gene-expression studies after rigorously
assessing the architectural structure,
functionality, security assessments and custom
support
12ArrayTrack Website
http//www.fda.gov/nctr/science/centers/toxicoinfo
rmatics/ArrayTrack/
13MicroArray Quality Control (MAQC) - An FDA-Led
Community Wide Effort to Address the Challenges
and Issues Identified in VGDS
- QC issue How good is good enough?
- Assessing the best achievable technical
performance of microarray platforms (QC metrics
and thresholds) - Analysis issue Can we reach a consensus on
analysis methods? - Assessing the advantages and disadvantages of
various data analysis methods - Cross-platform issue Do different platforms
generate different results? - Assessing cross-platform consistency
14MAQC Way of Working
Participants Everyone was welcome however,
cutoff dates had to be imposed. Cost-sharing Ever
y participant contributed, e.g., arrays, RNA
samples, reagents, time and resources in
generating and analyzing the MAQC
data Decision-making Face-to-face meetings
(1st, 2nd, 3rd, and 4th) Biweekly, regular MAQC
teleconferences (gt20 times) Smaller-scale
teleconferences on specific issues
(many) Outcome Peer-reviewed publication Followe
d the normal journal-defined publication
process 9 papers submitted to Nature
Biotechnology 6 accepted and 3 rejected Transpare
ncy MAQC Data is freely available at GEO,
ArrayExpress, and ArrayTrack RNA samples are
available from commercial vendors
15MicroArray Quality Control (MAQC) project Phase
I
Feb 2005
- MAQC-I Technical Performance
- Reliability of microarray technology
- Cross-platform consistency
- Reproducibility of microarray results
- MAQC-II Practical Application
- Molecular signatures (or classifiers) for risk
assessment and clinical application - Reliability, cross-platform consistency and
reproducibility - Develop guidance and recommendations
137 scientists from 51 ORG
MAQC-I
Sept 2006
MAQC-II
gt400 scientists from gt150 ORG
Dec 2008
16Results from the MAQC-I Study Published in Nature
Biotechnology on Sept/Oct 2006
- Six research papers
- MAQC Main Paper
- Validation of Microarray Results
- RNA Sample Titrations
- One-color vs. Two-color Microarrays
- External RNA Controls
- Rat Toxicogenomics Validation
Nat. Biotechnol. 24(9) and 24(10s), 2006
Plus Editorial Nature Biotechnology Forewo
rd Casciano DA and Woodcock J Stanford
Commentary Ji H and Davis RW FDA Commentary
Frueh FW EPA Commentary Dix DJ et al.
17Key Findings from the MAQC-I Study
- When standard operating procedures (SOPs) are
followed and the data is analyzed properly, the
following is demonstrated - High within-lab and cross-lab reproducibility
- High cross-platform comparability, including one-
vs two-color platforms - High correlation between quantitative gene
expression (e.g. TaqMan) and microarray platforms - The few discordant measurements were found,
mainly, due to probe sequence and thus target
location
18How to determine DEGs - Do we really know what
we know
- A circular path for DEGs
- Fold Change biologist initiated (frugal
approach) - Magnitude difference
- Biological significance
- P-value statistician joined in (expensive
approach) - Specificity and sensitivity
- Statistical significance
- FC (p) A MAQC findings (statistics got to know
its limitation) - The FC ranking with a nonstringent P-value
cutoff, FC (P), should be considered for class
comparison study - Reproducibility
19Nature
Science
Nature Method
Cell
Analytical Chemistry
20Post-MAQC-I Study on Reproducibility of DEGs - A
Statistical Simulation Study
Lab 1
Lab 2
P vs FC
POG
Reproducibility
21How to determine DEGs- Do we really know what we
dont know
- A struggle between reproducibility and
specificity/sensitivity - A monotonic relationship between specificity and
sensitivity - A ??? relationship between reproducibility and
specificity/sensitivity
22More on Reproducibility
- General impressions (conclusions)
- Reproducibility is a complicated phenomena
- No straightforward way to assess the
reproducibility of DEGs - Reproducibility and statistical power
- More samples ? higher reproducibility
- Reproducibility and statistical significance
- Inverse relationship but not a simple trade-off
- Reproducibility and the gene length
- A complex relationship with the DEG length
- Irreproducible not equal to biological irrelevant
- If two DEGs from two replicated studies are not
reproducible, both could be true discovery
23MicroArray Quality Control (MAQC) project Phase
II
Feb 2005
- MAQC-I Technical Performance
- Reliability of microarray technology
- Cross-platform consistency
- Reproducibility of microarray results
- MAQC-II Practical Application
- Molecular signatures (or classifiers) for risk
assessment and clinical application - Reliability, cross-platform consistency and
reproducibility - Develop guidance and recommendations
137 scientists from 51 ORG
MAQC-I
Sept 2006
MAQC-II
gt400 scientists from gt150 ORG
Dec 2008
24Application of Predictive Signature
Treatment
Long term effect
Clinical application (Pharmacogenomics)
Treatment outcome
Prognosis
Diagnosis
Short term exposure
Long term effect
Safety Assessment (Toxicogenomics)
Prediction
Phenotypic anchoring
25Challenge 1
Batch effect
Data Set
QC
Which QC methods
Normalization
e.g. Raw data, MAS5, RMA, dChip, Plier
Preprocessing
How to generate an initial gene pool for modeling
Feature Selection
P, FC, p(FC), FC(p)
Classifier
Which methods KNN, NC, SVM, DT, PLS
- How to assess the success
- Chemical based prediction
- Animal based prediction
Validation
26Challenge 2 Assessing the Performance of a
Classifier
Prediction Accuracy Sensitivity, Specificity
1
Robustness Reproducibility of signatures
3
2
Mechanistic Relevance Biological understanding
27Dataset Set
QC
A consensus approach (12 teams)
Normalization
Preprocessing
Freedom of choice (35 analysis teams)
Feature Selection
Classifier
Validation, validation and Validation!
Validation
28What We Are Looking For
- Which factors (or parameters) critical to the
performance of a classifier - A standard procedure to determine these factors
- The procedure should be the dataset independent
- A best practice - Could be used as a guidance to
develop microarray based classifiers
29Three-Step Approach
Step1 Training set
Step 2 Blind test set
Step 3 Future sets
New exp for selected endpoints
Prediction
- Classifiers
- Sig. genes
- DAPs
Assessment
Validate the Best Practice
Frozen
Best Practice
30MAQC-II Data Sets
Providers Datasets Size Size
Providers Datasets Step 1 - Training Step 2 - Test
MDACC Breast cancer 130 100
UAMS Multiple myeloma 350 209
Univ. of Cologne Neuroblastoma 251 300
Hamner The lung tumor 70 (18 cmpds) 40 (5 cmpds)
Iconix Non-genotoxic hepatocarcinogenicity 216 201
NIEHS Liver injury (Necrosis) 214 204
Clinical data
Toxicogenomics data
31Where We Are
Step1 Training set
Step 2 Blind test set
Step 3 Future sets
New exp for selected endpoints
Prediction
- Classifiers
- Sig. genes
- DAPs
Assessment
Validate the Best Practice
Frozen
Best Practice
3218 Proposed Manuscripts
- Main manuscript - Study design and main findings
- Assessing Modeling Factors (4 proposals)
- Prediction Confidence (5 proposals)
- Robustness (3 proposals)
- Mechanistic Relevance (2 proposals)
- Consensus Document (3 proposals)
33Consensus Document (3 proposals)
Modeling
- Principles of classifier development Standard
Operating Procedures (SOPs) - Good Clinical Practice (GCP) in using microarray
gene expression data - MAQC, VXDS and FDA guidance on genomics
Assessing
Consensus
Guidance
34Best Practice Document
- One of the VGDS and MAQC objectives is to
communicate with the private industry/research
community to reach consensus on - How to exchange genomic data (data submission)
- How to analyze genomic data
- How to interpret genomic data
- Lessons Learned from VGDS and MAQC have led to
development of Best Practice Document (Led by
Federico Goodsaid) - Companion to Guidance for Industry on
Pharmacogenomic Data Submission (Docket No.
2007D-0310). (http//www.fda.gov/cder/genomics/con
ceptpaper_20061107.pdf) - Over 10 pharmas have provided comments
35An Array of FDA Endeavors- Integrated Nature of
VGDS, ArrayTrack, MAQC and Best Practice Document
36Member Of Center for Toxicoinformatics