Title: Personalized Predictive Medicine and Genomic Clinical Trials
1Personalized Predictive Medicine and Genomic
Clinical Trials
- Richard Simon, D.Sc.
- Chief, Biometric Research Branch
- National Cancer Institute
- http//brb.nci.nih.gov
2Biometric Research Branch Websitebrb.nci.nih.gov
- Powerpoint presentations
- Reprints
- BRB-ArrayTools software
- Web based Sample Size Planning
3Personalized Oncology is Here Today and Rapidly
Advancing
- Key information is in tumor genome, not in
inherited genetics - Personalization is based on limited
stratification of traditional diagnostic
categories based on key treatment-specific
predictive biomarkers
4- Although the randomized clinical trial remains of
fundamental importance for predictive genomic
medicine, some of the conventional wisdom of how
to design and analyze rcts requires
re-examination - The paradigm of doing a broad eligibility rct of
thousands of patients to answer a single question
about average treatment effect for a target
population presumed homogeneous with regard to
the direction of treatment efficacy no longer has
a scientific basis in oncology
5Standard Approach is Based on Assumptions
- Qualitative treatment by subset interactions are
unlikely - i.e. if new treatment T is better than control C
on average, it is better for all subsets of
patients - Costs of over-treatment are less than costs
of under-treatment
6- Cancers of a primary site often represent a
heterogeneous group of diverse molecular diseases
which vary fundamentally with regard to - the oncogenic mutations that cause them,
- their responsiveness to specific drugs
7How Can We Develop New Drugs in a Manner More
Consistent With Modern Tumor Biology and
ObtainReliable Information About What Regimens
Work for What Kinds of Patients?
8Prospective Co-Development of Drugs and Companion
Diagnostics
- Develop a completely specified genomic classifier
of the patients likely to benefit from a new drug - Based on drug target, pre-clinical, phase 1/2
- Establish analytical validity of the classifier
- Use the completely specified classifier to design
and analyze a focused clinical trial to evaluate
effectiveness of the new treatment and how it
relates to the candidate biomarker
9Targeted (Enrichment) Design
- Restrict entry to the phase III trial based on
the binary predictive classifier
10Develop Predictor of Response to New Drug
Using phase II data, develop predictor of
response to new drug
Patient Predicted Responsive
Patient Predicted Non-Responsive
Off Study
New Drug
Control
11Applicability of Targeted Design
- Primarily for settings where the classifier is
based on a single gene whose protein product is
the target of the drug and there is substantial
biological evidence that the drug will not be
effective for classifier negative patients - Because most cancer drugs have serious side
effects and limit the doses at which other drugs
can be administered, it is ethically problematic
to ask patients to participate in a clinical
trial of a regimen from which they are not
expected to benefit - eg trastuzumab
- Parp inhibitors
12Evaluating the Efficiency of Targeted Design
- Simon R and Maitnourim A. Evaluating the
efficiency of targeted designs for randomized
clinical trials. Clinical Cancer Research
106759-63, 2004 Correction and supplement
123229, 2006 - Maitnourim A and Simon R. On the efficiency of
targeted clinical trials. Statistics in Medicine
24329-339, 2005.
13- Relative efficiency of targeted design depends on
- proportion of patients test positive
- effectiveness of new drug (compared to control)
for test negative patients - Specificity of treatment
- Sensitivity of test
- When less than half of patients are test positive
and the drug has little or no benefit for test
negative patients, the targeted design requires
dramatically fewer randomized patients
14Developmental Strategy (II)
15- Do not use the test to restrict eligibility, but
to structure a prospective analysis plan - Having a prospective analysis plan is essential.
- Stratifying (balancing) the randomization is
useful to ensure that all randomized patients
have test performed but is not required for valid
inference and is not a substitute for a
prospective analysis plan - Size the study for adequate evaluation of T vs C
separately by marker status
16- R Simon. Using genomics in clinical trial design,
Clinical Cancer Research 145984-93, 2008 - R Simon. Designs and adaptive analysis plans for
pivotal clinical trials of therapeutics and
companion diagnostics, Expert Opinion in Medical
Diagnostics 2721-29, 2008
17(No Transcript)
18Analysis Plan CLimited Confidence in Classifier
- Test for difference (interaction) between
treatment effect in test positive patients and
treatment effect in test negative patients at an
elevated level (e.g. .10) - If interaction is significant at that level, then
compare treatments separately for test positive
patients and test negative patients - Otherwise, compare treatments overall
19Sample Size Planning for Analysis Plan C
- 88 events in test patients needed to detect 50
reduction in hazard at 5 two-sided significance
level with 90 power - If 25 of patients are positive, when there are
88 events in positive patients there will be
about 264 events in negative patients - 264 events provides 90 power for detecting 33
reduction in hazard at 5 two-sided significance
leve
20Futility Analysis
- Interim futility analyses separately in test
and test - Conservative futility analysis
- After observing 132 (264/2) events in test
patients, if hazard ratio of new treatment vs
control is lt 1 , then terminate accrual of test
patients but continue accrual of test patients
till planned 88 events - Futility analysis may be performed using a
conditional surrogate intermediate endpoint to
protect the test - patients
21Does the RCT Need to Be Significant Overall for
the T vs C Treatment Comparison?
- No
- That requirement has been traditionally used to
protect against data dredging. It is
inappropriate for focused trials of a treatment
with a companion test.
22- Because of the complexity of cancer biology, it
is often difficult to have the right completely
defined predictive biomarker identified and
analytically validated by the time the pivotal
trial of a new drug is ready to start accrual
23Multiple Biomarker Design
- Have identified K candidate binary classifiers B1
, , BK thought to be predictive of patients
likely to benefit from T relative to C - Eligibility not restricted by candidate
classifiers
24Fallback Analysis Plan
- Compare outcomes of T to C overall
- If p lt 0.01, claim effectiveness of T overall
- Otherwise, conduct planned, type I error
protected subset analysis
25- Compute pk comparing T vs C restricted to
patients positive for Bk . Do this for k0,1,,K
- Let p min pk , k argminpk
- For a global test of significance
- Compute null distribution of p by permuting
treatment labels - If the data value of p is less than the 4th
percentile of the null distribution, then claim
effectiveness of T for patients positive for Bk
26- Repeating the analysis for bootstrap samples of
cases provides - an estimate of the stability of k (the
indication) - an interval estimate of the size of treatment
effect for the size of treatment effect in the
target population
27Adaptive Signature Design
- Boris Freidlin and Richard Simon
- Clinical Cancer Research 117872-8, 2005
28Adaptive Signature DesignEnd of Trial Analysis
- Compare T to C for all patients at significance
level a0 (eg 0.01) - If overall H0 is rejected, then claim
effectiveness of T for eligible patients - Otherwise
29- Otherwise
- Using a randomly selected training set consisting
of a pre-specified proportion of patients accrued
during the trial, develop a binary classifier
that predicts the subset of patients most likely
to benefit from the new treatment T compared to
control C - Compare T to C for patients accrued in second
stage who are predicted responsive to T based on
classifier - Perform test at significance level 1- a0 (eg
0.04)
30Treatment effect restricted to subset.10 of
patients sensitive, 10 sensitivity genes, 10,000
genes, 400 patients.
Test Power
Overall .05 level test 46.7
Overall .04 level test 43.1
Sensitive subset .01 level test (performed only when overall .04 level test is negative) 42.2
Overall adaptive signature design 85.3
31Cross-Validated Adaptive Signature Design
- Freidlin B, Jiang W, Simon R
- Clinical Cancer Research 16(2) 2010
3270 Response to T in Sensitive Patients25
Response to T Otherwise25 Response to C20
Patients Sensitive
ASD CV-ASD
Overall 0.05 Test 0.486 0.503
Overall 0.04 Test 0.452 0.471
Sensitive Subset 0.01 Test 0.207 0.588
Overall Power 0.525 0.731
33Prediction Based Analysis of Clinical Trials
- Using cross-validation we can evaluate our
methods for analysis of clinical trials,
including complex subset analysis algorithms, in
terms of their effect on improving patient
outcome via informing therapeutic decision making - This approach can be used with any set of
candidate predictor variables
34- Define an algorithm A for developing a classifier
of whether patients benefit preferentially from a
new treatment T relative to C - For patients with covariate vector x, the
algorithm predicts preferred treatment - Applying A to a training dataset D provides a
classifier model M(A, D) - R(x M(A, D) ) T
- R(x D) C
35- At the conclusion of the trial randomly partition
the patients into K approximately equally sized
sets P1 , , P10 - Let D-i denote the full dataset minus data for
patients in Pi - Using K-fold complete cross-validation, omit
patients in Pi - Apply the defined algorithm to analyze the data
in D-i to obtain a classifier M-i - For each patient j in Pi record the treatment
recommendation i.e. RjT or RjC
36- Repeat the above for all K loops of the
cross-validation - All patients have been classified as what their
optimal treatment is predicted to be
37- Let ST denote the set of patients for whom
treatment T is predicted optimal i.e. ST j
RjT - Compare outcomes for patients in ST who actually
received T to those in ST who actually received C - Let zT standardized log-rank statistic
- Let HRT denote the estimated hazard ratio in ST
- Compute statistical significance of zT by
randomly permuting treatment labels and repeating
the entire procedure - Do this 1000 or more times to generate the
permutation null distribution of treatment effect
for the patients in subset
38- The significance test based on comparing T vs C
for ST j RjT is the basis for
demonstrating that T is more effective than C for
some patients.
39- By applying the analysis algorithm to the full
RCT dataset D, recommendations are developed for
how future patients should be treated - R(xD) for all x vectors.
- The cross-validated estimate HRT of treatment
effect in ST provides a conservative estimate of
the treatment effect in the subset for which
R(xD)T
40- Identification of the subset of patients who
benefit from T vs C, although imperfect, will
generally be substantially greater than for the
standard clinical trial in which all patients are
classified based on results of testing the single
overall null hypothesis
41(No Transcript)
42Prediction Based Clinical Trials
- New methods for determining from RCTs which
patients, if any, benefit from new treatments can
be evaluated directly using the actual RCT data
in a manner that separates model development from
model evaluation, rather than basing treatment
recommendations on the results of a single
hypothesis test or on exploratory subset analyses
of the full dataset.
43Prediction Based Clinical Trials
- Hypothesis testing has value for ensuring that
ineffective treatments are not approved - Hypothesis testing is not an effective paradigm
for identifying which patients benefit from a new
treatment - The current paradigm results in over-treatment of
populations of patients with many patients not
benefiting - Conventional post-hoc subset analysis is also
unsatisfactory as it provides no internal
validation of classification accuracy
44- Using a combination of hypothesis testing of a
global null hypothesis of no treatment effect and
cross-validated prediction analysis based on a
pre-specified algorithm for prognostic
classification, we can accomplish both
objectives - Preserve type I error to ensure that most
ineffective treatments are not approved - Provide an internally validated classifier of
which kinds of patients benefit from the new
treatment
45Acknowledgements
- Boris Freidlin
- Yingdong Zhao
- Wenyu Jiang
- Aboubakar Maitournam