Use of Prognostic

About This Presentation

Title:

Use of Prognostic

Description:

the trial sized with sufficient marker and marker patients for adequately ... Sample size for Analysis Plan A ... Example Sample Size Planning for Procedure B ... – PowerPoint PPT presentation

Number of Views:78

Avg rating:3.0/5.0

Slides: 134

Provided by: rsi9

Learn more at: https://brb.nci.nih.gov

Category:

more less

Transcript and Presenter's Notes

Title: Use of Prognostic

1
Use of Prognostic Predictive Biomarkers in
Clinical Trial Design

Richard Simon, D.Sc.
Chief, Biometric Research Branch
National Cancer Institute
http//brb.nci.nih.gov

2
BRB Websitebrb.nci.nih.gov

Powerpoint presentations
Reprints
BRB-ArrayTools software
Data archive
Q/A message board
Web based Sample Size Planning
Clinical Trials
Optimal 2-stage phase II designs
Phase III designs using predictive biomarkers
Phase II/III designs
Development of gene expression based predictive
classifiers

3
Prognostic Predictive Biomarkers

Most cancer treatments benefit only a minority of
patients to whom they are administered
Being able to predict which patients are likely
or unlikely to benefit would
Save patients from unnecessary toxicity, and
enhance their chance of receiving a drug that
helps them
Control medical costs
Improve the success rate of clinical drug
development

Predictive biomarkers
Measured before treatment to identify who will or
will not benefit from a particular treatment
ER, HER2, KRAS
Prognostic biomarkers
Measured before treatment to indicate long-term
outcome for patients untreated or receiving
standard treatment
Only have medical utility if therapeutically
relevant
Used to identify who does or does not require
more intensive than standard treatment
OncotypeDx

5
Prognostic and Predictive Biomarkers in Oncology

Single gene or protein measurement
Scalar index or classifier that summarizes
expression levels of multiple genes

6
Prognostic Factors in Oncology

Many prognostic factors are not used because they
are not actionable
Most prognostic factor studies are not conducted
with an intended use
They use a convenience sample of heterogeneous
patients for whom tissue is available
Retrospective studies of prognostic markers
should be planned and analyzed with specific
focus on intended use of the marker
Design of prospective studies depends on context
of use of the biomarker
Treatment options and practice guidelines
Other prognostic factors

7
Clinical Utility

Biomarker benefits patient by improving treatment
decisions
Identify patients who have very good prognosis on
standard treatment and do not require more
intensive regimens
Identify patients who have poor prognosis on
standard chemotherapy who are good candidates for
experimental regimens

8
Prospective Evaluation of Prognostic Biomarker

Identify low stage patients for whom standard of
care is chemotherapy
Find dataset of low stage patients who did not
receive chemotherapy for whom archived tissue is
available
Develop prognostic classifier of risk without
chemotherapy of low stage patients
If the classifier identifies a group with a very
low risk of recurrence in the absence of
chemotherapy then
Conduct RCT in which low stage patients who are
low risk by biomarker classifier are randomized
to - chemotherapy

If the predicted risk of recurrence is
sufficiently low, then randomization may be
omitted
The test of the biomarker is a test of whether
the risk is as low as predicted
Absolute benefit of very low risk patients is by
necessity very small
This is the approach of TAILORx

10
How Does This Approach Compare to the So Called
Gold Standard of Randomizing Patients to Receive
or Not Receive the Test?
11
Prospective Marker Strategy Design

Patients are randomized to either
have marker measured and treatment determined
based on marker result and clinical features
dont have marker measured and receive standard
of care treatment based on clinical features
alone

12
Randomize Patients to Test or No Test
Rx Determined by Test
Rx Determined By SOC
13
Marker Strategy Design

Inefficient
Many patients get the same treatment regardless
of which arm they are randomized to
Uninformative
Since patients in the standard of care arm do not
have the marker measured, it is not possible to
compare outcome for patients whose treatment is
changed based on the marker result

14
Apply Test to All Eligible Patients
Using phase II data, develop predictor of
response to new drug
Test Deterimined Rx Different From SOC
Test Determined Rx Same as SOC
Off Study
Use Test Determined Rx
Use SOC
15

MINDACT randomizes breast cancer patients whose
Mammaprint based Rx differs from SOC
Trial is sized to estimate risk of relapse of low
risk Mammaprint patients randomized to no
chemotherapy

16
Predictive Biomarkers
17
(No Transcript)
18
(No Transcript)
19

Cancers of a primary site are in many cases a
molecularly heterogeneous group of diseases which
vary enormously in their responsiveness to
treatment, particularly molecularly targeted
treatment
Can we develop new drugs in a manner more
consistent with modern tumor biology and obtain
reliable information about what regimens work for
what kinds of tumors?

Evaluating a predictive biomarker for treatment T
involves an RCT of T versus a control C.
Analysis of RCT determines whether the biomarker
distinguishes the patients who benefit from T vs
C from those who dont
In this RCT, the biomarker should ideally be
completely specified in advance
focused on the single specific biomarker
the trial sized with sufficient marker and
marker patients for adequately powered separate
analysis of T vs C differences in each stratum.
Evaluating a predictive biomarker does not
involve comparison of outcome of marker vs
marker patient

21
(No Transcript)
22
Prospective Co-Development of Drugs and Companion
Diagnostics

Develop a completely specified genomic classifier
of the patients likely to benefit from a new drug
Establish analytical validity of the classifier
Use the completely specified classifier in the
primary analysis plan of a phase III trial of the
new drug

23
Guiding Principle

The data used to develop the classifier should be
distinct from the data used to test hypotheses
about treatment effect in subsets determined by
the classifier
Developmental studies can be exploratory
Studies on which treatment effectiveness claims
are to be based should not be exploratory

24
Develop Predictor of Response to New Drug
Using phase II data, develop predictor of
response to new drug
Patient Predicted Responsive
Patient Predicted Non-Responsive
Off Study
New Drug
Control
25
Applicability of Targeted/Enrichment Design

Primarily for settings where the classifier is
based on a single gene whose protein product is
the target of the drug or the biology seems well
understood
eg trastuzumab
With a strong biological basis for the
classifier, it may be unacceptable to expose
classifier negative patients to the new drug
Analytical validation, biological rationale and
phase II data provide basis for regulatory
approval of the test
Phase III study focused on test patients to
provide data for approving the drug

26
Principle

If a drug is found safe and effective in a
defined (test ) patient population, approval
should not depend on finding the drug ineffective
in some other (test -) population

27
Evaluating the Efficiency of Enrichment Design

Simon R and Maitnourim A. Evaluating the
efficiency of targeted designs for randomized
clinical trials. Clinical Cancer Research
106759-63, 2004 Correction and supplement
123229, 2006
Maitnourim A and Simon R. On the efficiency of
targeted clinical trials. Statistics in Medicine
24329-339, 2005.
reprints and interactive sample size calculations
at http//linus.nci.nih.gov

Relative efficiency of targeted design depends on
proportion of patients test positive
effectiveness of new drug (compared to control)
for test negative patients
When less than half of patients are test positive
and the drug has little or no benefit for test
negative patients, the targeted design requires
dramatically fewer randomized patients

29
TrastuzumabHerceptin

Metastatic breast cancer
234 randomized patients per arm
90 power for 13.5 improvement in 1-year
survival over 67 baseline at 2-sided .05 level
If benefit were limited to the 25 assay
patients, overall improvement in survival would
have been 3.375
4025 patients/arm would have been required

30
(No Transcript)
31
Model for Two Treatments With Binary Response

Molecularly targeted treatment T
Control treatment C
1-? Proportion of patients that express target
pc control response probability
response probability for T patients who express
target (R) is (pc ?1)
Response probability for T patients who do not
express target (R-) is (pc ?0)

32
(No Transcript)
33
(No Transcript)
34
Randomized Ratio(normal approximation)

RandRat nuntargeted/ntargeted
?1 rx effect in marker patients
?0 rx effect in marker - patients
? proportion of marker - patients
If ?00, RandRat 1/ (1-?) 2
If ?0 ?1/2, RandRat 1/(1- ?/2)2

35
Randomized Rationuntargeted/ntargeted
36
Screened Ratio

Nuntargeted nuntargeted
Ntargeted ntargeted/(1-?)
ScreenRat Nuntargeted/Ntargeted(1- ?)RandRat

37
Screened Ratio
38
Decomposing Specificity of Treatment Effect from
Accuracy of Test

RandRat nuntargeted/ntargeted

39
Randomized Ratio sensitivityspecificity0.9
40
Screened Ratio

Nuntargeted nuntargeted

41
Screened Ratio sensitivityspecificity0.9
42
Web Based Software for Designing RCT of Drug and
Predictive Biomarker

http//brb.nci.nih.gov

43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46

It can be very difficult to develop an effective
and analytically validated predictive biomarker
prior to launch of the phase III trial
Even for anti-EGFR antibodies, a more effective
biomarker turned out to be KRAS mutation, not
EGFR expression
For small molecule kinase inhibitors the task is
more difficult
In some settings it can be easier to use an
analytically validated biomarker of poor outcome
on the standard therapy

It can be very difficult to develop an effective
and analytically validated predictive biomarker
prior to launch of the phase III trial
Even for anti-EGFR antibodies, a more effective
biomarker turned out to be KRAS mutation, not
EGFR expression
For small molecule kinase inhibitors the task is
more difficult
In some settings it can be easier to use an
analytically validated biomarker of poor outcome
on the standard therapy

Score function S for distinguishing patients with
favorable outcome on standard rx vs those with
unfavorable outcome
Developed on training set of pts receiving std rx
GF(s)CDF of S in favorable pts
GU(s)CDF of S in unfavorable pts
Computed on test set of pts receiving std rx

GU(s)sensitivity of test for selecting pts with
unfavorable outcome on std rx using threshold s
1-GF(s)specificity of test
Plot of GU(s) vs GF(s) ROC curve

Latent classes
LCF
LCU
PrLCF?
PrSRespFLCFp1
PrSRespFLCUp0
PrERespFLCFp1
PrSRespFLCUp0?

51
(No Transcript)
52
(No Transcript)
53
(No Transcript)
54
(No Transcript)
55

The maximum treatment effect is ?. It can be
achieved if one selects a threshold t small
enough that the specificity of the test for
excluding cases with favorable outcome on the
standard treatment is 1. If the specificity is 1,
then the size of the treatment effect does not
depend on the sensitivity of the test
Proportion randomized (1-?)GU(t)?GF(t)

Simon and Maitnourim showed that the ratio of
number of patients needed to randomize for a
targeted design compared to a standard design
that does not use the biomarker is approximately
equal to the square of the ratio of the treatment
effects for the two designs
For the standard design the treatment effect is
(1-?)?

57
(No Transcript)
58

If the threshold is selected for specificity 1,
then the randomization ratio equals (1-?)2
Hence if half of the patients have favorable
outcome with standard treatment, i.e. ?0.5, then
the targeted design requires only one quarter the
number of randomized patients as the standard
design.

59
Stratification Design
60
Stratification Design

Use the test to structure a prospective specified
primary analysis plan
Having a prospective analysis plan is essential
Stratifying (balancing) the randomization is
useful to ensure that all randomized patients
have tissue available but is not a substitute for
a prospective analysis plan
The purpose of the study is to evaluate the new
treatment overall and for the pre-defined
subsets not to modify or refine the classifier
The purpose is not to demonstrate that repeating
the classifier development process on independent
data results in the same classifier

61
Not Interaction Design

Requiring a significant interaction at 5 level
to justify evaluating treatment effects in
subsets
was useful in the context of post-hoc subset
analysis when drugs were non-specific cytotoxins,
the subsets were not biology based and the prior
probability of qualitative interactions was low
is not useful for focused co-development of
molecularly targeted drugs when the subset
analysis is part of the primary analysis plan and
the study-wise type I error is controlled
is an example of how progress could be
unnecessarily stymied by making co-development
impracticably expensive

R Simon. Using genomics in clinical trial design,
Clinical Cancer Research 145984-93, 2008
R Simon. Designs and adaptive analysis plans for
pivotal clinical trials of therapeutics and
companion diagnostics, Expert Opinion in Medical
Diagnostics 2721-29, 2008

63
Analysis Plan A

Compare the new drug to the control for
classifier positive patients
If pgt0.05 make no claim of effectiveness
If p? 0.05 claim effectiveness for the
classifier positive patients and
Compare new drug to control for classifier
negative patients using 0.05 threshold of
significance

64
Sample size for Analysis Plan A

88 events in classifier patients needed to
detect 50 reduction in hazard at 5 two-sided
significance level with 90 power
If 25 of patients are positive, then when there
are 88 events in positive patients there will be
about 264 events in negative patients
264 events provides 90 power for detecting 33
reduction in hazard at 5 two-sided significance
level
Sequential futility monitoring may have enabled
early cessation of accrual of classifier negative
patients
Not much earlier with time-to-event endpoint

Study-wise false positivity rate is limited to 5
with analysis plan A
It is not necessary or appropriate to require
that the treatment vs control difference be
significant overall before doing the analysis
within subsets

66
Analysis Plan B(Limited confidence in test)

Compare the new drug to the control overall for
all patients ignoring the classifier.
If poverall? 0.03 claim effectiveness for the
eligible population as a whole
Otherwise perform a single subset analysis
evaluating the new drug in the classifier
patients
If psubset? 0.02 claim effectiveness for the
classifier patients.

This analysis strategy is designed to not
penalize sponsors for having developed a
classifier
It provides sponsors with an incentive to develop
genomic classifiers

68
Sample size for Analysis Plan B

To have 90 power for detecting uniform 33
reduction in overall hazard at 3 two-sided level
requires 297 events (instead of 263 for similar
power at 5 level)
If 25 of patients are positive, then when there
are 297 total events there will be approximately
75 events in positive patients
75 events provides 75 power for detecting 50
reduction in hazard at 2 two-sided significance
level
By delaying evaluation in test positive patients,
80 power is achieved with 84 events and 90
power with 109 events

69
Analysis Plan C

Test for difference (interaction) between
treatment effect in test positive patients and
treatment effect in test negative patients at an
elevated level ?int (e.g. .10)
If interaction is significant at level ?int then
compare treatments separately for test positive
patients and test negative patients
Otherwise, compare treatments overall

70
Sample Size Planning for Analysis Plan C

88 events in test patients needed to detect 50
reduction in hazard at 5 two-sided significance
level with 90 power
If 25 of patients are positive, when there are
88 events in positive patients there will be
about 264 events in negative patients
264 events provides 90 power for detecting 33
reduction in hazard at 5 two-sided significance
level

71
Simulation Results for Analysis Plan C

Using ?int0.10, the interaction test has power
93.7 when there is a 50 reduction in hazard in
test positive patients and no treatment effect in
test negative patients
A significant interaction and significant
treatment effect in test positive patients is
obtained in 88 of cases under the above
conditions
If the treatment reduces hazard by 33 uniformly,
the interaction test is negative and the overall
test is significant in 87 of cases

72
Does the RCT Need to Be Significant Overall for
the T vs C Treatment Comparison?

No
It is incorrect to require that the overall T vs
C comparison be significant to claim that T is
better than C for test patients but not for
test patients
That requirement has been traditionally used to
protect against data dredging. It is
inappropriate for focused trials of a treatment
with a companion test.

73
(No Transcript)
74
Development of Genomic Classifiers

During phase II development or
Adaptively during phase III trial
Using archived specimens from previous phase III
trial

75
(No Transcript)
76
(No Transcript)
77
(No Transcript)
78
Biomarker Adaptive Threshold Design

Wenyu Jiang, Boris Freidlin Richard Simon
JNCI 991036-43, 2007

79
Biomarker Adaptive Threshold Design

Randomized trial of T vs C
Have identified a biomarker score B thought to be
predictive of patients likely to benefit from T
relative to C
Eligibility not restricted by biomarker
No threshold for biomarker determined
Biomarker value scaled to range (0,1)
Time-to-event data

80
Procedure A

Compare T vs C for all patients
If results are significant at level .04 claim
broad effectiveness of T
Otherwise proceed as follows

81
Procedure A

Test T vs C restricted to patients with biomarker
B gt b
Let S(b) be log likelihood ratio statistic
Repeat for all values of b
Let S maxS(b)
Compute null distribution of S by permuting
treatment labels
If the data value of S is significant at 0.01
level, then claim effectiveness of T for a
patient subset
Compute point and bootstrap interval estimates of
the threshold b

82
Estimation of Threshold
83
Estimated Power of Broad Eligibility Design
(n386 events) vs Adaptive Design A (n412
events) 80 power for 30 hazard reduction
84
(No Transcript)
85
Procedure B

S(b)log likelihood ratio statistic for treatment
effect in subset of patients with B?b
SmaxS(0)R, maxS(b)
Compute null distribution of T by permuting
treatment labels
If the data value of T is significant at 0.05
level, then reject null hypothesis that T is
ineffective
Compute point and interval estimates of the
threshold b

86
(No Transcript)
87
Sample Size Planning (A)

Standard broad eligibility trial is sized for 80
power to detect reduction in hazard D at
significance level 5
Biomarker adaptive threshold design is sized for
80 power to detect same reduction in hazard D at
significance level 4 for overall analysis

88
Sample Size Planning (B)

Estimate power of procedure B relative to
standard broad eligibility trial based on Table 1
for the row corresponding to the expected
proportion of sensitive patients (? ) and the
target hazard ratio for sensitive patients
e.g. ?25 and ?.4 gives RE.429/.641.67
When B has power 80, overall test has power
80.6753
Use formula B.2 to determine the approximate
number of events needed for overall test to have
power 53 for detecting ?.4 limited to ?25 of
patients

89
Events needed to Detect Hazard Ratio ? With
Proportional Hazards
90
Events (D) Needed for Overall Test to Detect
Hazard Ratio ? Limited to Fraction ?
91
Example Sample Size Planning for Procedure B

Design a trial to detect ?0.4 (60 reduction)
limited to ?25 of patients
Relative efficiency from Table 1 .429/.641.67
When procedure B has power 80, standard test has
power 80.6753
Formula B.2 gives D230 events to have 53 power
for overall test and thus approximate 80 power
for B
Overall test needs D472 events for 80 power for
detecting the diluted treatment effect

92
(No Transcript)
93
Multiple Biomarker Design

Have identified K candidate binary classifiers B1
, , BK thought to be predictive of patients
likely to benefit from T relative to C
Eligibility not restricted by candidate
classifiers
For notation let B0 denote the classifier with
all patients positive

Test T vs C restricted to patients positive for
Bk for k0,1,,K
Let S(Bk) be log likelihood ratio statistic for
treatment effect in patients positive for Bk
(k1,,K)
Let S maxS(Bk) , k argmaxS(Bk)
For a global test of significance
Compute null distribution of S by permuting
treatment labels
If the data value of S is significant at 0.05
level, then claim effectiveness of T for patients
positive for Bk

Test T vs C restricted to patients positive for
Bk for k0,1,,K
Let S(Bk) be log likelihood ratio statistic for
treatment effect in patients positive for Bk
(k1,,K)
Let S maxS(Bk) , k argmaxS(Bk)
The new treatment is superior to control for the
population defined by k
Repeating the analysis for bootstrap samples of
cases provides
an estimate of the stability of k (the
indication)
an interval estimate S (the size of treatment
effect for the size of treatment effect in the
target population)

96
Adaptive Signature Design

Boris Freidlin and Richard Simon
Clinical Cancer Research 117872-8, 2005

97
Adaptive Signature DesignEnd of Trial Analysis

Compare E to C for all patients at significance
level 0.04
If overall H0 is rejected, then claim
effectiveness of E for eligible patients
Otherwise

Otherwise
Using only the first half of patients accrued
during the trial, develop a binary classifier
that predicts the subset of patients most likely
to benefit from the new treatment T compared to
control C
Compare T to C for patients accrued in second
stage who are predicted responsive to T based on
classifier
Perform test at significance level 0.01
If H0 is rejected, claim effectiveness of T for
subset defined by classifier

99
Classifier Development

Using data from stage 1 patients, fit all single
gene logistic models (j1,,M)
Select genes with interaction significant at
level ?

100
Classification of Stage 2 Patients

For ith stage 2 patient, selected gene j votes
to classify patient as preferentially sensitive
to T if

101
Classification of Stage 2 Patients

Classify ith stage 2 patient as differentially
sensitive to T relative to C if at least G
selected genes vote for differential sensitivity
of that patient

102
Treatment effect restricted to subset.10 of
patients sensitive, 10 sensitivity genes, 10,000
genes, 400 patients.
103
Empirical PowerRR for Control Patients 25
104
Cross-Validated Adaptive Signature Design(to be
submitted for publication)

Wenyu Jiang, Boris Freidlin, Richard Simon

105
Cross-Validated Adaptive Signature DesignEnd of
Trial Analysis

Compare T to C for all patients at significance
level ?overall
If overall H0 is rejected, then claim
effectiveness of T for eligible patients
Otherwise

106
Otherwise

Partition the full data set into K parts
Form a training set by omitting one of the K
parts. The omitted part is the test set
Using the training set, develop a predictive
classifier of the subset of patients who benefit
preferentially from the new treatment T compared
to control C using the methods developed for the
ASD
Classify the patients in the test set as
sensitive (classifier ) or insensitive
(classifier -)
Repeat this procedure K times, leaving out a
different part each time
After this is completed, all patients in the full
dataset are classified as sensitive or
insensitive

107

Compare T to C for sensitive patients by
computing a test statistic S e.g. the difference
in response proportions or log-rank statistic
(for survival)
Generate the null distribution of S by permuting
the treatment labels and repeating the entire
K-fold cross-validation procedure
Perform test at significance level 0.05 -
?overall
If H0 is rejected, claim effectiveness of T for
subset defined by classifier
The sensitive subset is determined by developing
a classifier using the full dataset

108
70 Response to T in Sensitive Patients25
Response to T Otherwise25 Response to C20
Patients Sensitive
109
Does It Matter If the Randomization in the RCT
Was Not Stratified By the Test?

No
Stratification improves balance of stratification
factors in overall comparisons
Stratification does not improve comparability of
treatment (T) and control (C) groups within test
positive patients or within test negative
patients.
In a fully prospective trial, stratification of
the randomization by the test is only useful for
ensuring that all patients have adequate test
performed

110
Information about a predictive biomarker may
develop following completion of the pivotal
trials

It may be infeasible to conduct a new
prospective trial for a previously approved drug
KRAS for anti-EGFR antibodies in colorectal
cancer
HER2 for doxorubicin in breast cancer

111

In some cases the benefits of a prospective trial
can be closely achieved by the carefully planned
use of archived tissue from a previously
conducted randomized clinical trial

112
Use of Archived Specimens in Evaluation of
Prognostic and Predictive BiomarkersRichard M.
Simon, Soonmyung Paik and Daniel F. Hayes

Claims of medical utility for prognostic and
predictive biomarkers based on analysis of
archived tissues can be considered to have either
a high or low level of evidence depending on
several key factors.
Studies using archived tissues, when conducted
under ideal conditions and independently
confirmed can provide the highest level of
evidence.
Traditional analyses of prognostic or predictive
factors, using non analytically validated assays
on a convenience sample of tissues and conducted
in an exploratory and unfocused manner provide a
very low level of evidence for clinical utility.

113
Use of Archived Specimens in Evaluation of
Prognostic and Predictive BiomarkersRichard M.
Simon, Soonmyung Paik and Daniel F. Hayes

For Level I Evidence
(i) archived tissue adequate for a successful
assay must be available on a sufficiently large
number of patients from a phase III trial that
the appropriate analyses have adequate
statistical power and that the patients included
in the evaluation are clearly representative of
the patients in the trial.
(ii) The test should be analytically and
pre-analytically validated for use with archived
tissue.
(iii) The analysis plan for the biomarker
evaluation should be completely specified in
writing prior to the performance of the biomarker
assays on archived tissue and should be focused
on evaluation of a single completely defined
classifier.
iv) the results from archived specimens should be
validated using specimens from a similar, but
separate, study.

114
(No Transcript)
115
Revised Levels of Evidence for Tumor Marker
Studies
116
New Paradigms for Clinical Trials in Predictive
Medicine

Developments in biotechnology have forced
statisticians to focus on prediction problems
This has led to important new methodological
developments for pgtgtn problems in which number of
genes is much greater than the number of cases
Statistics has over-focused on inference. Many of
the methods and much of the conventional wisdom
of biostatistics are based on inference problems

117
Some statisticians believe that accurate
prediction is not possible for pgtgtn

Accurate prediction is often possible, but
standard statistical methods for model building
and evaluation are not effective

118

pgtn prediction problems are not multiple
comparison problems
Feature selection should be optimized for
accurate prediction, not for controlling the
false discovery rate
Goodness of fit to training data should not be
used to guide model building nor to evaluate
model performance

119

Odds ratios, hazard ratios and statistical
significance of regression coefficients are not
proper measures of predictive accuracy

120

Validation of a predictive model means that the
model predicts accurately for independent data

121
Prediction Based Clinical Trials

Using cross-validation we can evaluate new
methods for analysis of clinical trials in terms
of their intended use which is informing
therapeutic decision making

122

fj(x) probability of response for patient with
covariate vector x who receives treatment j

123
Single Hypothesis Testing Based Decision Making
in an RCT

Test H0 ExfT(x) ExfC(x)
or fT(x) fC(x) for all x
If you reject H0 then treat future patients with
T, otherwise treat future patients with C

124
Other Approaches
125
Predicting the Effect of Analysis Methods on
Patient Outcome

At the conclusion of the trial randomly partition
the patients into 10 equally sized sets P1 , ,
P10
Let D-i denote the full dataset minus data for
patients in Pi
Using 10-fold complete cross-validation, omit
patients in Pi
Analyze trial using only data in D-i with both
the standard analysis and the alternative
analysis

126

For each patient j in Pi record the
cross-validated treatment recommendations based
on D-i

127

Let ST denote the set of cases for which the
standard analysis recommends C and the
alternative analysis recommends T
Let SC denote the set of cases for which the
standard analysis recommends T and the
alternative analysis recommends C

128

For patients in ST compare outcomes for patients
who received T versus those who received C
For patients in SC compare outcomes for patients
who received T versus those who received C

129

Hence, alternative methods for analyzing RCTs
can be evaluated in an unbiased manner with
regard to their value to patients using the
actual RCT data

130
Conclusions

New biotechnology and knowledge of tumor biology
provide important opportunities to improve
therapeutic decision making
Treatment of broad populations with regimens that
do not benefit most patients is increasingly no
longer necessary nor economically sustainable
The established molecular heterogeneity of human
diseases requires the use new approaches to the
development and evaluation of therapeutics

131
Conclusions

Some of the conventional wisdom about statistical
analysis of clinical trials is not applicable to
trials dealing with co-development of drugs and
diagnostic
e.g. subset analysis if the overall results are
not significant or if an interaction test is not
significant or if the randomization was not
stratified by the subsetting variable

132
Conclusions

Can we develop new drugs in a manner more
consistent with modern tumor biology and obtain
reliable information about what regimens work for
what kinds of patients?
The information doesnt have to be perfect to be
much better than what we currently have

133
Conclusions

Co-development of drugs and companion diagnostics
increases the complexity of drug development
It does not make drug development simpler,
cheaper and quicker
But it may make development more successful and
it has great potential value for patients and for
the economics of health care

Write a Comment

User Comments (0)