Predictive Biomarkers and Their Use in Clinical Trial Design presentation

About This Presentation

Transcript and Presenter's Notes

Title: Predictive Biomarkers and Their Use in Clinical Trial Design

1
Predictive Biomarkers and Their Use in Clinical
Trial Design

Richard Simon, D.Sc.
Chief, Biometric Research Branch
National Cancer Institute
http//linus.nci.nih.gov

2
BRB Websitehttp//linus.nci.nih.gov

Powerpoint presentations and audio files
Reprints Technical Reports
BRB-ArrayTools software
BRB-ArrayTools Data Archive
Sample Size Planning for Targeted Clinical Trials

Many cancer treatments benefit only a small
proportion of the patients to whom they are
administered
Many early stage patients dont need systemic
treatment
Many tumors are not sensitive to the drugs
administered
Targeting treatment to the right patients
Benefits patients
May reduce health care costs
May improve the success rate of clinical
development

Conducting a phase III trial in the traditional
way with tumors of a specified site/stage/pre-trea
tment category may result in a false negative
trial
Unless a sufficiently large proportion of the
patients have tumors driven by the targeted
pathway

Positive results in traditionally designed broad
eligibility phase III trials may result in
subsequent treatment of many patients who do not
benefit

6
Biomarkers

Surrogate endpoints
A measurement made before and after treatment to
determine whether the treatment is working
Prognostic markers
A measurement made before treatment to indicate
long-term outcome for patients untreated or
receiving standard treatment
Predictive classifiers
A measurement made before treatment to select
good patient candidates for the specific
treatment

7
Surrogate Endpoints

It is very difficult to properly validate a
biomarker as a surrogate for clinical outcome. It
requires a series of randomized trials with both
the candidate biomarker and clinical outcome
measured
Must demonstrate that treatment vs control
differences for the candidate surrogate are
concordant with the treatment vs control
differences for clinical outcome
It is not sufficient to demonstrate that the
biomarker responders survive longer than the
biomarker non-responders

Biomarkers for use as endpoints in phase I or II
studies need not be validated as surrogates for
clinical outcome
Unvalidated biomarkers can also be used for early
futility analyses in phase III trials

9
Prognostic Factors

Most prognostic factors are not used because they
are not therapeutically relevant
Most prognostic factor studies use a convenience
sample of patients for whom tissue is available.
Often the patients are too heterogeneous to
support therapeutically relevant conclusions
Prognostic factors in a focused population can be
therapeutically useful
Oncotype DX

10
ValidationFit for Purpose

FDA terminology of valid biomarker and
probable valid biomarker are not applicable to
predictive classifiers
Validation has meaning only as fitness for
purpose and the purpose of predictive classifiers
are completely different than for surrogate
endpoints

11
(No Transcript)
12
The Roadmap

Develop a completely specified predictive
classifier of the patients likely to benefit from
a new drug
Establish reproducibility of measurement of the
classifier
Use the completely specified classifier to design
and analyze a new clinical trial to evaluate
effectiveness of the new treatment with a
pre-defined analysis plan.

13
Guiding Principle

The data used to develop the classifier must be
distinct from the data used to test hypotheses
about treatment effect in subsets determined by
the classifier
Developmental studies are exploratory
Studies on which treatment effectiveness claims
are to be based should be definitive studies that
test a treatment hypothesis in a patient
population completely pre-specified by the
classifier

14
Predictive Classifier

Based on biological measurements of one or more
genes, transcripts, or protein products
If multivariate, includes a specified form for
combining measurements of components to provide a
binary prediction
Weights and cut-off for positivity specified

15
Predictive Index

Based on biological measurements of one or more
genes, transcripts, or protein products
If multivariate, includes a specified form for
combining measurements of components to provide a
multi-level or quantitative index
Weights specified

16
Development of Genomic Classifiers

Single gene or protein based on knowledge of
therapeutic target
Indicates whether drug can inhibit targeted gene
or protein and whether tumor progression is
driven by the targeted pathway
Empirically determined based on evaluation of a
set of candidate genes or assays
e.g. EGFR assays
Empirically determined based on genome-wide
correlating gene expression to response

17
Developing Predictive Classifiers

During phase II development or
After failed phase III trial using archived
specimens.
Adaptively during early portion of phase III
trial.

18
Developing Predictive Classifiers

To predict response from new drug using response
data for single arm phase II trials
To predict non-response from control regimen
using response data for control treated patients
To predict preferential response or delayed
progression from randomized phase II (or phase
III) trial data of new drug vs control

19
New Drug Developmental Strategy (I)

Develop a predictive classifier that identifies
the patients likely to benefit from the new drug
Develop a reproducible assay for the classifier
Use the classifier to restrict eligibility to a
prospectively planned evaluation of the new drug
Demonstrate that the new drug is effective in the
prospectively defined set of patients determined
by the classifier

20
Develop Predictor of Response to New Drug
Using phase II data, develop predictor of
response to new drug
Patient Predicted Responsive
Patient Predicted Non-Responsive
Off Study
New Drug
Control
21
Applicability of Design I

Primarily for settings where the classifier is
based on a single gene whose protein product is
the target of the drug
eg Herceptin
With a strong biological basis for the
classifier, it may be unacceptable to expose
classifier negative patients to the new drug
Without strong biological basis or adequate phase
II data, FDA may have difficulty approving the
test based on this phase III design

22
We dont think that this drug will help you
because your tumor is test negative. But we need
to show the FDA that a drug we dont think will
help test negative patients actually doesnt
23
Evaluating the Efficiency of Strategy (I)

Simon R and Maitnourim A. Evaluating the
efficiency of targeted designs for randomized
clinical trials. Clinical Cancer Research
106759-63, 2004 Correction and supplement
123229, 2006
Maitnourim A and Simon R. On the efficiency of
targeted clinical trials. Statistics in Medicine
24329-339, 2005.
reprints and interactive sample size calculations
at http//linus.nci.nih.gov

24
Compared two Clinical Trial Designs

Standard design
Randomized comparison of T to C without screening
or selection using classifier
Targeted design
Obtain tissue and evaluate classifier on
candidate patients
Randomize only classifier patients
Classifier patients not further studied

Efficiency of targeted design relative to
standard design depends on
proportion of patients test positive
effectiveness of new drug (compared to control)
for test negative patients
When less than half of patients are test positive
and the drug has little or no benefit for test
negative patients, the targeted design requires
dramatically fewer randomized patients
The targeted design may require fewer or more
screened patients than the standard design

26
No treatment Benefit for Assay - Patientsnstd /
ntargeted
Proportion Assay Positive Randomized Screened
0.75 1.78 1.33
0.5 4 2
0.25 16 4
27
Treatment Benefit for Assay Pts Half that of
Assay Pts nstd / ntargeted
Proportion Assay Positive Randomized Screened
0.75 1.31 0.98
0.5 1.78 0.89
0.25 2.56 0.64
28
Trastuzumab

Metastatic breast cancer
234 randomized patients per arm
90 power for 13.5 improvement in 1-year
survival over 67 baseline at 2-sided .05 level
If benefit were limited to the 25 assay
patients, overall improvement in survival would
have been 3.375
4025 patients/arm would have been required
If assay patients benefited half as much, 627
patients per arm would have been required

29
Comparison of Targeted to Untargeted DesignSimon
R, Development and Validation of Biomarker
Classifiers for Treatment Selection, JSPI
Treatment Hazard Ratio for Marker Positive Patients Number of Events for Targeted Design Number of Events for Traditional Design Number of Events for Traditional Design Number of Events for Traditional Design
Percent of Patients Marker Positive Percent of Patients Marker Positive Percent of Patients Marker Positive
20 33 50
0.5 74 2040 720 316

30
Web Based Software for Comparing Sample Size
Requirements

http//linus.nci.nih.gov

31
Developmental Strategy (II)
32
Developmental Strategy (II)

Do not use the diagnostic to restrict
eligibility, but to structure a prospective
analysis plan
Having a prospective analysis plan is essential
Stratifying (balancing) the randomization is
not sufficient but ensures that all randomized
patients will have tissue available
The purpose of the study is to evaluate the new
treatment overall and for the pre-defined
subsets not to modify or refine the classifier
The purpose is not to demonstrate that repeating
the classifier development process on independent
data results in the same classifier

33
Analysis Plan A (confidence in classifier)

Compare the new drug to the control for
classifier positive patients
If pgt0.05 make no claim of effectiveness
If p? 0.05 claim effectiveness for the
classifier positive patients and
Compare new drug to control for classifier
negative patients using 0.05 threshold of
significance

34
Sample size for Analysis Plan A

88 events in classifier patients needed to
detect 50 reduction in hazard at 5 two-sided
significance level with 90 power
If 25 of patients are positive, then when there
are 88 events in positive patients there will be
about 264 events in negative patients
264 events provides 90 power for detecting 33
reduction in hazard at 5 two-sided significance
level

Study-wise false positivity rate is limited to 5
with analysis plan A
It is not necessary or appropriate to require
that the treatment vs control difference be
significant overall before doing the analysis
within subsets

36
Analysis Plan B(confidence in overall effect)

Compare the new drug to the control overall for
all patients ignoring the classifier.
If poverall? 0.03 claim effectiveness for the
eligible population as a whole
Otherwise perform a single subset analysis
evaluating the new drug in the classifier
patients
If psubset? 0.02 claim effectiveness for the
classifier patients.

This analysis strategy is designed to not
penalize sponsors for having developed a
classifier
It provides sponsors with an incentive to develop
genomic classifiers

38
Sample size for Analysis Plan B

To have 90 power for detecting uniform 33
reduction in overally hazard at 3 two-sided
level requires 297 events (instead of 263 for
similar power at 5 level)
If 25 of patients are positive, when there are
297 total events there will be approximately 75
events in positive patients
75 events provides 75 power for detecting 50
reduction in hazard at 2 two-sided significance
level
By delaying evaluation in test positive patients,
80 power is achieved with 84 events and 90
power with 109 events

39
Analysis Plan C

Test for interaction between treatment effect in
test positive patients and treatment effect in
test negative patients
If interaction is significant at level ?int then
compare treatments separately for test positive
patients and test negative patients
Otherwise, compare treatments overall

40
Sample Size Planning for Analysis Plan C

88 events in classifier patients needed to
detect 50 reduction in hazard at 5 two-sided
significance level with 90 power
If 25 of patients are positive, when there are
88 events in positive patients there will be
about 264 events in negative patients
264 events provides 90 power for detecting 33
reduction in hazard at 5 two-sided significance
level

41
Simulation Results for Analysis Plan C

Using ?int0.10, the interaction test has power
93.7 when there is a 50 reduction in hazard in
test positive patients and no treatment effect in
test negative patients
A significant interaction and significant
treatment effect in test positive patients is
obtained in 88 of cases under the above
conditions
If the treatment reduces hazard by 33 uniformly,
the interaction test is negative and the overall
test is significant in 87 of cases

42
Web Based Software for Designing Stratified
Trials Using Predictive Biomarkers

http//linus.nci.nih.gov

43
The Roadmap

Develop a completely specified genomic classifier
of the patients likely to benefit from a new drug
Establish reproducibility of measurement of the
classifier
Use the completely specified classifier to design
and analyze a new clinical trial to evaluate
effectiveness of the new treatment with a
pre-defined analysis plan.

44
Guiding Principle

The data used to develop the classifier must be
distinct from the data used to test hypotheses
about treatment effect in subsets determined by
the classifier
Developmental studies are exploratory
And not closely regulated by FDA
Studies on which treatment effectiveness claims
are to be based should be definitive studies that
test a treatment hypothesis in a patient
population completely pre-specified by the
classifier

45
Test

How does this approach differ from conducting a
RCT comparing a new treatment to a control and
then performing numerous post-hoc subset
analyses?

46
Use of Archived Samples

Develop a binary classifier of the patients most
likely to benefit from the new treatment using
archived specimens from a negative phase III
clinical trial
Evaluate the new treatment compared to control
treatment in the classifier positive subset in a
separate clinical trial
Prospective targeted type I trial
Using archived specimens from a second previously
conducted clinical trial

47
(No Transcript)
48
Biomarker Adaptive Threshold Design

Wenyu Jiang, Boris Freidlin Richard Simon
JNCI 991036-43, 2007

49
Biomarker Adaptive Threshold Design

Randomized phase III trial comparing new
treatment E to control C
Survival or DFS endpoint

50
Biomarker Adaptive Threshold Design

Have identified a predictive index B thought to
be predictive of patients likely to benefit from
E relative to C
Eligibility not restricted by biomarker
No threshold for biomarker determined

51
Analysis Plan

S(b)log likelihood ratio statistic for treatment
versus control comparison in subset of patients
with B?b
Compute S(b) for all possible threshold values
Determine TmaxS(b)
Compute null distribution of T by permuting
treatment labels
Permute the labels of which patients are in which
treatment group
Re-analyze to determine T for permuted data
Repeat for 10,000 permutations

If the data value of T is significant at 0.05
level using the permutation null distribution of
T, then reject null hypothesis that E is
ineffective
Compute point and bootstrap confidence interval
estimates of the threshold b

53
(No Transcript)
54
Adaptive Biomarker Threshold Design

Sample size planning methods described by Jiang,
Freidlin and Simon, JNCI 991036-43, 2007

55
Adaptive Signature Design An adaptive design for
generating and prospectively testing a gene
expression signature for sensitive patients

Boris Freidlin and Richard Simon
Clinical Cancer Research 117872-8, 2005

56
Adaptive Signature DesignEnd of Trial Analysis

Compare E to C for all patients at significance
level 0.03
If overall H0 is rejected, then claim
effectiveness of E for eligible patients
Otherwise

Otherwise
Using only the first half of patients accrued
during the trial, develop a binary classifier
that predicts the subset of patients most likely
to benefit from the new treatment E compared to
control C
Compare E to C for patients accrued in second
stage who are predicted responsive to E based on
classifier
Perform test at significance level 0.02
If H0 is rejected, claim effectiveness of E for
subset defined by classifier

58
Treatment effect restricted to subset.10 of
patients sensitive, 10 sensitivity genes, 10,000
genes, 400 patients.
Test Power
Overall .05 level test 46.7
Overall .04 level test 43.1
Sensitive subset .01 level test (performed only when overall .04 level test is negative) 42.2
Overall adaptive signature design 85.3
59
Conclusions

New technology makes it increasingly feasible to
identify which patients are likely or unlikely to
benefit from a specified treatment
Targeting treatment can benefit patients, reduce
health care costs and improve the success rate of
new drug development

60
Conclusions

Some of the conventional wisdom about
biomarkers, how to develop predictive
classifiers and how to use them in clinical
trials is seriously flawed
Prospectively specified analysis plans for phase
III studies are essential to achieve reliable
results
Biomarker analysis does not mean exploratory
analysis except in developmental studies

61
Conclusions

Achieving the potential of new technology
requires paradigm changes in correlative
science and in important aspects of design and
analysis of clinical trials

62
Collaborators

Boris Freidlin
Aboubakar Maitournam
Kevin Dobbin
Wenu Jiang
Yingdong Zhao

Write a Comment

User Comments (0)

About PowerShow.com

Predictive Biomarkers and Their Use in Clinical Trial Design PowerPoint PPT Presentation