Predictive Biomarkers and Their Use in Clinical Trial Design PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: Predictive Biomarkers and Their Use in Clinical Trial Design


1
Predictive Biomarkers and Their Use in Clinical
Trial Design
  • Richard Simon, D.Sc.
  • Chief, Biometric Research Branch
  • National Cancer Institute
  • http//linus.nci.nih.gov

2
BRB Websitehttp//linus.nci.nih.gov
  • Powerpoint presentations and audio files
  • Reprints Technical Reports
  • BRB-ArrayTools software
  • BRB-ArrayTools Data Archive
  • Sample Size Planning for Targeted Clinical Trials

3
  • Many cancer treatments benefit only a small
    proportion of the patients to whom they are
    administered
  • Many early stage patients dont need systemic
    treatment
  • Many tumors are not sensitive to the drugs
    administered
  • Targeting treatment to the right patients
  • Benefits patients
  • May reduce health care costs
  • May improve the success rate of clinical
    development

4
  • Conducting a phase III trial in the traditional
    way with tumors of a specified site/stage/pre-trea
    tment category may result in a false negative
    trial
  • Unless a sufficiently large proportion of the
    patients have tumors driven by the targeted
    pathway

5
  • Positive results in traditionally designed broad
    eligibility phase III trials may result in
    subsequent treatment of many patients who do not
    benefit

6
Biomarkers
  • Surrogate endpoints
  • A measurement made before and after treatment to
    determine whether the treatment is working
  • Prognostic markers
  • A measurement made before treatment to indicate
    long-term outcome for patients untreated or
    receiving standard treatment
  • Predictive classifiers
  • A measurement made before treatment to select
    good patient candidates for the specific
    treatment

7
Surrogate Endpoints
  • It is very difficult to properly validate a
    biomarker as a surrogate for clinical outcome. It
    requires a series of randomized trials with both
    the candidate biomarker and clinical outcome
    measured
  • Must demonstrate that treatment vs control
    differences for the candidate surrogate are
    concordant with the treatment vs control
    differences for clinical outcome
  • It is not sufficient to demonstrate that the
    biomarker responders survive longer than the
    biomarker non-responders

8
  • Biomarkers for use as endpoints in phase I or II
    studies need not be validated as surrogates for
    clinical outcome
  • Unvalidated biomarkers can also be used for early
    futility analyses in phase III trials

9
Prognostic Factors
  • Most prognostic factors are not used because they
    are not therapeutically relevant
  • Most prognostic factor studies use a convenience
    sample of patients for whom tissue is available.
    Often the patients are too heterogeneous to
    support therapeutically relevant conclusions
  • Prognostic factors in a focused population can be
    therapeutically useful
  • Oncotype DX

10
ValidationFit for Purpose
  • FDA terminology of valid biomarker and
    probable valid biomarker are not applicable to
    predictive classifiers
  • Validation has meaning only as fitness for
    purpose and the purpose of predictive classifiers
    are completely different than for surrogate
    endpoints

11
(No Transcript)
12
The Roadmap
  1. Develop a completely specified predictive
    classifier of the patients likely to benefit from
    a new drug
  2. Establish reproducibility of measurement of the
    classifier
  3. Use the completely specified classifier to design
    and analyze a new clinical trial to evaluate
    effectiveness of the new treatment with a
    pre-defined analysis plan.

13
Guiding Principle
  • The data used to develop the classifier must be
    distinct from the data used to test hypotheses
    about treatment effect in subsets determined by
    the classifier
  • Developmental studies are exploratory
  • Studies on which treatment effectiveness claims
    are to be based should be definitive studies that
    test a treatment hypothesis in a patient
    population completely pre-specified by the
    classifier

14
Predictive Classifier
  • Based on biological measurements of one or more
    genes, transcripts, or protein products
  • If multivariate, includes a specified form for
    combining measurements of components to provide a
    binary prediction
  • Weights and cut-off for positivity specified

15
Predictive Index
  • Based on biological measurements of one or more
    genes, transcripts, or protein products
  • If multivariate, includes a specified form for
    combining measurements of components to provide a
    multi-level or quantitative index
  • Weights specified

16
Development of Genomic Classifiers
  • Single gene or protein based on knowledge of
    therapeutic target
  • Indicates whether drug can inhibit targeted gene
    or protein and whether tumor progression is
    driven by the targeted pathway
  • Empirically determined based on evaluation of a
    set of candidate genes or assays
  • e.g. EGFR assays
  • Empirically determined based on genome-wide
    correlating gene expression to response

17
Developing Predictive Classifiers
  • During phase II development or
  • After failed phase III trial using archived
    specimens.
  • Adaptively during early portion of phase III
    trial.

18
Developing Predictive Classifiers
  • To predict response from new drug using response
    data for single arm phase II trials
  • To predict non-response from control regimen
    using response data for control treated patients
  • To predict preferential response or delayed
    progression from randomized phase II (or phase
    III) trial data of new drug vs control

19
New Drug Developmental Strategy (I)
  • Develop a predictive classifier that identifies
    the patients likely to benefit from the new drug
  • Develop a reproducible assay for the classifier
  • Use the classifier to restrict eligibility to a
    prospectively planned evaluation of the new drug
  • Demonstrate that the new drug is effective in the
    prospectively defined set of patients determined
    by the classifier

20
Develop Predictor of Response to New Drug
Using phase II data, develop predictor of
response to new drug
Patient Predicted Responsive
Patient Predicted Non-Responsive
Off Study
New Drug
Control
21
Applicability of Design I
  • Primarily for settings where the classifier is
    based on a single gene whose protein product is
    the target of the drug
  • eg Herceptin
  • With a strong biological basis for the
    classifier, it may be unacceptable to expose
    classifier negative patients to the new drug
  • Without strong biological basis or adequate phase
    II data, FDA may have difficulty approving the
    test based on this phase III design

22
We dont think that this drug will help you
because your tumor is test negative. But we need
to show the FDA that a drug we dont think will
help test negative patients actually doesnt
23
Evaluating the Efficiency of Strategy (I)
  • Simon R and Maitnourim A. Evaluating the
    efficiency of targeted designs for randomized
    clinical trials. Clinical Cancer Research
    106759-63, 2004 Correction and supplement
    123229, 2006
  • Maitnourim A and Simon R. On the efficiency of
    targeted clinical trials. Statistics in Medicine
    24329-339, 2005.
  • reprints and interactive sample size calculations
    at http//linus.nci.nih.gov

24
Compared two Clinical Trial Designs
  • Standard design
  • Randomized comparison of T to C without screening
    or selection using classifier
  • Targeted design
  • Obtain tissue and evaluate classifier on
    candidate patients
  • Randomize only classifier patients
  • Classifier patients not further studied

25
  • Efficiency of targeted design relative to
    standard design depends on
  • proportion of patients test positive
  • effectiveness of new drug (compared to control)
    for test negative patients
  • When less than half of patients are test positive
    and the drug has little or no benefit for test
    negative patients, the targeted design requires
    dramatically fewer randomized patients
  • The targeted design may require fewer or more
    screened patients than the standard design

26
No treatment Benefit for Assay - Patientsnstd /
ntargeted
Proportion Assay Positive Randomized Screened
0.75 1.78 1.33
0.5 4 2
0.25 16 4
27
Treatment Benefit for Assay Pts Half that of
Assay Pts nstd / ntargeted
Proportion Assay Positive Randomized Screened
0.75 1.31 0.98
0.5 1.78 0.89
0.25 2.56 0.64
28
Trastuzumab
  • Metastatic breast cancer
  • 234 randomized patients per arm
  • 90 power for 13.5 improvement in 1-year
    survival over 67 baseline at 2-sided .05 level
  • If benefit were limited to the 25 assay
    patients, overall improvement in survival would
    have been 3.375
  • 4025 patients/arm would have been required
  • If assay patients benefited half as much, 627
    patients per arm would have been required

29
Comparison of Targeted to Untargeted DesignSimon
R, Development and Validation of Biomarker
Classifiers for Treatment Selection, JSPI
Treatment Hazard Ratio for Marker Positive Patients Number of Events for Targeted Design Number of Events for Traditional Design Number of Events for Traditional Design Number of Events for Traditional Design
Percent of Patients Marker Positive Percent of Patients Marker Positive Percent of Patients Marker Positive
20 33 50
0.5 74 2040 720 316

30
Web Based Software for Comparing Sample Size
Requirements
  • http//linus.nci.nih.gov

31
Developmental Strategy (II)
32
Developmental Strategy (II)
  • Do not use the diagnostic to restrict
    eligibility, but to structure a prospective
    analysis plan
  • Having a prospective analysis plan is essential
  • Stratifying (balancing) the randomization is
    not sufficient but ensures that all randomized
    patients will have tissue available
  • The purpose of the study is to evaluate the new
    treatment overall and for the pre-defined
    subsets not to modify or refine the classifier
  • The purpose is not to demonstrate that repeating
    the classifier development process on independent
    data results in the same classifier

33
Analysis Plan A (confidence in classifier)
  • Compare the new drug to the control for
    classifier positive patients
  • If pgt0.05 make no claim of effectiveness
  • If p? 0.05 claim effectiveness for the
    classifier positive patients and
  • Compare new drug to control for classifier
    negative patients using 0.05 threshold of
    significance

34
Sample size for Analysis Plan A
  • 88 events in classifier patients needed to
    detect 50 reduction in hazard at 5 two-sided
    significance level with 90 power
  • If 25 of patients are positive, then when there
    are 88 events in positive patients there will be
    about 264 events in negative patients
  • 264 events provides 90 power for detecting 33
    reduction in hazard at 5 two-sided significance
    level

35
  • Study-wise false positivity rate is limited to 5
    with analysis plan A
  • It is not necessary or appropriate to require
    that the treatment vs control difference be
    significant overall before doing the analysis
    within subsets

36
Analysis Plan B(confidence in overall effect)
  • Compare the new drug to the control overall for
    all patients ignoring the classifier.
  • If poverall? 0.03 claim effectiveness for the
    eligible population as a whole
  • Otherwise perform a single subset analysis
    evaluating the new drug in the classifier
    patients
  • If psubset? 0.02 claim effectiveness for the
    classifier patients.

37
  • This analysis strategy is designed to not
    penalize sponsors for having developed a
    classifier
  • It provides sponsors with an incentive to develop
    genomic classifiers

38
Sample size for Analysis Plan B
  • To have 90 power for detecting uniform 33
    reduction in overally hazard at 3 two-sided
    level requires 297 events (instead of 263 for
    similar power at 5 level)
  • If 25 of patients are positive, when there are
    297 total events there will be approximately 75
    events in positive patients
  • 75 events provides 75 power for detecting 50
    reduction in hazard at 2 two-sided significance
    level
  • By delaying evaluation in test positive patients,
    80 power is achieved with 84 events and 90
    power with 109 events

39
Analysis Plan C
  • Test for interaction between treatment effect in
    test positive patients and treatment effect in
    test negative patients
  • If interaction is significant at level ?int then
    compare treatments separately for test positive
    patients and test negative patients
  • Otherwise, compare treatments overall

40
Sample Size Planning for Analysis Plan C
  • 88 events in classifier patients needed to
    detect 50 reduction in hazard at 5 two-sided
    significance level with 90 power
  • If 25 of patients are positive, when there are
    88 events in positive patients there will be
    about 264 events in negative patients
  • 264 events provides 90 power for detecting 33
    reduction in hazard at 5 two-sided significance
    level

41
Simulation Results for Analysis Plan C
  • Using ?int0.10, the interaction test has power
    93.7 when there is a 50 reduction in hazard in
    test positive patients and no treatment effect in
    test negative patients
  • A significant interaction and significant
    treatment effect in test positive patients is
    obtained in 88 of cases under the above
    conditions
  • If the treatment reduces hazard by 33 uniformly,
    the interaction test is negative and the overall
    test is significant in 87 of cases

42
Web Based Software for Designing Stratified
Trials Using Predictive Biomarkers
  • http//linus.nci.nih.gov

43
The Roadmap
  1. Develop a completely specified genomic classifier
    of the patients likely to benefit from a new drug
  2. Establish reproducibility of measurement of the
    classifier
  3. Use the completely specified classifier to design
    and analyze a new clinical trial to evaluate
    effectiveness of the new treatment with a
    pre-defined analysis plan.

44
Guiding Principle
  • The data used to develop the classifier must be
    distinct from the data used to test hypotheses
    about treatment effect in subsets determined by
    the classifier
  • Developmental studies are exploratory
  • And not closely regulated by FDA
  • Studies on which treatment effectiveness claims
    are to be based should be definitive studies that
    test a treatment hypothesis in a patient
    population completely pre-specified by the
    classifier

45
Test
  • How does this approach differ from conducting a
    RCT comparing a new treatment to a control and
    then performing numerous post-hoc subset
    analyses?

46
Use of Archived Samples
  • Develop a binary classifier of the patients most
    likely to benefit from the new treatment using
    archived specimens from a negative phase III
    clinical trial
  • Evaluate the new treatment compared to control
    treatment in the classifier positive subset in a
    separate clinical trial
  • Prospective targeted type I trial
  • Using archived specimens from a second previously
    conducted clinical trial

47
(No Transcript)
48
Biomarker Adaptive Threshold Design
  • Wenyu Jiang, Boris Freidlin Richard Simon
  • JNCI 991036-43, 2007

49
Biomarker Adaptive Threshold Design
  • Randomized phase III trial comparing new
    treatment E to control C
  • Survival or DFS endpoint

50
Biomarker Adaptive Threshold Design
  • Have identified a predictive index B thought to
    be predictive of patients likely to benefit from
    E relative to C
  • Eligibility not restricted by biomarker
  • No threshold for biomarker determined

51
Analysis Plan
  • S(b)log likelihood ratio statistic for treatment
    versus control comparison in subset of patients
    with B?b
  • Compute S(b) for all possible threshold values
  • Determine TmaxS(b)
  • Compute null distribution of T by permuting
    treatment labels
  • Permute the labels of which patients are in which
    treatment group
  • Re-analyze to determine T for permuted data
  • Repeat for 10,000 permutations

52
  • If the data value of T is significant at 0.05
    level using the permutation null distribution of
    T, then reject null hypothesis that E is
    ineffective
  • Compute point and bootstrap confidence interval
    estimates of the threshold b

53
(No Transcript)
54
Adaptive Biomarker Threshold Design
  • Sample size planning methods described by Jiang,
    Freidlin and Simon, JNCI 991036-43, 2007

55
Adaptive Signature Design An adaptive design for
generating and prospectively testing a gene
expression signature for sensitive patients
  • Boris Freidlin and Richard Simon
  • Clinical Cancer Research 117872-8, 2005

56
Adaptive Signature DesignEnd of Trial Analysis
  • Compare E to C for all patients at significance
    level 0.03
  • If overall H0 is rejected, then claim
    effectiveness of E for eligible patients
  • Otherwise

57
  • Otherwise
  • Using only the first half of patients accrued
    during the trial, develop a binary classifier
    that predicts the subset of patients most likely
    to benefit from the new treatment E compared to
    control C
  • Compare E to C for patients accrued in second
    stage who are predicted responsive to E based on
    classifier
  • Perform test at significance level 0.02
  • If H0 is rejected, claim effectiveness of E for
    subset defined by classifier

58
Treatment effect restricted to subset.10 of
patients sensitive, 10 sensitivity genes, 10,000
genes, 400 patients.
Test Power
Overall .05 level test 46.7
Overall .04 level test 43.1
Sensitive subset .01 level test (performed only when overall .04 level test is negative) 42.2
Overall adaptive signature design 85.3
59
Conclusions
  • New technology makes it increasingly feasible to
    identify which patients are likely or unlikely to
    benefit from a specified treatment
  • Targeting treatment can benefit patients, reduce
    health care costs and improve the success rate of
    new drug development

60
Conclusions
  • Some of the conventional wisdom about
    biomarkers, how to develop predictive
    classifiers and how to use them in clinical
    trials is seriously flawed
  • Prospectively specified analysis plans for phase
    III studies are essential to achieve reliable
    results
  • Biomarker analysis does not mean exploratory
    analysis except in developmental studies

61
Conclusions
  • Achieving the potential of new technology
    requires paradigm changes in correlative
    science and in important aspects of design and
    analysis of clinical trials

62
Collaborators
  • Boris Freidlin
  • Aboubakar Maitournam
  • Kevin Dobbin
  • Wenu Jiang
  • Yingdong Zhao
Write a Comment
User Comments (0)
About PowerShow.com