Predictive Analysis of Clinical Trials - PowerPoint PPT Presentation

About This Presentation
Title:

Predictive Analysis of Clinical Trials

Description:

Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute http://brb.nci.nih.gov * * * For Binary Outcome Covariates: Age, performance status ... – PowerPoint PPT presentation

Number of Views:265
Avg rating:3.0/5.0
Slides: 112
Provided by: rsi9
Learn more at: https://brb.nci.nih.gov
Category:

less

Transcript and Presenter's Notes

Title: Predictive Analysis of Clinical Trials


1
Predictive Analysis of Clinical Trials
  • Richard Simon, D.Sc.
  • Chief, Biometric Research Branch
  • National Cancer Institute
  • http//brb.nci.nih.gov

2
Biomarker Biological Measurement
  • Early detection biomarker
  • Endpoint biomarker
  • Prognostic biomarkers
  • Predictive biomarkers

3
Kinds of Biomarkers
  • Endpoint
  • Measured before, during and after treatment to
    monitor pace of disease and treatment effect
  • Pharmacodynamic (phase 0-1)
  • Does drug hit target
  • Intermediate response (phase 2)
  • Does drug have anti-tumor effect
  • Surrogate for clinical outcome (phase 3)

4
  • Prognostic biomarkers
  • Measured before treatment to indicate long-term
    outcome for patients untreated or receiving
    standard treatment
  • May reflect both disease aggressiveness and
    effect of standard treatment
  • Used to determine who needs more intensive
    treatment
  • Predictive biomarkers
  • Measured before treatment to identify who will
    benefit from a particular treatment

5
Validation Fitness for Intended Use
6
  • Single gene or protein measurement
  • Scalar index or classifier that summarizes
    contributions of multiple genes

7
Prognostic Predictive Biomarkersin Genomic
Oncology
  • Many cancer treatments benefit only a minority of
    patients to whom they are administered
  • Being able to predict which patients are likely
    to benefit can
  • Help patients get an effective treatment
  • Help control medical costs
  • Improve the success rate of clinical drug
    development

8
Biomarker Validity
  • Analytical validity
  • Measures what its supposed to
  • Reproducible and robust
  • Clinical validity (correlation)
  • It correlates with something clinically
  • Medical utility
  • Actionable resulting in patient benefit

9
Clinical Utility
  • Prognostic and predictive biomarkers have utility
    if they are actionable for informing treatment
    decisions in a manner that results in patient
    benefit

10
Clinical Utility
  • Biomarker benefits patient by improving treatment
    decisions
  • Identify patients who have very good prognosis on
    standard treatment and do not require more
    intensive regimens
  • Identify patients who are likely or unlikely to
    benefit from a specific regimen

11
Prognostic markers
  • There is an enormous published literature on
    prognostic markers in cancer.
  • Very few prognostic markers (factors) are
    recommended for measurement by ASCO, are approved
    by FDA or are reimbursed for by payers. Very few
    play a role in treatment decisions.

12
Pusztai et al. The Oncologist 8252-8, 2003
  • 939 articles on prognostic markers or
    prognostic factors in breast cancer in past 20
    years
  • ASCO guidelines only recommend routine testing
    for ER, PR and HER-2 in breast cancer
  • With the exception of ER or progesterone
    receptor expression and HER-2 gene amplification,
    there are no clinically useful molecular
    predictors of response to any form of anticancer
    therapy.

13
Prognostic Factors in Oncology
  • Most prognostic factors are not used because they
    are not therapeutically relevant
  • Most prognostic factor studies are not conducted
    with an intended use clearly in mind
  • They use a convenience sample of patients for
    whom tissue is available.
  • Generally the patients are too heterogeneous to
    support therapeutically relevant conclusions
  • There is rarely a validation study separate from
    the developmental study that addresses medical
    utility
  • An analytically validated test is rarely developed

14
  • Prognostic factors for such a heterogeneous group
    of patients is not actionable i.e. does not
    help with trteatment decision making.

15
(No Transcript)
16
Major problems with prognostic studies of gene
expression signatures
  • Inadequate focus on intended use
  • Cases selected based on availability of specimens
    rather than for relevance to intended use
  • Heterogeneous sample of patients with mixed
    stages and treatments. Attempt to disentangle
    effects using regression modeling
  • Too a great a focus on which marker is prognostic
    or independently prognostic, not whether the
    marker is effective for intended use

17
If you dont know where you are going, you might
not get thereYogi Berra
18
Prognostic Biomarkers Can be Therapeutically
Relevant
  • lt10 of node negative ER breast cancer patients
    require or benefit from the cytotoxic
    chemotherapy that they receive

19
OncotypeDx Recurrence Score
  • Intended use
  • Patients with node negative estrogen receptor
    positive breast cancer who are going to receive
    an anti-estrogen drug following local
    surgery/radiotherapy
  • Identify patients who have such good prognosis
    that they are unlikely to derive much benefit
    from adjuvant chemotherapy

20
  • Selected patients relevant for the intended use
  • Analyzed the data to see if the recurrence score
    identified a subset with such good prognosis that
    the absolute benefit of chemotherapy would at
    best be very small in absolute terms

21
Biotechnology Has Forced Biostatistics to Focus
on Prediction
  • This has led to many exciting methodological
    developments
  • pgtgtn problems in which number of genes is much
    greater than the number of cases
  • And many erroneous publications
  • And growing pains in transitioning from an
    over-dependence on inference
  • Many of the methods and much of the conventional
    wisdom of statistics are based on inference
    problems and are not applicable to prediction
    problems

22
  • Goodness of fit is not a proper measure of
    predictive accuracy
  • Odds ratios and hazards ratios are not proper
    measures of prediction accuracy
  • Statistical significance of regression
    coefficients are not proper measures of
    predictive accuracy

23
Goodness of Fit vs Prediction Accuracy
  • Fit of a model to the same data used to develop
    it is no evidence of prediction accuracy for
    independent data
  • Prediction is difficult particularly the
    future.
  • Dan Quale or Neils Bohr?

24
(No Transcript)
25
Prediction on Simulated Null DataSimon et al. J
Nat Cancer Inst 9514, 2003
  • Generation of Gene Expression Profiles
  • 20 specimens (Pi is the expression profile for
    specimen i)
  • Log-ratio measurements on 6000 genes
  • Pi MVN(0, I6000)
  • Can we distinguish between the first 10
    specimens (Class 1) and the last 10 (Class 2)?
  • Prediction Method
  • Compound covariate predictor built from the
    log-ratios of the 10 most differentially
    expressed genes.

26
(No Transcript)
27
Cross Validation
  • Cross-validation simulates the process of
    separately developing a model on one set of data
    and predicting for a test set of data not used in
    developing the model
  • The cross-validated estimate of misclassification
    error is an estimate of the prediction error for
    model fit using specified algorithm to full
    dataset

28
Cross-validation Estimate of Prediction Error
29
  • Cross validation is only valid if the test set is
    not used in any way in the development of the
    model. Using the complete set of samples to
    select genes violates this assumption and
    invalidates cross-validation.
  • With proper cross-validation, the model must be
    developed from scratch for each leave-one-out
    training set. This means that feature selection
    must be repeated for each leave-one-out training
    set.

30
Predictive Biomarkers
  • Cancers of a primary site often represent a
    heterogeneous group of diverse molecular entities
    which vary fundamentally with regard to
  • the oncogenic mutations that cause them
  • their responsiveness to specific drugs

31
Most cancer treatments benefit only a minority of
patients to whom they are administered
  • Being able to predict who requires intensive
    treatment and who is likely to benefit from which
    treatments could
  • save patients from unnecessary debilitating
    adverse effects of treatments that they dont
    need or benefit from
  • enhance their chance of receiving a treatment
    that helps them
  • Help control medical costs
  • Improve the success rate of clinical drug
    development

32
  • In most positive phase III clinical trials
    comparing a new treatment to control, most of the
    patients treated with the new treatment did not
    benefit.
  • Adjuvant breast cancer 70 long-term
    disease-free survival on control. 80
    disease-free survival on new treatment. 70 of
    patients dont need the new treatment. Of the
    remaining 30, only 1/3rd benefit.

33
Predictive Biomarkers
  • Estrogen receptor over-expression in breast
    cancer
  • Anti-estrogens, aromatase inhibitors
  • HER2 amplification in breast cancer
  • Trastuzumab, Lapatinib
  • OncotypeDx gene expression recurrence score in
    breast cancer
  • Low score for ER node - -gt no chemotherapy
  • KRAS in colorectal cancer
  • WT KRAS cetuximab or panitumumab
  • EGFR mutation in NSCLC
  • EGFR inhibitor
  • V600E mutation in BRAF of melanoma
  • vemurafenib
  • ALK translocation in NSCLC
  • crizotinib

34
Standard Paradigm of Broad Eligibility Phase III
Clinical Trials Sometimes Leads to
  • Treating many patients with few benefiting
  • Small average treatment effects
  • Problematic for health care economics
  • Inconsistency in results among studies
  • False negative studies

35
The standard approach to designing phase III
clinical trials is based on two assumptions
  • Qualitative treatment by subset interactions are
    unlikely
  • Costs of over-treatment are less than costs
    of under-treatment

36
  • Oncology therapeutics development is now focused
    on molecularly targeted drugs that are only
    expected to be effective in a subset of patients
    whose tumors are driven by the molecular targets
  • Most new cancer drugs are very expensive
  • the aspirin paradigm on which some current
    clinical trial dogma is based is a roadblock to
    progress

37
Subset Analysis
  • In the past often studied as un-focused post-hoc
    analyses
  • Numerous subsets examined
  • Same data used to define subsets for analysis and
    for comparing treatments within subsets
  • No control of type I error
  • Led to conventional wisdom
  • Only hypothesis generation
  • Only valid if overall treatment difference is
    significant
  • Only valid if there is a significant treatment by
    subset interaction

38
  • Neither current practices of subset analysis nor
    current practices of ignoring differences in
    treatment effect among patients are effective for
    evaluating treatments where qualitative
    interactions are likely or for informing labeling
    indications

39
  • Although the randomized clinical trial remains of
    fundamental importance for predictive genomic
    medicine, some of the conventional wisdom of how
    to design and analyze rcts requires
    re-examination
  • The concept of doing an rct of thousands of
    patients to answer a single question about
    average treatment effect for a target population
    presumed homogeneous with regard to the direction
    of treatment efficacy in many cases no longer has
    an adequate scientific basis

40
  • How can we develop new drugs in a manner more
    consistent with modern tumor biology and obtain
    reliable information about what regimens work for
    what kinds of patients?

41
Development is Most Efficient When the Scientific
Basis for the Clinical Trial is Strong
  • Having an important molecular target
  • Having a drug that can inhibit the target in an
    overwhelming proportion of tumor cells at an
    achievable concentration
  • Having a pre-treatment assay that can identify
    the patients for whom the molecular target is
    driving progression of disease

42
When the Biology is Clear
  • Develop a classifier that identifies the patients
    likely (or unlikely) to benefit from the new drug
  • Classifier is based on either a single
    gene/protein or composite score
  • Develop an analytically validated test
  • Measures what it should accurately and
    reproducibly
  • Design a focused clinical trial to evaluate
    effectiveness of the new treatment in test
    patients

43
Using phase II data, develop predictor of
response to new drug
Targeted (Enrichment) Design
44
(No Transcript)
45
Evaluating the Efficiency of Targeted Design
  • Simon R and Maitnourim A. Evaluating the
    efficiency of targeted designs for randomized
    clinical trials. Clinical Cancer Research
    106759-63, 2004 Correction and supplement
    123229, 2006
  • Maitnourim A and Simon R. On the efficiency of
    targeted clinical trials. Statistics in Medicine
    24329-339, 2005.

46
  • Relative efficiency of targeted design depends on
  • proportion of patients test positive
  • specificity of treatment effect for test positive
    patients
  • When less than half of patients are test positive
    and the drug has minimal benefit for test
    negative patients, the targeted design requires
    dramatically fewer randomized patients than the
    standard design in which the marker is not used

47
Two Clinical Trial Designs
  • Standard design
  • Randomized comparison of new drug E to control C
    without the test for screening patients
  • Targeted design
  • Test patients
  • Randomize only test patients
  • Treatment effect D in test patients
  • Treatment effect D- in test patients
  • Proportion of patients test is p
  • Size each design to have power 0.9 and
    significance level 0.05

48
RandRat nuntargeted/ntargeted
  • If D-0, RandRat 1/ p2
  • if p0.5, RandRat4
  • If D- D/2, RandRat 4/(p 1)2
  • if p0.5, RandRat16/91.77

49
Comparing T vs C on Survival or DFS5 2-sided
Significance and 90 Power
Reduction in Hazard Number of Events Required
25 509
30 332
35 227
40 162
45 118
50 88
50
  • Hazard ratio 0.60 for test patients
  • 40 reduction in hazard
  • Hazard ratio 1.0 for test patients
  • 0 reduction in hazard
  • 33 of patients test positive
  • Hazard ratio for unselected population is
  • 0.330.60 0.671 0.87
  • 13 reduction in hazard

51
  • To have 90 power for detecting 40 reduction in
    hazard within a biomarker positive subset
  • Number of events within subset 162
  • To have 90 power for detecting 13 reduction in
    hazard overall
  • Number of events 2172

52
TrastuzumabHerceptin
  • Metastatic breast cancer
  • 234 randomized patients per arm
  • 90 power for 13.5 improvement in 1-year
    survival over 67 baseline at 2-sided .05 level
  • If benefit were limited to the 25 test
    patients, overall improvement in survival would
    have been 3.375
  • 4025 patients/arm would have been required

53
Web Based Software for Planning Clinical Trials
of Treatments with a Candidate Predictive
Biomarker
  • http//brb.nci.nih.gov

54
(No Transcript)
55
Principle
  • If a drug is found safe and effective in a
    defined patient population, approval should not
    depend on finding the drug ineffective in some
    other population

56
Implications for Early Phase Studies
  • Need to design and size early phase studies to
    discover an effective predictive biomarker for
    identifying the correct target population
  • Need to establish an analytically validated test
    for measuring the predictive marker in the phase
    III pivotal studies

57
When the drug is specific for one target and the
biology is well understood
  • May need to evaluate several candidate tests
  • e.g. protein expression of target or
    amplification of gene
  • Need to decide whether to include test negative
    patients in phase II trials
  • Phase II trials sized for adequate numbers of
    test positive patients

58
When the drug has several targets or the biology
is not well understood
  • Should biologically characterize tumors for all
    patients on phase II studies with regard to
    candidate targets and response moderators
  • Phase II trials sized for evaluating candidates
  • Opportunity for sequential and adaptive designs
    to improve efficiency

59
Empirical screening of expression profiles or
mutations to develop predictive marker
  • Larger sample size required
  • Dobbin, Zhao, Simon, Clinical Ca Res 14108,
    2008.
  • Use of archived samples from previous negative
    phase III trial
  • Use of large disease specific panel of
    molecularly characterized human tumor cell lines
    to identify predictive marker

60
(No Transcript)
61
Stratification DesignInteraction Design
62
Develop prospective analysis plan for evaluation
of treatment effect and how it relates to
biomarker
  • Defined analysis plan that protects type I error
  • Trial sized for evaluating treatment effect in
    test and test subsets
  • Test negative patients should be adequately
    protected using interim futility analysis

63
Fallback Analysis Plan
  • Test average treatment effect at reduced level p0
    (e.g. .01)
  • If significant claim broad effectiveness
  • If overall effect is not significant, test
    treatment effect in marker subset at level
    .05-p0
  • If significant claim effectiveness for marker
    subset
  • Test of marker subset should not require either
  • Overall significance nor
  • Significant interaction

64
Sample size for Analysis Plan
  • To have 90 power for detecting uniform 33
    reduction in overall hazard at 1 two-sided
    level requires 370 events.
  • If 33 of patients are positive, then when there
    are 370 total events there will be approximately
    123 events in positive patients
  • 123 events provides 90 power for detecting a 45
    reduction in hazard at a 4 two-sided
    significance level.

65
(No Transcript)
66
(No Transcript)
67
(No Transcript)
68
Bayesian Two-Stage DesignRCT With Single Binary
Marker
69
The Biology is Often Not So Clear
  • Cancer biology is complex and it is not always
    possible to have the right single predictive
    classifier identified with an appropriate
    cut-point by the time the phase 3 trial of a new
    drug is ready to start accrual

70
The Objectives of a Phase III Clinical Trial
  • Test the strong null hypothesis that the test
    treatment is uniformly ineffective compared to
    control for primary endpoint
  • If the null hypothesis is rejected, develop a
    labeling indication for informing physicians in
    their decisions about which patients they treat
    with the drug.

71
  • The test of the null hypothesis of no average
    treatment effect is not necessarily a good test
    of the strong null hypothesis that the new
    treatment is uniformly ineffective
  • Rejection of the null hypothesis is not in itself
    adequate information for guiding physicians on
    how to use the treatment

72
Biomarker Selection Design
  • Based on Adaptive Threshold Design
  • W Jiang, B Freidlin R Simon
  • JNCI 991036-43, 2007

72
73
Biomarker Selection Design
  • Have identified K candidate biomarkers B1 , , BK
    thought to be predictive of patients likely to
    benefit from T relative to C
  • Cut-points not necessarily established for each
    biomarker
  • Eligibility not restricted by candidate markers

74
Marker Selection Design
75
  • Compute p minp1 , p2 , , pK
  • Compute whether the value of p is statistically
    significant when adjusted for multiple testing
  • Adjust for multiple testing by permuting the
    treatment labels and re-calculating p1pK and p
    for the permuted treatment labels
  • Repeat for 10,000 random permutations to
    approximate the null distribution of p

76
  • To detect a 40 reduction in hazard in an
    a-priori defined subset with 90 power and a 4
    two-sided significance level requires 171 events
    in the subset.
  • To adjust for multiplicity with 4 independent
    binary tests, 171 -gt 224.
  • If 33 are positive for each marker, then the
    trial might be sized for 3224 total 672 events.

77
Designs When there are Many Candidate Markers and
too Much Patient Heterogeneity for any Single
Marker
78
(No Transcript)
79
Adaptive Signature Design
80
  • The indication classifier is not a binary
    classifier of whether a patient has good
    prognosis or poor prognosis
  • It is a two sample classifier of whether the
    prognosis of a patient on E is better than the
    prognosis of the patient on C

81
  • The indication classifier can be a binary
    classifier that maps the vector of candidate
    covariates into E,C indicating which treatment
    is predicted superior for that patient
  • The classifier need not use all the covariates
    but variable selection must be determined using
    only the training set
  • Variable selection may be based on selecting
    variables with apparent interactions with
    treatment, with cut-off for variable selection
    determined by cross-validation within training
    set for optimal classification
  • The indication classifier can be a probabilistic
    classifier

82
(No Transcript)
83
(No Transcript)
84
Treatment effect restricted to subset.10 of
patients sensitive, 400 patients.
Test Power
Overall .05 level test 46.7
Overall .04 level test 43.1
Sensitive subset .01 level test (performed only when overall .04 level test is negative) 42.2
Overall adaptive signature design 85.3
85
Overall treatment effect, no subset effect. 400
patients
Test Power
Overall .05 level test 74.2
Overall .04 level test 70.9
Sensitive subset .01 level test 1.0
Overall adaptive signature design 70.9
86
  • This approach can be used with any set of
    candidate predictor variables
  • This approach can also be used to identify the
    subset of patients who dont benefit from E in
    cases where E is superior to C overall

87
(No Transcript)
88
(No Transcript)
89
Cross-Validated Adaptive Signature Design
  • Define indication classifier development
    algorithm A
  • Apply algorithm to full dataset to develop
    indication classifier for use in future patients
    M(xA,P)
  • Using K fold cross validation
  • Classify patients in test sets based on
    classifiers developed in training sets e.g.
    yiM(xiA,P-i)
  • Si yi E
  • Compare E to C in S and estimate size of
    treatment effect
  • is an estimate of the size of the
    treatment effect
  • for future patients with M(xA,P)E

90
Cross-Validated Adaptive Signature Design
  • Approximate null distribution of
  • Permute treatment labels
  • Repeat complete cross-validation procedure
  • Generate permutation distribution of the
  • values for permuted data
  • Test null hypothesis that the treatment effect in
    classifier positive patients is null using as
    test statistic cross-validated estimate of
    treatment effect in positive patients

91
Key Ideas
  • Replace multiple significance testing by
    development of one indication classifier
  • Control study-wise type I error for significance
    test of
  • Overall average treatment effect
  • Treatment effect in classifier positive patients
  • Test of treatment effect in classifier positive
    patients does not depend on significance of
    overall test nor on significant interaction
  • Obtain unbiased or conservative estimate of the
    treatment effect of future classifier positive
    patients

92
  • The size of the E vs C treatment effect for the
    indicated population is (conservatively)
    estimated from the cross validation by the Kaplan
    Meier survival curves of E and of C in S
  • The Kaplan-Meier curves of E and C for patients
    in S provides an estimate of

93
  • The stability of the indication classifier
    M(xA,D)can be evaluated by examining the
    consistency of classifications M(xiA, B) for
    bootstrap samples B from D.

94
  • Although there may be less certainty about
    exactly which types of patient benefit from E
    relative to C, classification may be better than
    for many standard clinical trial in which all
    patients are classified based on results of
    testing the single overall null hypothesis

95
70 Response to E in Sensitive Patients25
Response to E Otherwise25 Response to C30
Patients Sensitive
ASD CV-ASD
Overall 0.05 Test 0.830 0.838
Overall 0.04 Test 0.794 0.808
Sensitive Subset 0.01 Test 0.306 0.723
Overall Power 0.825 0.918
96
25 Response to T 25 Response to CNo Subset
Effect
ASD CV-ASD
Overall 0.05 Test 0.047 0.056
Overall 0.04 Test 0.04 0.048
Sensitive Subset 0.01 Test 0.001 0
Overall Power 0.041 0.048
97
For Binary Outcome
98
For Binary Outcome
99
(No Transcript)
100
506 prostate cancer patients were randomly
allocated to one of four arms Placebo and 0.2 mg
of diethylstilbestrol (DES) were combined as
control arm C 1.0 mg DES, or 5.0 mg DES were
combined as T. The end-point was overall
survival (death from any cause).
Covariates Age, performance status (pf), tumor
size (sz), stage/grade index (sg), serum acid
phosphatase (ap)
101
Figure 1 Overall analysis. The value of the
log-rank statistic is 2.9 and the corresponding
p-value is 0.09. The new treatment thus shows no
benefit overall at the 0.05 level.
102
Figure 2 Cross-validated survival curves for
patients predicted to benefit from the new
treatment. log-rank statistic 10.0, permutation
p-value is .002
103
Figure 3 Survival curves for cases predicted not
to benefit from the new treatment. The value of
the log-rank statistic is 0.54.
104
(No Transcript)
105
(No Transcript)
106
(No Transcript)
107
Prediction Based Clinical Trials
  • We can evaluate our methods for analysis of
    clinical trials in terms of their effect on
    patient outcome via informing therapeutic
    decision making

108
Expected Survival Distribution for Future
PatientsWith Standard Analysis
109
Expected Survival Distribution for Future
PatientsWith Indication Classifier
110
  • Hence, alternative methods for analyzing RCTs
    can be evaluated in an unbiased manner with
    regard to their value to patients using the
    actual RCT data

111
Conclusions
  • New biotechnology and knowledge of tumor biology
    provide important opportunities to improve
    therapeutic decision making
  • Treatment of broad populations with regimens that
    do not benefit most patients is increasingly no
    longer necessary nor economically sustainable
  • The established molecular heterogeneity of human
    diseases requires the use new approaches to the
    development and evaluation of therapeutics

112
Acknowledgements
  • Boris Freidlin
  • Wenyu Jiang
  • Aboubakar Maitournam
  • Jyothi Subramanian
Write a Comment
User Comments (0)
About PowerShow.com