Multiple Endpoint Testing in Clinical Trials - PowerPoint PPT Presentation

About This Presentation
Title:

Multiple Endpoint Testing in Clinical Trials

Description:

Mohammad Huque, Ph.D. Division of Biometrics III/Office of Biostatistics/OPaSS/CDER/FDA ... Complexities? ... Underlying concepts and complexities? 10/6/09 ... – PowerPoint PPT presentation

Number of Views:84
Avg rating:3.0/5.0
Slides: 31
Provided by: CDE100
Category:

less

Transcript and Presenter's Notes

Title: Multiple Endpoint Testing in Clinical Trials


1
Multiple Endpoint Testing in Clinical Trials
Some Issues Considerations
  • Mohammad Huque, Ph.D.
  • Division of Biometrics III/Office of
    Biostatistics/OPaSS/CDER/FDA
  • 2005 Industry/FDA Workshop, Washington. DC

2
Disclaimer
  • Views expressed here is that of the presenter and
    not necessarily of the FDA

3
Sources of Multiplicity in Clinical Trials
  • Multiple endpoints ?
  • Multiple comparisons
  • Interim analysis
  • Subgroup analysis
  • Selection of covariates in an analysis model
  • Others

4
OUTLINE
  1. Type I error concept and type I error control
    when testing for multiple endpoints.
    Complexities?
  2. Multiple endpoints are often triaged into
    primary, secondary and other types of endpoints.
    Reasons for doing so and how these endpoints are
    tested?
  3. Sequential testing of endpoints - no alpha
    adjustment is needed. Issues and fixes?
  4. Some trials require that 2 or more endpoints must
    show effects for clinical evidence. Reasons for
    doing so and consequences?
  5. Composite endpoints. Underlying concepts and
    complexities?

5
Trial has a single endpoint to test type I and
type II errors
Concludes Treatment Not beneficial Concludes Treatment beneficial
Truly Not beneficial H0 Correct Decision Type I error
Truly beneficial Ha Type II error Correct Decision
  • Conduct a test for claiming that a new treatment
    is beneficial
  • a Probability of the Type I error
  • ß Probability of the Type II error (power 1-
    ß )

6
Trial has multiple endpoints to test
  • Consider a two arm superiority trial, a test
    treatment versus a control
  • Endpoints y1, y2, , yK
  • Multiple Null Hypotheses F H01, H02, , H0K
  • H0j dj 0, Haj dj ? 0, j 1, , K

7
Trial has multiple endpoints to test
  • Two scenarios
  • (A) In the family F all are true null hypotheses
  • (B) Some may be true null hypotheses, and some
    may be false null hypotheses, but their true
    state are unknown.

8
Testing under scenario (A)
  • Scenario (A) and the trial has 3 endpoints y1,
    y2, and y3
  • A test procedure can give type I error in
    multiple ways (-, -, ), (-, , -), (, -, -),
    (-, , ), (, -, ), (, , -), (, , ). These
    are chance events because of multiplicity of
    tests when in fact there is no treatment benefit
    for any of the endpoint.
  • a0 Pr of at least one of these chance events
    test procedure, H0, H0 nH0j

9
Testing under scenario (A)
  • a0 is called global alpha (or overall alpha).
    Also, called the familywise type I error rate
    (FWER) under H0, where
  • H0 nH0j is the global null hypothesis.
  • A test procedure for testing H0 is called a
    global test procedure

10
Global Test procedures
  • Useful for non-specific global claims. Difficulty
    in interpreting the result. Type I error rate can
    remain inflated for specific claims.
  • Examples Simes test, OBriens OLS/GLS tests,
    Hotellings T2 test (Sankoh et al, DIA Jr.,1999)

11
Testing under scenario (B)
  • Some of the null hypotheses F H01, H02, ,
    H0K may be true null hypotheses and some be
    false, but its not known which ones are which.
  • Question Is there a treatment effect
    specifically for the endpoint y1?
  • For answering this question, the null hypothesis
    is not a single null hypothesis like a global
    null hypothesis, rather it is a class of null
    hypothesis configurations in which there is no
    treatment effect for y1, and all possible
    scenarios for treatment effects for the remaining
    endpoints y2, , yK

12
Testing under scenario (B)
  • Consider 3 endpoints y1, y2, and y3.
  • Question Is there a treatment effect
    specifically for the endpoint y1?
  • Null hypothesis configurations F1 for testing for
    treatment effect specifically for the endpoint
    y1
  • F1 (d1 0, d2 0, d3 0),
  • (d1 0, d2 0, d3 ? 0),
  • (d1 0, d2 ? 0, d3 0),
  • (d1 0, d2 ? 0, d3 ? 0).

13
Control of FWER(two types)
  • Weak control
  • Control FWER only under the global null
    configuration
  • Strong control
  • Control FWER under all null configurations
  • Specificity property -- useful for making
    specific claims.
  • Examples of methods Bonferroni, Holm, Hochberg,
    closed statistical tests, and other methods
  • with some caveats

14
Triaging of multiple endpoints into meaningful
families by trial objectives
  • Two important families

1) Prospectively defined 2) FWE controlled
Primary endpoints
Secondary endpoints
Exploratory endpoints
(usually not prospectively defined)
  • Primary endpoints are primary focus of the
    trial. Their results determine
  • main benefits of he clinical trials
    intervention.
  • Secondary endpoints by themselves generally not
    sufficient for characterizing
  • treatment benefit. Generally, tested for
    statistical significance for extended
  • indication and labeling after the primary
    objectives of the trial are met.

15
Statistical methods
  • Prospective alpha allocation schemes (PAAS)
    Moyé (2000)
  • Spend alpha1 for the primary endpoints and the
    remaining alpha for the secondary endpoints -
    FWER is controlled

16
Statistical methods
  • Parallel gatekeeping strategies for clinical
    trials
  • Dmitrienko-Offen-Westfall (SM 2003)
  • Chen-Luo-Capizzi (SM 2005)
  • Allows testing of secondary endpoints when at
    least one of the primary endpoints exhibits a
    statistically significant result
  • These methods controls FWER for both the primary
    and secondary endpoints in the strong sense.

17
Sequential testing of multiple endpoints
  • A fixed sequence approach allows testing of each
    of the k null hypotheses at the same significance
    level of a without any adjustment, as long as the
    null hypotheses to be tested are hierarchically
    ordered and are tested in a pre-defined
    sequential order.
  • Hierarchical ordering of null hypotheses can be
    achieved, for example, by their clinical
    relevance.

18
Sequential testing of multiple endpoints
  • For this fixed-sequence approach, however,
  • there are two caveats
  • Pre-specification of the testing sequence
  • No further testing once the sequence breaks
  • Problem when the sequence breaks and the next
    p-value is extreme (e.g., p1 0.50, p2 0.001)

19
A flexible fixed-sequence approach
Test H(02) at Level a
H(01) is rejected
Test H(01) at Level a1
Test H(02) at Level ?
H(01) is rejected
e.g., a1 0.04, a 0.05, ? 0.0104, ? 0 (?
0.0214, ? 0.8 )
20
Example flexible fixed-sequence method

21
Some trials require that 2 or more endpoints must
show effects
  • Examples
  • Alzheimer trial
  • (win on ADAS-Cognitive Sub-scale) and (win on
    Clinicians Interview Based Impression of Change)
  • Many other examples (PhRMA draft paper)
  • Main Reason
  • Clinical expectations of the desired clinical
    benefit
  • (concept beyond statistics)

22
Adjustments in the Type I error rate - Some
wining criterion require adjustments and some
dont
? Adjustment by Sidaks method on accounting
for correlation Note Which method to use
depends on on the clinical decision rule set in
advance
23
Power ComparisonCase of K2 endpoints
24
Loss in Power when win in all endpointsK of
endpoints
25
Sample Size Increase (1) When Win in All K
Endpoints Compared to Single Endpoint Case
  • Alpha 0.025 (1-sided), Power 0.90
  • Correlation K 2 K3
    K4
  • 0.0 22.8
    35.9 45.0
  • 0.3 21.1
    33.1 41.2
  • 0.4 20.2
    31.7 39.7
  • 0.5 19.1
    29.8 37.3
  • 0.6 17.7
    27.5 34.4
  • 0.7 15.9
    24.6 30.7
  • 0.8 13.5
    20.8 25.8
  • 0.9 10.0
    15.3 18.9
  • (1) Calculations using mutivariate normal
    distribution of the test statistics comparing
    active treatment versus placebo for a 2-arm
    trial, assuming same delta/sigma for all K
    endpoints

26
Composite Endpoints
  • Two types -
  • Total score or index based on a rating scale,
    e.g., HAMD totals in depression trials,
    ACR20/ACR70 in rheumatoid arthritis trials
  • Issues validity and reliability

27
Composite Endpoints
  • Another Type
  • Composite endpoint is defined in terms of the
    time to the first event, where event is one of
    several possible event types
  • LIFE study Composite of cardiovascular
    death, stroke and myocardial infraction events.

28
Composite Endpoint Issues
  • Life Study
  • The Composite endpoint was significantly
    positive. However, analysis of the first events
    by individual components and sub-composite
    endpoints indicate overall composite result
    mainly due to reduction in fatal and non-fatal
    stroke.
  • Issue
  • How to interpret composite endpoint results? How
    to characterize benefits in terms of the
    component endpoints?

29
Extent of multiplicity adjustments between
endpoints
  • correlation

high
Practically no adjustments
Small adjustments
Good case for combining endpoints
Large adjustments
low
high
low
Homogeneity of treatment effects across endpoints
30
Concluding Remarks
  • For endpoint specific claims strong control of
    the type I error is needed
  • Parallel gate-keeping strategies can be used for
    the primary and secondary endpoint claims
  • Flexible sequential test procedure can be used to
    gain power of the test
  • There is a scientific basis when a reasonable
    clinical decision rule asks for statistically
    significant efficacy results in more than 1
    endpoint issue of loss of power?
  • When 4 or more endpoints included as primary
    (e.g., arthritis trials), and homogeneity of
    treatment effects acress endpoints is expected -
    a composite or responder endpoint approach will
    be effective.
Write a Comment
User Comments (0)
About PowerShow.com