STATISTICS 542 Introduction to Clinical Trials SAMPLE SIZE ISSUES PowerPoint PPT Presentation

presentation player overlay
1 / 83
About This Presentation
Transcript and Presenter's Notes

Title: STATISTICS 542 Introduction to Clinical Trials SAMPLE SIZE ISSUES


1
STATISTICS 542Introduction to Clinical Trials
SAMPLE SIZE ISSUES
  • Ref Lachin, Controlled Clinical Trials 293-113,
    1981.

2
Sample Size Issues
  • Fundamental Point
  • Trial must have sufficient statistical power to
    detect differences of clinical interest
  • High proportion of published negative trials do
    not have adequate power
  • Freiman et al, NEJM (1978)
  • 50/71 could miss a 50 benefit

3
Example How many subjects?
  • Compare new treatment (T) with a control (C)
  • Previous data suggests Control Failure Rate (Pc)
    40
  • Investigator believes treatment can reduce Pc by
    25
  • i.e. PT .30, PC .40
  • N number of subjects/group?

4
  • Estimates only approximate
  • Uncertain assumptions
  • Over optimism about treatment
  • Healthy screening effect
  • Need series of estimates
  • Try various assumptions
  • Must pick most reasonable
  • Be conservative yet be reasonable

5
Statistical Considerations
Null Hypothesis (H0) No difference in the
response exists between treatment and control
groups Alternative Hypothesis (Ha) A
difference of a specified amount (?) exists
between treatment and control Significance
Level (?) Type I Error The probability of
rejecting H0 given that H0 is true Power (1 -
?) (? Type II Error) The probability of
rejecting H0 given that H0 is not true
6
Standard Normal Distribution
Ref Brown Hollander. Statistics A Biomedical
Introduction. John Wiley Sons, 1977.
7
Standard Normal Table
Ref Brown Hollander. Statistics A Biomedical
Introduction. John Wiley Sons, 1977.
8
Distribution of Sample Means (1)
Ref Brown Hollander. Statistics A Biomedical
Introduction. John Wiley Sons, 1977.
9
Distribution of Sample Means (2)
Ref Brown Hollander. Statistics A Biomedical
Introduction. John Wiley Sons, 1977.
10
Distribution of Sample Means (3)
Ref Brown Hollander. Statistics A Biomedical
Introduction. John Wiley Sons, 1977.
11
Distribution of Sample Means (4)
Ref Brown Hollander. Statistics A Biomedical
Introduction. John Wiley Sons, 1977.
12
Test Statistics
13
Distribution of Test Statistics
  • Many have this common form
  • Testing a population parameter (eg
    difference in means)
  • T sample estimate of a population
    parameter
  • Then
  • Z T E(T)/vV(T)
  • And then Z has a Normal (0,1) distribution for
    large sample size

14
  • If statistic z is large enough (e.g. falls into
    red area of scale), we believe this result is too
    large
  • to have come from a distribution with mean O
    (i.e. Pc - Pt 0)
  • Thus we reject H0 Pc - Pt 0, claiming that
    their exists 5 chance this result could have
    come from distribution with no difference

15
Normal Distribution
Ref Brown Hollander. Statistics A Biomedical
Introduction. John Wiley Sons, 1977.
16
Two Groups
OR
or
17
Test of Hypothesis
  • Two sided vs. One sided
  • e.g. H0 PT PC H0 PT lt PC
  • Classic test za critical value
  • If z gt z? If z gt z?
  • Reject H0 Reject H0
  • ? .05 , z? 1.96 ? .05, z?
    1.645
  • where z test statistic
  • Recommend
  • z? be same value both cases (e.g. 1.96)
  • two-sided one-sided
  • ? ? .05 or .025
  • z? 1.96 1.96

18
Typical Design Assumptions (1)
  • 1. ? .05, .025, .01
  • 2. Power .80, .90
  • Should be at least .80 for design
  • 3. ? smallest difference hope to detect
  • e.g. ? PC - PT
  • .40 - .30
  • .10 25 reduction!

19
Typical Design Assumptions (2)
Two Sided
Power
Significance Level
20
Sample Size Exercise
  • How many do I need?
  • Next question, whats the question?
  • Reason is that sample size depends on the outcome
    being measured, and the method of analysis to be
    used

21
Simple Case - Binomial
  • 1. H0 PC PT
  • 2. Test Statistic (Normal Approx.)
  • 3. Sample Size
  • Assume
  • NT NC N
  • HA? PC - PT

22
Sample Size Formula (1)Two Proportions
  • Simple Case
  • Za constant associated with a
  • P Zgt Za a two sided!
  • (e.g. a .05, Za 1.96)
  • Zb constant associated with 1 - b
  • P Zlt Zb 1- b
  • (e.g. 1- b .90, Zb 1.282)
  • Solve for Zb (? 1- b) or D

23
Sample Size Formula (2)Two Proportions
  • Za constant associated with a
  • P Zgt Za a two sided!
  • (e.g. a .05, Za 1.96)
  • Zb constant associated with 1 - b
  • P Zlt Zb 1- b
  • (e.g. 1- b .90, Zb 1.282)

24
Sample Size Formula
  • Power
  • Solve for Zb ? 1- b
  • Difference Detected
  • Solve for D

25
Simple Example (1)
  • H0 PC PT
  • HA PC .40, PT .30
  • ? .40 - .30 .10
  • Assume
  • a .05 Za 1.96 (Two sided)
  • 1 - b .90 Zb 1.282
  • p (.40 .30 )/2 .35

26
Simple Example (2)
  • Thus
  • a.
  • N 476
  • 2N 952
  • b.
  • 2N 956
  • N 478

27
Approximate Total Sample Size for Comparing
Various Proportions in Two Groups with
Significance Level (a) of 0.05 and Power (1-b) of
0.80 and 0.90
28
(No Transcript)
29
Comparison of Means
  • Some outcome variables are continuous
  • Blood Pressure
  • Serum Chemistry
  • Pulmonary Function
  • Hypothesis tested by comparison of mean values
    between groups, or comparison of mean changes

30
Comparison of Two Means
  • H0 ?C ?T ? ?C - ?T 0
  • HA ?C - ?T ?
  • Test statistic for sample means N (?, ?)
  • Let N NC NT for design

N(0,1) for H0
31
Comparison of Means
  • Power Calculation

32
Example
  • e.g. IQ ? 15 ?
    0.3x15 4.5
  • Set 2? .05
  • ? 0.10 1 - ? 0.90
  • HA ? 0.3? ? ?/? 0.3
  • Sample Size
  • N 234
  • ? 2N 468

33
(No Transcript)
34
Comparing Time to Event Distributions
  • Primary efficacy endpoint is the time to an event
  • Compare the survival distributions for the two
    groups
  • Measure of treatment effect is the ratio of the
    hazard rates in the two groups ratio of the
    medians
  • Must also consider the length of follow-up

35
Assuming Exponential Survival Distributions
  • Then define the effect size by
  • Standard difference

36
Time to Failure (1)
  • Use a parametric model for sample size
  • Common model - exponential
  • S(t) e-?t ? hazard rate
  • H0 ?I ?C
  • Estimate N
  • George Desu (1974)
  • Assumes all patients followed to an event
  • (no censoring)
  • Assumes all patients immediately entered

37
Assuming Exponential Survival Distributions
  • Simple case
  • The statistical test is powered by the total
    number of events observed at the time of the
    analysis, d.

38
Converting Number of Events (D) to Required
Sample Size (2N)
  • d 2N x P(event) 2N d/P(event)
  • P(event) is a function of the length of total
    follow-up at time of analysis and the average
    hazard rate
  • Let AR accrual rate (patients per year)
  • A period of uniform accrual (2N AR x A)
  • F period of follow-up after accrual complete
  • A/2 F average total follow-up at planned
    analysis
  • average hazard rate
  • Then P(event) 1 P(no event)

39
Time to Failure (2)
  • In many clinical trials
  • 1. Not all patients are followed to an event
  • (i.e. censoring)
  • 2. Patients are recruited over some period of
    time
  • (i.e. staggered entry)
  • More General Model (Lachin, 1981)
  • where g(?) is defined as follows

40
  • 1. Instant Recruitment Study Censored At Time T
  • 2. Continuous Recruiting (O,T) Censored at T
  • 3. Recruitment (O, T0) Study Censored at T (T
    gt T0)

41
  • Example
  • Assume ? .05 (2-sided) 1 - ? .90
  • ?C .3 and ?I .2
  • T 5 years follow-up
  • T0 3
  • 0. No Censoring, Instant Recruiting
  • N 128
  • 1. Censoring at T, Instant Recruiting
  • N 188
  • 2. Censoring at T, Continual Recruitment
  • N 310
  • 3. Censoring at T, Recruitment to T0
  • N 233

42
Sample Size Adjustment for Non-Compliance (1)
  • References
  • 1. Shork Remington (1967) Journal of Chronic
    Disease
  • 2. Halperin et al (1968) Journal of Chronic
    Disease
  • 3. Wu, Fisher DeMets (1988) Controlled
    Clinical Trials
  • Problem
  • Some patients may not adhere to treatment
    protocol
  • Impact
  • Dilute whatever true treatment effect exists

43
Sample Size Adjustment for Non-Compliance (2)
  • Fundamental Principle
  • Analyze All Subjects Randomized
  • Called Intent-to-Treat (ITT) Principle
  • Noncompliance will dilute treatment effect
  • A Solution
  • Adjust sample size to compensate for dilution
    effect (reduced power)
  • Definitions of Noncompliance
  • Dropout Patient in treatment group stops taking
    therapy
  • Dropin Patient in control group starts taking
    experimental therapy

44
  • Comparing Two Proportions
  • Assumes event rates will be altered by
    non-compliance
  • Define
  • PT adjusted treatment group rate
  • PC adjusted control group rate
  • If PT lt PC,

1.0
0
PC
PT
PC
PT
45
Adjusted Sample Size
  • Simple Model -
  • Compute unadjusted N
  • Assume no dropins
  • Assume dropout proportion R
  • Thus PC PC
  • PT (1-R) PT R PC
  • Then adjust N
  • Example
  • R 1/(1-R)2 Increase
  • .1 1.23 23
  • .25 1.78 78

46
Sample Size Adjustment for Non-Compliance
  • Dropouts dropins (R0, RI)
  • Example
  • R0 R1 1/(1- R0- R1)2 Increase
  • .1 .1 1.56 56
  • .25 .25 4.0 4 times

47
Sample Size Adjustments
  • More Complex Model
  • Ref Wu, Fisher, DeMets (1980)
  • Further Assumptions
  • Length of follow-up divided into intervals
  • Hazard rate may vary
  • Dropout rate may vary
  • Dropin rate may vary
  • Lag in time for treatment to be fully effective

48
Example Beta-Blocker Heart Attack Trial (BHAT)
(1)
  • Used complex model
  • Assumptions
  • 1. ? .05 (Two sided) 1 - ? .90
  • 2. 3 year follow-up
  • 3. PC .18 (Control Rate)
  • 4. PT .13 Treatment assumed
  • 28 reduction
  • 5. Dropout
  • 26 (12, 8, 6)
  • 6. Dropin
  • 21 (7, 7, 7)

49
Example Beta-Blocker Heart Attack Trial (BHAT)
(2)
  • Unadjusted Adjusted
  • PC .18 PC .175
  • PT .13 PT .14
  • 28 reduction 20 reduction
  • N 1100 N 2000
  • 2N 2200 2N 4000

50
Multiple Response Variables
  • Many trials measure several outcomes
  • (e.g. MILIS, NOTT)
  • Must force investigator to rank them for
    importance
  • Do sample size on a few outcomes (2-3)
  • If estimates agree, OK
  • If not, must seek compromise

51
Equivalency or Non-Inferiority Trials
  • Compare new therapy with standard
  • Wish to show new "as good as"
  • Rationale may be cost, toxicity, profit
  • Examples
  • Intermittent Positive Pressure Breathing Trial
  • Expensive IPPB vs. Cheaper Treatment
  • Nocturnal Oxygen Therapy Trial (NOTT)
  • 12 Hours Oxygen vs. 24 Hours
  • Problem
  • Can't show H0? 0
  • A Solution
  • Specify minimum difference ? min

52
Sample Size Formula Two Proportions
  • Simple Case
  • Za constant associated with a
  • Zb constant associated with 1 - b
  • Solve for Zb (? 1- b) or D

53

Difference in Events Test Drug Standard Drug
54
Mid Stream Adjustments
  • Murphy's Law applies to sample size
  • May find event rate assumptions way off from
    early results, power of study very inadequate
  • Problem
  • Quit?
  • Continue for almost certain doom?
  • Adjust sample size?
  • Extend followup?
  • Early Decision
  • Best to decide early, not look at treatment
    comparisons

55
Adaptive Designs
  • One class allows re-estimating the sample size
    once the trial is underway
  • Chung et al
  • Chen, Lan DeMets
  • Methods have been criticized for allowing bias
    (eg Mehta Tsiatis)
  • Thus, methods still not widely used
  • AHEFT Trial one example
  • Will be discussed later in data monitoring lecture

56
Event Rate Assumptions
  • Challenging to get event rate assumptions correct
  • Inclusion/exclusion criteria effect
  • Healthy volunteer effect
  • Changing background therapy/standard of care
  • Even if trials conducted back to back

57
PRAISE I vs PRAISE IIPlacebo arms
58
Event Driven Trials
  • For time to event trials, most of the information
    is in the events
  • Power is a function of the events
  • For time to event trials, target is really number
    of observed events (D), not the total sample size
    (2N)
  • Thus, target the number of events

59
Event Driven Trials
  • Can adjust or adapt trial to target the number of
    events if the assumed event rate was too high
  • Steering committee can
  • Increase sample size
  • Increase follow up
  • A combination of both

60
Examples of Event Driven Trials
  • PROMISE (Based on control arm)
  • PRAISE I II
  • COPERNICUS
  • CARS (Based on control arm)

61
Response Adaptive Designs
  • The size of the observed treatment effect may be
    different (i.e. less than) from assumptions
  • Treatment actually less effective
  • Compliance worse than assumed
  • Background therapy changed
  • Smaller observed effect may be still of clinical
    interest if real

62
Response Adaptive Designs
  • Also, probability of rejecting H0 is also small
  • Power
  • Conditional Power
  • Question is whether to
  • quit and start over or
  • make design modification and continue

63
Response Adaptive Designs
  • Stopping and starting over problematic
  • Waste of financial resources
  • Ethical issues of wasting contributions of
    patients who have already participated
  • Probably cant afford a policy of designing all
    trials for minimum treatment effect of interest

64
Response Adaptive Designs
  • Adjust/increase sample size if treatment effect
    assumed was too large
  • Traditionally, this approach discouraged
  • Recent methodology suggests possible approaches

65
Response Adaptive Designs
  • These methods are relatively new and still
    controversial
  • Many leading biostatisticians very critical
    (e.g., Fleming, Emerson, Turnbull, Tsiatis)
  • Issues often more than statistical control of
    Type I error
  • Introducing other sources of bias

66
Response Adaptive Designs
  • Increase sample size based on observed treatment
    effect
  • May inflate false positive rate
  • By 30 to 40 (Cui et al)
  • Can double (Proschan et al)
  • Inflation of Type I error of that magnitude not
    acceptable

67
Response Adaptive Designs
  • Statistical adjustments to control alpha
  • Weighted z-statistic
  • Adjustment to the critical value
  • enforcing rules for sample size recalculation

68
Weighted Z Statistic
  • Reference
  • Cui, Hung Wang (1999,Biometrics)
  • Fisher(1998,Stat Med)
  • Shen Fisher (1999, Biometrics)
  • Tsiatis Mehta ( 2003, Biometrika)

69
Weighted Z
  • Xi N(0,1) distribution
  • n current sample size
  • N0 initial total sample size
  • ?a hypothesized
  • treatment effect
  • t n/ N0

70
Weighted Z
  • N proposed sample size based on
  • Reject H0 if
  • Note less weight assigned to new/additional
    observations

71
Weighted Z
  • Possible to modify design, increase sample size
    based on interim analysis control Type I error
  • Flexibility has a price

72
Tsiatis Mehta Criticism
  • Argue that a properly designed group sequential
    trial is more efficient than these adaptive
    designs
  • Challenge is to properly design
  • (However, that can be a bigger challenge than
    often realized)

73
Weighted/UnWeighted Modification
  • Both
  • Type I error lt ?
  • No real loss of power
  • Ref Chen, DeMets, Lan

74
P-Value Method
  • Reference
  • Proschan Hunsberger (1995, Biometrics)
  • Requires a promising p-value before allowing an
    increase in sample size
  • Requires stopping if first stage p-value not
    promising
  • Requires a larger critical value at the second
    stage to control the Type I error

75
P-value Method
  • One sided alpha 0.05
  • P(1) .10 .15 .20 .25 .50
  • Z(2) 1.77 1.82 1.85 1.875 1.95
  • Regardless of n2, second stage

76
Proschan Hunsberger Method
  • Simple method may make Type I error substantially
    less than 0.05
  • Developed another method to obtain exact Type I
    error as a function of Z1 and n2, using a
    conditional power type calculation (details to be
    discussed later)

77
Proschan Hunsberger
Conditional Power and p value required in stage 2
as a function of R n2/n1 for the NHLBI Type II
study example
78
Proschan Hunsberger
  • Allows for sample size adjustment based on
    observed treatment effect
  • Requires increasing final critical value

79
Adaptive Design Remarks
  • A need exists for adaptive designs (even FDA
    statisticians agree)
  • Technical advances have been made through several
    new methods
  • Adaptive designs are still not widely accepted
    subject to (strong) criticism
  • May be useful for non pivotal trials
  • Practice precedes theory, perhaps in time

80
Sample Size Summary
  • Ethically, the size of the study must be large
    enough to achieve the stated goals with
    reasonable probability (power)
  • Sample size estimates are only approximate due to
    uncertainty in assumptions
  • Need to be conservative but realistic

81
Demo of Sample Size Programwww.biostat.wisc.edu
  • Program covers comparison of proportions, means,
    time to failure
  • Can vary control group rates or responses, alpha
    power, hypothesized differences
  • Program develops sample size table and a power
    curve for a particular sample size

82
Sample Size Program Output
83
Union Terrace/Lakefront
Write a Comment
User Comments (0)
About PowerShow.com