STATISTICS 542 Introduction to Clinical Trials SAMPLE SIZE ISSUES presentation

About This Presentation

Transcript and Presenter's Notes

Title: STATISTICS 542 Introduction to Clinical Trials SAMPLE SIZE ISSUES

1
STATISTICS 542Introduction to Clinical Trials
SAMPLE SIZE ISSUES

Ref Lachin, Controlled Clinical Trials 293-113,
1981.

2
Sample Size Issues

Fundamental Point
Trial must have sufficient statistical power to
detect differences of clinical interest
High proportion of published negative trials do
not have adequate power
Freiman et al, NEJM (1978)
50/71 could miss a 50 benefit

3
Example How many subjects?

Compare new treatment (T) with a control (C)
Previous data suggests Control Failure Rate (Pc)
40
Investigator believes treatment can reduce Pc by
25
i.e. PT .30, PC .40
N number of subjects/group?

Estimates only approximate
Uncertain assumptions
Over optimism about treatment
Healthy screening effect
Need series of estimates
Try various assumptions
Must pick most reasonable
Be conservative yet be reasonable

5
Statistical Considerations
Null Hypothesis (H0) No difference in the
response exists between treatment and control
groups Alternative Hypothesis (Ha) A
difference of a specified amount (?) exists
between treatment and control Significance
Level (?) Type I Error The probability of
rejecting H0 given that H0 is true Power (1 -
?) (? Type II Error) The probability of
rejecting H0 given that H0 is not true
6
Standard Normal Distribution
Ref Brown Hollander. Statistics A Biomedical
Introduction. John Wiley Sons, 1977.
7
Standard Normal Table
Ref Brown Hollander. Statistics A Biomedical
Introduction. John Wiley Sons, 1977.
8
Distribution of Sample Means (1)
Ref Brown Hollander. Statistics A Biomedical
Introduction. John Wiley Sons, 1977.
9
Distribution of Sample Means (2)
Ref Brown Hollander. Statistics A Biomedical
Introduction. John Wiley Sons, 1977.
10
Distribution of Sample Means (3)
Ref Brown Hollander. Statistics A Biomedical
Introduction. John Wiley Sons, 1977.
11
Distribution of Sample Means (4)
Ref Brown Hollander. Statistics A Biomedical
Introduction. John Wiley Sons, 1977.
12
Test Statistics
13
Distribution of Test Statistics

Many have this common form
Testing a population parameter (eg
difference in means)
T sample estimate of a population
parameter
Then
Z T E(T)/vV(T)
And then Z has a Normal (0,1) distribution for
large sample size

If statistic z is large enough (e.g. falls into
red area of scale), we believe this result is too
large
to have come from a distribution with mean O
(i.e. Pc - Pt 0)
Thus we reject H0 Pc - Pt 0, claiming that
their exists 5 chance this result could have
come from distribution with no difference

15
Normal Distribution
Ref Brown Hollander. Statistics A Biomedical
Introduction. John Wiley Sons, 1977.
16
Two Groups
OR
or
17
Test of Hypothesis

Two sided vs. One sided
e.g. H0 PT PC H0 PT lt PC
Classic test za critical value
If z gt z? If z gt z?
Reject H0 Reject H0
? .05 , z? 1.96 ? .05, z?
1.645
where z test statistic
Recommend
z? be same value both cases (e.g. 1.96)
two-sided one-sided
? ? .05 or .025
z? 1.96 1.96

18
Typical Design Assumptions (1)

1. ? .05, .025, .01
2. Power .80, .90
Should be at least .80 for design
3. ? smallest difference hope to detect
e.g. ? PC - PT
.40 - .30
.10 25 reduction!

19
Typical Design Assumptions (2)
Two Sided
Power
Significance Level
20
Sample Size Exercise

How many do I need?
Next question, whats the question?
Reason is that sample size depends on the outcome
being measured, and the method of analysis to be
used

21
Simple Case - Binomial

1. H0 PC PT
2. Test Statistic (Normal Approx.)
3. Sample Size
Assume
NT NC N
HA? PC - PT

22
Sample Size Formula (1)Two Proportions

Simple Case
Za constant associated with a
P Zgt Za a two sided!
(e.g. a .05, Za 1.96)
Zb constant associated with 1 - b
P Zlt Zb 1- b
(e.g. 1- b .90, Zb 1.282)
Solve for Zb (? 1- b) or D

23
Sample Size Formula (2)Two Proportions

Za constant associated with a
P Zgt Za a two sided!
(e.g. a .05, Za 1.96)
Zb constant associated with 1 - b
P Zlt Zb 1- b
(e.g. 1- b .90, Zb 1.282)

24
Sample Size Formula

Power
Solve for Zb ? 1- b
Difference Detected
Solve for D

25
Simple Example (1)

H0 PC PT
HA PC .40, PT .30
? .40 - .30 .10
Assume
a .05 Za 1.96 (Two sided)
1 - b .90 Zb 1.282
p (.40 .30 )/2 .35

26
Simple Example (2)

Thus
a.
N 476
2N 952
b.
2N 956
N 478

27
Approximate Total Sample Size for Comparing
Various Proportions in Two Groups with
Significance Level (a) of 0.05 and Power (1-b) of
0.80 and 0.90
28
(No Transcript)
29
Comparison of Means

Some outcome variables are continuous
Blood Pressure
Serum Chemistry
Pulmonary Function
Hypothesis tested by comparison of mean values
between groups, or comparison of mean changes

30
Comparison of Two Means

H0 ?C ?T ? ?C - ?T 0
HA ?C - ?T ?
Test statistic for sample means N (?, ?)
Let N NC NT for design

N(0,1) for H0
31
Comparison of Means

Power Calculation

32
Example

e.g. IQ ? 15 ?
0.3x15 4.5
Set 2? .05
? 0.10 1 - ? 0.90
HA ? 0.3? ? ?/? 0.3
Sample Size
N 234
? 2N 468

33
(No Transcript)
34
Comparing Time to Event Distributions

Primary efficacy endpoint is the time to an event
Compare the survival distributions for the two
groups
Measure of treatment effect is the ratio of the
hazard rates in the two groups ratio of the
medians
Must also consider the length of follow-up

35
Assuming Exponential Survival Distributions

Then define the effect size by
Standard difference

36
Time to Failure (1)

Use a parametric model for sample size
Common model - exponential
S(t) e-?t ? hazard rate
H0 ?I ?C
Estimate N
George Desu (1974)
Assumes all patients followed to an event
(no censoring)
Assumes all patients immediately entered

37
Assuming Exponential Survival Distributions

Simple case
The statistical test is powered by the total
number of events observed at the time of the
analysis, d.

38
Converting Number of Events (D) to Required
Sample Size (2N)

d 2N x P(event) 2N d/P(event)
P(event) is a function of the length of total
follow-up at time of analysis and the average
hazard rate
Let AR accrual rate (patients per year)
A period of uniform accrual (2N AR x A)
F period of follow-up after accrual complete
A/2 F average total follow-up at planned
analysis
average hazard rate
Then P(event) 1 P(no event)

39
Time to Failure (2)

In many clinical trials
1. Not all patients are followed to an event
(i.e. censoring)
2. Patients are recruited over some period of
time
(i.e. staggered entry)
More General Model (Lachin, 1981)
where g(?) is defined as follows

1. Instant Recruitment Study Censored At Time T
2. Continuous Recruiting (O,T) Censored at T
3. Recruitment (O, T0) Study Censored at T (T
gt T0)

Example
Assume ? .05 (2-sided) 1 - ? .90
?C .3 and ?I .2
T 5 years follow-up
T0 3
0. No Censoring, Instant Recruiting
N 128
1. Censoring at T, Instant Recruiting
N 188
2. Censoring at T, Continual Recruitment
N 310
3. Censoring at T, Recruitment to T0
N 233

42
Sample Size Adjustment for Non-Compliance (1)

References
1. Shork Remington (1967) Journal of Chronic
Disease
2. Halperin et al (1968) Journal of Chronic
Disease
3. Wu, Fisher DeMets (1988) Controlled
Clinical Trials
Problem
Some patients may not adhere to treatment
protocol
Impact
Dilute whatever true treatment effect exists

43
Sample Size Adjustment for Non-Compliance (2)

Fundamental Principle
Analyze All Subjects Randomized
Called Intent-to-Treat (ITT) Principle
Noncompliance will dilute treatment effect
A Solution
Adjust sample size to compensate for dilution
effect (reduced power)
Definitions of Noncompliance
Dropout Patient in treatment group stops taking
therapy
Dropin Patient in control group starts taking
experimental therapy

Comparing Two Proportions
Assumes event rates will be altered by
non-compliance
Define
PT adjusted treatment group rate
PC adjusted control group rate
If PT lt PC,

1.0
0
PC
PT
PC
PT
45
Adjusted Sample Size

Simple Model -
Compute unadjusted N
Assume no dropins
Assume dropout proportion R
Thus PC PC
PT (1-R) PT R PC
Then adjust N
Example
R 1/(1-R)2 Increase
.1 1.23 23
.25 1.78 78

46
Sample Size Adjustment for Non-Compliance

Dropouts dropins (R0, RI)
Example
R0 R1 1/(1- R0- R1)2 Increase
.1 .1 1.56 56
.25 .25 4.0 4 times

47
Sample Size Adjustments

More Complex Model
Ref Wu, Fisher, DeMets (1980)
Further Assumptions
Length of follow-up divided into intervals
Hazard rate may vary
Dropout rate may vary
Dropin rate may vary
Lag in time for treatment to be fully effective

48
Example Beta-Blocker Heart Attack Trial (BHAT)
(1)

Used complex model
Assumptions
1. ? .05 (Two sided) 1 - ? .90
2. 3 year follow-up
3. PC .18 (Control Rate)
4. PT .13 Treatment assumed
28 reduction
5. Dropout
26 (12, 8, 6)
6. Dropin
21 (7, 7, 7)

49
Example Beta-Blocker Heart Attack Trial (BHAT)
(2)

Unadjusted Adjusted
PC .18 PC .175
PT .13 PT .14
28 reduction 20 reduction
N 1100 N 2000
2N 2200 2N 4000

50
Multiple Response Variables

Many trials measure several outcomes
(e.g. MILIS, NOTT)
Must force investigator to rank them for
importance
Do sample size on a few outcomes (2-3)
If estimates agree, OK
If not, must seek compromise

51
Equivalency or Non-Inferiority Trials

Compare new therapy with standard
Wish to show new "as good as"
Rationale may be cost, toxicity, profit
Examples
Intermittent Positive Pressure Breathing Trial
Expensive IPPB vs. Cheaper Treatment
Nocturnal Oxygen Therapy Trial (NOTT)
12 Hours Oxygen vs. 24 Hours
Problem
Can't show H0? 0
A Solution
Specify minimum difference ? min

52
Sample Size Formula Two Proportions

Simple Case
Za constant associated with a
Zb constant associated with 1 - b
Solve for Zb (? 1- b) or D

53

Difference in Events Test Drug Standard Drug
54
Mid Stream Adjustments

Murphy's Law applies to sample size
May find event rate assumptions way off from
early results, power of study very inadequate
Problem
Quit?
Continue for almost certain doom?
Adjust sample size?
Extend followup?
Early Decision
Best to decide early, not look at treatment
comparisons

55
Adaptive Designs

One class allows re-estimating the sample size
once the trial is underway
Chung et al
Chen, Lan DeMets
Methods have been criticized for allowing bias
(eg Mehta Tsiatis)
Thus, methods still not widely used
AHEFT Trial one example
Will be discussed later in data monitoring lecture

56
Event Rate Assumptions

Challenging to get event rate assumptions correct
Inclusion/exclusion criteria effect
Healthy volunteer effect
Changing background therapy/standard of care
Even if trials conducted back to back

57
PRAISE I vs PRAISE IIPlacebo arms
58
Event Driven Trials

For time to event trials, most of the information
is in the events
Power is a function of the events
For time to event trials, target is really number
of observed events (D), not the total sample size
(2N)
Thus, target the number of events

59
Event Driven Trials

Can adjust or adapt trial to target the number of
events if the assumed event rate was too high
Steering committee can
Increase sample size
Increase follow up
A combination of both

60
Examples of Event Driven Trials

PROMISE (Based on control arm)
PRAISE I II
COPERNICUS
CARS (Based on control arm)

61
Response Adaptive Designs

The size of the observed treatment effect may be
different (i.e. less than) from assumptions
Treatment actually less effective
Compliance worse than assumed
Background therapy changed
Smaller observed effect may be still of clinical
interest if real

62
Response Adaptive Designs

Also, probability of rejecting H0 is also small
Power
Conditional Power
Question is whether to
quit and start over or
make design modification and continue

63
Response Adaptive Designs

Stopping and starting over problematic
Waste of financial resources
Ethical issues of wasting contributions of
patients who have already participated
Probably cant afford a policy of designing all
trials for minimum treatment effect of interest

64
Response Adaptive Designs

Adjust/increase sample size if treatment effect
assumed was too large
Traditionally, this approach discouraged
Recent methodology suggests possible approaches

65
Response Adaptive Designs

These methods are relatively new and still
controversial
Many leading biostatisticians very critical
(e.g., Fleming, Emerson, Turnbull, Tsiatis)
Issues often more than statistical control of
Type I error
Introducing other sources of bias

66
Response Adaptive Designs

Increase sample size based on observed treatment
effect
May inflate false positive rate
By 30 to 40 (Cui et al)
Can double (Proschan et al)
Inflation of Type I error of that magnitude not
acceptable

67
Response Adaptive Designs

Statistical adjustments to control alpha
Weighted z-statistic
Adjustment to the critical value
enforcing rules for sample size recalculation

68
Weighted Z Statistic

Reference
Cui, Hung Wang (1999,Biometrics)
Fisher(1998,Stat Med)
Shen Fisher (1999, Biometrics)
Tsiatis Mehta ( 2003, Biometrika)

69
Weighted Z

Xi N(0,1) distribution
n current sample size
N0 initial total sample size
?a hypothesized
treatment effect
t n/ N0

70
Weighted Z

N proposed sample size based on
Reject H0 if
Note less weight assigned to new/additional
observations

71
Weighted Z

Possible to modify design, increase sample size
based on interim analysis control Type I error
Flexibility has a price

72
Tsiatis Mehta Criticism

Argue that a properly designed group sequential
trial is more efficient than these adaptive
designs
Challenge is to properly design
(However, that can be a bigger challenge than
often realized)

73
Weighted/UnWeighted Modification

Both
Type I error lt ?
No real loss of power
Ref Chen, DeMets, Lan

74
P-Value Method

Reference
Proschan Hunsberger (1995, Biometrics)
Requires a promising p-value before allowing an
increase in sample size
Requires stopping if first stage p-value not
promising
Requires a larger critical value at the second
stage to control the Type I error

75
P-value Method

One sided alpha 0.05
P(1) .10 .15 .20 .25 .50
Z(2) 1.77 1.82 1.85 1.875 1.95
Regardless of n2, second stage

76
Proschan Hunsberger Method

Simple method may make Type I error substantially
less than 0.05
Developed another method to obtain exact Type I
error as a function of Z1 and n2, using a
conditional power type calculation (details to be
discussed later)

77
Proschan Hunsberger
Conditional Power and p value required in stage 2
as a function of R n2/n1 for the NHLBI Type II
study example
78
Proschan Hunsberger

Allows for sample size adjustment based on
observed treatment effect
Requires increasing final critical value

79
Adaptive Design Remarks

A need exists for adaptive designs (even FDA
statisticians agree)
Technical advances have been made through several
new methods
Adaptive designs are still not widely accepted
subject to (strong) criticism
May be useful for non pivotal trials
Practice precedes theory, perhaps in time

80
Sample Size Summary

Ethically, the size of the study must be large
enough to achieve the stated goals with
reasonable probability (power)
Sample size estimates are only approximate due to
uncertainty in assumptions
Need to be conservative but realistic

81
Demo of Sample Size Programwww.biostat.wisc.edu

Program covers comparison of proportions, means,
time to failure
Can vary control group rates or responses, alpha
power, hypothesized differences
Program develops sample size table and a power
curve for a particular sample size

82
Sample Size Program Output
83
Union Terrace/Lakefront

Write a Comment

User Comments (0)

About PowerShow.com

STATISTICS 542 Introduction to Clinical Trials SAMPLE SIZE ISSUES PowerPoint PPT Presentation