Basic Biostatistics for the Clinical Trialist - PowerPoint PPT Presentation

1 / 57
About This Presentation
Title:

Basic Biostatistics for the Clinical Trialist

Description:

Types of data and summary statistics. Confidence Intervals. Tests of ... of new breast cancers in Tamoxifen treated and placebo treated subjects over 5 years? ... – PowerPoint PPT presentation

Number of Views:208
Avg rating:3.0/5.0
Slides: 58
Provided by: hilse5
Category:

less

Transcript and Presenter's Notes

Title: Basic Biostatistics for the Clinical Trialist


1
Basic Biostatistics for the Clinical Trialist
?
?
?2
?
Power
P0.05
?
80
  • Susan Hilsenbeck, Ph.D.
  • Breast Center and Dan L. Duncan Cancer Centerat
    Baylor College of Medicine
  • Houston, TX USA

2
Overview of Material
  • Types of data and summary statistics
  • Confidence Intervals
  • Tests of hypothesis
  • Sample size calculations

3
Sample vs Target Population
Do It
Protocol
4
Sample vs Target Population
5
Design of Clinical Trials Striking a Balance
  • Answer the question (correctly)
  • Control risk of errors in conclusions
  • Minimize potential harm and maximize potential
    benefit
  • Limit n of participants treated at
    sub-therapeutic doses
  • Limit n of participants treated with ineffective
    therapy or exposed to toxicity
  • Maximize feasibility
  • Simple enough to carry out

6
Types of Data Typical of Early Phase Trials
SexEthnicity
Freq count
Sex
Gender
Proportion
Tumor location
Performance
Stage
Grade of Tox
Mean, median, etc
7
Summary Statistics Location
8
Summary Statistics Spread
9
Graphs as Summary Statistics
10
Summary Statistics and Confidence Intervals
  • Response rate is point estimate of the effect of
    drug
  • Confidence interval gives a range of population
    response rates that are consistent with the
    sample data

11
Thought ExperimentCatching the real response
rate
  • Suppose the real response rate for a new therapy
    is 0.3 (30)
  • Suppose we run a small safety and efficacy
    clinical trial, and calculate the response rate
    and a 95 confidence interval for the response
    rate over and over and over
  • How often will the interval capture the real
    value?

12
True Rate0.3, N30, Confidence95
95 of CI's contain True Rate
13
True Rate0.3, N30, Confidence99.9
What happens if we want to be more confident?
14
True Rate0.3, N120, Confidence95
What happens if we want to be 95 confident, but
we increase the sample size?
15
Making Decisions Test of Hypothesis
a probability of Type I error (level of
significance) b probability of Type II
error 1-b Power
16
Hypothesis Testing and Jury Trials
17
Hypothesis Testing and Drug Trials
18
Type I and Type II Errors
  • Common choices
  • a 5
  • b 20
  • Exploratory study?
  • a 10
  • b 10
  • Confirmatory study?
  • a 1
  • b 10

19
Study Paradigm
Hypothesis
20
Example of a test of hypothesis
Compare the rate of new breast cancers in
Tamoxifen treated and placebo treated subjects
over 5 years?

Total ------------------------- TAM
1000
-----------------
-------- Placebo 1000

------------------------- Total
2000
BRCA Dis
Free Total ------------------------- TAM
1000

------------------------- Placebo
1000
----------------------
--- Total 50 1950 2000
Expected BRCA Dis
Free Total ------------------------- TAM
1000 25
975
------------------------- Placebo
1000 25 975
----------------------
--- Total 50 1950 2000
Frequency Expected BRCA Dis
Free Total ------------------------- TAM
16 984 1000 25
975
------------------------- Placebo 34
966 1000 25 975
----------------------
--- Total 50 1950 2000
Test Statistic DF Value P-value
Chi-Square 1 6.65 ?
Hypothetical data representative of Fisher et al,
1998, JNCI 901371-1388
21
Chi Square Distribution
3.84P0.05
6.69P0.01
Observed data very different from expected
Observed data very close to expected
22
Example of a test of hypothesis
Compare the rate of new breast cancers in
Tamoxifen treated and placebo treated subjects
over 5 years?

Total ------------------------- TAM
1000
-----------------
-------- Placebo 1000

------------------------- Total
2000
BRCA Dis
Free Total ------------------------- TAM
1000

------------------------- Placebo
1000
----------------------
--- Total 50 1950 2000
Expected BRCA Dis
Free Total ------------------------- TAM
1000 25
975
------------------------- Placebo
1000 25 975
----------------------
--- Total 50 1950 2000
Frequency Expected BRCA Dis
Free Total ------------------------- TAM
16 984 1000 25
975
------------------------- Placebo 34
966 1000 25 975
----------------------
--- Total 50 1950 2000
Test Statistic DF Value P-value
Chi-Square 1 6.65 0.01
Hypothetical data representative of Fisher et al,
1998, JNCI 901371-1388
23
What if we double the sample size?
Compare the rate of new breast cancers in
Tamoxifen treated and placebo treated subjects
over 5 years?
Frequency Expected BRCA Dis
Free Total ------------------------- TAM
32 1968 2000 50
1950
------------------------- Placebo 68
1932 2000 50 1950
----------------------
--- Total 100 3900 4000
Test Statistic DF Value P-value
Chi-Square 1 13.29 0.0003
Hypothetical data representative of Fisher et al,
1998, JNCI 901371-1388
24
Chi Square Distribution
3.84P0.05
13.29P0.003
6.69P0.01
25
P-Value
  • Descriptive statement How consistent or
    inconsistent are the observed data with what we
    would have expected to see by chance (Ho true)
  • P0.01 means, IF Ho is true, 1 time in 100 we
    would get something like this OR something even
    more inconsistent with Ho

26
Effect Size and Confidence Interval
Frequency Expected Row Pct BRCA Dis
Free Total ------------------------- TAM
16 984 1000 25
975 1.60 98.40 -------------
------------ Placebo 34 966
1000 25 975
3.40 96.60 ------------------------- Total
50 1950 2000
RR 95 CI _
1.6/3.40.47 0.26 to 0.85
27
What if we double the sample?
Frequency Expected Row Pct BRCA Dis
Free Total ------------------------- TAM
32 1968 2000 50
1950 1.60 98.40 -------------
------------ Placebo 68 1932
2000 50 1950
3.40 96.60 ------------------------- Total
100 3900 4000
RR 95 CI _ 0.47 0.31 to
0.71
28
P-values and Confidence Intervals
  • Before start of trial
  • specify ? and ? errors
  • After analysis of trial
  • summarize results of testing with p-value
  • BUT Small p ? Big Effect
  • Summarize size of effect with estimate and
    confidence interval
  • Report estimates, confidence intervals and
    p-values

29
When you observe a small P-value
  • It means the null hypothesis is unlikely to be
    true?
  • It means that the treatment effect is big and
    clinically important?
  • It means your results are unusual if there is
    actually NO EFFECT?

NO
Not Necessarily
Yes
Pr(Hodata) ? Pr(dataHo)
30
Planning a StudySample Size and Power Analysis
  • Sample size calculations estimate the number of
    patients needed to accomplish study goals
  • Power analysis estimates the power to detect
    specified differences, given a particular sample
    size

31
Break
32
Ingredients
  • Test t-test, chi-square test?
  • N sample size (per group?)
  • K imbalance in size of groups
  • ? effect size (clinically important difference
    and expected variability)
  • ? alpha error rate
  • ? beta error rate
  • Other censoring, correlations among variables,

33
The TTEST Procedure Summary Statistics
Group N
Mean Std Dev A 15 98.86
10.66 B 15 110.02
9.57 Equality of Variances Variable Method
Num DF Den DF F Value Pr gt F assay_value
Folded F 14 14 1.24
0.6933 T-Tests Variable Method Variances
DF t Value Prgtt assay_value Pooled
Equal 28 -3.02 0.0054

The FREQ Procedure Table of Group by
Category Frequency High Low
Total ------------------------- A
4 11 15 -------------------------
B 9 6
15 ------------------------- Total 13
17 30 Statistic
DF Value Prob Continuity Adj.
Chi-Square 1 2.1719 0.1405
Fisher's Exact Test Two-sided Pr lt P
0.1394
34
Relationships between Ingredients
?
?
?
?
?
35
Example 1
  • Suppose you want to compare the average test
    scores among research fellows following two
    different training programs, a web-based
    self-paced course, and an intensive course at a
    plush resort. Based on previous experience, you
    expect the web course students to score about 75
    (standard deviation10) and you hope that the
    one-on-one teaching at the resort will result in
    a 6 point improvement. This comparison will
    provide objective evidence to justify funding
    future courses.
  • You plan to compare the test scores at the a5
    level of significance, and you want the study to
    have 90 power to detect this difference.
  • How many students do you need to study?

36
Brief Classification of Tests
From Hulley and Cummings, 1988
37
E6, S10, a 0.05, Power 90
From Hulley and Cummings, 1988
38
Phase I
  • Design
  • Small sample size (10-40)
  • Escalating/de-escalating dose
  • Usually route and schedule fixed
  • Nonrandomized
  • Questions
  • Safe dose for further study (MTD, MED, OBD)?
  • Toxicity profile? (Hematopoietic, GI, CNS)
  • Hints of efficacy?
  • Pharmacologic profile? (AUC, half-life, etc)
  • Endpoints - toxicity, change in biomarker,
    response

39
Phase I Designs
  • 33
  • Modified Continual Reassessment Method
  • Accelerated titration
  • Other
  • Pharmacologically guided
  • Storer Up and Down
  • Escalation with overdose control (EWOC)
  • Various Bayesian

40
Dose Response
1
1.0
100
67
50
33
33
0.8
In theory, this idea could be used to home in
on the optimal dose for any outcome, but
dose/toxicity assumed to be monotonic
increasing dose/target modulation may not be
monotonic
MTD
0.6
Probability of DLT
0.4
P(DLT)0.3
0.2
0.0
0
2
4
6
8
10
Dose (mg/m2)
41
Dose Response
1.0
100
67
50
33
33
0.8
0.6
Probability of DLT
0.4
P(DLT)0.3
0.2
0.0
0
2
4
6
8
10
Dose (mg/m2)
42
Hypothetical 33
Define MTD highest dose with Pr(DLT) lt 30
TruePr(DLT) Level Cohort
3/0
lt1 1
3/0
lt1 2
3/1 3/0
4 3
3/0
MTD Expand?
13 4
3/0
3/2
54 5
90 6
33 picks a dose, but does not give any
precision for estimate of MTD
43
Continual Reassessment Method and Modified CRM
  • Designed to treat more patients near therapeutic
    doses
  • Original CRM
  • Begin with a prior guess as to dose-response
    MTD
  • Treat a patient at near MTD, observe DLT or not
  • Update and choose new dose near MTD
  • Treat next patient and repeat
  • Modified to improve safety, but increases N
  • Start at the traditionally determined 1st level
  • Treat several patients in a cohort (2 or 3?)
  • Dont skip doses

44
Why choose one design over another?
  • Most commonly used design is still 33
  • Is there support for complex design?
  • 33 easy to implement
  • CRM may be better but requires statistician and
    special software
  • Is drug class and toxicity profile already
    well-known?
  • Prior dose-response curve known, rapid escalation
  • Biologically targeted agents that require
    expanded cohorts to estimate target modulation?

45
Phase II
  • Design
  • Moderate sample size (20-100)
  • Defined treatment and population
  • Nonrandomized (usually)
  • Test of hypothesis
  • Questions
  • Efficacy clinically interesting?
  • Toxicity profile acceptable?
  • Endpoints response, TTP, RFS, toxicity, change
    in biomarker

46
Wide Variety of Phase II Designs
  • Single stage
  • Two-stage (multi-stage)
  • Simon Minimax
  • Simon Optimal
  • Other admissible
  • Multiple outcomes efficacy vs toxicity
  • Bryant-Day
  • Bayesian trade-off
  • Randomized Phase II
  • Other (Non-Cytotoxic agents?)
  • Mick-Ratain paired TTP
  • Randomized discontinuation

47
Example 2 Single Arm
  • Suppose we are planning a Phase II study of a new
    treatment, theobromococanib, a small molecule
    inhibitor of CHLTR, the nearly ubiquitously
    expressed chocolate receptor
  • Outcome of interest is 6 month PFS
  • If the rate is low (P010), then we want
    probability of keeping the druglt5 (Type I)
  • If the rate is high (P130), then we want
    probability of discarding the druglt20 (Type II)

48
Comparison of Designs Phase II Trial of
theobromacanib
Single Stage P010 P130 a5 ?20 N25,
R5 EN25
Unacceptably low response rate of bad
drug Acceptably high response rate of good
drug Risk of keeping a bad drug Risk of
missing a good drug
Conclude drug is bad Expected sample size if
drug is bad
49
P00.10 vs P10.30, a5, Power80
Two-stage design
Designs for p1 - p0 0.20 Reject Drug if
po p1 ltr1/n1 ltr/n EN(po) PET(po) 0.05 0.25 0/
9 2/24 14.5 0.63 0/9 2/17 12.0 0.63 0/9 3/30 1
6.8 0.63 0.10 0.30 1/12 5/35 19.8 0.65 1/1
0 5/29 15.0 0.74 2/18 6/36 22.5 0.71
a b 0.10 0.10 0.05 0.20 0.05 0.10
(Example 1 in Appendix)
50
Comparison of Designs Phase II Trial of
theobromacanib
Single Stage P010 P130 a5 ?20 N25,
R5 EN25
Optimal Two Stage P010 P130 a5 ?20 N1
10, R11 N29, R5 EN15
Bryant-Day P010 P130 a5 ?20 aT5 P0nt60
P1nt80 N112, R10, NT17 N33, R3,
NT22 EN20.9
Unacceptably low response rate of bad
drug Acceptably high response rate of good
drug Risk of keeping a bad drug Risk of
missing a good drug
Risk of keeping a toxic drug Unacceptably low
rate of non toxicity Acceptably high rate of
nontoxicity
Stage 1 Stage 2
Conclude drug is bad Expected sample size if
drug is bad
51
P10.1 vs P20.3, a5 one-tailed, Power80
The Awful Truth about Comparative Trials! Entire
study will be not just 2 times, but nearly 4
times as big as single arm.
From Hulley and Cummings, 1988 (Chi square
without continuity correction)
52
Randomized Phase II Popular but sometimes misused
  • NOT a cheap Phase III
  • No power to compare arms
  • Original RP2
  • Pick winner of two or more treatment variations
    (i.e. schedules, drug analogs)
  • Each arm can stop early
  • If there is a difference (pre-specified) the
    better drug will win with high probability
  • If there is NO difference outcome is like coin
    flip
  • Parallel Phase IIs, Adaptive Randomization

53
Why choose one design over another?
  • Simon Two-stage designs probably most common
  • If toxicity is a concern?
  • Efficacy/Toxicity design
  • If outcome can be evaluated in short term?
  • 1 vs 2 stage
  • If there are competing schedules, analogs,
    formulations?
  • Randomized Phase II or Adaptive design
  • Is there support for complex design?

54
Two-sided versus one-sided tests
  • Two-tailed test - Any difference
  • One-tailed test - Specific direction of
    difference
  • Same power from smaller sample size, but
  • Only appropriate when ONLY ONE direction is
    important or biologically meaningful

55
Special Considerations
  • Equivalence
  • Interim testing (multi-stage)
  • Complex designs, no tables

56
What have we done?
  • Summary statistics
  • Confidence intervals
  • Tests of hypotheses
  • Sample size calculations

57
Questions?
Write a Comment
User Comments (0)
About PowerShow.com