Title: Clinical Trials Overview
1Clinical Trials Overview
2Clinical Trials
- A clinical trial is a prospectively planned
experiment for the purpose of evaluating one or
more potentially beneficial therapies or
treatments - In general these studies are conducted under as
many controlled conditions as possible in order
to provide definitive answers to well-defined
questions
3Primary vs. Secondary Questions
- Primary
- most important, central question
- ideally, only one
- stated in advance
- basis for design and sample size
- Secondary
- related to primary
- stated in advance
- limited in number
4Examples
- Physicians Health Study (PHS) started in fall
1982 - risks and benefits of aspirin and beta carotene
in the prevention of cardiovascular disease and
cancer - low-dose aspirin vs placebo
- Primary total mortality
- Secondary fatal nonfatal myocardial infarction
- Eastern Cooperative Oncology Group (ECOG)
- tamoxifen vs placebo
- Primary tumor recurrence/relapse, disease-free
survival - Secondary total mortality
5Definitions
- Single Blind Study A clinical trial where the
participant does not know the identity of the
treatment received - Double Blind Study A clinical trial in which
neither the patient nor the treating
investigators know the identity of the treatment
being administered.
6Definitions
- Placebo
- Used as a control treatment
- 1. An inert substance made up to physically
resemble a treatment being investigated - 2. Best standard of care if placebo
unethical - 3. Sham control
7Definitions
- Adverse event
- An incident in which harm resulted to a person
receiving health care. - Examples Death, irreversible damage to liver,
nausea - Not always easy to specify in advance because
many variables will be measured - May be known adverse effects from earlier trials
8Adverse Events
- Challenges
- Long term follow-up versus early benefit
- Rare AEs may be seen only with very large numbers
of exposed patients and long term follow-up - Example COX II inhibitors
- Vioxx Celebrex
- Immediate pain reduction vs longer term increase
in cardiovascular risk
9Surrogate Endpoints
- Response variables used to address questions
often called endpoints - Surrogates used as alternative to desired or
ideal clinical response to save time and/or
resources - Examples
- Suppression of arrhythmia (sudden death)
- T4 cell counts (AIDS or ARC)
- Often used in therapeutic exploratory trials
- Use with caution in therapeutic confirmatory
- trials
10The General Flow of Statistical Inference
Sample Protocol to Obtain Participants
Patient Population
Observed Results
Inference about Population
Sample protocol / design key to analysis and
inference and may redefine the population for
future experiments
11Types of Clinical Trials
- Randomized
- Non-Randomized
- Single-Center
- Multi-Center
- Phase I, II, III Trials
12Phase I Trial
- Objective To determine an acceptable range of
doses and schedules for a new drug - Usually seeking maximum tolerated dose (MTD)
- Participants often those that have failed other
treatments - Important, however, that they still have normal
organ functions
13Phase II Trial
- Objective To determine if new drug has any
beneficial activity and thus worthy of further
testing / investment of resources. - Doses and schedules may not be optimum
- Begin to focus on population for whom this drug
will likely show favorable effect
14Phase III Trial
- Objective To compare experimental or new
therapies with standard therapy or competitive
therapies. - Very large, expensive studies
- Required by FDA for drug approval
- If drug approved, usually followed by Phase IV
trials to follow-up on long-range adverse events
concern is safety
15(No Transcript)
16Characterization of Trials
Phase Single Center Single Center Multi Center Multi Center
Randomized Non-Rand. Randomized Non-Rand.
I Never Yes Never Sometimes
II Rare Yes Yes Sometimes
III Yes Use of Historical Controls Yes Use of Historical Controls
Carrying out a multi-center randomized clinical
trial is the most difficult way to generate
scientific information.
17Why Clinical Trials?
- 1. Most definitive method to determine whether a
treatment is effective. - Other designs have more potential biases
- One cannot determine in an uncontrolled setting
whether an intervention has made a difference in
the outcome.
18Observational Studies
- Correlation vs. Causation
- Examples of False Positives
- 1. High cholesterol diet and rectal cancer
- 2. Smoking and breast cancer
- 3. Vasectomy and prostate cancer
- 4. Red meat and colon cancer
- 5. Red meat and breast cancer
- 6. Drinking water frequently and bladder cancer
- 7. Not consuming olive oil and breast cancer
- Replication of observational studies may not
overcome confounding and bias
19Why Clinical Trials?
- 2. Help determine incidence of side effects and
complications. - Example Coronary Drug Project
- A. Detection of side effect (Cardiac
Arrhythmias) - Clofibrate 33.3
- Niacin 32.7 pgt.05
- Placebo 38.2
- B. Natural occurring side effect (nausea)
- Clofibrate 7.6
- Placebo 6.2
20Typical Side Effect Report - Lyrica
21Why Clinical Trials?
- 3. Theory not always best path
- Intermittent positive pressure breathing (IPPB) ?
reduced use, no benefit - High O2 in premature infants ? Retrolental
Fibroplasia, Harmful - Tonsillectomy ? Reduced use
- Bypass Surgery ? Restricted use
22Phase I Design Strategy
- Designs based largely on tradition
- Typically do some sort of dose escalation to
reach maximum tolerated dose (MTD) - Has been shown to be safe and reasonably
effective - Dose escalation often based on Fibonacci series
- 1 2 3 5 8 13 . . . .
23Dose-response curve (animal study)
24Typical Scheme
- 1. Enter 3 patients at a given dose
- 2. If no toxicity, go to next dosage and repeat
step 1 - 3. a. If 1 patient has serious toxicity, add 3
more patients at that dose (go to 4) - b. If 2/3 have serious toxicity, consider MTD
- 4. a. If 2 or more of 6 patients have toxicity,
- MTD reached
- b. If 1 of 6 has toxicity, increase dose and go
back to step 1
25Summary of Schemes (Storer, Biometrics
45925-37, 1989)
- A. Standard
- Observe group of 3 patients
- No toxicity? increase dose
- Any toxicity ? observe 3 or more
- One toxicity out of 6 ? increase dose
- Two or more toxicity ? stop
- B. 1 Up, 1 Down
- Observe single patients
- No toxicity ? increase dose
- Toxicity ? decrease dose
26Summary of Schemes(Storer, Biometrics 45925-37,
1989)
- C. 2 Up, 1 Down
- Observe single patients
- No toxicity in two consecutive ? increase dose
- Toxicity ? decrease dose
- D. Extended Standard
- Observe groups of 3 patients
- No toxicity ? increase dose
- One toxicity ? dose unchanged
- Two or three toxicity ? decrease dose
27Summary of Schemes (Storer, Biometrics
45925-37, 1989)
- E. 2 Up, 2 Down
- Observe groups of 2 patients
- No toxicity ? increase dose
- One toxicity ? dose unchanged
- Both toxicity ? decrease dose
- B, C, D, E - fixed sample sizes ranging from
12 to 32 patients - Can speed up process to get to target dose
range - F. Bayesian sequential/adaptive designs
28Phase II Designs
- References
- Gehan (1961) Journal of Chronic Disorders
- Fleming (1982) Biometrics
- Storer (1989) Statistics in Medicine
- Goal
- Screen for therapeutic activity
- Further evaluate toxicity
- Test using MTD from Phase I
- If drug passes screen, test further
29Phase II Design
- Design of Gehan
- No control (is this wise?)
- Two-stage (small initial sample, observe at least
one benefit take a
second larger sample) - Goal is to reject ineffective drugs ASAP
- Decision I Drug is unlikely to be effective in
? x of patients - Decision II Drug could be effective
- in ? x of patients
30Phase II Design
- Example Gehan Design
- Let x 20 want to check if drug likely to
work in at least 20 of patients - 1. Enter 14 patients
- 2. If 0/14 responds, stop and
- declare true drug response ?20
- 3. If 1/14 respond, add 15-40
- more patients
- 4. Estimate response rate C.I.
31Gehan Design
- Why 14 patients initially?
-
- If drug ? 20 effective, there would be 95.6
chance of at least one success - If 0/14 success observed, reject drug
Patient Prob 1 0.8 2 0.64 (0.8 x
0.8) 3 0.512 (0.8 x 0.8 x 0.8) --- --- 8 0.1
6 --- --- 14 0.044
32Phase II Design
- Stage I Sample Size - Gehan
- Table I
- Rejection Effectiveness ()
- Error 5 10 15 20 25 40 50
- 5 59 29 19 14 11 6 5
- 10 45 22 15 11 9 5 4
33Stage II Sample Size
- Based on desired precision of effectiveness
estimate - r1 of successes in Stage 1
- n1 of patients in Stage 1
-
- Now precision of total sample N(n1 n2)
34Stage II Sample Size
- To be conservative, Gehan suggested
- The upper 75 confidence limit from first sample
- Thus, we can generate a table for size of
- second stage (n2) based on desired precision
35Additional Patients for Stage II(n2, a1.05)
36Phase II Trial Designs
- Many cancer Phase II trials follow Gehan design
- Many other diseases could there seems to be no
standard non-cancer Phase II design - Might also randomize patients into multiple arms
each with a different dose can then get a dose
response curve - Other two-stage designs based on determining
p1-p0 gt x where p0 is the standard care
combination
37Phase III Trial Designs
- The foundation for the design of controlled
experiments established for agricultural
experiments - The need for control groups in clinical studies
recognized, but not widely accepted until 1950s - No comparison groups needed when results
dramatic - Penicillin for pneumococcal pneumonia
- Rabies vaccine
- Use of proper control group necessary due to
- Natural history of most diseases
- Variability of a patient's response to
intervention -
38Phase III Design
- Comparative Studies
- Experimental Group vs. Control Group
- Establishing a Control
- 1. Historical
- 2. Concurrent
- 3. Randomized
- Randomized Control Trial (RCT) is the gold
standard - Eliminates several sources of bias
39Purpose of Control Group
- To allow discrimination of patient outcomes
caused by test treatment from those caused by
other factors - Natural progression of disease
- Observer/patient expectations
- Other treatment
- Fair comparisons
- Necessary to be informative
40Goals of Phase III Clinical Trial
- Superiority Trials
- A controlled trial may demonstrate efficacy of
the test treatment by showing that it is superior
to the control - No treatment (placebo)
- Best standard of current care
41Goals of Phase III Clinical Trials
- Non-Inferiority Trials
- Controlled trial may demonstrate efficacy by
showing the test treatment is similar in efficacy
to a known effective treatment - The active control has to be effective under the
conditions of the trials - New treatment cannot be worse by a pre-specified
amount - New treatment may not be better than the standard
but may have other advantages - Cost
- Toxicity and/or side effects
- Invasiveness
42Significance of Control Group
- Inference drawn from the trial
- Ethical acceptability of the trial
- Degree to which bias is minimized
- Type of subjects
- Kind of endpoints that can be studied
- Credibility of the results
- Acceptability of the results by regulatory
authorities - Other features of the trial, its conduct, and
interpretation
43Use of Placebo Control
- The placebo effect is well documented (as
high as 33 according to some studies) - Could be
- No treatment placebo
- Standard care placebo
- Matched placebos are necessary so patients and
investigators cannot decode the treatment
assignment - E.g. Vitamin C trial for common cold
- Placebo was used, but was distinguishable
- Many on placebo dropped out of study not
blinded - Those who knew they were on vitamin C reported
fewer cold symptoms and duration than those on
vitamin who didn't know
44Unbiased Evaluation
- Subject Bias (NIH Cold Study)
- (Karlowski, 1975)
- Duration of Cold (Days)
- Blinded Unblinded
- Subjects Subjects
- Placebo 6.3 8.6
- Ascorbic Acid 6.5 4.8
45Historical Control Study
- A new treatment used in a series of subjects
- Outcome compared with previous series of
comparable subjects - Non-randomized
- Rapid, inexpensive, good for initial testing of
new - treatments
- Vulnerable to biases
- Different underlying populations
- Criteria for selecting patients
- Patient care
- Diagnostic or evaluating criteria
46Historical Control Study
- When might we consider a historical control
study? - When preliminary data strongly suggest efficacy.
- When course of disease predictable, generally a
consistently poor outcome. - When endpoints objective, like death or
metastisization. - When impact of baseline and other variables on
endpoint is well characterized.
47Randomized ControlClinical Trial
- Reference Byar et al. (1976)
- New England Journal of Medicine
- Patients assigned at random to either
treatment(s) or control - Considered to be Gold Standard
48Disadvantages of Randomized Control Clinical Trial
- 1. Generalizable Results?
- Subjects may not represent general patient
population volunteer effect - 2. Recruitment
- Twice as many new patients
- 3. Acceptability of Randomization Process
- Some physicians will refuse
- Some patients will refuse
- 4. Administrative Complexity
49Ethics of Randomization
- Statistician/clinical trialist must sell benefits
of randomization - Ethics Þ MD should do what he thinks is best for
his patient - Two MD's might ethically treat same patient quite
differently - Chalmers Shaw (1970) Annals New York Academy of
Science - 1. If MD "knows" best treatment, should not
participate in trial - 2. If in doubt, randomization gives each patient
equal chance to - receive one of therapies (i.e. best)
- 3. More ethical way of practicing medicine
- Bayesian Adaptive designs ? More likely assign
better treatment
50Comparing Treatments
- Fundamental principle
- Groups must be alike in all important aspects and
only differ in the treatment each group receives - In practical terms, comparable treatment groups
meansalike on the average - Randomization
- Each patient has the same chance of receiving any
of thetreatments under study - Allocation of treatments to participants is
carried out using a chance mechanism so that
neither the patient nor the physician know in
advance which therapy will be assigned - Blinding
- Avoidance of psychological influence
- Fair evaluation of outcomes
51Randomized Phase III Experimental Designs
- Assume
- Patients enrolled in trial have satisfied
eligibility criteria and have given consent - Balanced randomization each treatment group will
be assigned an equal number of patients - Issue
- Different experimental designs can be used to
answer different therapeutic questions
52Commonly Used Phase III Designs
- Parallel
- Withdrawal
- Group/Cluster
- Randomized Consent
- Cross Over
- Factorial
- Large Simple
- Equivalence/Non-inferiority
- Sequential
53Parallel Design
- Screen
- Trt A
- Randomize -
- Trt B
- H0 A vs. B
- Advantage
- Simple, General Use
- Valid Comparison
- Disadvantage
- Few Questions/Study
54Fundamental Design
R A N D O M I Z E
Yes
Yes
A
Eligible
Consent
No
B
No
Dropped
Dropped
Comment Compare A with B
55Run-In Design
- Problem
- Non-compliance by patient may seriously impair
efficiency and possibly distort conclusions. - Possible Solution Drug Trials
- Assign all eligible patients a placebo to be
taken for a brief period of time. Patients who
are judged compliant are enrolled into the
study. This is often referred to as the Placebo
Run-In period. - Can also use active drug to test for compliance.
56Run-In Design
R A N D O M I Z E
Screen Consent
Run-In Period
Satisfactory
A
B
Unsatisfactory
Dropped
Note It is assumed that all patient entering the
run-in period are eligible and have given consent
57Withdrawal Study
- Treatment A
- Treament A ?
- Not Treatment A
(placebo) - Advantage
- Easy Access to subjects
- Show if continued treatment is beneficial
- Disadvantage
- Selected Population
- Different Disease Stage
randomize
58Cluster Randomization Designs
- Groups (clinics, communities) are randomized to
treatment or control - Examples
- Community trials on fluoridization of water
- Breast self-examination programs in different
clinic settings in USSR - Smoking cessation intervention trial in different
school districtsin the state of Washington - Advantages
- Sometimes logistically more feasible
- Avoid contamination
- Allow mass intervention, thus public health
trial - Disadvantages
- Effective sample size less than number of
subjects - Many units must participate to overcome
unit-to-unit variation,thus requires larger
sample size - Need cluster sampling methods
59Cross Over DesignH0 A vs. B
- Scheme
- Period
- Group I II
- AB 1 TRT A TRT B
- BA 2 TRT B TRT A
- Advantage
- Each patient their own control
- Smaller sample size
- Disadvantage
- Not useful for acute disease
- Disease must be stable
- Assumes no period carry over
- If carryover, have a study half sized
- (Period I A vs. Period I B)
60Superiority vs. Non-Inferiority Trials
- Superiority Design Show that new treatment is
better than the control or standard (maybe a
placebo) - Non-inferiority Show that the new treatment
- Is not worse that the standard by more than some
margin - Would have beaten placebo if a placebo arm had
been included (regulatory)
61Equivalence/Non-inferiority Trial
- Trial with active (positive) controls.
- The question is whether new (easier or cheaper)
treatment is as good as the current treatment. - Must specify margin of equivalence or
non-inferiority - Can't statistically prove equivalency -- only
show that difference is less than something with
specified probability. - Historical evidence of sensitivity to treatment
- Sample size issues are crucial.
- Small sample size, leading to low power and
subsequently lack of significant difference, does
not imply equivalence.
62Non-Inferiority Challenges
- Requires high quality trial
- Poor execution favors non-inferiority
- Treatment margin somewhat arbitrary
63Sequential Design
- Continue to randomize subjects until H0 is either
rejected or accepted - A large statistical literature for classical
sequential designs - Developed for industrial setting
- Modified for clinical trials
- (e.g. Armitage 1975, Sequential Medical Trials)
64Classical Sequential Design
- Continue to randomize subjects until H0 is either
rejected or accepted - Classic
Trt Better
Continue
Net Treatment Effect
20
Accept H0
?
0
Continue
-20
Trt Worse
100
200
300
No. of Paired Observations
65Sample Size Considerations
66Comparing Time to Event Distributions
- Primary endpoint is the time to an event
- Compare the survival distributions
- Measure of treatment effect is the ratio of the
hazard rates - Must also consider the length of follow-up
67Exponential Survival Distributions
- Surivival function P(T gt t) e-lt
- George Desu (1974)
- Assumes all patients followed to an event (no
censoring) - Assumes all patients immediately entered
68Converting Number of Events (D) to Required
Sample Size (2N)
- d 2N x P(event) 2N d/P(event)
- P(event) is a function of the length of total
follow-up at time of analysis and the average
hazard rate - Let AR accrual rate (patients per year)
- A period of uniform accrual (2N AR x A)
- F period of follow-up after accrual complete
- A/2 F average total follow-up at planned
analysis - average hazard rate
- Then P(event) 1 P(no event)
69Time to Failure
- In many clinical trials
- 1. Not all patients are followed to an event
- (i.e. censoring)
- 2. Patients are recruited over some period of
time - (i.e. staggered entry)
- More General Model (Lachin, 1981)
- where .
70- 1. Instant Recruitment Study Censored At Time T
- 2. Continuous Recruiting (O,T) Censored at T
- 3. Recruitment (O, T0) Study Censored at T (T
gt T0)
71- Example
- Assume ? .05 (2-sided) 1 - ? .90
- ?C .3 and ?I .2
- T 5 years follow-up
- T0 3
- 0. No Censoring, Instant Recruiting
- N 128
- 1. Censoring at T, Instant Recruiting
- N 188
- 2. Censoring at T, Continual Recruitment
- N 310
- 3. Censoring at T, Recruitment to T0
- N 233
72Sample Size Adjustment for Non-Compliance
- References
- 1. Shork Remington (1967) Journal of Chronic
Disease - 2. Halperin et al (1968) Journal of Chronic
Disease - 3. Wu, Fisher DeMets (1988) Controlled
Clinical Trials - Problem
- Some patients may not adhere to treatment
protocol - Impact
- Dilute whatever true treatment effect exists
73Sample Size Adjustment for Non-Compliance
- Fundamental Principle
- Analyze All Subjects Randomized
- Called Intent-to-Treat (ITT) Principle
- Noncompliance will dilute treatment effect
- A Solution
- Adjust sample size to compensate for dilution
effect (reduced power) - Definitions of Noncompliance
- Dropout Patient in treatment group stops taking
therapy - Dropin Patient in control group starts taking
experimental therapy
74- Comparing Two Proportions
- Assumes event rates will be altered by
non-compliance - Define
- PT adjusted treatment group rate
- PC adjusted control group rate
- If PT lt PC,
1.0
0
PC
PT
PC
PT
75Adjusted Sample Size
- Simple Model -
- Compute unadjusted N
- Assume no dropins
- Assume dropout proportion R
- Thus PC PC
- PT (1-R) PT R PC
- Then adjust N
-
- Example
- R 1/(1-R)2 Increase
- .1 1.23 23
- .25 1.78 78
76Sample Size Adjustment for Non-Compliance
- Dropouts dropins (R0, RI)
- Example
- R0 R1 1/(1- R0- R1)2 Increase
- .1 .1 1.56 56
- .25 .25 4.0 4 times
77Sample Size Adjustments
- More Complex Model
- Ref Wu, Fisher, DeMets (1980)
- Further Assumptions
- Length of follow-up divided into intervals
- Hazard rate may vary
- Dropout rate may vary
- Dropin rate may vary
- Lag in time for treatment to be fully effective
78Sample Size Summary
- Ethically, the size of the study must be large
enough to achieve the stated goals with
reasonable probability (power) - Sample size estimates are only approximate due to
uncertainty in assumptions - Need to be conservative but realistic