Title: Statistics for Non-Statisticians
1Statistics for Non-Statisticians
- Kay M. Larholt, Sc.D.
- Vice President, Biometrics Clinical Operations
- Abt Bio-Pharma Solutions
2Topics
- Basic Statistical Concepts
-
- 2) Study Design
- 3) Blinding and Randomization
- 4) Hypothesis testing
- 5) Power and Sample Size
3Basic Statistical Concepts
4Statistics
- Per the American Heritage dictionary -
- The mathematics of the collection,
organization, and interpretation of numerical
data, especially the analysis of population
characteristics by inference from sampling. - Two broad areas
- Descriptive Science of summarizing data
- Inferential Science of interpreting data in
order to make estimates, hypothesis testing,
predictions, or decisions from the sample to
target population.
5Introduction to Clinical Statistics
- Statistics - The science of making decisions in
the face of uncertainty - Probability - The mathematics of uncertainty
- The probability of an event is a measure of how
likely the event is to happen
6Sample versus Population
7Clinical Statistics
- Biostatisticians are statisticians who apply
statistics to the biological sciences. - Clinical statistics are statistics that are
applied to clinical trials
8Basic Statistical Concepts
- Types of data
- Descriptive statistics
- Graphs
- Basic probability concepts
- Type of probability distributions in clinical
statistics - Sample vs. population
9Types of Data
10Types of Quantitative Variables
11Continuous Data
- Data should be collected in its rawest form.
We can always categorize data later. (We can
never uncategorize data.) - e.g. If you measure prostate size as part of the
clinical trial then capture the size in mm on the
CRF.
12Basic Data Summarization Techniques
- The objective of data summarization is to
describe the characteristics of a data set.
Ultimately, we want to make the data set more
comprehensible and meaningful. - To put data in a concise form, use
- Summary descriptive statistics
- Graphs
- Tables
13Descriptive Statistics for Continuous Variables
- Measures of central tendency
- Mean, Median, Mode
- Measures of dispersion
- Range, Variance, Standard deviation
-
- Measures of relative standing
- Lower quartile (Q1)
- Upper quartile (Q3)
- Interquartile range (IQR)
- range (IQR)
14Mean
- Arithmetic average sum of all observations
divided by of observations. - Example
- The average age of a group of 10 people is
24.2 years - Who are they?
15Mean
- Answer
- They could be ten twenty-somethings who go out
to dinner together - Pete aged 24, Jane aged 26, Louise aged 21, Bob
aged 22, Julie aged 23, Sue aged 22, Jenn aged
27, John aged 28, Jeff aged 20 and Mark aged 29.
-
- The mean age for these 10 people is
- (24262122232227282029)/10
-
24.2 years
16Mean
- Or alternatively
- They could be Mr. Mrs. Smith and their 8
grandchildren - Susie aged 3, Abby aged 5, Max aged 8, Laura
aged 10, Joshua aged 10, Emma aged 12, Jane aged
13, Sarah aged 18, Mrs. Smith aged 80, Mr. Smith
aged 83. -
- The mean age for these 10 people is
- (35810101213188083)/10
-
24.2 years
17Mean
- Presenting the average alone does not give you
much information about the data you are looking
at.
18Median
- The midpoint of the values after they have been
ordered from the smallest to the largest, or the
largest to the smallest. - There are as many values above the median as
below it in the data array.
19Median
Example The age of the people in our data set
is 24, 26, 21, 23, 22, 27, 28, 20, 29 ( I took
out one of the 22 year olds to make this example
easier) Arranging the
data in ascending order gives 20, 21, 22, 23,
24, 26, 27, 28, 29
The median is 24
20 There are three kinds of lies lies, damned
lies, and statistics.
This well-known saying is part of a phrase
attributed to Benjamin Disraeli and popularized
in the U.S. by Mark Twain
21Median Home Price
- Connecticut Darien
- Median home price 1,295,000
- Location about 40 miles northeast of midtown
Manhattan - Population 20,209, households 6,592
-
22Properties of Mean and Median
- There are unique means and medians for each
variable in the data set. - Median is not affected by extremely large or
small values and is therefore a valuable measure
of central tendency when such values occur. - Mean is a poor measure of central tendency in
skewed distributions.
23Mode
3-14
- The value of the observation that appears most
frequently. - Example
- The exam scores for ten students are
- 81, 93, 84, 75, 68, 87, 81, 75, 81, 87.
- Since the score of 81 occurs the most, the
modal score is 81.
24Averages and What Else?
- As we have seen, just knowing the mean or even
the median of a data set does not tell us enough
about the data. We need more information to
really describe the data.
25Measures of Dispersion
- Once we know something about the centre of the
data we need to understand how the data are
dispersed around this centre. - How variable are the data?
26Range
- Maximum value in the data set minus Minimum value
in the data set - The age of the patients in our data set is
- 21, 25, 19, 20, 22
- Range 25 19 6
- 2. The age of the patients in our data set is
- 21, 45, 19, 20, 22.
- Range 45 19 26
-
- When max and min are unusual values, range may be
a misleading measure of dispersion. The range
only uses the 2 extreme values in the data.
27Variance and Standard Deviation
- The variance of a data set measures how far each
data point is from the mean of the data set. - It provides a measure of how spread out the data
points are - The Standard Deviation is the square root of the
variance
28Variance and Standard Deviation
Variance Measure of dispersion, the square of
the deviations of the data from the mean Standard
deviation positive square root of the
variance Small std dev observations are
clustered tightly around the mean Large std dev
observations are scattered widely about the mean
29Standard Deviation
Take each observation and subtract it from the
mean of the observations Square the answer Sum up
all the results Divide by n-1 Take the square root
30Example Standard Deviation
- The age of the patients in our data set is
- 21, 25, 19, 20, 22
- Mean 21.4, Median 21, StdDev 2.302
- 2. The age of the patients in our data set is
- 21, 45, 19, 20, 22.
-
- Mean 25.4, Median 21, StdDev 11.014
31Choosing an Appropriate Method of Central Tendency
- The mean is ordinarily the preferred measure of
central tendency. The mean should always be
presented along with the variance or the standard
deviation - There are situations when a median might be more
appropriate - - a skewed distribution
- - a small number of subjects
32Measures of Relative Standing
- Descriptive measures that locate the relative
position of an observation in relation to the
other observations.
33Measures of Relative Standing
- The pth percentile is a number such that p of
the observations of the data set fall below and
(100-p) of the observations fall above it. - Lower quartile 25th percentile (Q1)
- Mid-quartile 50th percentile (median or Q2)
- Upper quartile 75th percentile (Q3)
- Interquartile range (IQR Q3-Q1)
34Measures of Relative Standing an Example
The age of the patients in our data set is 21,
25, 19, 20, 22 Q1 20, Q2 21, Q3 22, IQR
2 The age of the patients in our data set is
21, 45, 19, 20, 22 Q1 20, Q2 21, Q3
22, IQR 2
35 Definitions
- Statistics - The science of making decisions in
the face of uncertainty - Probability - The mathematics of uncertainty
- The probability of an event is a measure of how
likely the event is to happen
36Basic Probability Concepts
- Sample spaces and events
- Simple probability
- Joint probability
37Sample Spaces
- Collection of all possible outcomes
- Example All six faces of a die
- Example All 52 cards in a deck
38Sample Space
- Gumballs in a gumball machine
60 red 50 green 40 yellow 30 white 25 pink 20
blue 16 purple
Total 241 gumballs
39Events
- Simple event
- Outcome from a sample space with one
characteristic - Examples A red card from a deck of cards
- A purple gumball from the
gumball machine - Joint event
- Involves two outcomes simultaneously
- Example An ace that is also red from a deck of
cards
40Events
- Mutually exclusive events
- Two events cannot occur together
- Example Drawing one card from a deck
- A Drawing a queen of diamonds
- B Drawing a queen of clubs
- As only one of these can happen
- Events A and B are mutually exclusive
41Probability
Certain
1
- Probability is the numerical measure of the
likelihood that an event will occur - Value is between 0 and 1
.5
0
Impossible
42Computing Probabilities
- The probability of an event E
- Assumes each of the outcomes in the sample space
is equally likely to occur
43Computing Probabilities
- Example
- What is the probability of rolling a 4 when you
roll a die? - of possible outcomes in the sample space 6
- of 4s in the sample space 1
- Prob (rolling a 4 when you roll a die) 1/6
44Computing Probabilities
- Example
- What is the probability of rolling a six and a
four when you roll 2 dice? - of possible outcomes in the sample space 36
- of ways to roll one 6 and one 4 2
45Computing Joint Probability
- The probability of a joint event, A and B
46Computing Joint Probability
P (Red Card and an Ace) 2 Red Aces Total
Cards 2/52 1/26
47Type of Probability Distributions in Clinical
Statistics
- Bernoulli
- Binomial
- Normal
48Bernoulli Distribution
- The bernoulli distribution is the coin flip
distribution. - X is bernoulli if its probability function is
Examples X1 for heads in coin toss
X1 for male in survey
X1 for defective in a test of product
49Binomial Distribution
- The binomial distribution is just n independent
bernoullis added up. - It is the number of successes in n trials.
- Probability of success is usually denoted by p,
and therefore probability of failure is 1-p. - Example Number of heads when we flip a coin 10
times. Here n 10, p0.5 (the probability of
getting a head when we toss the coin once).
50Binomial Distribution
- The binomial probability function
Example X Number of heads when we flip a coin
10 times. Here X Binomial (n 10, p0.5) n!
n factorial n.n-1.n-2..1 10!10.9.8.7.6.5.4.3.2
.13,628,800
51Binomial Distribution
X Number of heads when we flip a coin 10 times.
Here X Binomial (n 10, p0.5). Then E(X)5
(on average we expect to get 5 heads) and Var(X)
2.5.
52Gaussian or Normal Distribution aka Bell Curve
- Most important probability distribution in the
statistical analysis of experimental data. - Data from many different types of processes
follow a normal distribution - Heights of American women
- Returns from a diversified asset portfolio
- Even when the data do not follow a normal
distribution, the normal distribution provides a
good approximation
53Gaussian or Normal Distribution aka Bell Curve
- The Normal Distribution is specified by two
parameters - The mean, ?
- The standard deviation, ?
54Standard Normal Distribution
55Characteristics of the Standard Normal
Distribution
- Mean µ of 0 and standard deviation s of 1.
- It is symmetric about 0 (the mean, median and the
mode are the same). - The total area under the curve is equal to one.
One half of the total area under the curve is on
either side of zero.
56Area in the Tails of Distribution
- The total area under the curve that is more than
1.96 units away from zero is equal to 5.
Because the curve is symmetrical, there is 2.5
in each tail.
57Normal Distribution
- 68 of observations lie within 1 std dev of
mean - 95 of observations lie within 2 std dev of
mean - 99 of observations lie within 3 std dev of mean
58Study Design
59Sample versus Population
- A population is a whole, and a sample is a
fraction of the whole. - A population is a collection of all the elements
we are studying and about which we are trying to
draw conclusions. - A sample is a collection of some, but not all, of
the elements of the population
60Sample versus Population
61Sample versus Population
- To make generalizations from a sample, it needs
to be representative of the larger population
from which it is taken. - In the ideal scientific world, the individuals
for the sample would be randomly selected. This
requires that each member of the population has
an equal chance of being selected each time a
selection is made.
62Type of Studies and Study Design
- Phase I IV
- Controlled vs. non-controlled studies
- Single arm, parallel groups, cross-over designs,
and stratified designs - Selecting an appropriate study design
- Analysis population Intent-to-treat vs.
per-protocol
63Phases of Clinical Trials
- Clinical trials are generally categorized into
four phases. - An investigational medicine or product may be
evaluated in two or more phases simultaneously in
different trials, and some trials may overlap two
different phases.
64Phase 1 Studies Safety and Dosing
- Initial safety trials in which investigators
attempt to establish the dose range tolerated by
20-80 healthy volunteers. - Although usually conducted on healthy volunteers,
Phase 1 trials are sometimes conducted with
severely ill patients, for example those with
cancer or AIDS.
65Phase 2 Studies Safety and Limited Efficacy
- Pilot clinical trials to evaluate safety and
efficacy in selected populations of about 100-300
patients who have the disease or condition to be
treated, diagnosed, or prevented. Often referred
to as feasibility studies - Used as dose finding studies as different doses
and regimens are investigated
66Phase 3 studies - efficacy
- Large definitive studies that are carried out
once safety has been established and doses that
are likely to be effective have been found - Often called pivotal studies
- FDA usually requires 2 Phase III studies for
registration
67Phase 4 studies post marketing surveillance
- After the product is marketed, Phase 4 studies
provide additional details about the products
safety and efficacy. - May be used to evaluate formulations, dosages,
durations of the treatment, medicine
interactions, and other factors. - Patients from various demographic groups may be
studied.
68Phase 4 studies post marketing surveillance
- Important part of many Phase 4 studies detecting
and defining previously unknown or inadequately
quantified adverse reactions and related risk
factors. - Phase 4 studies are often observational studies
rather than experimental.
69Hierarchy of medical evidence
- From weakest to strongest evidence
- Case reports
- Case series
- Database studies
- Observational studies
- Controlled clinical trials
- Randomized controlled trial
-
Byar, 1978
70Clarke MJ Ovarian Oblation in breast cancer, 1896
to 1998 milestones along hierarchy of evidence
from case report to Cochrane review BMJ 1998 317
71Controlled studies
- Studies in which a test article is compared with
a treatment that has known effects. - The control group may receive no treatment,
standard treatment or placebo.
72What is a randomized clinical trial?
- A prospective study in humans
- Randomization
- Comparable control group
- Complete accounting of all cases
- Carefully monitored for safety and efficacy
- Adheres to regulatory requirements GCP,FDA, ICH
guidelines
73Blinded studies
- Blinded study one in which subject or the
investigator (or both) are unaware of what trial
product a subject is receiving. - Single-blind study subjects do not know what
treatment they are receiving (active or control) - Double-blind study neither the subjects nor the
investigators know what treatment a subject is
receiving
74 75Intent-to-Treat Principle
- Primary analysis in most randomized clinical
trials testing new therapies or devices. - Requires that any comparison among treatment
groups in a randomized clinical trials is based
on the results for all subjects in the treatment
group to which they were randomly assigned. - Full analysis includes compliers and
non-compliers
76Intent-to-Treat
- ITT Population includes the following
- All Randomized patients Preserve initial
randomization - - Prevents biased comparison
- - Basis for statistical tests and inference
77Intent-to-Treat
- Problems Predictable or Unpredictable
- Ineligible Patients allowed in the trial
- Non-compliance, ie. not following the assigned
treatment - Patients refusing a trial procedure
- Prohibited medication
- Early withdrawal/termination
- Invalid data
78Intent-to-Treat
FDA guideline related to regulatory submission
states As a general rule, even if the sponsors
preferred analysis is based on a reduced subset
of the patients with data, there should be an
additional intent-to-treat analysis using all
randomized patients. Ref ICH E3 Structure and
Content of Clinical Study Reports
79Intent-to-Treat
- When can we exclude randomized patients?
- Failure to satisfy major entry criteria
- Failure to take at least one dose of medication
- Failure to complete procedure
- Lack of any data post-randomization
- Lost to follow up
- Missing data randomly, not related to treatment
assignment
80Intent-to-Treat
Problem In a 6-Month study, what should be done
with the patient who drops out and provides no
further data after 2 months ?
81Intent-to-Treat
Last Observation Carried Forward (LOCF) Use last
available valid observation post-baseline on a
particular variable for the missing visit through
the end of study
82LOCF last observation carried forward
83 Last Observation Carried Forward (LOCF) Biased
if the early withdrawal is treatment related
84Example
The primary analysis sample will be based on the
principle of intention-to-treat. All patients
who sign the written Informed Consent form, meet
the study entry criteria, and undergo
randomization will be included in the analysis,
regardless of whether or not the assigned
treatment device was implanted.
85Intent-to-Treat Principle
- Using the complete analysis data set
- Preserves the randomization at the time of
analysis which helps prevent bias - Provides the foundation for statistical testing.
- Provides estimates of treatment effects which are
more likely to mirror those observed in clinical
practice.
86Argument against ITT
- An ITT, by including subjects, randomized to the
drug but who received little or no drug will
dilute the treatment effect when compared to the
placebo group
87How can we improve the ITT analysis?
- Careful identification of inclusion/exclusion
criteria - Careful review of reasons for failure, missing
data, and exclusions - Adherence to Good Clinical Practices
- Better monitoring practices to reduce the
protocol deviations and non compliance - Appropriate and detailed statistical plan and
analysis
88Per-Protocol aka Evaluable patient population
- Subset of ITT who are compliant with the protocol
and excluding patients who - Major protocol violation/deviation
- Use prohibited medication as per protocol
- Technical or procedural failure
- Lost to follow up, lack of efficacy/response
- Wrong treatment assignment
89Per-Protocol Population
- Advantages and disadvantages
- Analysis in its pure form, completely as per the
protocol - Maximize the efficacy from new treatment
- Not a conservative approach, results in bias
- due to exclusion
90Per-Protocol Population
- Advantages and disadvantages
- May not have enough power and sample size
- Both analyses are done in confirmatory trials
- If the results and conclusions are the same from
- two analyses, the confidence is higher.
91Blinding and Randomization
92 Randomisation
93History
- The concept of randomisation was introduced by
R.A. Fisher in 1926 in the area of agricultural
research. - Previous to that clinical trials in the 18th and
19th centuries had used controls from the
literature, other historical controls and
concurrent controls.
94Randomisation
- To guard against any use of judgement or
systematic arrangements i.e to avoid bias - To provide a basis for the standard methods of
statistical analysis such as significance tests - Assures that treatment groups are balanced (on
average) in all regards. - i.e. balance occurs for known prognostic
variables and for unknown or unrecorded variables
95 - Inferential statistics calculated from a clinical
trial make an allowance for differences between
patients and that this allowance will be correct
on average if randomisation has been employed.
96 - Randomisation promotes confidence that we have
acted in utmost good faith. It is not to be used
as an excuse for ignoring the distribution of
known prognostic factors. - Randomisation is essential for the effective
blinding of a clinical trial.
97Non-Randomised Trials
- It is difficult to obtain a reliable assessment
of treatment effect from non-randomised studies.
98Uncontrolled Trials
- Medical Practice implies that a doctor prescribes
a treatment for a patient that in his/her
judgement, based on past experience, offers the
best prognosis. - Clinicians are always looking for new therapies,
improvements in therapies and alternative
therapies.
99 - When a new treatment is proposed some clinicians
might try it on a few patients in an uncontrolled
trial. - The new treatment is studied without any direct
comparison with a similar group of patients on
more standard therapy.
100 - Uncontrolled trials have the potential to provide
a very distorted view of therapy. - Why?
101Laetrile
- In the 1970s in the US Laetrile achieved
widespread popular support for treating advanced
cancer of all types without any formal testing in
clinical trials. - NCI tried to collect documented cases of tumour
response after Laetrile therapy. Although an
estimated 70,000 cancer patients had tried
Laetrile only 93 cases were submitted for
evaluation and 6 were judged to have a response.
102Laetrile
- An uncontrolled trial of 178 patients found no
benefit and evidence of cyanide toxicity - The final conclusion of NCI was that Laetrile is
a toxic drug that is not effective as a cancer
treatment
103 - Uncontrolled trials are much more likely to lead
to enthusiastic recommendation of the treatment
as compared with properly controlled trials.
104Historical Controls
- Instead of randomising groups studies compare the
current patients on the new treatment with
previous patients who had received the standard
treatment. - This is a Historical Control group.
105 - Major flaw - How can we be sure that the
comparison is fair. How do we know whether the 2
groups differ with respect to any feature other
than the treatment itself.
106Patient Selection
- Historical control group is less likely to have
clearly defined criteria for patient inclusion
because the patients on the standard treatment
were not known to be in the clinical trial when
their treatment began. - Historical controls were recruited earlier and
possibly from a different source and therefore
might be a different type of patients. - Investigator might be more restrictive in choice
of patients for new treatment
107Concurrent Non-randomised Controls
- Use some pre-determined systematic method or
investigator judgement to assign patients to
groups
108Non-Randomised controls
- Date of Birth odd/even day of birth
new/standard treatment - Date of presentation odd/even days
new/standard treatment - Alternate assignment odd/even patients
new/standard treatment
109Example
- Trial of anticoagulant therapy for MI
- Patients admitted on odd days of the month
received anticoagulant and patients admitted on
even days did not. - Treated Control
- N 589 442
110 - Is it ethical to randomise?
- Assuming we have sufficient supply of the new
treatment why shouldnt every new patient be
given the new treatment? -
111 - Tendency is to do non-randomised trial first and
then follow up with RCT. - However it is difficult to do the RCT if the
results from the non-randomised trial are too
good.
112 - We assume that the new treatment has a reasonable
chance of being an improvement. - Before agreeing to enter patients into a
randomised trial the investigator must be
prepared to stay objective about the treatments
involved. - Randomised trials often produce scientific
evidence that contradicts prior beliefs.
113Equipoise
- What is equipoise and why is it important?
- A state of being equally balanced
- Clinical equipoise provides the ethical basis for
medical research involving randomly assigning
patients to different treatment arms.
114Clinical Equipoise
- Term was first used by B. Freedman in 1987, in
the article 'Equipoise and the ethics of clinical
research NEJM 1987 317(3) . - The ethics of clinical research requires
equipoise - a state of genuine uncertainty on the
part of the clinical investigator regarding the
comparative therapeutic merits of each arm in a
trial. Should the investigator discover that one
treatment is of superior therapeutic merit, he or
she is ethically obliged to offer that treatment.
115Clinical Equipoise
Freeman suggests that as long as there is genuine
uncertainty within the expert medical
community about the preferred treatment then
there can be clinical equipoise, even if a
specific investigator has a preference.
116Randomisation
117Randomisation
- Randomised trial with two treatments, A or B
- How do we assign treatments
- Toss a coin each time Heads A, Tails B
- Random Numbers Table
- Random Permuted Blocks
118Flip a coin
- Could flip coin for each participantcalled
complete randomisation or simple randomisation - Problem can get imbalance in groups, especially
in smaller trials - Imbalance in prognostic factors more likely
- Inefficient for estimating treatment effect
119Probability of 5 Treated and 5 Controls in 10
patients
- What is the probability of getting 5 Treated
patients out of 10? - Remember the binomial distribution
120Binomial Distribution
- The binomial probability function
X Binomial (n 10, p0.5) In this case, we
want x5
121Imbalance with 10 Participants
- (T, C) Probability Efficiency
- (5,5) .246
1 - (4,6) or (6,4) .410 .96
- (3,7) or (7,3) .234 .84
- (2,8) or (8,2) .088 .64
- (1,9) or (9,1) .020 .36
- (0,10) or (10,0) .002 0
122 - Even if treatment balanced at end of trial, may
be unbalanced at some time - E.g., may be balanced at end with 400
participants, but first 10 might be - CCCCTCTCTC
123Random Permuted Blocks
- To balance over time, could randomize in blocks
(called random permuted blocks) - Conceptually, for blocks of size 4 put 2 T
labels 2 C labels in hat for next 4
participants, draw labels at random without
replacement from hat - TTCC TCTC TCCT CTTC CTCT CCTT all equally
likely
124Forces balance after every 4
- TCTC CCTT C T C T
- 1 2 3 4 5 6 7 8 9 10 11 12
T TC C
T TC C
T TC C
125Randomisation by blocks 5 sites, 6 patients per
site
126Incomplete Blocks
- What happens if a site does not enroll all the
patients in a block? - What happens if multiple sites do not enroll all
the patients in a block?
127 - The smaller the block size, the more often
balance is forced e.g., in trial of 100, - blocks of size 2 force balance after every 2
- A block of size 100 forces balance only at end
128 - With blocks of size 2 in an unblinded trial, we
know every second participants assignment in
advance - I can veto potential participants until I find
one I like (sick one if next assignment is
control, healthy one if next patient is
treatment) - Schulz KF Subverting Randomization in Controlled
Trials, JAMA 1995 Vol. 274
129 - Even with larger blocks, in unblinded trial you
know some assignments in advance - With blocks of size 8 if first 6 are TCTTCT, we
know next 2 are C - Using a variable block size in a study makes it
harder to guess - Never include the block size in a protocol
130Subgroup balance
- Sometimes want to balance treatment assignments
within subgroups - Especially important if subgroup size is small
- E.g., with 6 diabetics in a trial, with a
complete randomisation, there is 22 chance of
5-1 or 6-0 split!
131Stratified Randomisation
- To avoid this problem could stratify the
randomisation (use blocked randomisation
separately for factors such as diabetics
nondiabetics) - E.g., for blocks of size 6,
- Diabetics Nondiabetics
- CTTCCT TTCTCC TCCTTC
132Stratified Block randomisation
- Typical examples of such factors are age group,
severity of condition, and treatment centre.
Stratification simply means having separate block
randomisation schemes for each combination of
characteristics (stratum)
133Stratified Block randomisation
-
- For example, in a study where you expect
treatment effect to differ with age and sex you
may have four strata - male over 65,
- male under 65,
- female over 65
- female under 65
134Stratification
- If we believe that gender is a prognostic factor,
that is, the treatment effect for males may be
different than the treatment effect for females
then we should stratify the randomisation (and
the analysis) on gender - This does not mean that we need identical numbers
of males and females in the trial, but rather
that the males be equally distributed between
treatment and control and the females also be
equally distributed between treatment and control
135Stratification
- Example
- In RA trials there are usually about 70 females
and 30 males. - Stratification at randomisation would help ensure
that each treatment group had about 70 females
and 30 males. - If we believe that males and females may have
different responses to treatment this would be
important.
136Blinding
137Blinding
- Many potential problems can be avoided if
everyone involved in the study is blinded to the
actual treatment the patient is receiving. - Blinding (also called masking or concealment of
treatment) is intended to avoid bias caused by
subjective judgment in reporting, evaluation,
data processing, and analysis due to knowledge of
treatment.
138Hierarchy of Blinding
- open label no blinding
- single blind patient blinded to treatment
- double blind patient and assessors blinded to
treatment - complete blind everyone involved in the study
blinded to treatment
139Open Label Studies
- These may be useful for
- pilot studies
- dose ranging studies
- However knowledge of treatment can lead to
- over or under reporting of toxicity
- over estimation of efficacy
- Even a small fraction of patients assigned at
random to placebo will reduce these potential
problems substantially.
140Single Blind Studies
- Usually justified when it is practically
infeasible to blind the investigator - Patients should be blinded if the endpoints are
patient reported outcomes and for safety - Where possible use blinded assessor to elicit
adverse events or patient outcomes
141Double Blind Studies
- When both the subjects and the investigators are
kept from knowing who is assigned to which
treatment, the experiment is called double
blind" - Serve as a standard by which all studies are
judged, since it minimizes both potential patient
biases and potential assessor biases
142Double BlindingTechniques
- Coded treatment groups
- Sham treatments
- If impossible try to use a blinded assessor for
assessing endpoints.
143Double Blind Studies issues
- Side effects
- Side effects (observable by patient or assessor)
are much harder to blind and are one of the major
ways in which blinding is broken -
- Efficacy
- A truly effective treatment can be recognized by
its efficacy in patients
144Hypothesis Testing
145Hypothesis Testing
- Steps in hypothesis testing state problem,
define endpoint, formulating hypothesis, - choice
of statistical test, decision rule, calculation,
decision, and interpretation - Statistical significance types of errors,
p-value, one-tail vs. two-tail tests, confidence
intervals - Significance vs. non-significance
- Equivalence vs. superiority tests
146Descriptive and inferential statistics
- Descriptive statistics is devoted to the
summarization and description of data (population
or sample) . - Inferential statistics uses sample data to make
an inference about a population .
147Objectives and Hypotheses
- Objectives are questions that the trial was
designed to answer - Hypotheses are more specific than objectives and
are amenable to explicit statistical evaluation -
148Examples of Objectives
- To determine the efficacy and safety of Product
ABC in diabetic patients - To evaluate the efficacy of Product DEF in the
prevention of disease XYZ - To demonstrate that images acquired with product
GHI are comparable to images acquired with
product JKL for the diagnosis of cancer
149How do you measure the objectives?
- Endpoints need to be defined in order to measure
the objectives of a study.
150Endpoints Examples
- Primary Effectiveness Endpoint
- Percentage of patients requiring intervention due
to pain, where an intervention is defined as - Change in pain medication
- Early device removal
151Endpoints Examples
- Primary Endpoint
- Percentage of patients with a reduction in pain
- Reduction in the Brief Pain Inventory (BPI) worst
pain scores of 2 points at 4 weeks over
baseline.
152Endpoints Examples
- Patient Survival
- Proportion of patients surviving two years
post-treatment - Average length of survival of patients
post-treatment
153Objectives and Hypotheses
- Primary outcome measure
- greatest importance in the study
- used for sample size
- More than one primary outcome measure -
multiplicity issues
154Hypothesis Testing
- Null Hypothesis (H0)
- Status Quo
- Usually Hypothesis of no difference
- Hypothesis to be questioned/disproved
- Alternate Hypothesis (HA)
- Ultimate goal
- Usually Hypothesis of difference
- Hypothesis of interest
155Hypothesis Testing
Type I Error Societys Risk Type II Error
Sponsors Risk
156Hypothesis testing
- Null Hypothesis
- No difference between Treatment and Control
- Type I error aka alpha, ?, p-value
- The probability of declaring a difference between
treatment and control groups even though one does
not exist (ie treatment is not statistically
different from control in this experiment) - As this is societys risk it is conventionally
set at 0.05 (5)
157Hypothesis testing
- Type II error aka beta, ?
- The probability of not declaring a difference
between treatment and control groups even though
one does exist (ie treatment is statistically
different from control in this experiment) - 1 - ? is the power of the study
- Often set at 0.8 (80 power) however many
companies use 0.9 - Underpowered studies have less probability of
showing a difference if one exists
158Steps in Hypothesis Testing
- Choose the null hypothesis (H0) that is to be
tested - Choose an alternative hypothesis (HA) that is of
interest - Select a test statistic, define the rejection
region for decision making about when to reject
H0 - Draw a random sample by conducting a clinical
trial
159Steps in Hypothesis Testing
- Calculate the test statistic and its
corresponding p-value - Make conclusion according to the pre-determined
rule specified in step 3
160Hypothesis Testing Normal Distribution
161Test of Significance and p-value
- Statistically significant
- Conclusion that the results of a study are not
likely to be due to chance alone. - Clinical significance is unrelated to statistical
significance
162Test of Significance and p-value
- p-value
- Probability that the observed relationship (e.g.,
between variables) or a difference (e.g., between
means) in a sample occurred by pure chance and
that in the population from which the sample was
drawn, no such relationship or differences exist. - It is not the probability that given result is
wrong.
163Test of Significance and p-value
- p-value
- The smaller the p-value, the more likely that the
observed relation between variables in the sample
is a reliable indicator of the relation between
the respective variables in the population.
164Test of Significance and p-value
- The p-level of .05 (i.e.,1/20) indicates that
there is a 5 probability that the relation
between the variables found in our sample is by
chance alone. - In other words, assuming that in the population
there was no relation between those variables
whatsoever, and we were repeating experiments
like ours one after another, we could expect that
approximately in every 20 replications of the
experiment there would be one in which the
relation between the variables in question would
be equal or stronger than in ours.
165Sample versus population
166Estimation
- We use results from our sample to make inference
about the population - How reliable are the sample data at representing
the population data? - Is the sample mean a good estimation of the
population mean?
167Confidence Intervals
- The results of the analysis are estimates of the
truth in the population. -
- The average reduction in pain score is an
estimate based on the sample in the study. -
- Confidence Intervals indicate the precision of
the estimate. The wider the confidence interval,
the less precise the estimate
168Confidence Intervals
- Example
- Average reduction in pain score from baseline to
month 6 was 9.7 (95 Confidence Interval 8.3 to
11.1) - This does not mean that we are 95 sure that the
true result lies between 8.3 and 11.1, rather
if we were to repeat the study 100 times with the
same sample size and characteristics, 95 of the
studies would probably show a mean reduction in
pain score between 8.3 and 11.1 -
169What have we learnt?
- Statistics doesnt have to be frightening.
- Statistics is all about a way of thinking
- If you dont have uncertainty you dont need
statistics - p-values are probability statements that tell you
something about your experiment
170What havent we learnt?
- All the detailed theory and formulae that back up
everything we have discussed - How to be a statistician (for that you do have to
go to graduate school) - How to get the perfect answer each time we run a
clinical trial - We are working with patients not widgets and
human beings are incredibly complex
171References
- ICH Guidelines E9, E3 and others
- Statistical Issues in Drug Development Stephen
Senn 1997 John Wiley Sons - Freeman B. Equipoise and the ethics of clinical
research NEJM 1987 317(3) - Schulz KF. Subverting Randomization in Controlled
Trials, JAMA 1995 Vol. 274
172Thank You !kay.larholt_at_abtbiopharma.com