Title: Factors that threaten the validity of research findings
1Factors that threaten the validity of research
findings
- Material for this presentation has been taken
from the seminal article by Don Campbell and
Julian Stanley - Experimental and quasi-experimental designs for
research on teaching, - which was first published as Chapter 5 in N.L
Page (1963), Ed., Handbook of Research on
Teaching.
2Two classes of factors that jeopardize the
validity of research findings
- Factors concerned with internal validity.
- Do the research conditions warrant the
conclusions? - Without internal validity results are
uninterpretable. - Factors concerned with external validity.
- To what extent can the results be generalized?
- To what populations, settings, treatment
variables, and measurement variables?
3Factors affecting Internal Validity
- Internal validity is threatened whenever there
exists the possibility of un-controlled
extraneous variables that might otherwise account
for the results of a study. - Eight classes of extraneous variables can be
identified.
- History
- Maturation
- Testing
- Instrumentation
- Statistical regression
- Selection
- Research mortality
- Interactions w/ selection
4History
- Specific events, in addition to the treatment,
that occur between the first and second
measurement. - The longer the interval between the pretest and
posttest, the more viable this threat.
5Maturation
- Changes in physical, intellectual, or emotional
characteristics, that occur naturally over time,
that influence the results of a research study. - In longitudinal studies, for instance,
individuals grow older, become more
sophisticated, maybe more set in there ways.
6Testing
- Also called pretest sensitization, this refers
to the effects of taking a test upon performance
on a second testing. - Merely having been exposed to the pretest may
influence performance on a posttest. - Testing becomes a more viable threat to internal
validity as the time between pretest and posttest
is shortened.
7Instrumentation
- Changes in the way a test or other measuring
instrument is calibrated that could account for
results of a research study (different forms of a
test can have different levels of difficulty). - This threat typically arises from unreliability
in the measuring instrument. - Can also be present when using observers.
8Statistical Regression
- Occurs when individuals are selected for an
intervention or treatment on the basis of extreme
scores on a pretest. - Extreme scores are more likely to reflect larger
(positive or negative) errors in measurement
(chance factors). - Such extreme measurement errors are NOT likely to
occur on a second testing.
9Differential Selection
- This can occur when intact groups are compared.
- The groups may have been different to begin
with. - If three different classrooms are each exposed
to a different intervention, the classroom
performances may differ only because the groups
were different to begin with.
10Selection-Maturation Interaction
- Occurs when differential selection is confounded
with maturational effects. - The treatment group might be composed of higher
aptitude students, or - The treatment group might have more students who
are born during the summer months.
11Research Mortality
- The differential loss of individuals from
treatment and/or comparison groups. - This is often a problem when research
participants are volunteers. - Volunteers may drop our of the study if they
find it is consuming too much of their time. - Others may drop out if they find the task to be
too arduous.
12Interaction of Selection with the Other Factors
Affecting Internal Validity
- Occurs when intact groups, which may not be
equivalent, are selected to participate in
research interventions. - As in a previous example, three different
classrooms may be exposed to different
treatments, but one of the classroom might be
composed of students having higher achievement
trajectories.
13External Validity
- Concerned with whether the results of a study can
be generalized beyond the study itself - Population validity (when the sample does not
adequately represent the population). - Personological validity (when personal/
psychological characteristics interact with the
treatment). - Ecological validity (when the situational
characteristics of the study are not
representative of the population).
14Factors affecting External Validity
- External validity is threatened whenever
conditions inherent in the research design are
such that the generalizability of the results is
limited. - Four classes of threats to external validity can
be identified.
- Reactive or interactive effects of testing
- Interaction effect of selection bias and the
intervention. - Reactive effects of treatment arrangements
- Multiple treatment interference
15Reactive effect of testing
- Occurs whenever a pretest increases or decreases
the respondents sensitivity to the treatment. - Studies involving self-report measures of
attitude and interest are very susceptible to
this threat.
16Selection x Treatment Interaction
- This can occur when selected treatment or
comparison groups are more or less sensitive to
the treatment prior to initiating the treatment
(or intervention). - Most likely to occur when the treatment and
comparison groups are not randomly selected.
17Reactive Effects of Experimental Arrangements
- These can occur when the conditions of the study
are such that the results are not likely to be
replicated in non-experimental situations. - Hawthorn effects
- John Henry effects
- Placebo effects
- Novelty effects
18Multiple-treatment Interference
- This has a likelihood of occurring whenever the
same research participants are exposed to
multiple treatments. - Sequence effects
- Carry-over effects
19Research Designs
- We will examine the operative threats to internal
and external validity in twelve specific types of
research designs. - Some symbols to be used
20Design 1 One-shot Case Study
- This is a widely-used research design in
education. - A single group receives a treatment or
intervention. - Following the treatment individuals are measured
on some outcome variable - It can be diagramed as follows
21Design 1One-shot Case Study, Continued
- This design is typical of a case study
- Inferences typically are based upon expectations
of what the results would have been had X not
occurred. - These designs often are subject to the error of
misplaced precision, since they often involve
tedious collection of specific detail and careful
observations. - The problem is that there usually are numerous
rival, plausible sources of effect on the outcome
other than X.
22Design 2 One-group Pretest-Posttest Design
- This, also, is a widely-used research design in
education (see the diagram). - A pretest is given, followed by a treatment or
intervention, followed by a posttest. - The difference between O1 and O2 is used to infer
an effect due to X. - This design is subject to four of the eight
threats to internal validity and one of the
threats to external validity. Can you name them?
23One-group Pretest-Posttest Design (Continued)
- Threats to internal validity
- History
- Many change-producing events may have occurred
between O1 and O2 . - History is more viable the longer the lapse
between the pretest and posttest. - Maturation
- During the time between O1 and O2 the
individuals may have grown older, wiser, more
tired, more wary, or more cynical. - Testing
- The fact that the participants in the study were
exposed to a pretest may, by itself, influence
performance on the posttest.
24One-group Pretest-Posttest Design (Continued)
- Threats to internal validity (continued)
- Instrumentation
- If O1 and O2 are obtained from judges (or
raters), for example, than the judges may become
more skillful between the two sets of
observations. - Standardized achievement tests might be
re-normed between pretesting and postesting. - Statistical regression
- For example, if students are selected to
participate in a remedial intervention because of
extremely low scores on a pretest they are very
likely, as a group, to score higher upon
receiving the same (or similar) test as a
posttest. - This results mainly from errors in measurement
(or unreliability in the tests).
25Design 3Static-group Comparison
- In this design (diagramed below) a non-random
treatment group is compared to a non-random
comparison group. - Problems associated with this design stem from
the fact that that there is no way to
substantiate that the treatment and comparison
groups were equivalent to begin with.
26Static-group Comparison (Continued)
- Threats to internal validity
- Selection
- Here, intact groups, are being compared. It is
possible that the treatment group was already
prepared to do better (or worse) than the
comparison group on O hence the treatment group
might have performed differently from the
comparison group even in the absence of X. - Mortality
- It is possible that differences between O1 and
O2 are due to the fact that the nature of the
treatment is such that participants drop out at
higher rates than do participants in the
comparison group.
27Static-group Comparison (Continued)
- Threats to internal validity (continued)
- Interactive effects (e.g., selections and
maturation). - It may be that one of the groups being compared
has a higher (or lower) achievement trajectory
(e.g., when a more advanced class is compared
with a lesser-advanced class). - The three designs discussed so far are usually
referred to as pre-experimental designs. - We will now turn to a consideration of three true
experimental designs.
28True Experiments
- True experiments are characterized by random
assignment - Random assignment of individuals to treatment
conditions. - Random assignments of treatment conditions to
individuals. - When comparison groups are large enough (usually,
n gt 20) and individuals are selected at random
than representativeness can be assumed.
29Design 4.Pretest-posttest Control Group Design
- Here, individuals are randomly assigned to one of
two groups the treatment group and a comparison
group. - The treatment group receives the intervention.
- The groups are compared in terms of their
difference scores - (MO3- MO1 ) vs (MO4 MO2)
30Pretest-posttest Control Group Design (Continued)
- This design, and the next two true-experimental
designs, control for all eight of the threats to
internal validity. - Any differences between groups that might have
existed prior to X are (assumed to be) controlled
through random assignment. - Any effects do to history, maturation, testing,
instrumentation, regression and so on would be
expected to occur with equal frequency in both
groups.
31Pretest-posttest Control Group Design (Continued)
- Factors effecting external validity
- Interactions between the treatment and testing.
- The occurs whenever the pretest sensitizes the
treatment group to the effects of the treatment. - Interactions between the treatment and group
selection. - This can happen when the population from which
the comparison group samples were selected is not
the same as the target population. - Reactive arrangements
- Sometimes the setting for the study is
artificially restrictive. When this occurs
generalizability suffers. -
32Design 5. Solomon Four-group Design
- This design enjoys several advantages.
- Both the main effect of testing and the
interaction of testing and treatment are
testable. - There are multiple tests of the effect of X
- O2gtO1 O2 gtO4 O5gtO6 O5 gtO3
33Design 6Posttest-only Design
- Pretests are not always necessary. Given
randomization of subjects to treatment conditions
we can assume that the groups were equivalent
prior to the treatment intervention. - In this design all the threats to internal
validity are controlled for. - As far as external validity is concerned we might
still question whether there might be reactive
effects.
34Design 8Non-equivalent Pretest-Postest
- Most widely-used quasi-design in education
research. - O1 X O2
-
______________________________ - O3 O4
- Used to determine (and adjust where necessary)
whether the groups were equivalent before onset
of treatment.
35Design 7Time Series Design
- O1 O3 O5 O7 X9 O11 O13 O15
O17 - -----------------------------------------------
- O2 O4 O6 O8 X10 O12 O14 O16
O18 - \
- \
- \
\----------------------------
--------------------------- - \
\ - \
\ - \-------------------------------------------------
--\ - __________________________________________________
_______________________ - O2 O4 O6 O8 X10 O12 O14 O16
O18
36Design 9Counterbalanced Designs
-
- X1 O1 X2 O2 X3 O3
- _______________________________________________
___ - X3 O4 X1 O5 X2 O6
- _______________________________________________
___ - X2 O7 X3 O8 X1 O9
37Treatment Reversal Design with Randomization
- R O1 O3 X5 O7 X9 O11
- ------------------------------------
- R O2 O4 X6 O8 X10 O12
38Treatment Reversal Design without Randomization
- O1 O3 X5 O7 X9 O11
- -------------------------------
- O2 O4 X6 O8 X10 O12
39Single (or few) Subject Designs
- I certain types of situations these designs are
very appropriate. - When the target population is very small.
- Particularly applicable to clinical settings.
- When studying specific behaviors of unique
individuals. - Individuals serve as their own controls.
- When we want to show that an intervention can
have an effect.
40Requirements of Single-Subject Designs
- External validity is often difficulty to
establish. - Internal validity requires three things
- Repeated and reliable measurement.
- Valid and reliable measuring instruments (or
techniques). - Baseline stability.
- Single variable rule (manipulate only one
variable at a time.)
41Design 8A-B-A Withdrawal Design
- This design involves alternating phases of
baseline observation and treatment intervention,
X - 0 0 0 0 X 0 X 0 X
0 - __________________________________
_______________________________________________
_________ - Baseline Phase Treatment Phase
- During the treatment phase the intervention is
turned on and off.
42Design 9A-B-A Single Subject Design
- 0 0 0 0 X X X X 0 0 0 0
- _____________________________
_______________________________
____________________________ - Baseline Phase Treatment
Phase Post-treatment - One problem with this design is that it is
sometimes considered unethical to discontinue
treatment when the treatment has been shown to be
effective.
43Design 10A-B-A-B Single Subject Design
- 0 0 0 0 X X X X 0 0 0 0 X X X X
- _________________ _____________________
__________________ _____________________ - Baseline Treatment Baseline
Treatment - The advantage is that it leaves an effective
treatment in place.
44Other Single-Subject Designs
- There are a wide variety of single-subject
designs - Multiple baseline designs.
- Alternating treatment designs.
- Increasing/decreasing treatment intervention
designs. - Replicated single-subject designs.