Title: Outline
1Outline
- Validity of Inference
- Theory of Validity
- Statistical Conclusion Validity
- Internal Validity
- Construct Validity Jill
- External Validity Tim
- Trade-offs Tim et al
- Discussion
2Validity of Inference
3VALIDITY
- The approximate truth of an inference
- Judgment about the extent to which relevant
evidence supports the inference as being true - Always entails fallible human judgments
- Evidence comes from both empirical findings and
their consistency with past findings and theories - Validity judgments are never absolute
- No certainty that inferences are true or that all
possible alternatives have been falsified
4Validity of Inferences
- Validity is a property of inferences
- not of designs or methods
- Even using a randomized experiment does not
guarantee a valid causal inference - Could be broken by
- Differential attrition
- Low statistical power
- Improper statistical analysis
- Sampling error
5Why is it important to remember that validity is
a property of a knowledge claim, not a property
of the design?
6Three Theories of Truth
- Correspondence theory
- A knowledge claim is true if it corresponds to
the world e.g., see it raining - Coherence theory
- A claim is true if it belongs to a coherent set
of claims - Pragmatism
- A claim is true if it is useful to believe it
- Philosophers do not agree on which theory of
truth is correct and for us it doesnt matter! - Science uses them all to approximate the truth
7The Theory of Validity is pragmatic and uses them
all
- Correspondence between empirical evidence and
abstract inferences - Sensitive to degree of coherence between findings
and theory - Pragmatic ruling out of alternative explanations
- Truth is a social construction!
8Campbell Stanley, 1963
- Followed Campbell (1957) closely in defining
internal and external validity. - Internal validity inferences about whether the
experimental treatments make a difference in this
specific experimental instance. (p. 5) - Construct validity asked to what populations,
settings, treatment variables and measurement
variables can this effect be generalized? (p. 5)
9Cook Campbell (1979) Expanded Typology of
Validity
- To draw generalized causal inferences it is
useful to treat the causal and generalizability
aspects of the inferences separately - Statistical conclusion Validity
- Internal Validity
- Construct Validity
- External Validity
10Corresponds to 4 Questions
- How large and reliable is covariation between the
presumed cause and the effect? - Is the covariation causal, or would it have been
obtained without the treatment? - Which general constructs are involved in the
persons (units), treatments, observations, and
settings (UTOS)? - How generalizable is the locally-embedded causal
relationship over varied UTOS?
11- These questions and inferences are often
considered separately, so it is practical to have
the typology reflect that - However, they are often related - and different
combinations are possible (e.g., internal
validity with or without construct validity) - Interesting to consider the limits of
combinations (e.g., to what extent is both high
internal and external validity possible?)
12Threats To Validity
- Are specific reasons why we can be partly or
completely wrong in our inferences - About covariation, causation, constructs or
variations across UTOS - It is useful to anticipate criticisms of
inferences by considering the types of
limitations encountered by past research. - Heuristics, such as a list of potential threats,
allow us to account for threats in the design or
by including measures of anticipated threats.
133 Critical Questions about Threats
- For any particular experiment and finding
- How would the threat apply in this case?
- Is there other evidence that the threat is
plausible rather than just possible? - Does the threat operate in the same direction as
the observed effect (so that it could partially
or totally explain it)? - But ruling out threats is a falsification
enterprise, so is always limited
14Statistical Conclusion Validity
- The validity of inferences about the covariation
or correlation between the treatment and the
outcome - How large and reliable is the covariation?
- Whether the variables covary or not
- How strongly they covary (SCC, p. 42)
15Testing covariation
- Null hypothesis significance testing (NHST)
- Common misunderstandings of p value
- NHST tells little about effect size
- Effect size bound by confidence intervals
- An alternative approach
- SCC recommend these along with exact p of type I
error
16Classical Interpretation of p value
- In the classic interpretation, exact Type I
probability levels tell us the probability that
the results that were observed in the experiment
could have been obtained by chance from a
population in which the null hypothesis is true
(Cohen, 1994 as cited in SCC, p. 44). - Perhaps not the most interesting hypothesis
(SCC)
17AlternativeInterpretation of p value
- p value (probability level) signifies the
confidence we can have in deciding among the
following claims - 1) Treatment A did better than treatment B
- (sign of effect is )
- 2) Treatment B did better than treatment A
- (sign of effect is -)
- 3) The sign is uncertain (P gt .05 signifies 3,
too close to call)
18Incorrect statistical conclusions (SCC, p.42)
- 1) Whether the variables covary
- Type I error (claim of a difference when there is
none) - Type II error (conclude that there is no effect
when in fact there is one) - 2) How strongly they covary
- Overestimate magnitude of covariation (and
confidence in estimate of magnitude) - Underestimate magnitude of covariation (and
confidence in estimate of magnitude)
19Threats to Statistical Conclusion Validity
- Low statistical power
- See Table 2.3 (pp. 46-7) for methods to increase
power - Violated assumptions of the test statistics
- Fishing and the error rate problem
- Unreliability of measures
- Always attenuates bivariate relationships
- Restriction of range floor and ceiling effects
- Unreliability of treatment implementation
- Extraneous variance in experimental setting
- Heterogeneity of respondents (units)
- Inaccurate effect size estimation
20Can we prove that covariation between a treatment
and an outcome is zero?
21To support the causal inferences, three things
must be established (p. 53)
- 1) A precedes B in time (use design)
- 2) A covaries with B (use statistics)
- 3) No other explanation for the relationship is
plausible (use design if possible)
22Internal Validity
- The ability to infer with confidence that an
independent variable has produced the observed
differences in the dependent variable (Singleton
Straits, 2005, p. 188) - Isolating the independent variable
- Controlling confounds
- Validity the approximate truth of an inference
(SCC, p. 34)
23Internal validity
- The validity of inferences about whether observed
covariation between A (treatment) and B (outcome)
reflects a causal relationship from A to B as
those variables were manipulated or measured. - Is the covariation causal or would the same
effect be obtained without treatment?
24Internal validity
- Local Molar Causal Validity
- Local generalizability is zero, limited to UTOS
- Molar treatments are a complex package
- Causal restricted to claims that A caused B
- One of the things that's most difficult to grasp
about internal validity is that it is only
relevant to the specific study in question
(Trochim, 2006).
25Threats to Internal Validity
- Each threat signifies a distinct class of
extraneous other possible causes (p. 55) - Ambiguous temporal precedence
- Selection bias
- History
- Maturation
- Statistical regression
- Attrition (a special case of selection bias)
- Testing effects
- Instrumentation
- Additive/Interactive effects of these threats
26GARDASIL DAILY DOUBLE
Threats to internal validity are not necessarily
independent of each other. Define two threats to
internal validity and explain how they could be
related / co-occur in a study.
27Randomization Controls Most Threats to Internal
Validity
- Indeed, all except
- Differential attrition
- Differential testing
28Relating Statistical and Internal Validity
- Both concern operations (not the constructs they
represent) - Statistical conclusion validity is concerned with
errors in assessing covariation - Internal validity is concerned with errors in
causal-reasoning - Internal validity depends substantially on
statistical conclusion validity
29Jill and Tim
- Jill
- Construct Validity
- Tim
- External Validity
- Trade-offs
30Shadish 2011
- Evaluators discuss external validity much less
than internal validity - Some idea of disagreements in the field(s)
- Threats to validity overlap
- E.g, Attrition is listed as a threat to internal
validity. But because sample size drops, it can
threaten power (statistical conclusion validity),
may require changing how we describe who is and
is not in the study (construct validity), and may
raise questions about whether the intervention
would have the same effect in those who dropped
out (external validity).
31Threats to validityDiscussion questions
32What happens to the precision, and confidence
intervals, of effect size estimates when a study
has low power? What kind of validity is
threatened?
33A specific instance of selection bias is also
defined in SCCs list as a separate threat to
internal validity. What is it?
34Confounding of treatment effects with population
differences threatens _______ validity
35You are a part of a research team that has been
funded to tackle the adult obesity epidemic. The
hypothesis is that adults receiving the
intervention will have a healthier weight than
adults who do not receive the intervention. You
ask your boss, How will we measure healthy
weight? To which, your boss replies, Simple, we
will ask each participant their height and
weight. You ask, Thats it?, and your boss
replies, Yes. Youre new to the team, but
you really want to speak up because this is a
threat to _________ validity, known as
____________________.
36HERPES DAILY DOUBLE
Random sampling, though rarely performed in
experimental designs, improves what kind of
validity?
37HEPATITIS DAILY DOUBLE
You work at High Times Community College and your
coworker comes to work sharing the results of a
new study. He says, Listen to this! In a new
study, students were randomly assigned to take
10, 15, or 20 units of course credit. Results
show that college students who took 20 or more
credits were less likely to engage in marijuana
use. So to reduce the prevalence of marijuana use
here at High Times CC, we have to implement a
policy putting a minimum credit hour of 20 for
all students!. You, having taken H699, take a
closer look at the report and see that the study
was conducted at one universityHarvard. Your
response to your colleague is, Sorry, my friend,
but this study most likely lacks _________
because ____________________
38You want to test out a novel approach to
improving psychological distress among college
students. Your technique is provided to students
that come into the campus counseling center. You
conduct two week follow-ups with these students
and see that their self-reported levels of
psychological distress has improved. You are
ready to tell your boss about the success of your
program when your colleague points out that your
study has a threat to _______ validity known as
____________.
39Secular trends pose a threat to _________validity.
40You have completed an RTC in which you examined
the impact of an SAT preparation course on SAT
performance. You want to see if results differ
for boys versus girls. What will happen to your
power if your sample is divided by gender?