Title: Research Methods for the Learning Sciences
1Research Methods for the Learning Sciences
- Kenneth R. KoedingerPhilip I. Pavlik JrTA
Benjamin Shih
Lecture 2
Validity and Design
2Management Issues
- In a few minutes, I will get started with our
second lecture - But first, Id like to cover a few mangement
issues
3Management Issues
- Has everyone successfully purchased the book and
accessed the online reading for today? - Trochim Donnelly Chapters 1 and 7
4Management Issues
- Has anyone had any difficulty with accessing or
posting to Goggle Wave? - You should post for the next class
5Your first assignment part 1
- Are there any questions or concerns on part 1 of
the first assignment
6Quibbles
- Discussing, or even Quibbling, about details of
examples is a good thing, since it helps us think
about the concepts discussed - Although
7The Trouble With Quibbles
8They Multiply!
9Three Types of Study
- Descriptive Studies
- Relational Studies
- Many people call them correlational studies
- Like me
- Causal Studies
- Can you define each type?
10Three Types of Study
- Descriptive Studies
- Correlational Studies
- Causal Studies
- Who here has done studies of each type?(say a
little more?)
11Feasibility and Validity
- Descriptive Studies
- Correlational Studies
- Causal Studies
Feasibility
Validity
12Feasibility and Validity
- A tradeoff you will see many times
13Issues With Correlational Studies
- Does the Dog Wag the Tail?
- Or Does the Tail Wag The Dog?
14The Tail Wagging The Dog
- Fowler et al (2005) report that "There is a 41
increase in risk of being overweight for every
can or bottle of diet soft drink a person
consumes each day. - People who drink more diet soda gain weight
- Therefore, diet drinks must be stimulating
appetite, and making people eat more and gain
weight, right?
15The Tail Wagging The Dog
- Well, maybe
- But maybe people drink more diet soda because
they are gaining weight or are already overweight - With just a correlation, you cant tell
- With a causal study, you can
- So how would you make this study causal?
16Free Soda
- You get all the free diet soda you can drink,
but you over there, you get all the free regular
soda you can drink. - See the later section on Ethics
17Issues with Correlational Studies
- The Third-Variable Problem
- Not always referred to by this name, but
definitely always important
18Lets Consider a Possible Relationship
- We are handing out slips describing a
correlational relationship - Please write down a third variable that could
directly lead to an increase in both
variables(it could be a quantitative or a
categorical variable)
19Lets read out what youve got
- Read out the original relationship and your third
variable
20As you can see
- A lot of different third variables can explain
the relationships found
21The Just So Story Problem
- As a class, you were able to find reasonable
explanations for two contradictory hypotheses - This is called the Just So Story problem
- People can find a reasonable-sounding explanation
for just about any finding - Which is why we should always question both our
findings, and our reasonable-sounding
explanations for them
22Confirmation Bias
- A particular danger is when you find what youre
expecting to find - You may not double-check your results quite as
carefully as when your results are surprising - Always double-check everything and keep records!
- Coding errors, mis-copied data, eliminated
subjects for good reasons but forgot to propagate
change to sample pool, using the wrong variable
in an analysis, running the wrong test
23Exception and Ecological Fallacies(from Chapter
1)
- Roughly opposites of each other
- Ecological fallacy
- A property general to group applies to all group
members - Students who have used Cognitive Tutors know more
math than students who used traditional curricula - Therefore Sheela (who used a Cognitive Tutor)
knows more math than Indira (who used a
traditional curriculum) - Exception fallacy
- A property found in one individual applies to
whole group - Roberto used a Cognitive Tutor and cannot
distinguish categorical variables from numerical
variables - Therefore all students who used a Cognitive Tutor
will have this difficulty
24Now, Validity!
- What are
- Conclusion validity
- Internal validity
- Construct validity
- External validity
- Ecological validity
25Sub-categories of External Validity
- Non-representative and/or nonrandom sample of
users - Inappropriate tasks
- Inappropriate measures
26Ecological vs. External Validity
- Critical issue in studies of learning is
- whether they generalize to people and places
(have 'external validity') - that are representative of "real life" (an
ecological validity concern) - Ecological validity, in common usage
- not about generalization to real-life
situations - about the whether the "methods, materials and
settings" are similar (or identical) to real
life. - One can separate the ideas
- ecological validity is about real-world
relevance - external validity is about generalizability
27Examples ecological external validity
distinction
- Strong ecological validity, but lower external
validity - Koedinger, Anderson, Hadley, Mark study
- Strong ecological validity because methods,
materials, setting are real classroom
instruction in real schools - Not strong external validity because study was
only done with urban students in Pittsburgh - Strong external validity, but lower ecological
validity - Lab study of seductive details finds that
instruction that does not include interesting but
ultimately irrelevant details leads to better
learning, for students of variety of ages
performed at 2 universities with children of
different socio-economic status (SES) race - Strong external validity because it was
demonstrated across a range of persons and
places, but because it was done in the lab, it
may not have high ecological validity - Maybe seductive details only have benefit in
ecologically valid settings, with distractions,
where they increase attention
28Study features to consider for external
ecological validity
- External validity
- Generalizability of study features
- Trochim 2nd edition persons, places, times
- Brewer (2000) (see Wikipedia) settings
(places), procedures, participants ( persons) - Koedinger procedures, materials
- Ecological validity
- Relevance of study features to real-world
- Brewer (2000) (see Wikipedia) methods
(procedures), materials, setting
29Ecological validity increases prob of external
validity
- It is commonly conjectured that high ecological
validity may likely improve external validity. - A study done in a classoom rather than the lab
(more ecologically valid) is more likely to
generalize to other classrooms (external
validity) than a lab study - Not clear that this common conjecture has been
proven - How would one prove it?
30Ecological validity increases prob of external
validity
- It is commonly conjectured that high ecological
validity may likely improve external validity. - A study done in a classoom rather than the lab
(more ecologically valid) is more likely to
generalize to other classrooms (external
validity) than a lab study - Not clear that this common conjecture has been
proven - How would one prove it?
- But a good rule of thumb isThe more similar
your study is to context of application
(ecological)and the more different contexts of
study (external)The more likely your results
will generalize to the context of natural
settings with other people, procedures, places,
times (ecological and external)
31Example(Baker, dMello, Rodrigo, Graesser, in
preparation)
- Is boredom or frustration more persistent over
time, as students use a learning environment? - If we just did one study, you might ask
- Will this effect be general across contexts,
student ages, cultures, learning systems,
domains, etc.
32Example(Baker, dMello, Rodrigo, Graesser, in
preparation)
- So we ran studies analyzing this
- USA, college students, lab study, AutoTutor,
computer literacy domain - Philippines, 17-19 year olds, classroom study,
The Incredible Machine, concrete problem-solving
domain - Philippines, 12-15 year olds, classroom study,
Aplusix, algebra - And got the same result (boredom is much more
persistent)
33Example(Baker, dMello, Rodrigo, Graesser, in
preparation)
- Do these three studies have external validity?
- Do these three studies have ecological validity?
34Another key feature
- Participant motivation, affect, knowledge
factors. - Example Study with students in classroom,
materials from course -gt ecologically valid - But, students not getting a grade -gt may approach
task differently results may differ - E.g., a treatment designed to enhance motivation
may work better than it does when it is applied
as actual, graded, part of a class
35A quiz
36Lets consider a few examples
- Vote on which type of validity is violated (any
of the five, could be multiple, could even be
none) - Explain your reasoning
37Which type of validity is violated?
- Students who read bug messages perform more
poorly on post-test - So bug messages hurt learning!
You have chosen a categorical variable for the X
axis however, scatterplot graphs can only
contain numerical variables.
(Baker, Corbett, Koedinger, Schneider, 2004)
38Which type of validity is violated?
- I have proven that students learn more Calculus
from my Calculus tutoring system - Here is my test, used both pre and post
- How well do you know Calculus?
- 1 2 3 4
5 - Not well
Very well
39Which type of validity is violated?
- My new tutoring system is much better than the
previous tutoring system!
40Which type of validity is violated?
- My new tutoring system is much better than the
previous tutoring system!
41Which type of validity is violated?
- I conducted a study comparing my new tutoring
system to a previous one - Students who completed the whole tutoring system
performed significantly better on post-test in
the experimental condition than control condition
42Which type of validity is violated?
- I conducted a study comparing my new tutoring
system to a previous one - Students who completed the whole tutoring system
performed significantly better on post-test in
the experimental condition than control condition - Oops did I mention only 3 of students completed
the whole tutoring system in the experimental
condition?
43Which type of validity is violated?
- Now that I have tested my new learning
environment that responds to off-task behavior by
giving it to single students in the guidance
counselors office after school, we can be
confident it will work in all school settings
44Which type of validity is violated?
- Now that I have tested my new learning
environment with a set of 10 8th graders in
Tuktoyaktuk (Northwestern Territory of Canada),
all bilingual English-Inuvialuit, with parents
who work in the mine nearby, we can be confident
it will work for all students
45Which type of validity is violated?
- Now that I have tested my new learning
environment with a set of 41 8th graders in a
predominantly upper-class Caucasian suburb of
Pittsburgh, we can be confident it will work for
all students
46Threats to Validity
- Selection threat/ Self-selection threat
- Internal validity (Accuracy of cause-effect
inference) - History threat
- Maturation threat
- Testing threat
- Instrumentation threat
- Mortality threat
- Regression threat
- Social/Motivational threats
- Diffusion of treatment
- Compensatory rivalry/resentful demoralization
- Compensatory Equalization
- Demand threat
47Confounding
- What is a confounding variable?
- Examples?
48Regression toward the mean example
- (From davidmlane.com)
- "Consider an acutal study that received
considerable media attention. This study sought
to determine whether a drug that reduces anxiety
could raise SAT scores by reducing test anxiety.
A group of students whose SAT scores were
surprisingly low (given their grades) was chosen
to be in the experiment. - These students, who presumably scored lower than
expected on the SAT because of test anxiety, were
administered the anti-anxiety drug before taking
the SAT for the second time. The results
supported the hypothesis that the drug could
improve SAT scores by lowering anxiety the SAT
scores were higher the second time than the first
time. Since SAT scores normally go up from the
first to the second administration of the test,
the researchers compared the improvement of the
students in the experiment with nationwide data
on how much students usually improve. The
students in the experiment improved significantly
more than the average improvement nationwide. The
problem with this study is that by choosing
students who scored lower than expected on the
SAT, the researchers inadvertently chose students
whose scores on the SAT were lower than their
"true" scores. The increase on the retest could
just as easily be due to regression toward the
mean as to the effect of the drug. The degree to
which there is regression toward the mean depends
on the relative role of skill and luck in the
test."
49Any issues with this example?
50Feasibility
- One of the big things you crash into, when
planning a study or a program of research, is the
need for feasibility - It would be awesome if we all had access to
unlimitedly large subject pools, in any setting
we wanted
51Feasibility
- It would be awesome if we all had access to
unlimited research support for things like
running studies and coding data
52Feasibility
- Often, when a study we want to do is not quite
feasible, we can find corners to cut to make it
possible - The key is finding the right corners to cut
53That Said
- Being willing to do something painful that no one
else has been willing to do so far can enable
great new research - Like driving out to schools every morning at 7am
for 2 months in 3 separate years(Ryans
dissertation)
54But
- Its even better to discover a new method that
provides data which is verifiably almost as
good with vastly less effort
55Experimental Design Feasibility Considerations
- Cost of running experiment
- Subjects, experimenter time, equipment
- Converting results into economic or practical
terms - Important trade-offs
- Lower cost for subjects vs. higher reliability/
believability of results - More pilot subjects/time vs. faster/cheaper
results but with greater risk
56Ethics
- This is a big issue
- It is not one that can be summarized in just a
few minutes - These days there is often a lot of paperwork
- CMU is sometimes extremely reasonable about this
- But there have been real abuses in the past
- And not just in the past
57Ethics
- I feel odd not saying much about ethics, its a
very key subject - But at some level, ethics is a key part of the
apprenticeship model of graduate school - I genuinely believe that its hard to teach out
of context
58Guidelines
- Protect peoples anonymity
- Enable people to give informed consent, as much
as possible - Give people an avenue for complaint
- Dont use conditions known to be bad unless
youre going to compensate for it somehow - If unexpected bad things happen, dont ignore it
- The subject is always right
59Guidelines
- Protect peoples anonymity
- Enable people to give informed consent, as much
as possible - Give people an avenue for complaint
- Dont use conditions known to be bad unless
youre going to compensate for it somehow - If unexpected bad things happen, dont ignore it
- The subject is always right (until they leave the
scene)
60Ethical Guidelines
- Does anybody want to disagree with any of these
guidelines? - Does anybody want to add in some other guidelines
they think are important?
61Thanks!
- Make sure to read Trochim chapters 8, 9, 10 for
next week!