Week 2

About This Presentation

Title:

Week 2

Description:

In reality, a third variable, in this instance a heat wave, more likely caused both. Life would be simpler if every effect variable (DV) had only one cause! – PowerPoint PPT presentation

Number of Views:106

Avg rating:3.0/5.0

Slides: 100

Provided by: kals155

Category:

Tags: heat | week

more less

Transcript and Presenter's Notes

Title: Week 2

1
Week 2

Research Design

2
PART 1
Scientific Theory - Its Nature and
Utility - Its Elements Concepts and Definitions
3
Naive Science and Theory

People regularly observe events around them and
speculate about their causes.
Personal observations frequently forms the basis
of peoples explanations of these events.
In these instances, people are behaving like
scientistsin part. They are trying to
understand and explain events and predict
outcomes.
But they are doing so without awareness of the
rules of sciencehence the term naïve science.

4
Naive Science and Theory

As naïve scientists, we try to understand some
interesting situation in a way that will predict
or explain its operation.
A Definition of Theory
A set of interrelated constructs (concepts),
definitions, and propositions that present a
systematic view of phenomena by specifying
relations among variables, with the purpose of
predicting and explaining the phenomena.
(Kerlinger, Foundations of Behavioral Research,
1986).
Naïve science/understanding is a kind of
theory, but it could be considered mere
speculation. Well use the term theory to mean
a simplified explanation of reality.

5
Theory Its Purpose Components

The goal to predict and explain events.
Important practical ramifications.
A theory achieves prediction and explanation by
stating relationships between concepts, when they
are operationalized as variables.
Variables things that vary (take on different
intensities, values, or states).
Concepts (or constructs) the mental image of
the thing which varies.
Example Fire is the concept size, heat or
other details about the fire are the variables
based on the concept.

6
Naive Theory Building An Example

Jill decides to vacation at an ocean resort. The
first day at the beach, the water is warm and
great for swimming. The second day, it is very
cold. The next day, the water is again very warm.
This phenomenon (variability of the water
temperature) interests her, because she likes to
swim, but doesnt like cold water.

What is the cause of this day-to-day variation?
7
Potential Contributing Factors

The sun has been out each day. Jill reasons that
the sun cant be the cause of differing water
temperatures. She therefore doesnt include the
sun in her naïve theory.
She observes the water very carefully each day
and notices the water is clearer on the days it
is cold, and murkier on the days that are better
for swimming.
Jill can now predict whether swimming will be
good by observing the clarity of the water. We
can stop now if the goal is merely to pick the
best days to swim!
But the identification of the pattern doesnt
explain why the water temperature should shift.

8
Additional Factors to Consider

Jill next notices a relationship between
variations in the prevailing wind direction on
the previous day and the water temperature the
next day. Days with winds out of the Northeast
are followed by days with cold water. Days with
winds from another direction are followed by warm
water.
Why should wind direction affect water
temperature? She consults a map and improves her
naïve theory by adding some process or mechanism
to explain these events.
Open ocean lies to the Northeast, while the bay
she swims in is protected on all other sides by
land. Thus, one possibility is that the Northeast
winds may blow colder deep ocean waters (which
are clearer, as less algae grow in cold
temperatures) into the bay.

9
The Beginnings of Theory Development

Jill has identified variables (bay water
temperature, bay water clarity, and wind
direction) and specified relationships among
them.
She is likely to call one of these variables the
cause, and the other two variables the effects.
Jill now has an intuitive idea of what
constitutes a causal relationship. It is
A specific condition of a variable (Northeast
wind) which occurs earlier in time than a
corresponding condition of another variable (cold
water), combined with some reasonable explanation
for the relationship between these two variables
(the nature of the geography of the region).

10
Is Jill finished?

Given the data thus far, it is too early too
conclude the proposed causal solution. So whats
next?
Jill should collect more data, so she extends her
vacation for a month and continues her
observations.
If the pattern continues, we can be increasingly
certain that the relationship accurately reflects
reality. More evidence can improve the
probability that Jills theory is true.
Naïve scientists will consider their personal
observations to be sufficient to construct a
completed theory. For the true scientist,
personal observations are only the beginning.
The scientific method a highly formalized,
systematic and controlled approach to theory
development and testing.

11
Testing Theories Naïve Science vs. Science.

Naive scientists are likely to be satisfied with
Jills evidence because it is self-evident,
common sense, is what any reasonable person
would conclude.
It is important to rule out alternative
explanations (competing causes of the phenomena)
by building controls into the experimental
design.
There are many procedures to guard against biased
testing of theories
Randomization (random selection random
assignment).
Appropriate research design and methodology.
Valid and reliable instrumentation.
Statistical procedures.

12
Methods of Knowing (fixing belief)

Method of Tenacity Least sophisticated, but
commonly used. Establishes explanations by
asserting that something is true because it is
commonly known to be true. Occurs entirely within
a given individual and is therefore subject to
their beliefs, values and idiosyncrasies.
Surprisingly resistant to contrary evidence.
Method of Authority Truth established when
something or someone held in high regard states
the truth. Relies on the actual truth of the
expert or source. Widespread in marketing.
Potentially dangerous.
Method of Reasonable Men (apriori method) Relies
on the idea that the propositions are
self-evident or reasonable. Criterion for fixing
belief lies in the reasonableness of the
argument and how reasonable is defined. May
agree with reason but not the observable facts.
Scientific Method Critical shift all three
previous methods are focused inward. Science
shifts the locus of truth from single individuals
to groups, by establishing a mutually agreed upon
rules for establishing truth.

13
Basic Requirements of the Scientific Method

The Use and Selection of Concepts
Linking Concepts by Propositions
Testing Theories with Observable Evidence
Defining Concepts
Publication of Definitions and Procedures
Control of Alternative Explanations
Unbiased Selection of Evidence
Reconciliation of Theory and Observation
Limitations of the Scientific Method

The Use and Selection of Concepts First,
develop a verbal (conceptual) description or name
for the events. Here we seek to explain events
by linking two concepts a cause to an
effect. Scientists arrive at causally related
concepts through a thorough review of previous
research, by using logical deduction, and by
insight and personal observations.
Linking Concepts by Propositions To explain a
phenomenon, we must specify the functional
mechanism whereby changes in variable A (a
cause) should lead to changes in some variable
B (an effect). Such a functional statement
distinguishes between causal relationships (that
have such an explanation) and covariance
relationships (that do not).
Testing Theories with Observable Evidence No
theory regarded as probable truth until it has
been empirically tested against some observable
reality.

Defining Concepts Testing theory with some
observable evidence generates this requirement we
must bridge the gap between theory (stated at a
high level of abstraction) and observation (which
occurs at a very concrete level).
The gap is bridged by defining both the meanings
of concepts and the indicators or measures used
to capture those meanings, a process that
produces an operational definition.
An operational definition adds three things to
the theoretical definition
- Describes the unit of measurement
- Specifies the level of measurement
- Provides a mathematical or logical statement
that clearly states how measurements are to be
made and combined to create a single value for
the abstract concept.

16
(No Transcript)
17

Publication of Definitions/Procedures The
scientific method is public. All other
researchers need to have the ability to carry out
the same procedures to arrive at the same
conclusions. Requires that we be as explicit and
objective as possible in stating and publicizing
definitions/procedures.
Control of Alternative Explanations Scientific
studies must be designed to rule out alternative
causes. Isolating a true causal variable means
that these other confounding variables have to be
identified and their effects eliminated or
controlled.
Unbiased Selection of Evidence Decision to
accept a theory as probably true or probably
false will be based on observations of limited
evidence (e.g., a few hundred college students).
Generalizing results beyond the (limited) study
sample requires the evidence to be selected so as
to eliminate biases and be representative of some
broader population.

Reconciliation of Theory and Observation Degree
of agreement between what theory predicts we
should observe and what we actually do is the
basis of the self-correcting nature of this
iterative approach.
Limitations of the Scientific Method Scientific
method cannot be used when objective observation
is not possible (e.g., determining whether a
social policy is good or bad, if objective
measurement of good and bad is not possible).
Basic beliefs or assumptions are not testable
propositions, as they can never be disproved, and
thus cannot be investigated scientifically.

19
PART 2

Types of Relationships
Testing Hypotheses Confounds Controls

20
Types of Relationships Null, Covariance, Causal

Null relationship No relationship at all.
Concepts operate independently of each other.
Covariance relationship Concepts vary together
(directly or inversely).
Causal relationship Concepts covary (are
related), changes in one concept precede changes
in the other concept, and a causal relationship
between the two (a cause and an effect) can be
justified logically.

21
(No Transcript)
22

Covariance relationships can provide prediction,
but not a (necessarily) valid explanation of the
relationship.
Accepting covariance relationships as true
without empirical testing fails to identify
spurious relationships. Two variables may covary
because they are both the effects of a common
cause.

23
The unobserved, but real, causal variable (Amount
of Education) is termed a confounding variable,
since it may mislead us by creating the
appearance of a relationship between the observed
variables.
24
Covariation vs. Causality Key Differences

Covariance alone does not imply causality.
Covariance merely means that a change in one
variable is associated with a change in the other
variable.
Causality requires that a change in one variable
(IV) creates the change in the other (DV).
Covariance is 1of 4 conditions that must be met
Spatial Contiguity (connected in the same time
and space).
Temporal Ordering (change in the IV occurs before
the change in the DV).
Necessary Connection (statement specifying why
the cause can bring about a change in the
effect).

25
Covariation vs. Causality An Example

Consider the example of an observed relationship
between first letter of a persons last name and
the persons exam grade.
Spatial Continguity requirement? Yes
The name and the exam score both exist within the
same person.
Covariance requirement? Yes
Last names A through M scored lower than the
others.
Temporal Ordering requirement? Yes
A persons last name was established before exams
were taken.
Necessary Connection requirement? Not so fast
Is there a sensible reason why a persons last
name should create different levels of
performance on an exam?

26
Covariation vs. Causality An Example
We expect persons with higher incomes to read
more newspapers (covariation) because the income
provides the purchasing power and leisure time
for such readership (necessary connection)
We expect older persons will read more newspapers
(covariation) for two reasons they have fewer
children at home and thus more leisure time and
they developed the habit of reading before the
dominance of TV and the Internet (necessary
connection)
27
Spurious Relationships
A city's ice cream sales are found to be highest
when the rate of drownings in the citys swimming
pools is highest. To allege that ice cream sales
cause drowning, or vice-versa, would be to imply
a spurious relationship between the two. In
reality, a third variable, in this instance a
heat wave, more likely caused both.
28
Testing Hypotheses Confounds and Controls

Life would be simpler if every effect variable
(DV) had only one cause!
Hardly ever the case Becomes difficult to sort
out how variables affect each other.
An observed covariance relationship between two
variables could occur because of some real
relationship or due to the spurious effect of a
third confounding variable.
Suppose we are interested in determining whether
there is a real relationship between Exposure to
Movie Violence and the Number of Violent Acts
committed by adolescents.

If we ignore, or are unaware of, the confounding
variable (Predisposition to Violence) we may
erroneously conclude that all change in the
number of Acts of Violence is due to the direct
action of level of Exposure to Movie Violence.

30
Controlling for Confounding Variables

Identifying Control Variables
Internal Validity, External Validity, and
Information
Methods for Controlling Confounding Variables

31
Identifying Control Variables
32
Internal Validity, External Validity, and
Information

Internal Validity the extent to which we can be
sure that no confounding variables have obscured
the true relationship between the variables in
the hypothesis test. That a change in the IV
causes a change in the DV.
External Validity the ability to generalize from
results of a study to the real world.
Information pertains to the amount of
information we can obtain about any confounding
variable and its relationship with the relevant
variables.

33
Methods of Controlling Confounding Variables

Manipulated Control we eliminate the effect of a
confounding variable by not allowing it to vary
(e.g., selecting and/or matching subjects on
potentially important confounding variables).
Statistical Control we build the confounding
variable(s) into the research design as
additional measured variables.
Randomization randomly assign study participants
to the experimental groups or conditions so that
the potential effects of confounding variables
are distributed equally among the groups.

34
Manipulated Control Eliminating effects of
confounding variables through research design and
sampling decisions

Example
A researcher investigating the effects of seeing
justified violence in video games on children
knows that young children cannot interpret the
motives of characters accurately. She decides to
limit her study to older children only, to
eliminate random responses or unresponsiveness of
younger children.

35
Statistical Control Confounding variables
measured mathematical procedures used to remove
their effects

Example
A political communication researcher interested
in studying emotional appeals versus rational
appeals in political commercials suspects that
the effects vary with the age of the viewer. She
measures age, and uses it as an independent
predictor to isolate, describe, and remove its
effect.

36
Randomization Unknown sources of error are
equalized by randomly assigning subjects to
research conditions

Example
Many different factors are known to affect the
amount of use of Internet social networking
sites. A researcher wants to test two different
site designs. He randomly assigns subjects to
work with each of the two designs. This approach
aims to distribute the amount of confounding
error from unknown factors equally across groups.

37
Methods of Controlling Confounding Variables A
summary

Manipulated and statistical control give high
internal validity, while randomization is a bit
weaker.
Statistical control and randomization give high
external validity, while manipulated control is
weaker.
Key difference between randomization and the
other techniques is that randomization doesnt
involve identifying/measuring the confounding
variables.
A major advantage of randomization is that we can
assume that all confounding variables have been
controlled to a certain extentbut any random
process will result in disproportionate outcomes
occasionally. Randomization also provides little
information about the action of any confounding
variables.

38
PART 3

Classes of Research Variables
Measurement The Foundation of Scientific
Inquiry
Essential Elements of Research Reliability,
Validity, Control and Importance

39
Classes of Research Variables Variables
defined by their use in research
40
Classes of Research Variables Levels of
Measurement
Depending on our operational definition, a
measurement can give us differing kinds of
information about a theoretical concept.

Nominal. A variable made up of discrete,
unordered categories. Each category is either
present or absent and categories are mutually
exclusively and exhaustive (e.g., gender).
Ordinal. A variable for which different values
indicate a difference in the relative amount of
the characteristic being measured. Not always
possible to determine the absolute distance
between adjacent categories.
Interval. A variable for which equal intervals
between variable values indicate equal
differences in amount of the characteristic being
measured.
Ratio. Ratios between measurements as well as
intervals are meaningful because there is a
starting point (zero).

41
Nominal Measurement An Example
A nominal measurement makes a simple distinction
between the presence or absence of the
theoretical concept within the unit of analysis.
Theoretical concepts can have more than two
nominal response categories (nominal factors) as
in the example below.
42
Ordinal Measurement An Example
Categories of a nominal level variable cannot be
arranged in any order of magnitude. By adding
ordering by quantity to the definition of the
categories, the sensitivity of our observations
is improved.
Example Subjects in a study are asked to sort a
stack of photographs according to their physical
attractiveness so that the most attractive photo
is on top and the least attractive photo is on
the bottom. This introduces the general idea of
comparative similarity in observations. We can
now say that the 2nd photo in the stack is more
attractive to the subject than all the photos
below it, but less attractive than the photo on
top of the pile. We can assign an
attractiveness score to each photo by
numbering, starting at the top of the pile
(1most attractive 2second most attractive,
etc.). This is called a rank order measurement.
43
With ordinal measurement, we cannot determine the
absolute distance between adjacent categories.
Suppose we knew the real attractiveness scores
of the photos for two subjects. Although their
real evaluation of the photos are quite
different, they rank the comparative
attractiveness identically.
44
Interval Measurement An Example

If we can rank order observations and assign them
numerical scores that register the degree of
distance between observations or points on the
measurement scale, we have improved the level of
measurement to interval-level.
Interval scales are numerical scales in which
intervals have the same interpretation
throughout. As an example, consider the
Fahrenheit scale of temperature. The difference
between 30 degrees and 40 degrees represents the
same temperature difference as the difference
between 80 degrees and 90 degrees. This is
because each 10-degree interval has the same
physical meaning (in terms of the kinetic energy
of molecules).
Interval scales are not perfect, however. In
particular, they do not have a true zero point.

45
Scales of Measurement
Levels of Measurement Nominal Ordinal Interval
Ratio
Diagnostic categories Socioeconomic Test
scores Weight length brand names
political class ranks personality and reaction
time or religious affiliation attitude
scales of responses Identity Identity
magnitude Identity magnitude Identity
magnitude equal intervals equal
intervals true zero point None Rank
order Add subtract Add subtract multipl
y divide Nominal Ordered Score Score Chi
Square Mann-Whitney t-test ANOVA t-test
ANOVA U-test
Examples Properties Mathematical
Operations Type of Data Typical Statistics
46
Evaluating Measures Effective
Range
47
Essential Elements of Measurement Reliability,
Validity, Control and Importance
48
Types of Reliability

Test-retest Reliability
Consistency of measurement over time
Internal Consistency
Inter-item correlation
Interrater Reliability
Level of agreement between independent
observers of behavior(s). Assessed via
correlation or the
procedure at right.

Agreement Agreement Disagreement
x 100
49
Types of Validity
Face validity. The (non-empirical) degree to
which a test appears to be a sensible
measure. Content validity. The extent to which a
test adequately samples the domain of
information, knowledge, or skill that it purports
to measure. Criterion validity. Now (concurrent)
and Later (predictive). Involves determining the
relationship (correlation) between the predictor
(IV) and the criterion (DV). Construct validity.
The degree to which the theory or theories behind
the research study provide(s) the best
explanation for the results observed.
50
Internal vs. External Validity

Internal Validity
Extent to which causal/independent variable(s)
and no other extraneous factors caused the change
being measured.
External Validity (generalizability)
Degree to which the results and conclusions of
your study would hold for other persons, in other
places, and at other times.

51
Threats to Internal ValidityFactors that reduce
our ability to draw valid conclusions
Selection History Maturation Repeated
Testing Instrumentation Regression to the
mean Subject mortality Selection-interactions Expe
rimenter bias
52
Reducing Threats to Internal Validity
The role of Control Behavior is influenced by
many factors termedconfounding variablesthat
tend to distort the results of a study, thereby
making it impossible for the researcher to draw
meaningful conclusions. Some of these may be
unknown to the researcher. Control refers to the
systematic methods (e.g., research designs)
employed to reduce threats to the validity of the
study posed by extraneous influences on both the
participants and the observer (researcher).
53
Group/Selection threat

Occurs when nonrandom procedures are used to
assign subjects to conditions or when random
assignment fails to balance out differences among
subjects across the different conditions of the
experiment.
Example
A researcher is interested in determining the
factors most likely to elicit aggressive behavior
in male college students. He exposes subjects in
the experimental group to stimuli thought to
provoke aggression and subjects in the control
group to stimuli thought to reduce aggression and
then measures aggressive behaviors of the
students. How would the selection threat operate
in this instance?

54
History threat

Events that happen to participants during the
research which affect results but are not linked
to the independent variable.
Example
The reported effects of a program designed to
improve medical residents prescription writing
practices by the medical school may have been
confounded by a self-directed continuing
education series on medication errors provided to
the residents by a pharmaceutical firm's medical
education liaison.

55
Maturation threat

Can operate when naturally occurring biological
or psychological changes occur within subjects
and these changes may account in part or in total
for effects discerned in the study.
Example
A reported decrease in emergency room visits in
a long-term study of pediatric patients with
asthma may be due to subjects outgrowing
childhood asthma rather than to any treatment
regimen introduced to treat the asthma.

56
Repeated testing threat

May occur when changes in test scores occur not
because of the intervention but rather because of
repeated testing. This is of particular concern
when researchers administer identical pretests
and posttests.
Example
A reported improvement in medical resident
prescribing behaviors and order-writing practices
in the study previously described may have been
due to repeated administration of the same short
quiz. That is, the residents simply learned to
provide the right answers rather than truly
achieving improved prescribing habits.

57
Instrumentation threat

When study results are due to changes in
instrument calibration or observer changes rather
than to a true treatment effect, the
instrumentation threat is in operation.
Example
In Kalshers Experimental Methods and Statistics
course, he evaluates students progress in
understanding principles of research design at
week 3 of the semester. A graduate T.A.
evaluates the students at the conclusion of the
course. If the evaluators are dissimilar enough
in their approach, perhaps because of lack of
training, this difference may contribute to
measurement error in trying to determine how much
learning occurred over the semester.

58
Statistical Regression threat

The regression threat can occur when subjects
have been selected on the basis of extreme
scores, because extreme (low and high) scores in
a distribution tend to move closer to the mean
(i.e., regress) in repeated testing.
Example
if a group of subjects is recruited on the basis
of extremely high stress scores and an
educational intervention is then implemented, any
improvement seen could be due partly, if not
entirely, to regression to the mean rather than
to the coping techniques presented in the
educational program.

59
Experimental Mortality threat

Experimental mortalityalso known as attrition,
withdrawals, or dropoutsis problematic when
there is a differential loss of subjects from
comparison groups subsequent to randomization,
resulting in unequal groups at the end of a
study.
Example
Suppose a researcher conducts a study to compare
the effects of a corticosteroid nasal spray with
a saline nasal spray in alleviating symptoms of
allergic rhinitis (irritation and inflammation of
the nasal passages). If subjects with the most
severe symptoms preferentially drop out of the
active treatment group, the treatment may appear
more effective than it really is.

60
Selection Interaction threats

A family of threats to internal validity
produced when a selection threat combines with
one or more of the other threats to internal
validity. When a selection threat is already
present, other threats can affect some
experimental groups, but not others.
Example
If one group is dominated by members of one
fraternity (selection threat), and that
fraternity has a party the night before the
experiment (history threat), the results may be
altered for that group.

61
Threats to External ValidityWays you might be
wrong in making generalizations
People, Places, and Times Demand
Characteristics Hawthorne Effects Order Effects
(or carryover effects)
62
People threatAre the results due to the unusual
type of people in the study?
Example You learn that the grant you submitted
to assess average drinking rates among college
students in the U.S. has been funded. In late
November, you post an announcement about the
study on campus to get subjects for the study.
100 students sign up for the study. Of these, 78
are members of campus fraternities the other 22
are members of the schools football team.
63
Places threatDid the study work because of the
unusual place you did the study in?
Example Suppose that you conduct an
educational study in a college town with lots
of high-achieving educationally-oriented kids.
64
Time threatWas the study conducted at a
peculiar time?
Example Suppose that you conducted a smoking
cessation study the week after the U.S. Surgeon
General issued the well publicized results of the
latest smoking and cancer studies. In this
instance, you might get different results than if
you had conducted the study the week before.
65
Demand Characteristics

Participants are often provided with cues to the
anticipated results of a study.
Example
When asked a series of questions about
depression, participants may become wise to the
hypothesis that certain treatments may work
better in treating mental illness than others.
When participants become wise to anticipated
results (termed a placebo effect), they may begin
to exhibit performance that they believe is
expected of them.
Making sure that subjects are not aware of
anticipated outcomes (termed a blind study)
reduces the possibility of this threat.

66
Hawthorne Effects

Similar to a placebo, research has found that
the mere presence of others watching a persons
performance causes a change in their
performance. If this change is significant, can
we be reasonably sure that it will also occur
when no one is watching?
Addressing this issue can be tricky but
employing a control group to measure the
Hawthorne effect of those not receiving any
treatment can be very helpful. In this sense,
the control group is also being observed and will
exhibit similar changes in their behavior as the
experimental group therefore negating the
Hawthorne effect.

67
Order Effects (carryover effects)

Order effects refer to the order in which
treatment is administered and can be a major
threat to external validity if multiple
treatments are used.
Example
If subjects are given medication for two months,
therapy for another two months, and no treatment
for another two months, it would be possible, and
even likely, that the level of depression would
be least after the final no treatment phase.
Does this mean that no treatment is better than
the other two treatments? It likely means that
the benefits of the first two treatments have
carried over to the last phase, artificially
elevating the no treatment success rates.

68
PART 4

Describing data Measures of Central Tendency and
Dispersion
The Role of Variance

69
Describing Data
Measures of Central Tendency - Mean (the
average) - Median (the middle number) - Mode
(the most frequently occurring number) Measures
of Dispersion - Range - Standard Deviation
(square root of the variance) - Variance (the
average squared deviation from the mean)
70
The Role of Variance
- In an experiment, IV(s) are manipulated to
cause variation between experimental and control
conditions. - Experimental design helps control
extraneous variation--the variance due to factors
other than the manipulated variable(s). Sources
of Variance - Systematic between-subjects
variance Experimental variance due to
manipulation of the IV(s) The Good
Stuff Extraneous variance due to confounding
variables. Natural variability due to sampling
error - Non-systematic within-groups
variance Error variance due to chance factors
(individual differences) that affect some
participants more than others within a group
The Not-So-Good Stuff
71
Separating Out The Variance
SST Sums of Squares Total SSM Sums of Squares
Model SSR Sums of Squares Error
SST

SSM
SSR
72
Controlling Variance in Experiments

In experimentation, each study is designed to
Maximize experimental variance.
Control extraneous variance.
Minimize error variance.
Good measurement
Manipulated and Statistical control

73
Test Statistics
Essentially, most test statistics are of the
following form
Systematic variance
Test statistic
Unsystematic variance

Test statistics are used to estimate the
likelihood that an observed difference is real
(not due to chance), and is usually accompanied
by a p value (e.g., plt.05, plt.01, etc.)
74
A Very Simple Statistical Model

outcomei (model) errori
model an equation made up of variables and
parameters
variables measurements from our research (X)
parameters estimates based on our data (b)
outcomei (bXi) errori
outcomei (b1X1i b2X2i b3X3i) errori

75
Types of Mistakes
True state of null hypothesis
Statistical decision
Ho true
Ho false
Reject Ho
Type I error
Correct
Dont reject Ho
Correct
Type II error

76
Statistical Power

A measure of how well Type II errors have been
avoided (i.e. how well a test is able to find an
effect)
1 type II error rate
Power should be 0.8 or higher, so Type II error
rate should not exceed .20.

77
Effect Sizes The Correlation coefficient
The statistical test only tells us whether it is
safe to conclude that the means come from
different populations. It doesnt tell us
anything about how strong these differences are.
So, we need a standard metric to gauge the
strength of the effects. The correlation
coefficient (r) is one metric for gauging effect
size.

Ranges from 0 1 (no effect to perfect effect)
Rough cutoffs (nonlinear, that is twice the r
value doesnt necessarily mean twice the effect)
0.10 small effect (explains 1 of the variance)
0.30 medium effect (explains 9 of the
variance)
0.50 large effect (explains 25 of the variance)

78
Effect Sizes The coefficient of determination
The statistical test only tells us whether it is
safe to conclude that the means come from
different populations. It doesnt tell us
anything about how strong these differences are.
So, we need a standard metric to gauge the
strength of the effects. r2 (r-Square), or the
Coefficient of Determination, is one metric for
gauging effect size. Rules of Thumb regarding
effects sizes Small effect 1-3 of the total
variance Medium effect 10 of the total
variance Large effect 25 of the variance
SSM
r2

SST
79
Reporting Statistical Models

APA recommends exact p-values for all reported
results best to include an effect size, too
Effect x was not statistically significant in
condition y, p .24, d .21
Report a mean and the upper and lower boundaries
of the confidence interval as M 30, 95 CI
20,40
If all confidence intervals you are reporting are
95, its acceptable to say so and then later say
something likeIn this condition, effect x
increased, M 30 20,40.

80
A Model of the Research Process Levels of
Constraint (Model used to illustrate the
continuum of demands placed on the adequacy of
the information used in research and on the
nature of the processing of that information.)
High Experimental Research Differential
Research Correlational Research Case-stu
dy Research Low Naturalistic
Observation Exploratory Research
Research plan becomes increasingly detailed
(e.g., precise hypotheses and analyses) but less
flexible. Research plan may be general, ideas,
questions, and procedures relatively unrefined.
Demand
81
Observational Methods

No direct manipulation of variables by the
researcher. Behavior is merely recorded--but
systematically and objectively so that the
observations are potentially replicable.
Advantages
Reveals how people normally behave.
Experimentation without prior careful observation
can lead to a distorted or incomplete picture.
Disadvantages
Generally more time-consuming.
Doesnt allow identification of cause and effect.

82
Quasi-Experimental Design

In a quasi-experimental study, the experimenter
does not have complete control over manipulation
of the independent variable or how participants
are assigned to the different conditions of the
study.
Advantages
Natural setting
Higher face validity (from practitioner
viewpoint)
Disadvantages
Not possible to isolate cause and effect as
conclusively as with a true experiment.

83
Types of Quasi-Experimental Designs
84
One Group Post-Test Design
Treatment
Measurement
Time
Change in participants behavior may or may not
be due to the intervention. Prone to time
effects, and lacks a baseline against which to
measure the strength of the intervention.
85
One Group Pre-test Post-test Design
Measurement
Treatment
Measurement
Time
Comparison of pre- and post-intervention scores
allows assessment of the magnitude of the
treatments effects. Prone to time effects, and
it is not possible to determine whether
performance would have changed without the
intervention.
86
Interrupted Time-Series Design
Measurement
Measurement
Time
Measurement
Treatment
Measurement
Measurement
Measurement
Dont have full control over manipulations of the
IV. No way of ruling out other factors.
Potential changes in measurement.
87
Static Group Comparison Design
Treatment
Measurement
Group A
(experimental group)
Measurement
No Treatment
Group B
(control group)
Time
Participants are not assigned to the conditions
randomly. Observed differences may be due to
other factors. Strength of conclusions depends
on the extent to which we can identify and
eliminate alternative explanations.
88
Experimental ResearchBetween-Groups and
Within-Groups Designs
89
Between-Groups Designs

Separate groups of participant are used for each
condition of the experiment.

Within-Groups (Repeated Measures) Designs Each
participant is exposed to each condition of the
experiment (requires less participants than
between groups design).

90
Between-Groups Designs

Advantages
Simplicity
Less chance of practice and fatigue effects
Useful when it is not possible for an individual
to participate in all of the experimental
conditions

Disadvantages
Can be expensive in terms of time, effort, and
number of participants
Less sensitive to experimental manipulations

91
Examples of Between-Groups Designs
92
Post-test Only / Control Group Design
Treatment
Measurement
Group A
(experimental group)
Random
allocation
Measurement
No Treatment
Group B
(control group)
Time
If randomization fails to produce equivalence,
there is no way of knowing that it has failed.
Experimenter cannot be certain that the two
groups were comparable before the treatment.
93
Pre-test / Post-test Control Group Design
Treatment
Group A
Measurement
Measurement
Random
allocation
No Treatment
Group B
Measurement
Measurement
Time
Pre-testing allows experimenter to determine
equivalence of the groups prior to the
intervention. However, pre-testing may affect
participants subsequent performance.
94
Solomon Four-Group Design
Treatment
Measurement
Group A
Measurement
Group B
Measurement
No Treatment
Measurement
Random allocation
Measurement
Treatment
Group C
Measurement
No Treatment
Group D
Time
95
Within-Groups Designs Repeated Measures
Advantages

Economy
Sensitivity

Disadvantages

Carry-over effects from one condition to another
The need for conditions to be reversible

96
Repeated-Measures Design
Treatment
Measurement
Measurement
No Treatment
Random Allocation
No Treatment
Measurement
Measurement
Treatment
Time
Potential for carryover effects can be avoided by
randomizing the order of presentation of the
different conditions or counterbalancing the
order in which participants experience them.
97
Latin Squares Design
Three Conditions or Trials
order of conditions or trials
One group of participants
A B C
Another group of participants
B C A
Yet another group of participants
C A B
Order of presentation of conditions in a
within-subjects design can be counterbalanced so
that each possible order of conditions occurs
just once. Problem not completely eliminated
because A precedes B twice, but B precedes A only
once. Same with C and A.
98
Balanced Latin Squares Design
Four Conditions or Trials
order of conditions or trials
One group of participants
A B C D
Another group of participants
B D A C
Yet another group of participants
D C B A
And yet another group of participants
C A D B
Note This approach works only for experiments
with an even number of conditions. For
additional help with more complex multi-factorial
designs, see http//www.jic.bbsrc.ac.uk
99
Factorial Designs