Title: Analysis of Variance Bidin Yatim, PhD Exeter, MSc Aston, BSc Nottingham Statistics Department Facult
1Analysis of VarianceBidin Yatim, PhD (Exeter),
MSc (Aston), BSc (Nottingham)Statistics
DepartmentFaculty of Quantitative Sciences
- Data Analysis Using SPSS
- Topic 6
2Objectives
- Perform and interpret a one-factor independent
measures ANOVA - Understand the necessary assumptions for a
one-factor independent measures ANOVA - Perform and interpret post-hoc test for a
one-factor independent measures ANOVA - Perform and interpret a one-factor repeated
measures ANOVA - Understand the necessary assumptions for a
one-factor repeated measures ANOVA
3Research Problem
- Does the temperature of the lecture hall affect
the rate at which students fall asleep during
class? - IV or Factor is Room Temperature
- Three Levels 50, 70, 90 degrees
- DV is Reaction Time
- How many minutes after the start of lecture does
the first student fall asleep?
4Raw Data
5Introduction
- Analysis of variance compares two or more
populations of interval data. - Specifically, we are interested in determining
whether differences exist between the population
means. - The procedure works by analyzing the sample
variance.
6One Way Analysis of Variance
- The analysis of variance is a procedure that
tests to determine whether differences exits
between two or more population means. - To do this, the technique analyzes the sample
variances
7One Way Analysis of Variance Example 1
- An apple juice manufacturer is planning to
develop a new product -a liquid concentrate. - The marketing manager has to decide how to market
the new product. - Three strategies are considered
- Emphasize convenience of using the product.
- Emphasize the quality of the product.
- Emphasize the products low price.
8One Way Analysis of Variance Example 1
- An experiment was conducted as follows
- In three cities an advertisement campaign was
launched . - In each city only one of the three
characteristics (convenience, quality, and price)
was emphasized. - Weekly sales were recorded for 20 weeks following
the beginning of the campaigns.
9One Way Analysis of Variance
Weekly sales
Weekly sales
Weekly sales
10One Way Analysis of Variance
- Solution
- The data are interval
- The problem objective is to compare mean sales in
three cities. - We hypothesize that the three population means
are equal
11Defining the Hypotheses
H0 m1 m2 m3 H1 At least two means differ
12Notation
Independent samples are drawn from k populations
(treatments).
X11 x21 . . . Xn1,1
X12 x22 . . . Xn2,2
X1k x2k . . . Xnk,k
Sample size
Sample mean
X is the response variable. The variables
value are called responses.
13Terminology
- In the context of this problem
- Response variable weekly salesResponses
actual sale valuesExperimental unit weeks in
the three cities when we record sales
figures.Factor the criterion by which we
classify the populations (the treatments). In
this problems the factor is the marketing
strategy. - Factor levels the population (treatment)
names. In this problem factor levels are the
three marketing strategies.
14The rationale of the name of Analysis of Variance
(ANOVA)
- We are testing the different between means but
why ANOVA? - Two types of variability are employed when
testing for the equality of the population means
Within Samples and Between Samples
15One Way Analysis of Variance
Graphical demonstration Employing two types of
variability Within Samples and Between Samples
16One Way Analysis of Variance
20
16 15 14
11 10 9
The sample means are the same as before, but the
larger within-sample variability makes it harder
to draw a conclusion about the population means.
A small variability within the samples makes it
easier to draw a conclusion about the population
means.
Treatment 1
Treatment 2
Treatment 3
17 Rationale 1 Variability Between Sample
- If the null hypothesis is true, we would expect
all the sample means to be close to one another
(and as a result, close to the grand mean). - If the alternative hypothesis is true, at least
some of the sample means would differ. - Thus, we measure variability between sample means
(and hence MSTr).
18 Rationale II Variability Within
- Large variability within the samples weakens the
ability of the sample means to represent their
corresponding population means. - Therefore, even though sample means may markedly
differ from one another, we have to consider the
within samples variability (and hence MSE).
19Test Statistics (F), Critical Value Rejection
Criterion
And finally
the hypothesis test
20The F test
Ho m1 m2 m3 H1 At least two means differ
Test statistic F MSTr/ MSE
3.23
Since 3.23 gt 3.15, there is sufficient evidence
to reject Ho in favor of H1, and argue that at
least one of the mean sales is different than
the others.
21The F test p- value
- Use SPSS to find the p-value
- fx Statistical Table
DIST(3.23,2,57) .0467
p Value P(Fgt3.23) .0467
22Hey! Lets get our hand dirty Using SPSS.
23One Way Analysis of Variance Using SPSS
- Suppose we want to know whether students who have
to work many hours outside school to support
themselves find their grade suffering. - We examine this question by comparing the GPAs of
students who work various hours outside school. - Lets examine this question using data in our
Student file. FilegtOpengt Student
24One Way Analysis of Variance Using SPSS
- First examine the average GPA for each of the
three work categories (0 hrs,1-19hrs,gt20hrs)-WorkC
at - GraphgtBoxplot then choose Simple and click
Define. Select GPA as the variable and WorkCat
for the Category Axis. Click - Option
25After Clicking Options, click off Display
groups defined by missing value, and click
Continue then OK.
26What is the Box-plot telling us?
- Some variation across the groups
- See median GPAs (dark line in the middle of box)
differ slightly between groups. Thats natural.
Probably because of sampling error. - So, should we attribute the observed difference
to sampling error or they genuinely differ? - Neither box-plot nor the median offer decisive
evidence. Hence we need ANOVA.
27One Way Analysis of Variance Using SPSS
- We are testing
- H1 At least two means differ
- Before attempting ANOVA, need to review the ANOVA
assumptions. (i) Independent samples (ii)
Normality (iii) Variances equality. We can test
both (ii) (iii). - AnalyzegtDescriptive StatisticsgtExplore
28AnalyzegtDescriptive StatisticsgtExplore
- In the Explore dialog box, select GPAs as the
dependent List variable, WorkCat as the Factor
List variable and Plot as the Display. Next,
click Plot - We are interested in a
- normality test, select
-
- Deselect this
- select this only. Click Continue
and OK. See next slide
29The Output has several parts, let focus on the
tests of normality
- The Kolmogorov-Smirnov test assesses whether
there is significant departure from normality in
the population distribution of the 3 groups. Null
hypothesis Distributions are normal. - Look at the p-values, all are gt0.05. Hence no
evidence of non-normality.
30One Way Analysis of Variance Using SPSS
- We still need to validate the homogeneity of
variance assumption. We do this within ANOVA. - AnalyzegtCompare MeansgtOne-Way ANOVA
- Dependent List variable
- is GPA and Factor
- variable is WorkCat
31One Way Analysis of Variance Using SPSS
- Click Option under Statistics, select
Descriptive and Homogeneity of variance test.
Click Continue OK - One-Way ANOVA output
- Consists many parts.
- Focus on hence do not
- reject Ho Variances are equal.
32Normality Homogeneity of variances assumptions
met hence
- Let find out whether students who work various
hours outside school differ in their GPAs. - The P-value of .000 is very small, hence we
reject Ho and conclude that - the means GPAs are not all the same. Where are
the differences? Hence Post-Hoc test
33One Way Analysis of Variance Using SPSS Post-hoc
Test
- Before doing Post-hoc test, lets look at the
group means, please comment. - Eyeballing group means cannot tell us decisively
if significant differences exist. - Many options exist. Commonly used- Tukeys It
tests compares all pairs of group means without
increasing the probability of Type 1 Error. - AnalyzegtCompare MeansgtOne-Way ANOVA
34One Way Analysis of Variance Using SPSS Post-hoc
Test
- The variables are still selected. Click
Post-Hoc, select only Tukey then Continue and OK
35One Way Analysis of Variance Using SPSS Post-hoc
Test
- The results of Tukey Test could be summarized as
follows - Group 1-19hrs had better GPAs than 0hrs
- Group 1-19hrs is comparable to 20hrs
- Group 20hrs is comparable to 0hrs.
36One Way Analysis of Variance Using SPSS Exercise
- FilegtOpengtGSS94
- This is the extract from the 1994 General Social
Survey. - One variable in the file groups respondent into
four age categories.Do the mean number of tv
hours vary by age group?
37Analysis of Variance Experimental Designs
- Several elements may distinguish between one
experimental design and others. - Either independent or dependent samples used.
- The number of factors.
- Each characteristic investigated is called a
factor. - Each factor has several levels.
38One-Factor Repeated Measures Analysis of Variance
- There are many situations where we are interested
in examining the same sample across three or more
treatments (several measurements on the same set
of individuals). - Do blood pressure changes during various stressor
tasks? - Lets examine using data in file called BP.
39One-Factor Repeated Measures Analysis of Variance
- We are testing
- H1 At least two means differ
- Repeated Measures ANOVA requires 4 conditions
(i) Independent observations within each
treatment (ii) normality (iii) homogeneous
variances (iv) sphericity
40One-Factor Repeated Measures Analysis of
Variance BP file
- BP file contains
- Dbprest diastolic BP at rest
- Dbpma diastolic BP during mental arithmetic
- Dbpcp diastolic BP while immersing a hand in ice
water. - Test the normality of the variables
- See next slides for sphericity test.
41One-Factor Repeated Measures Analysis of Variance
Using SPSS
- AnalyzegtGeneral Linear Modelgt Repeated Measures
These instructions will prompt the dialog box
shown. Assign our repeated measure (also called
within subject factor) and indicate - the number of levels as 3.
- Factor1(3) will appear here
- after we click Add. After
- clicking Add, click Define
- refer next slide watch a
- new dialog box.
42One-Factor Repeated Measures Analysis of Variance
Using SPSS
- Select a specific level for our repeated measure
dbprest, dbpma, and dbpcp in the order indicated. - Click on Option and
- select Descriptive
- Statistics, Continue
- OK
43One-Factor Repeated Measures Analysis of Variance
Using SPSS
- The output has several parts First look at
Mauchlys sphericity test. Itll determine which
ANOVA test to use. Ho is that the correlations
among the 3 measures are equal. Look at p-value,
hence reject Ho. - Sphericity assumption is not met.
44One-Factor Repeated Measures Analysis of Variance
Using SPSS
- Then look at output labeled Test of Within
Subjects Effects, 1st line of factor1 reads
Sphericity Assumed. This is the F test line to
refer if sphericity condition met. - So? Many other F tests to choose from The
Greengouse-Geisser adjusted test is commonly - used. Look
- at p-value.
- So what?
45One-Factor Repeated Measures Analysis of Variance
Using SPSS CONCLUSION
- The means for Diastolic BP (DBP) for the 3 tasks
are not the same. Hence it changes significantly
during the various mental and physical stressors
investigated in this study. - But where the differences are? Determine
manually..
46One-Factor Repeated Measures Analysis of
Variance Exercise
- Does systolic BP change significantly during the
three tasks examined in this study? The 3 tasks
are at rest (sbprest), performing mental
arithmetic (sbpma), and immersing a hand in ice
water (sbpcp).
47Two Way Analysis of Variance
- Objectives
- Perform and interpret a two-factor independent
measures ANOVA - Understand the necessary assumptions for a
two-factor independent measures ANOVA - Understand and interpret statistical main effects
- Understand and interpret statistical interactions
48Two Way Analysis of Variance
- Previously we learned how to perform ANOVA for
research situations involving a single IV. - However, in many situations we want to consider
two IV simultaneously. The analysis used is a
two-way analysis of variance. -
49One - way ANOVA Single factor
Two - way ANOVA Two factors
Response
Response
Treatment 3 (level 1)
Treatment 2 (level 2)
Treatment 1 (level 3)
Level 3
Level2
Factor A
Level 1
Level 1
Level2
Factor B
50Two Way Analysis of Variance Research Problem
- Prior research found that people at risk of
having hypertension (HBP) showed changes in
cardiovascular responses to various stressors. - A researcher wants to explore this finding
further by looking at several variables that
might be implicated, in particular, a persons
sex and whether a person has a parent with HBP. - Various DV can be measured. We focus on systolic
BP during mental arithmetic task.
51Two Way Analysis of Variance Hypothesis
- Two-way ANOVA will test for
- Mean difference between levels of 1st factor
(here, comparing systolic BP during mental
arithmetic (sbpma) between gender) - Mean difference between levels of 2nd factor
(here, comparing systolic BP during mental
arithmetic (sbpma) for individuals having parent
with HBP or not) - Any other mean differences as a result of a
unique combination of the two factors called
interaction effects.
52Two Way Analysis of Variance Hypothesis
- 1st two hypothesis tests called tests for the
main effects. Ho there are no differences
between the levels of the factor. - 3rd hypothesis test for the interaction between
the two factors. Ho there is no interaction
between the factors. - The three tests are independent. The outcome of
one does not effect the other.
53Two Way Analysis of Variance Using SPSS,
FilegtOpengtBP
- AnalyzegtGeneral Linear ModelgtUnivariate
- Select systolic bp mental arithmetic sbpma as
the Dependent Variable and sex and parental - hypertension PH
- as Fixed Factors.
- Click Option and..
-
54Two Way Analysis of Variance Using SPSS
- Under Display, select Descriptive Statistics and
homogeneity tests. Click Continue then OK
55Two Way Analysis of Variance Using SPSS
- Univariate ANOVA output has several parts
(descriptive statistics, Levenes test of
equality of variances, and tests of
between-subjects effects). - 1st look at Levenes it tests variance
homogeneity. Its - P-value is .225 gt .05,
- hence the data do not
- violate variance homo-
- geneity assumption.
- So, can proceed to interpret ANOVA.
56Two Way Analysis of Variance Using SPSS
- Lets check if sbpma is related to a persons
gender, parental history of ph or some other
combination of these factor. P-value for testing
1st, 2nd and 3rd hypotheses are shown in red,
blue and green line - respectively.
57Two Way Analysis of Variance Using SPSS. A
conclusions
- Reject the 1st hypothesis, hence there is a
significant main effect for gender. - Reject 2nd hypothesis, hence there is a
significant main effect for parental history. - P-value for testing the interaction effect is
greater than .05, hence do not reject the Ho and
conclude that there is no significant interaction
between the two factors.
58Two Way Analysis of Variance Using SPSS. A
conclusions
- What exactly do these results mean?
- We have two significant main effects and a non-
significant interaction. - One very helpful way to make sense of a
two-factor ANOVA results is to graph the data. - GraphsgtInteractiongtBarDrag sbpma to the vertical
axis, PH to the horizontal axis and sex to the
Panel Variables. - Click Titles tab to title your bar chart and put
your name in the caption.
59- Bars are about
- the same hght.
- Hard to differentiate
-
60Editing the bar chart
- Lets use the interactive graphing capabilities
to make a bar chart where the y-axis does not
start with zero, and thus show the differences
more clearly. The lowest sbpma is 118, so let
start the y-axis at 118. - Double click anywhere on your chart. Find Chart
Manager and click on it. - In the Chart Manager, click on Scales Axis and
then on Edit - In the Scale Axis dialog box, under Scale, find
Minimum and deselect Auto. Type 118. Click OK
61Edited bar chart.. At last
- So? We can see that, males have higher SBP,
regardless of parental hypertension. - Person having a parent with hypertension have
higher SBP, regardless of gender. - Both factors effect SBP separately but not their
combination.
62EXERCISE
- FilegtOpengtBP
- Is heart rate while immersing a hand in ice water
hrcp related to a persons sex, parental
hypertension PH, or some combination of these
factors?
63End of Part SixSee U LaterPlease Dont Go
Away What if the data is not normal?