Title: 4' Analysis of Variance III Anlisis de Varianza III
14. Analysis of Variance III Análisis de
Varianza III
- Profesor Simon Wilson
- Departamento de Estadística y Econometría
2Testing that more than 2 means are equal /
Contrastar que las medias son iguales
- In the last section, we tested for 2 means to be
equal - In our example, this does not really tell us if
all 3 means together are equal (only tells us
about each pair of means) - The method of Analysis of Variance tells us how
to do this - In general, we have I levels of the factor so
- H0 m1 m2 ... mI m (null hypothesis /
hipótesis nula) - H1 not all the mi are equal
3Testing the Equality of Means (1)
- The logic of the test is easy (recall 3 plots
from Section 2) - If the differences between the group means are
large relative to the experimental error s, we
conclude that the means are different - Si las diferencias entre las medias de grupo son
grandes con relación a la variabilidad
experimental s, concluimos que las medias son
diferentes
4Testing the Equality of Means (2)
- Three groups with means 20, 22 and 24. First,
the error s is large relative to the difference
in group means - o observations
- x group means
- Here we accept H0
5Testing the Equality of Means (3)
- Second, the error s is small relative to the
difference in group means - o observations
- x group means
- Here we reject H0
6Testing the Equality of Means (4)
- Remember, we estimate the experimental error s2
by - The variability between the groups can be
measured using the sum of squares / suma de
cuadrados -
7A diversion Decomposition of the Variance (1)
- Suppose H0 is true (all the means equal)
- Then our estimate of this mean would be the mean
of all observations together - Our estimate of the variance would use the sum of
squares - This is called the total variability /
variabilidad total (TV)
8Decomposition of the Variance (2)
- Now we can write
- And therefore
9Decomposition of the Variance (3)
- Now! clearly,
- (Why?)
- And so we have
10Testing the Equality of Means (5)
- Recall that is the sum of squares
- used to estimate s2 when we think that the group
means are different. - This is called the unexplained variability /
variabilidad no explicada (UV) of the data or
unexplained variance (i.e. difference between the
observations and their estimated expected value
in the group)
11Testing the Equality of Means (6)
- The other part, measures the
- variability between the means of each groups.
- We call this term the explained variability /
variabilidad explicada (EV) - If there are large differences in the means of
each group then EV is large. - Note that TV EV UV
12Testing the Equality of Means (7)
- Further, the following is true
- If H0 is true then the explained variability is
0. - If H1 is very true -- large differences
between group means -- then the explained
variability is large relative to the unexplained
variability (go back to example)
13Testing the Equality of Means (8)
- So look at the ratio / cociente of explained to
unexplained variability - If this ratio is small then means are more or
less equal (accept H0) - If this ratio is large then means are different
(accept H1)
14The ANOVA Table (1)
- All this information about variances is usually
put into a table called the ANalysis Of VAriance
(ANOVA) table - This table is on the next page
- It shows the three types of variability and the
degrees of freedom of each - Finally, it shows the estimate of the variance
(sum of squares divided by the degrees of freedom)
15The ANOVA Table (2)
16Distributions of EV and UV
- When H0 is true
- This helps us to say when F is small or large
17The F test (1)
- We can show that the following ratio of the EV
and UV (ratio of two c2 distributions) - Has a distribution called the F-distribution with
I-1 and n-I degrees of freedom
18The F test (2)
- If the value of F is large according to the
F-distribution then we reject H0. - We therefore conduct the test as follows
- Compute F
- Decide on the signifance level a
- Look up (in tables) the value Fa I-1,n-I ,such
that - P(F gt Fa I-1,n-I ) a
- If F gt Fa I-1,n-I then reject H0
19The F test / contraste de la F for the whisky
data (1)
20The F test for the whisky data (2)
- First, we calculate everything in the ANOVA
table. - Using the table of information that we have
already -
- Then the EV is
21The F test for the whisky data (3)
- For the UV
- We already have calculated
- So
22The F test for the whisky data (4)
23The F test for the whisky data (5)
- So F 57.755 / 9.3556 6.173
- If H0 is true, F has the F distribution with 2
and 18 degrees of freedom. - Decide to use a 5 level of significance.
- F0.05 2,18 3.55 (from tables)
- Since F gt 3.55, so we reject H0
- Conclude that the means are different
24Class Example
- Sales of a fast food company have increased in
the last year. A director of the company wants
to know if the increase has been the same in the
4 regions of the country (North, South, East,
West). - Five establishments from each region are randomly
chosen. The percentage increase in sales is
observed. - Construct the ANOVA table and test to see if
there is a difference in mean percentage increase
in sales between the 4 regions (use 5 level of
significance).
25Class Example Data
26ANOVA Table
27The F test
- F _________
- If H0 is true, F has the F distribution with ___
and ___ degrees of freedom. - Decide to use a 5 level of significance.
- F0.05 __,__ ______ (from tables)
- Conclude that