Week 12 objectives presentation

About This Presentation

Transcript and Presenter's Notes

Title: Week 12 objectives

1
Week 12 objectives

1. Overview of hypothesis tests
2. ANOVA (a) Example
(b) Minitab
(c) ANOVA Table
(d) Conceptual view of ANOVA
3. Assumptions and Conditions for ANOVA

2
1. Overview of hypothesis tests
3
When may ANOVA needed?

A supermarket chain store executive needs to
determine whether the sales of a new product are
affected by the aisle in which the product is
stored.
Possible experiment
10 aisles in the store
Locate the product in each of the 10 aisles for a
week and record daily sales
Are there differences among mean daily sales?
A claim about ten populations of quantitative
data is to be tested using sample information

4
2. ANOVA introduction

An Analysis of variance (ANOVA) is a test about
population means
The usual null hypothesis states that some group
of population means are equal
The alternative hypothesis is simply that those
means are not all equal
This test about means is carried out by
decomposing certain sample variances

5
ANOVA hypotheses
6
2(a) Example 1

Random samples of 15 students from each of three
faculties record the dollar amounts spent on
textbooks and stationery
Are the mean spending levels different across the
three faculties?
How to answer the question?

7
The boxplots for the faculty spending example
8
Discussion

Conclusions about population mean spending levels
are to be based on samples
This requires a Statistical inference method
Which inference method?
The question to be investigated is about whether
population mean spending levels are the same or
not, i.e. need a yes/no answer
This indicates a hypothesis test
For differences between more than two population
means, ANOVA is required

9
ANOVA the F ratio test statistic
The form of the test statistic is the F ratio
10
ANOVA the F ratio test statistic (cont.)

The ratio is sensitive to whether or not the null
hypothesis is true
The null distribution of the F-ratio is the
F-distribution
Using the F-ratio to carry out a test is called
an F-test
The results are laid out in a very convenient
ANOVA Table

11
2(b) ANOVA using Minitab

For one-way ANOVA, data can either be presented
in separate columns
(use Stat gt ANOVA gt One-way (unstacked))
or
stacked in a single column, in which case another
column is needed to contain the sample labels.
Then use Stat gt ANOVA gt One-way.

12
2(c) Use the ANOVA Table to write out a six
steps solution Minitab output for Example 1
(Note the within samples SSQ is called the
error SSQ in Minitab).
13
Six steps Solution
14
The six steps (cont.)

P-value 0.003
Decision rule Reject H0 if P-value lt 0.05, but
if P-value gt 0.05 then we cannot reject H0. In
the present case P-value 0.003 lt .05, so H0
is rejected.
There is strong evidence to conclude that at
least two population mean expenditure levels are
different.

15
More comments P-values and the distribution of
the F ratio

The F-ratio,

tends to have a value around 1 if H0 is true,
but becomes inflated if H0 is not true.
Thus large values of F are significant.
The P-value will be the probability Pr an F
distributed variable gt observed F value
From Fk-1,N-k distribution tables, or by Minitab.

16
Example F distribution
17
More about the F distribution
18
2(d) Conceptual View of ANOVA (1)
Consider the following two experiments to examine
the effectiveness of three different teaching
methods on two campuses (City West and Mawson
Lakes). Here is the raw data
Which experiment has better evidence of a
difference in the true (POPULATION) average
results among the methods?
19
Conceptual view of ANOVA (2)
Could variations among the means this large be
plausibly due to chance
OR Is it a good evidence that
POPULATION means differ?
It seems that in experiment 1, it is easier to
justify the differences between the levels of the
factor because the results are so consistent. The
heart of ANOVA is to compare the variability
among the group means to the variability within
each group.
20
Conceptual View of ANOVA (3)

In experiment 1, the variability among the group
means is much larger than the variability of
individual observations within each single group.
This is the basic idea behind ANOVA.
This technique examines the data for evidence of
differences in the corresponding population means
by looking at the ratio

21
Review of SSQ (from week 2)
In ANOVA a variance is called a mean square.
22
2(a) ANOVA decomposition of SSQ (leave out as
non-examinable)
23
ANOVA decomposition of SSQ, contd.

The decomposition
Total SSQ Between samples SSQ Within samples
SSQ applies also to degrees of freedom
N1 (k1) (Nk)
Dividing SSQ terms by degrees of freedom gives
mean-square, or MSQ terms
If some of the SSQ terms become inflated more
than others, so will the corresponding MSQ terms
But if the null hypothesis is true, both MSQ
terms, are estimates of the same thing, the
natural experimental variability of the data.

24
ANOVA how the decomposition leads to a test

If the null hypothesis of equal population means
is true, the corresponding MSQ terms both
estimate error variance and are approximately
equal.
But if the population means differ, the Between
samples SSQ becomes inflated and its MSQ tends to
be bigger than the Within samples, or error, MSQ.
Thus the ratio of MSQ terms provides a test
statistic if H0 is true, both MSQ terms estimate
the same thing, and the ratio is about 1.
So the value 1 is roughly in the middle of the
null distribution of the test ratio.

25
Lecture Exercise 1 the textbook example (12.2.1)
26
Will there be evidence of different population
means?
Boxplots in the Lecture Exercise 1
27
Lecture Exercise 1 cont.
Analysis of Variance Source DF SS MS
F P Brands 2 5.09 2.54 0.87
0.437 Error 18 52.87 2.94 Total 20
57.95

The between and within samples MSQ terms are
5.09/2 2.54 and 52.87/18 2.94
The ratio of these MSQ terms is 2.54/2.94 0.87,
which is less than 1.

28
Solution Steps (i), (ii), (iii)
29
Solution Steps (iv), (v) and (vi)

P-value P-value 0.437
Decision rule Reject H0 if P-value lt 0.05, but
if P-value gt 0.05 then we cannot reject H0. In
the present case P-value 0.437 gt .05, so H0
cannot be rejected.
There is no evidence suggesting differences
between population mean brand levels of toxin.

30
Lecture Exercise 2

The XYZ Corporation is interested in possible
differences in days worked by salaried employees
in three different departments in the financial
area.
A survey of 23 randomly chosen employees reveals
the data shown below.
At 1 significance level, are the mean annual
attendance rates different for employees in these
three departments?

31
Boxplots for Lecture Example 2
32
Minitab output for Exercise 2
One-way ANOVA Budgets, Payables,
Pricing Analysis of Variance Source DF
SS MS F P Factor 2
1804 902 3.43 0.052 Error
20 5257 263 Total 22
7060
Level N
Mean StDev Budgets 5 261.20
11.95 Payables 10 238.00 21.24 Pricing
8 244.38 9.46
Pooled StDev 16.21
33
Solution Steps (i), (ii), (iii)
34
Solution Steps (iv), (v) and (vi)

P-value P-value 0.052
Decision rule Reject H0 if P-value lt 0.01, but
if P-value gt 0.01 then we cannot reject H0. In
the present case P-value 0.052 gt 0.01, so H0
cannot be rejected.
There is not enough evidence to suggest
differences between mean annual attendance rates
in the three departments.

35
3. Assumptions and Conditions

Well defined continuous variables?
Representative sample?
Large sample sizes or normally distributed
variables?
Look at a normal probability plot of residuals,
but large degrees of freedom for error term helps
the CLT to work
Equal variances?
Look at the sample sizes. If equal, this
protects against adverse consequences. If not
equal, look at sample s.d.s.
Independence?

36
Assumptions and conditions in Lecture Exercise 1
(Toxin readings)

Well defined continuous variables?
Toxin readings are continuous variables
Equal variances?
Equal sample sizes, so equal variances can be
assumed
Representative sample?
Yes, as a result of random sampling.
Independence?
It would be OK, say, if readings carried out
separately

37
Normality condition?

The Error (Within Samples) degrees of freedom is
18 lt 30, which is not large enough
However, a normal probability plot of the
residuals shown on the next slide confirms the
normal distribution of the residuals.

38
Normal probability plot of residuals
39
The case k 2, comparing two means

Can be tested either with
a 2-sample t-test or
a 2-sample z-test, or
one way ANOVA based on two samples
The t-test and z-test use a CI for difference
between means see Week 10
The P-value is identical for both 2-sample t-test
and ANOVA
See the textbook for an example, and more
explanation

Write a Comment

User Comments (0)

About PowerShow.com

Week 12 objectives PowerPoint PPT Presentation