Chi-Square Procedures - PowerPoint PPT Presentation

1 / 58
About This Presentation
Title:

Chi-Square Procedures

Description:

Title: Chapter 11 Chi-Square Procedures Author: Michael J Sullivan Last modified by: end user Created Date: 3/28/2002 12:31:58 PM Document presentation format – PowerPoint PPT presentation

Number of Views:223
Avg rating:3.0/5.0
Slides: 59
Provided by: Michael4211
Category:

less

Transcript and Presenter's Notes

Title: Chi-Square Procedures


1
Chi-Square Procedures
  • Chi-Square Test for Goodness of Fit, Independence
    of Variables, and Homogeneity of Proportions

2
The chi-square Goodness of Fit Test you
have only one set of data on a single
characteristic, and you want to know if it
matches an expected distribution based on the
laws of probability(1 variable, 1population)
3
In a chi-square goodness of fit test, the null
hypothesis is always Ho The data follow a
specified distribution The alternative
hypothesis is always Ha The data does not
follow a specified distribution
4
The idea behind testing these types of claims is
to compare actual counts to the counts we would
expect if the null hypothesis were true. If a
significant difference between the actual counts
and expected counts exists, we would take this as
evidence against the null hypothesis.
5
The method for obtaining the expected counts
requires that we determine the number of
observations within each cell under the
assumption the null hypothesis is true.
6
Test Statistic for the Test of Goodness of Fit
Let Oi represent the observed number of counts
in the ith cell, Ei represent the expected number
of counts in the ith cell. Then,
approximately follows the chi-square distribution
with( of cells 1) degrees of freedom in the
contingency table
7
The Chi-Square Test for Goodness of Fit If a
claim is made regarding the data following a
certain distribution, we can use the following
steps to test the claim provided 1. the data is
randomly selected

8
The Chi-Square Test for Goodness of Fit If a
claim is made regarding the data following a
certain distribution, we can use the following
steps to test the claim provided 1. the data is
randomly selected
2. all expected frequencies are
greater than or equal to 1.
9
The Chi-Square Test for Goodness of Fit If a
claim is made regarding the data following a
certain distribution, we can use the following
steps to test the claim provided 1. the data is
randomly selected
2. all expected frequencies are
greater than or equal to 1. 3. 80 of
the expected cell counts are greater than or
equal to 5.
10
EXAMPLE Testing for Goodness of Fit
In consumer marketing, a common problem that any
marketing manager faces is the selection of
appropriate colors for package design. Assume
that a marketing manager wishes to compare five
different colors of package design. He is
interested in knowing if there is a preference
among the five colors so that it can be
introduced in the market. A random sample of 400
consumers reveals the following. Do the consumer
preferences for package colors show any
significant difference?
Package Color Red Blue Green Pink Orange Total
Costumers Preference 70 106 80 70 74 400
11
Step 1. A claim is made regarding the data fit to
a certain distribution. Ho Ha
12
Step 1. A claim is made regarding the data fit to
a certain distribution. Ho the number of
customers who prefer each color are the same.
Ha the number of customers who prefer each
color are not the same.
13
Step 2 Calculate the expected frequencies
(counts) for each cell in the contingency table.
14
Step 2 Calculate the expected frequencies
(counts) for each cell in the contingency table.
Observed Counts
Package Color Red Blue Green Pink Orange Total
Costumers Preference 70 106 80 70 74 400
Expected Counts
Package Color Red Blue Green Pink Orange Total
Costumers Preference 80 80 80 80 80 400
15
Step 3 Verify the requirements for the
chi-square test for goodness of fit are
satisfied. (1) data is randomly selected (2)
all expected frequencies are greater than
or equal to 1 (3) 80 of the expected cell
counts are greater than or equal to 5.
Step 4 Select a proper level of significance ?
16
(No Transcript)
17
Step 5 Compute the test statistic and P-value
P-value cdf(min,max,df)
18
Step 5 Compute the test statistic and P-value
P-value 0.0224
11.4
19
If P-value lt ?, reject null hypothesis
20
If P-value lt ?, reject null hypothesis
11.4gt9.49 and 0.0224lt0.05. Therefore I would
reject the null hypothesis. The data is
statistically significant and I am led to believe
that there is a difference in preference of
package color
21
The chi-square independence test you have
two characteristics of a population, and you want
to see if there is any association between the
characteristics(2 variables, 1 population)
22
In a chi-square independence test, the null
hypothesis is always Ho the variables are
independent The alternative hypothesis is
always Ha the variables are dependent
23
The idea behind testing these types of claims is
to compare actual counts to the counts we would
expect if the null hypothesis were true (if the
variables are independent). If a significant
difference between the actual counts and expected
counts exists, we would take this as evidence
against the null hypothesis.
24
The method for obtaining the expected counts
requires that we determine the number of
observations within each cell under the
assumption the null hypothesis is true.
25
Expected Frequencies in a Chi-Square Independence
Test To find the expected frequencies in a cell
when performing a chi-square independence test,
multiply the row total of the row containing the
cell by the column total of the column containing
the cell and divide this result by the table
total. That is
26
Test Statistic for the Test of Independence Let
Oi represent the observed number of counts in the
ith cell, Ei represent the expected number of
counts in the ith cell. Then,
approximately follows the chi-square distribution
with(r 1)(c 1) degrees of freedom where r is
the number of rows and c is the number of columns
in the contingency table
27
The Chi-Square Test for Independence If a claim
is made regarding the association between (or
independence of) two variables in a contingency
table, we can use the following steps to test the
claim provided 1. the data is randomly selected

28
The Chi-Square Test for Independence If a claim
is made regarding the association between (or
independence of) two variables in a contingency
table, we can use the following steps to test the
claim provided 1. the data is randomly selected

2. all expected frequencies are greater than or
equal to 1.
29
The Chi-Square Test for Independence If a claim
is made regarding the association between (or
independence of) two variables in a contingency
table, we can use the following steps to test the
claim provided 1. the data is randomly selected

2. all expected frequencies are greater than or
equal to 1. 3. 80 of the expected cell
counts are greater than or equal to 5.
30
EXAMPLE Testing for Independence
Money Health Love
Men 82 446 355
Women 46 574 273
31
Step 1. A claim is made regarding the
independence of the data. Ho Ha
32
Step 1. A claim is made regarding the
independence of the data. Ho there is not
association between gender of lifestyle
choice, the variables are independent
Ha there is an association between gender of
lifestyle choice, the variables are dependent
33
Step 2 Calculate the expected frequencies
(counts) for each cell in the contingency table.
34
Step 2 Calculate the expected frequencies
(counts) for each cell in the contingency table.
Observed Counts
Money Health Love
Men 82 446 355
Women 46 574 273
Expected Counts
Money Health Love
Men 63.64 507.13 312.23
Women 64.36 512.87 315.77
35
Step 3 Verify the requirements for the
chi-square test for independence are
satisfied. (1) data is randomly selected (2)
all expected frequencies are greater than
or equal to 1 (3) 80 of the expected cell
counts are greater than or equal
to 5.
Step 4 Select a proper level of significance ?
36
Step 5 Compute the test statistic and P-Value
P-value cdf(min,max,df)
37
Step 5 Compute the test statistic and P-Value
36.84 P 0.00000001
38
If P-value lt ?, reject null hypothesis
39
If P-value lt ?, reject null hypothesis
36.84gt5.99 and 0.00000001lt0.05. Therefore I
would reject the null hypothesis. The data is
statistically significant and I am led to believe
that there is an association between gender and
lifestyle choice and that these variables are
dependent
40
In a chi-square test for homogeneity you take
samples from different populations, and you want
to test to see if the proportions in various
categories is the same for each population(1
variable, multiple populations)
41
In a chi-square homogeneity test, the null
hypothesis is always Ho populations have the
same proportion of
individuals with some characteristic. The
alternative hypothesis is always Ha
populations have different
proportion of individuals with some
characteristic.
42
The idea behind testing these types of claims is
to compare actual counts to the counts we would
expect if the null hypothesis were true
(proportions are equal). If a significant
difference between the actual counts and expected
counts exists, we would take this as evidence
against the null hypothesis.
43
The method for obtaining the expected counts
requires that we determine the number of
observations within each cell under the
assumption the null hypothesis is true.
44
Expected Frequencies in a Chi-Square Homogeneity
Test To find the expected frequencies in a cell
when performing a chi-square independence test,
multiply the row total of the row containing the
cell by the column total of the column containing
the cell and divide this result by the table
total. That is
45
Test Statistic for the Test of Homogeneity Let
Oi represent the observed number of counts in the
ith cell, Ei represent the expected number of
counts in the ith cell. Then,
approximately follows the chi-square distribution
with(r 1)(c 1) degrees of freedom where r is
the number of rows and c is the number of columns
in the contingency table
46
The Chi-Square Test for Homogeneity If a claim is
made regarding that different populations have
the same proportion of individuals with some
characteristic, we can use the following steps to
test the claim provided 1. the data is randomly
selected

47
The Chi-Square Test for Homogeneity If a claim is
made regarding that different populations have
the same proportion of individuals with some
characteristic, we can use the following steps to
test the claim provided 1. the data is randomly
selected
2. all expected frequencies are greater
than or equal to 1.
48
The Chi-Square Test for Homogeneity If a claim is
made regarding that different populations have
the same proportion of individuals with some
characteristic, we can use the following steps to
test the claim provided 1. the data is randomly
selected
2. all expected frequencies are greater
than or equal to 1. 3. 80 of the
expected cell counts are greater than or equal to
5.
49
EXAMPLE A Test of Homogeneity of Proportions The
following question was asked of a random sample
of individuals in 1992, 1998, and 2001 Would
you tell me if you feel being a teacher is an
occupation of very great prestige? The results
of the survey are presented below
1992 1998 2001
Yes 549 539 570
No 522 578 599
50
Step 1. A claim is made regarding the homogeneity
of the data. Ho Ha
51
Step 1. A claim is made regarding the homogeneity
of the data. Ho the proportions of
individuals who feel teaching is an
occupation of very great prestige in each year
are equal Ha the proportions of individuals
who feel teaching is an occupation of very
great prestige in each year are not equal
52
Step 2 Calculate the expected frequencies
(counts) for each cell in the contingency table.
53
Step 2 Calculate the expected frequencies
(counts) for each cell in the contingency table.
Observed Counts
1992 1998 2001
Yes 549 539 570
No 522 578 599
Expected Counts
1992 1998 2001
Yes 528.96 551.68 577.36
No 542.04 565.32 591.64
54
Step 3 Verify the requirements for the
chi-square test for homogeneity are
satisfied. (1) data is randomly selected (2)
all expected frequencies are greater than
or equal to 1 (3) 80 of the expected cell
counts are greater than or equal to 5.
Step 4 Select a proper level of significance ?
55
Step 5 Compute the test statistic and P-Value
P-value cdf(min,max,df)
56
Step 5 Compute the test statistic and P-Value
2.26 P 0.3228
57
If P-value lt ?, reject null hypothesis
58
If P-value lt ?, reject null hypothesis
2.26lt9.21 and 0.323gt0.01. Therefore I would fail
to reject the null hypothesis. The data is not
statistically significant and I can not conclude
that the proportions of individuals who feel
teaching is an occupation of very great prestige
is different each year
Write a Comment
User Comments (0)
About PowerShow.com