Chi Square - ?2 - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

Chi Square - ?2

Description:

... 100 teenagers listen to radio stations ... H0: Radio station do not differ in popularity ... of teenagers will prefer each of the four radio stations. ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 44
Provided by: john77
Category:
Tags: chi | square

less

Transcript and Presenter's Notes

Title: Chi Square - ?2


1
Chapter 14
  • Chi Square - ?2

2
Chi Square
  • Chi Square is a non-parametric statistic used to
    test the null hypothesis.
  • It is used for nominal data.
  • It is equivalent to the F test that we used for
    single factor and factorial analysis.

3
Chi Square
  • Nominal data puts each participant in a category.
    Categories are best when mutually exclusive and
    exhaustive. This means that each and every
    participant fits in one and only one category
  • Chi Square looks at frequencies in the
    categories.

4
Expected frequencies and the null hypothesis ...
  • Chi Square compares the expected frequencies in
    categories to the observed frequencies in
    categories.
  • Expected frequenciesare the frequencies in each
    cell predicted by the null hypothesis

5
Expected frequencies and the null hypothesis ...
  • The null hypothesis
  • H0 fo fe
  • There is no difference between the observed
    frequency and the frequency predicted (expected)
    by the null.
  • The experimental hypothesis
  • H1 fo ? fe
  • The observed frequency differs significantly from
    the frequency predicted (expected) by the null.

6
Calculating ?2
For each cell
  • Calculate the deviations of the observed from the
    expected.
  • Square the deviations.
  • Divide the squared deviations by the expected
    value.

7
Calculating ?2
  • Add em up.
  • Then, look up ?2 in Chi Square Table
  • df k - 1 (one sample ?2)
  • OR df (Columns-1) (Rows-1)
  • (2 or more samples)

8
Critical values of ?2
df 1 2 3 4
5 6 7 8 .05
3.84 5.99 5.82 9.49 11.07
12.59 14.07 15.51 .01 6.63 9.21
11.34 13.28 15.09 16.81 18.48
20.09 df 9 10 11
12 13 14 15
16 .05 16.92 18.31 19.68 21.03 22.36
23.68 25.00 26.30 .01 21.67 23.21
24.72 26.22 27.69 29.14 30.58
32.00 df 17 18 19
20 21 22 23
24 .05 27.59 28.87 30.14 31.41 32.67
33.92 35.17 36.42 .01 33.41 34.81
36.19 37.57 38.93 40.29 41.64
42.98 df 25 26 27
28 29 30 .05 37.65 38.89
40.14 41.34 42.56 43.77 .01 44.31
45.64 46.96 48.28 49.59 50.89
9
Critical values of ?2
df 1 2 3 4
5 6 7 8 .05
3.84 5.99 5.82 9.49 11.07
12.59 14.07 15.51 .01 6.63 9.21
11.34 13.28 15.09 16.81 18.48
20.09 df 9 10 11
12 13 14 15
16 .05 16.92 18.31 19.68 21.03 22.36
23.68 25.00 26.30 .01 21.67 23.21
24.72 26.22 27.69 29.14 30.58
32.00 df 17 18 19
20 21 22 23
24 .05 27.59 28.87 30.14 31.41 32.67
33.92 35.17 36.42 .01 33.41 34.81
36.19 37.57 38.93 40.29 41.64
42.98 df 25 26 27
28 29 30 .05 37.65 38.89
40.14 41.34 42.56 43.77 .01 44.31
45.64 46.96 48.28 49.59 50.89
Degrees of freedom
10
Critical values of ?2
df 1 2 3 4
5 6 7 8 .05
3.84 5.99 5.82 9.49 11.07
12.59 14.07 15.51 .01 6.63 9.21
11.34 13.28 15.09 16.81 18.48
20.09 df 9 10 11
12 13 14 15
16 .05 16.92 18.31 19.68 21.03 22.36
23.68 25.00 26.30 .01 21.67 23.21
24.72 26.22 27.69 29.14 30.58
32.00 df 17 18 19
20 21 22 23
24 .05 27.59 28.87 30.14 31.41 32.67
33.92 35.17 36.42 .01 33.41 34.81
36.19 37.57 38.93 40.29 41.64
42.98 df 25 26 27
28 29 30 .05 37.65 38.89
40.14 41.34 42.56 43.77 .01 44.31
45.64 46.96 48.28 49.59 50.89
Critical values ? .05
11
Critical values of ?2
df 1 2 3 4
5 6 7 8 .05
3.84 5.99 5.82 9.49 11.07
12.59 14.07 15.51 .01 6.63 9.21
11.34 13.28 15.09 16.81 18.48
20.09 df 9 10 11
12 13 14 15
16 .05 16.92 18.31 19.68 21.03 22.36
23.68 25.00 26.30 .01 21.67 23.21
24.72 26.22 27.69 29.14 30.58
32.00 df 17 18 19
20 21 22 23
24 .05 27.59 28.87 30.14 31.41 32.67
33.92 35.17 36.42 .01 33.41 34.81
36.19 37.57 38.93 40.29 41.64
42.98 df 25 26 27
28 29 30 .05 37.65 38.89
40.14 41.34 42.56 43.77 .01 44.31
45.64 46.96 48.28 49.59 50.89
Critical values ? .01
12
Example
If there were 5 degrees of freedom, how big would
?2 have to be for significance at the .05 level?
13
Critical values of ?2
df 1 2 3 4
5 6 7 8 .05
3.84 5.99 5.82 9.49 11.07
12.59 14.07 15.51 .01 6.63 9.21
11.34 13.28 15.09 16.81 18.48
20.09 df 9 10 11
12 13 14 15
16 .05 16.92 18.31 19.68 21.03 22.36
23.68 25.00 26.30 .01 21.67 23.21
24.72 26.22 27.69 29.14 30.58
32.00 df 17 18 19
20 21 22 23
24 .05 27.59 28.87 30.14 31.41 32.67
33.92 35.17 36.42 .01 33.41 34.81
36.19 37.57 38.93 40.29 41.64
42.98 df 25 26 27
28 29 30 .05 37.65 38.89
40.14 41.34 42.56 43.77 .01 44.31
45.64 46.96 48.28 49.59 50.89
14
Using the ?2 table.
If there were 2 degrees of freedom, how big would
?2 have to be for significance at the .05 level?
Note Unlike most other tables you have seen,
the critical values for Chi Square get larger as
df increase. This is because you are summing
over more cells, each of which usually
contributes to the total observed value of chi
square.
15
Critical values of ?2
df 1 2 3 4
5 6 7 8 .05
3.84 5.99 5.82 9.49 11.07
12.59 14.07 15.51 .01 6.63 9.21
11.34 13.28 15.09 16.81 18.48
20.09 df 9 10 11
12 13 14 15
16 .05 16.92 18.31 19.68 21.03 22.36
23.68 25.00 26.30 .01 21.67 23.21
24.72 26.22 27.69 29.14 30.58
32.00 df 17 18 19
20 21 22 23
24 .05 27.59 28.87 30.14 31.41 32.67
33.92 35.17 36.42 .01 33.41 34.81
36.19 37.57 38.93 40.29 41.64
42.98 df 25 26 27
28 29 30 .05 37.65 38.89
40.14 41.34 42.56 43.77 .01 44.31
45.64 46.96 48.28 49.59 50.89
16
One sample example Party 75 male, 25
femaleThere are 40 swimmers. Since 75 of
people at party are male, 75 of swimmers should
be male. So expected value for males is .750 X 40
30. For women it is .250 x 40 10.00
Observed 20 20
Expected 30 10
O-E -10 10
(O-E)2 100 100
(O-E)2/E 3.33 10
Male Female
df k-1 2-1 1
17
?2 (1, n40) 13.33
Critical values of ?2
df 1 2 3 4
5 6 7 8 .05
3.84 5.99 5.82 9.49 11.07
12.59 14.07 15.51 .01 6.63 9.21
11.34 13.28 15.09 16.81 18.48
20.09 df 9 10 11
12 13 14 15
16 .05 16.92 18.31 19.68 21.03 22.36
23.68 25.00 26.30 .01 21.67 23.21
24.72 26.22 27.69 29.14 30.58
32.00 df 17 18 19
20 21 22 23
24 .05 27.59 28.87 30.14 31.41 32.67
33.92 35.17 36.42 .01 33.41 34.81
36.19 37.57 38.93 40.29 41.64
42.98 df 25 26 27
28 29 30 .05 37.65 38.89
40.14 41.34 42.56 43.77 .01 44.31
45.64 46.96 48.28 49.59 50.89
Exceeds critical value at ? .01 Reject the null
hypothesis.
Gender does affect who goes swimming.
Women go swimming more than expected.
Men go swimming less than expected.
18
2 sample example
Freshman and sophomores who like horror movies.
150
50
Likes horror films
200
100
Dislikes horror films
19
There are 500 altogether. 200 (or a proportion of
.400 like horror movies, 300 (.600) dislike
horror films. (Proportions appear in parentheses
in the margins.) Multiplying by the proportion in
the likes horror films row by the number in the
Freshman column yield the following expected
frequency for the first cell. The formula is
Expected Frequency (Proprowncol). (EF appears
in parentheses in each cell.)
200 (.400)
(100)
150
50 (100)
Likes horror films
200 (150)
100 (150)
300 (.600)
Dislikes horror films
250
500
250
20
Computing ?2
Observed 150 100 50 200
Expected 100 150 100 150
Fresh Likes Fresh Dislikes Soph Likes Soph
Dislikes
df (C-1)(R-1) (2-1)(2-1) 1
21
?2 (1, n500) 83.33
Critical values of ?2
df 1 2 3 4
5 6 7 8 .05
3.84 5.99 5.82 9.49 11.07
12.59 14.07 15.51 .01 6.63 9.21
11.34 13.28 15.09 16.81 18.48
20.09 df 9 10 11
12 13 14 15
16 .05 16.92 18.31 19.68 21.03 22.36
23.68 25.00 26.30 .01 21.67 23.21
24.72 26.22 27.69 29.14 30.58
32.00 df 17 18 19
20 21 22 23
24 .05 27.59 28.87 30.14 31.41 32.67
33.92 35.17 36.42 .01 33.41 34.81
36.19 37.57 38.93 40.29 41.64
42.98 df 25 26 27
28 29 30 .05 37.65 38.89
40.14 41.34 42.56 43.77 .01 44.31
45.64 46.96 48.28 49.59 50.89
Critical at ? .01 Reject the null hypothesis.
Fresh/Soph dimension does affect liking for
horror movies.
Proportionally, more freshman than sophomores
like horror movies
22
The only (slightly)hard part is computing
expected frequencies
  • In one sample case, multiply n by a hypothetical
    proportion based on the null hypothesis that
    frequencies will be random.

23
Simple Example - 100 teenagers listen to radio
stations
H1 Some stations are more popular with teenagers
than others. H0 Radio station do not differ in
popularity with teenagers. Expected frequencies
are the frequencies predicted by the null
hypothesis. In this case, the problem is simple
because the null predicts an equal proportion of
teenagers will prefer each of the four radio
stations.
Is the observed significantly different from the
expected?
24
Observed
Expected
40 30 20 10
25 25 25 25
15 5 -5 15
225 25 25 225
9.00 1.00 1.00 9.00
Closeness to final exam
Category 1 Station 2 Station 3 Station 4



df k-1 (4-1) 3 ?2(3, n100) 20.00, plt.01
25
Example - Admissions to Psychiatric Hospitals
Close to a once/year final
H1 More people are admitted to
psychiatric hospitals when it is near their final
exam. H0 Time from final exam does not have
an effect on hospital admissions. .
Category 1 Within 7 days of final. (11
admitted) Category 2 Between 8 and 30 days. (24
admitted) Category 3 Between 31 and 90 days. (69
admitted) Category 4 More than 90 days. (96
admitted)
26
Psychiatric Admissions
  • Expected frequencyexpected proportion of daysn
  • There are 365 days and 1 final and 200 patients
    admitted each year.
  • Proportion of each kind of day computed below

27
Expected Frequencies
To obtain expected frequencies with 200
admissions multiply proportion of days of each
type by n200. This time the proportions are not
equal.
28
Observed
Expected
11 24 69 96
8 26 66 100
3 -2 3 -4
9 4 9 16
1.12 0.15 0.14 0.16
Closeness to final exam
Category 1 Category 2 Category 3 Category 4



df k-1 (4-1) 3 ?2(3, n200) 1.57, n.s.
29
The only (slightly)hard part is computing
expected frequencies
  • In the multi-sample case, multiply proportion in
    row by numbers in each column to obtain EF in
    each cell.

30
A 3 x 4 Chi Square
Women, stress, and seating preferences. (and
perimeter vs. interior, front vs. back
Front Front Back
Back Perim Inter
Perim Inter
Very Stressed Females Moderately Stressed
Females Control Group Females
10
70
5
15
100
15
50
10
25
100
35
30
15
20
100
300
60
30
150
60
31
Expected frequencies
Women, stress, and perimeter versus interior
seating preferences.
Front Front Back
Back Perim Inter
Perim Inter
Very Stressed Females Moderately Stressed
Females Control Group Females
10
70
5
15
(20)
100
(20)
15
50
10
25
100
(20)
35
30
15
20
100
300
60
30
150
60
32
Column 2
Women, stress, and perimeter versus interior
seating preferences.
Front Front Back
Back Perim Inter
Perim Inter
Very Stressed Females Moderately Stressed
Females Control Group Females
10
70
5
15
(20)
100
(50)
(20)
15
50
10
25
100
(50)
(20)
35
30
15
20
(50)
100
300
60
30
150
60
33
Column 3
Women, stress, and perimeter versus interior
seating preferences.
Front Front Back
Back Perim Inter
Perim Inter
Very Stressed Females Moderately Stressed
Females Control Group Females
10
70
5
15
(20)
100
(50)
(10)
(20)
15
50
10
25
100
(50)
(10)
(20)
35
30
15
20
(50)
(10)
100
300
60
30
150
60
34
All the expected frequencies
Women, stress, and perimeter versus interior
seating preferences.
Front Front Back
Back Perim Inter
Perim Inter
Very Stressed Females Moderately Stressed
Females Control Group Females
10
70
5
15
(20)
100
(50)
(10)
(20)
(20)
15
50
10
25
100
(50)
(10)
(20)
(20)
35
30
15
20
(50)
(10)
(20)
100
300
60
30
150
60
35
Observed 10 70 5 15
Expected 20 50 10 20
Very Stressed
FrontP FrontI BackP BackI
15 50 10 25
20 50 10 20
-5 0 0 5
25 0 0 25
1.25 0.00 0.00 1.25
Moderately Stressed
FrontP FrontI BackP BackI
35 30 15 20
20 50 10 20
15 -20 5 0
225 400 25 0
11.25 8.00 2.50 0.00
Control Group
FrontP FrontI BackP BackI
df (C-1)(R-1) (4-1)(3-1) 6
36
?2 (6, N300) 41.00
Critical values of ?2
df 1 2 3 4
5 6 7 8 .05
3.84 5.99 5.82 9.49 11.07
12.59 14.07 15.51 .01 6.63 9.21
11.34 13.28 15.09 16.81 18.48
20.09 df 9 10 11
12 13 14 15
16 .05 16.92 18.31 19.68 21.03 22.36
23.68 25.00 26.30 .01 21.67 23.21
24.72 26.22 27.69 29.14 30.58
32.00 df 17 18 19
20 21 22 23
24 .05 27.59 28.87 30.14 31.41 32.67
33.92 35.17 36.42 .01 33.41 34.81
36.19 37.57 38.93 40.29 41.64
42.98 df 25 26 27
28 29 30 .05 37.65 38.89
40.14 41.34 42.56 43.77 .01 44.31
45.64 46.96 48.28 49.59 50.89
There is an effect between stressed women
and seating position.
Critical at ? .01 Reject the null hypothesis.
37
Observed 10 70 5 15
Expected 20 50 10 20
O-E -10 20 -5 -5
(O-E)2 100 400 25 25
(O-E)2/E 5.00 8.00 2.50 1.25
Very Stressed
FrontP FrontI BackP BackI
15 50 10 25
20 50 10 20
-5 0 0 5
25 0 0 25
1.25 0.00 0.00 1.25
Moderately Stressed
FrontP FrontI BackP BackI
Very stressed women avoid the perimeter and
prefer the front interior.
The control group prefers the perimeter and
avoids the front interior.
35 30 15 20
20 50 10 20
15 -20 5 0
225 400 25 0
11.25 8.00 2.50 0.00
Control Group
FrontP FrontI BackP BackI
?2 41.00
df (C-1)(R-1) (4-1)(3-1) 6
38
Summary Different Ways of Computing the
Frequencies Predicted by the Null Hypothesis
  • One sample
  • Expect subjects to be distributed equally in each
    cell. OR
  • Expect subjects to be distributed proportionally
    in each cell. OR
  • Expect subjects to be distributed in each cell
    based on prior knowledge, such as, previous
    research.
  • Multi-sample
  • Expect subjects in different conditions to be
    distributed similarly to each other. Find the
    proportion in each row and multiply by the number
    in each column to do so.

39
Conclusion - Chi Square
  • Chi Square is a non-parametric statistic,used for
    nominal data.
  • It is equivalent to the F test that we used for
    single factor and factorial analysis.
  • Chi Square compares the expected frequencies in
    categories to the observed frequencies in
    categories.

40
Conclusion - Chi Square
  • The null hypothesis
  • H0 fo fe
  • There is no difference between the observed
    frequency and frequency predicted by the null
    hypothesis.
  • The experimental hypothesis
  • H1 fo ? fe
  • The observed frequency differs significantly from
    the frequency expected by the null hypothesis.

41
The end. Hope you found the slides helpful!RK
42
Example - Vitamin C and Flu
Experimental Hypothesis Vitamin C prevents
influenza. Null Hypothesis Vitamin C has no
effect on getting the flu. 30 subjects in each
experimental group.
Are the observed significantly different from the
expected?
43
How I computed expected frequencies
Multiply the proportion in each row times the
number in each column. Here Vitamin C row has 30
research participants. Total N 60. So
30/60.500 (half). Twenty-five got influenza. So
half of those 25 should come from the Vitamin C
group. (25 X .50012.5). Same for placebo.
Thirty five did not get influenza, so 35X.500
17.5 of each group should not have.
Are the observed significantly different from the
expected?
Write a Comment
User Comments (0)
About PowerShow.com