Analysis of frequency data: the chisquare test of independence - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Analysis of frequency data: the chisquare test of independence

Description:

the scores are nominal measures (representing only the frequency of occurrence of a response) ... ? = chi, the 22nd letter of the Greek alphabet (pronounced 'ky' ... – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 16
Provided by: inform1
Category:

less

Transcript and Presenter's Notes

Title: Analysis of frequency data: the chisquare test of independence


1
Analysis of frequency data the chi-square test
of independence
2
Chi-square test of independence
  • Is used when
  • a score is categorized on two independent
    dimensions
  • a between-subjects design is used
  • the scores are nominal measures (representing
    only the frequency of occurrence of a response).

3
Example
  • Is there a statistically significant relationship
    between gender and choice of the program of
    study, - e.g., either physics or library science?
  • Would, for example, male students more frequently
    choose the physics program of study than the
    library science program of study?

4
(No Transcript)
5
Test statistic
  • ? chi, the 22nd letter of the Greek alphabet
    (pronounced ky).
  • Orc observed frequency in the rc cell, where r
    represents the row number, and c the column
    number.
  • Erc expected frequency of the corresponding rc
    cell.
  • r number of categories of the row variable.
  • c number of levels of the columns variable.
  • The double summation signs and limits (
    ) indicate that the
  • values of are summed over
    all rows and columns of the
  • contingency table.

6
Hypothesis testing
  • A null hypothesis, H0, and an alternative
    hypothesis, H1, are formulated. The null
    hypothesis provides the sampling distribution of
    the ?2 statistic.
  • A significance level a is selected.
  • A critical value of ?2, identified as ?2crit, is
    found from the sampling distribution of ?2 that
    can be found in the standard pre-calculated
    statistical tables for the critical values of the
    chi-square distribution.
  • Using the value of ?2crit, a rejection region is
    located in the sampling distribution of ?2.
  • The observed ?2obs, is calculated from the
    frequencies in the contingency table.
  • A decision to reject or not reject H0 is made on
    the basis of whether ?2obs falls into the
    rejection region.

7
1) Statistical hypotheses
  • H0 The row variable and the column variable are
    independent in the population. In our case
  • H0 The program of study choice of the student is
    independent of gender of the student.
  • H1 The row and column variables are related in
    the population. In our case
  • H1 The program of study choice of the student is
    related to the gender of the student.

8
2) Significance level
  • In line with common practice, lets choose
  • a .05.

9
3) Critical value ?2crit
  • depends on the degrees of freedom of the
    statistic.
  • Degrees of freedom (r - 1)(c 1)
  • For our example, r 2 and c 2 thus,
  • df (2 - 1)(2 - 1), which equals 1.
  • The critical value of ?2 for particular df and a
    can be found in the standard pre-calculated
    statistical tables for the critical values of the
    chi-square distribution in our case ?2crit 3.84

10
4) Rejection region
  • For our example, if ?2obs is equal to or greater
    than 3.84, it falls into the rejection region and
    will lead to rejection of H0 and acceptance of H1.

11
5) Calculating ?2obs
12
Practical calculation of ?2obs
  • Subtract the expected frequency (E) from the
    observed frequency (O) for each cell.
  • Square each O - E difference.
  • Divide each squared O - E difference by the
    expected frequency of the cell.
  • Sum the resulting (O - E)2/E values over all
    cells in the contingency table to obtain ?2obs.

13
Numerical value of ?2obs
  • can be calculated as follows

2.286 2.000 1.143 1.000 6.43 when
rounded to two decimal places.
14
6) Decisions
  • The ?2obs of 6.43 is larger than ?2crit of 3.84
    and, thus, falls into the rejection region.
  • H0 is rejected and H1 is accepted.
  • There is a statistically significant relationship
    between the row and column variables.
  • Gender and program of study selection are related
    in the case of physics and library science.
  • A greater than chance frequency of the physics
    program of study selection is associated with
    male gender, and a greater than chance frequency
    of the library science program of study selection
    is associated with female gender.

15
Assumptions for the use of the chi-square test of
independence
  • The test may be used on a contingency table of
    any size (e.g., 2x3, 3x4, or 4x4).
  • Each participant may contribute only one response
    to the contingency table.
  • The number of responses obtained should be large
    enough so that no expected frequency is less than
    10 in a 2x2 contingency table, or less than 5 in
    a contingency table larger than 2x2.
  • Alternative Fisher exact test.
Write a Comment
User Comments (0)
About PowerShow.com