Testing Hypothesis with Categorical Data - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Testing Hypothesis with Categorical Data

Description:

Categorical variables are measured at either the nominal or the ... Cramer's V. Based on k. V = Square root of X2 / n(k-1) Measures of Association. Lamda ( ? ) ... – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 26
Provided by: SEMO
Category:

less

Transcript and Presenter's Notes

Title: Testing Hypothesis with Categorical Data


1
Testing Hypothesis with Categorical Data
  • Chapter Nine

2
Introduction
  • Categorical variables are measured at either the
    nominal or the ordinal level, and the values of
    these variables consist of distinct categories
  • Chi-square goodness of fit tests (one variable
    tests)consistent with null
  • Two variable tests (test of independence)is
    there a difference

3
One-Variable Goodness of Fit Chi-Square Test
  • fo the observed frequencies from our sample data
  • fe the expected frequencies we should get under
    the null hypothesis, and
  • K the number of categories for the variable

4
One-Variable Goodness of Fit Chi-Square Test
  • Subtract the expected frequencies from the
    observed frequencies, square this difference, and
    then divide by the expected frequenciesThis sum
    is our obtained value of the chi-square statistic
  • Degrees of freedom is important to k Table
  • E-4

5
Two-Variable Chi-Square Test of Independence
  • Independent variable (cause)
  • Dependent variable (effect)
  • Are IV and DV related?
  • If so how strong is that relationship?

6
Two-Variable Chi-Square Test of Independence
  • Contingency Table Shows the joint distribution
    of two categorical variables. A contingency
    table is defined by the number of rows and number
    of columns it has. A contingency table with 3
    rows and 2 columns is a 3 x 2 contingency table

7
Two-Variable Chi-Square Test of Independence
  • There are the (R1 and R2) and column marginals
    (C1 and C2)
  • The row marginals correspond to the number of
    cases in each row of the table
  • The column marginals correspond to the frequency
    in each column of the table

8
Two-Variable Chi-Square Test of Independence
  • f o the observed cell frequencies from our
    sample data,
  • fe the expected cell frequencies we should get
    under the null hypothesis, and
  • k the number of cells in the table

9
Two-Variable Chi-Square Test of Independence
  • The observed frequencies are the joint
    distribution of two categorical variables that we
    actually observed in our sample data
  • The expected frequencies are the joint frequency
    distribution we would expect to se if the two
    categorical variables were in fact independent of
    each other

10
Two-Variable Chi-Square Test of Independence
  • Multiplication rule
  • Expected freq. multiply probability by the total
    number of cases
  • P(A and B) P(A) X P(B)

11
Two-Variable Chi-Square Test of Independence
  • Where
  • RMi the row marginal frequency for row i,
  • CMj the column marginal frequency for column j,
    and
  • n the total number of cases
  • Pg. 330

12
Two-Variable Chi-Square Test of Independence
  • Specifically, the chi-square test takes the
    difference between the observed and expected cell
    frequencies for each cell in the table. If the
    observed frequencies are equal to the expected
    frequencies (i.e., if the difference between them
    is zero), then we can be confident in concluding
    that the two variables are independent

13
Two-Variable Chi-Square Test of Independence
  • If the difference between the observed and
    expected cell frequencies is zero, therefore, the
    chi-square test also will be zero
  • As the difference between the observed and
    expected cell frequencies increases, the
    magnitude of the chi-square test increases and
    our assumption of independence becomes more and
    more suspicious

14
Two-Variable Chi-Square Test of Independence
  • What we have to determine, therefore, is how
    large a difference we must find between the
    observed and expected cell frequencies, or how
    large a chi-square must we see, before we are
    willing to abandon the null hypothesis of
    independence

15
Two-Variable Chi-Square Test of Independence
  • Chi-Square Test of Independence Table 9-15 and
    9-16.

16
Measures of Association
  • Nominal-Level Variables
  • Phi-Coefficient (F) is appropriate when we have a
    2 X 2 table
  • Magnitude of phi near zero indicate a very weak
    relationship, while those nearing 1.0 indicate a
    very strong relationship

17
Nominal-Level Variables
  • 0 and .29 (weak)
  • .30 and .59 (moderate)
  • .60 and 1.00 (strong)

18
Measures of Association
  • Contingency coefficient (C )
  • Based on value of k
  • C Square root of X2 /n X2

19
Measures of Association
  • Cramers V
  • Based on k
  • V Square root of X2 / n(k-1)

20
Measures of Association
  • Lamda ( ? )
  • Proportionate Reduction in Error (PRE)
  • Vary between 0 and 1.0
  • A value 0 means we cannot reduce our errors in
    predicting the dependent variable from knowledge
    of the independent variable, while a value of 1.0
    means that we can reduce all of errorsor that
    knowledge of the independent will allow us to
    predict with perfect accuracy the value of the
    dependent variable

21
Measures of Association
  • ? Number of errors using mode of DV- number of
    errors using mode of DV within categories of the
    IV / Number of errors using mode of DV

22
Measures of Association
  • fi the largest cell frequency in each category
    of the IV,
  • d the largest marginal frequency of the DV, and
  • n the total number of cases

23
Measures of Association
  • Ordinal Level
  • Goodman and Kruskals gamma
  • Gamma is a proportionate reduction error measure
    with 0 and 1.0
  • 0 and .29 (weak)
  • .30 and .59 (moderate)
  • .60 and 1.00 (strong)

24
Measures of Association
  • Yules Q
  • Q (f cell a X f cell d)- (f cell b X f cell c)/
    f cell a X f cell d) (f cell b X f cell c)

25
Measures of Association
  • Gamma CP-DP/CP DP
  • CP the number of concordant pairs of
    observation, and
  • DP the number of discordant pairs of observation
  • Magnitude 0 to 1.0
Write a Comment
User Comments (0)
About PowerShow.com