Testing Hypothesis with Categorical Data - PowerPoint PPT Presentation

1 / 25

About This Presentation

Title:

Testing Hypothesis with Categorical Data

Description:

Categorical variables are measured at either the nominal or the ... Cramer's V. Based on k. V = Square root of X2 / n(k-1) Measures of Association. Lamda ( ? ) ... – PowerPoint PPT presentation

Number of Views:67

Avg rating:3.0/5.0

Slides: 26

Provided by: SEMO

Category:

more less

Transcript and Presenter's Notes

Title: Testing Hypothesis with Categorical Data

1
Testing Hypothesis with Categorical Data

Chapter Nine

2
Introduction

Categorical variables are measured at either the
nominal or the ordinal level, and the values of
these variables consist of distinct categories
Chi-square goodness of fit tests (one variable
tests)consistent with null
Two variable tests (test of independence)is
there a difference

3
One-Variable Goodness of Fit Chi-Square Test

fo the observed frequencies from our sample data
fe the expected frequencies we should get under
the null hypothesis, and
K the number of categories for the variable

4
One-Variable Goodness of Fit Chi-Square Test

Subtract the expected frequencies from the
observed frequencies, square this difference, and
then divide by the expected frequenciesThis sum
is our obtained value of the chi-square statistic
Degrees of freedom is important to k Table
E-4

5
Two-Variable Chi-Square Test of Independence

Independent variable (cause)
Dependent variable (effect)
Are IV and DV related?
If so how strong is that relationship?

6
Two-Variable Chi-Square Test of Independence

Contingency Table Shows the joint distribution
of two categorical variables. A contingency
table is defined by the number of rows and number
of columns it has. A contingency table with 3
rows and 2 columns is a 3 x 2 contingency table

7
Two-Variable Chi-Square Test of Independence

There are the (R1 and R2) and column marginals
(C1 and C2)
The row marginals correspond to the number of
cases in each row of the table
The column marginals correspond to the frequency
in each column of the table

8
Two-Variable Chi-Square Test of Independence

f o the observed cell frequencies from our
sample data,
fe the expected cell frequencies we should get
under the null hypothesis, and
k the number of cells in the table

9
Two-Variable Chi-Square Test of Independence

The observed frequencies are the joint
distribution of two categorical variables that we
actually observed in our sample data
The expected frequencies are the joint frequency
distribution we would expect to se if the two
categorical variables were in fact independent of
each other

10
Two-Variable Chi-Square Test of Independence

Multiplication rule
Expected freq. multiply probability by the total
number of cases
P(A and B) P(A) X P(B)

11
Two-Variable Chi-Square Test of Independence

Where
RMi the row marginal frequency for row i,
CMj the column marginal frequency for column j,
and
n the total number of cases
Pg. 330

12
Two-Variable Chi-Square Test of Independence

Specifically, the chi-square test takes the
difference between the observed and expected cell
frequencies for each cell in the table. If the
observed frequencies are equal to the expected
frequencies (i.e., if the difference between them
is zero), then we can be confident in concluding
that the two variables are independent

13
Two-Variable Chi-Square Test of Independence

If the difference between the observed and
expected cell frequencies is zero, therefore, the
chi-square test also will be zero
As the difference between the observed and
expected cell frequencies increases, the
magnitude of the chi-square test increases and
our assumption of independence becomes more and
more suspicious

14
Two-Variable Chi-Square Test of Independence

What we have to determine, therefore, is how
large a difference we must find between the
observed and expected cell frequencies, or how
large a chi-square must we see, before we are
willing to abandon the null hypothesis of
independence

15
Two-Variable Chi-Square Test of Independence

Chi-Square Test of Independence Table 9-15 and
9-16.

16
Measures of Association

Nominal-Level Variables
Phi-Coefficient (F) is appropriate when we have a
2 X 2 table
Magnitude of phi near zero indicate a very weak
relationship, while those nearing 1.0 indicate a
very strong relationship

17
Nominal-Level Variables

0 and .29 (weak)
.30 and .59 (moderate)
.60 and 1.00 (strong)

18
Measures of Association

Contingency coefficient (C )
Based on value of k
C Square root of X2 /n X2

19
Measures of Association

Cramers V
Based on k
V Square root of X2 / n(k-1)

20
Measures of Association

Lamda ( ? )
Proportionate Reduction in Error (PRE)
Vary between 0 and 1.0
A value 0 means we cannot reduce our errors in
predicting the dependent variable from knowledge
of the independent variable, while a value of 1.0
means that we can reduce all of errorsor that
knowledge of the independent will allow us to
predict with perfect accuracy the value of the
dependent variable

21
Measures of Association