Chi Square - PowerPoint PPT Presentation

About This Presentation
Title:

Chi Square

Description:

Chi Square & Correlation Nonparametric Test of Chi2 Used when too many assumptions are violated in T-Tests: Sample size too small to reflect population Data are not ... – PowerPoint PPT presentation

Number of Views:111
Avg rating:3.0/5.0
Slides: 24
Provided by: Politica6
Category:
Tags: chi | square | test

less

Transcript and Presenter's Notes

Title: Chi Square


1
Chi Square Correlation
2
Nonparametric Test of Chi2
  • Used when too many assumptions are violated in
    T-Tests
  • Sample size too small to reflect population
  • Data are not continuous and thus not appropriate
    for parametric tests based on normal
    distributions.
  • ?2 is another way of showing that some pattern in
    data is not created randomly by chance.
  • X2 can be one or two dimensional.
  • X2 deals with the question of whether what we
    observed is different from what is expected

3
Calculating X2
  • What would a contingency table look like if no
    relationship exists between gender and voting for
    Bush? (i.e. statistical independence)

Male
Female
25 25
25 25
Voted for Bush
50
Voted for Kerry
50
100
50
50
NOTE INDEPENDENT VARIABLES ON COLUMS AND
DEPENDENT ON ROWS
4
Calculating X2
  • What would a contingency table look like if a
    perfect relationship exists between gender and
    voting for Bush?

Male
Female
Voted for Bush
50 0
0 50
Voted for Kerry
5
Calculating the expected value
The expected frequency of the cell in the ith row
and jth column
Fi The total in the ith row marginal Fj The
total in the jth column marginal N The grand
total, or sample size for the entire table
Expected Voted for Bush 50x50 / 100 25
6
Nonparametric Test of Chi2
  • Again, the basic question is what you are
    observing in some given data created by chance or
    through some systematic process?

O Observed frequency E Expected frequency
7
Nonparametric Test of Chi2
  • The null hypothesis we are testing here is that
    the proportion of occurrences in each category
    are equal to each other (Ho BK). Our research
    hypothesis is that they are not equal (Ha B K).
  • Given the sample size, how many cases could we
    expect in each category (n/categories)? The
    obtained/critical value estimation will provide a
    coefficient and a Pr. that the results are random.

8
Lets do a X2
  • (50-25)2/2525
  • (0 - 25)2 /2525
  • (0 - 25)2 /2525
  • (50-25)2 /2525
  • X2100

Male
Female
Voted for Bush
50 0
0 50
Voted For Kerry
What would X2 be when there is statistical
independence?
9
Lets corroborate with SPSS
10
Testing for significance
  • How do we know if the relationship is
    statistically significant?
  • We need to know the df (df (R-1) (C-1) )
  • (2-1)(2-1) 1
  • We go to the X2 distribution to look for the
    critical value (CV 3.84)
  • We conclude that the relationship gender and
    voting is statistically significant.

Male
Female
Voted for Bush
20 30
30 20
Voted for Kerry
X2 4
11
When is X2 appropriate to use?
  • X2 is perhaps the most widely used statistical
    technique to analyze nominal and ordinal data
  • Nominal X nominal (gender and voting preferences)
  • Nominal and ordinal (gender and opinion for W)

12
X2 can also be used with larger tables
Opinion of Bush MALE FEMALE
Favorable 40 5
Indifferent 10 20
Unfavorable 15 55
45
(19.4)
(15.8)
30
(.88)
(.72)
70
(8.6)
(6.9)
65
80
145
X252.3 Do we reject the null hypothesis?
13
Correlation (Does not mean causation)
  • We want to know how two variables are related to
    each other
  • Does eating doughnuts affect weight?
  • Does spending more hours studying increase test
    scores?
  • Correlation means how much two variables overlap
    with each other

14
Types of Correlations
X (cause) Y (effect) Correlation Values
Increases Increases Positive 0 to1
Decreases Decreases Positive 0 to 1
Increases Decreases Negative -1 to 0
Decreases Increases Negative -1 to 0
Increase Decreases Does not change Independent 0
15
Conceptualizing Correlation
Measuring Development
Strong
Weak
GPD
POP WEIGHT
GDP
EDUCATION
Correlation will be associated with what type of
validity?
16
Correlation Coefficient
17
Home Value Square footage
Log value Log sqft value2 sqft2 Val sqft
5.13 4.02 26.3169 16.1604 20.6226
5.2 4.54 27.04 20.6116 23.608
4.53 3.53 20.5209 12.4609 15.9909
4.79 3.8 22.9441 14.44 18.202
4.78 3.86 22.8484 14.8996 18.4508
4.72 4.17 22.2784 17.3889 19.6824
29.15 23.92 141.95 95.96 116.56
18
Correlation Coefficient
19
Rules of Thumb
Size of correlation coefficient General Interpretation
.8 - 1.0 Very Strong
.6 - .8 Strong
.4 - .6 Moderate
.2 - .4 Weak
.0 - .2 Very Weak or no relationship
20
Multiple Correlation Coefficients
21
Limitation of correlation coefficients
  • They tell us how strong two variables are related
  • However, r coefficients are limited because they
    cannot tell anything about
  • Causation between X and Y
  • Marginal impact of X on Y
  • What percentage of the variation of Y is
    explained by X
  • Forecasting
  • Because of the above Ordinary Least Square (OLS)
    is most useful

22
Do you have the BLUES?
  • B for Best (Minimum error)
  • L for Linear (The form of the relationship)
  • U for Un-bias (does the parameter truly reflect
    the effect?)
  • E for Estimator

23
Home value and sq. Feet
Does the above line meet the BLUE criteria?
Write a Comment
User Comments (0)
About PowerShow.com