Title: Contingency Tables
1Contingency Tables
- Tables representing all combinations of levels of
explanatory and response variables - Numbers in table represent Counts of the number
of cases in each cell - Row and column totals are called Marginal counts
2Example EMT Assessment of Kids
- Explanatory Variable Child Age (Infant,
Toddler, Pre-school, School-age, Adolescent) - Response Variable EMT Assessment (Accurate,
Inaccurate)
Source Foltin, et al (2002)
3Pearsons Chi-Square Test
- Can be used for nominal or ordinal explanatory
and response variables - Variables can have any number of distinct levels
- Tests whether the distribution of the response
variable is the same for each level of the
explanatory variable (H0 No association between
the variables) - r of levels of explanatory variable
- c of levels of response variable
4Pearsons Chi-Square Test
- Intuition behind test statistic
- Obtain marginal distribution of outcomes for the
response variable - Apply this common distribution to all levels of
the explanatory variable, by multiplying each
proportion by the corresponding sample size - Measure the difference between actual cell counts
and the expected cell counts in the previous step
5Pearsons Chi-Square Test
- Notation to obtain test statistic
- Rows represent explanatory variable (r levels)
- Cols represent response variable (c levels)
6Pearsons Chi-Square Test
- Marginal distribution of response and expected
cell counts under hypothesis of no association
7Pearsons Chi-Square Test
- H0 No association between variables
- HA Variables are associated
8Example EMT Assessment of Kids
Observed
Expected
9Example EMT Assessment of Kids
- Note that each expected count is the row total
times the column total, divided by the overall
total. For the first cell in the table
- The contribution to the test statistic for this
cell is
10Example EMT Assessment of Kids
- H0 No association between variables
- HA Variables are associated
Reject H0, conclude that the accuracy of
assessments differs among age groups
11Example - SPSS Output
12Example - Cyclones Near Antarctica
- Period of Study September,1973-May,1975
- Explanatory Variable Region (40-49,50-59,60-79)
(Degrees South Latitude) - Response Season (Aut(4),Wtr(5),Spr(4),Sum(8))
(Number of months in parentheses) - Units Cyclones in the study area
- Treating the observed cyclones as a random
sample of all cyclones that could have occurred
Source Howarth(1983), An Analysis of the
Variability of Cyclones around Antarctica and
Their Relation to Sea-Ice Extent, Annals of the
Association of American Geographers,
Vol.73,pp519-537
13Example - Cyclones Near Antarctica
For each region (row) we can compute the
percentage of storms occuring during each season,
the conditional distribution. Of the 1517
cyclones in the 40-49 band, 370 occurred in
Autumn, a proportion of 370/1517.244, or 24.4
as a percentage.
14Example - Cyclones Near Antarctica
Graphical Conditional Distributions for Regions
15Example - Cyclones Near Antarctica
Observed Cell Counts (fo)
Note that overall (1876/9165)10020.5 of all
cyclones occurred in Autumn. If we apply that
percentage to the 1517 that occurred in the
40-49S band, we would expect (0.205)(1517)310.5
to have occurred in the first cell of the table.
The full table of fe
16Example - Cyclones Near Antarctica
Computation of
17Example - Cyclones Near Antarctica
- H0 Seasonal distribution of cyclone occurences
is independent of latitude band - Ha Seasonal occurences of cyclone occurences
differ among latitude bands - Test Statistic
- P-value Area in chi-squared distribution with
(3-1)(4-1)6 degrees of freedom above 71.2 - Frrom Table 8.5, P(c2??22.46).001 ? Plt .001
18SPSS Output - Cyclone Example
P-value
19Data Sources
- Foltin, G., D. Markinson,M. Tunik, et al
(2002). Assessment of Pediatric Patients by
Emergency Medical Technicians Basic, Pediatric
Emergency Care, 1881-85. - Howarth, D.A. (1983), An Analysis of the
Variability of Cyclones around Antarctica and
Their Relation to Sea-Ice Extent, Annals of the
Association of American Geographers, 73519-537