Title: Contingency Tables
1Lesson 12 - 2
- Contingency Tables
- and
- Association
2Objectives
- Compute the marginal distribution of a variable
- Use the conditional distribution to identify
association among categorical data
3Vocabulary
- Contingency Table relates to categories of data
- Marginal Distribution a frequency or relative
frequency of either the row or column variable in
the contingency table - Conditional Distribution lists the relative
frequency of each category of a variable, given a
specific value of the other variable in the
contingency table.
4Requirements
- To describe the association between two
categorical variables, relative frequencies
(percentages) must be used, because there will
likely be different numbers of observations for
each of the categories
5Contingency Table
Conditional distributions
Cat 2 Cat1 Plain Peanut Almond Total
Red
Blue
Yellow
Brown
Green
Total
Totals by color of MM
Totals by type of MM
Marginal distributions
6Conditional Distribution graph
Problem 11, page 650 Abortions in thousands
completed in a year, by age and
year Percentages in total cells represents the
marginal distributions Percentages in other
cells represent theconditional distributions
(cell /column total) Total numbers going down
conditional under 19decreasing conditional
over 25 increasing conditional between 20
24 pretty constant
Age (yrs) Year Year Year Total
Age (yrs) 1990 1995 2000 Total
19 36922.86 27420.15 24418.57 88720.69
20 - 24 53232.96 44132.43 43032.72 140332.72
25 71344.18 64547.43 64048.71 199846.60
Total 1614 37.64 1360 31.72 1314 30.64 4288
An alternative graph
7Summary and Homework
- Summary
- Contingency tables are categorical data that have
a specific structure - A row variable
- A column variable
- There are counts associated with each combination
of row variable value and column variable value - Various row and column totals and row and column
frequencies can be used to summarize this data - Homework
- pg 647 651 1, 3, 4, 7, 11, 13
8Comments
- Since the data is population data no inferential
statistical comparisions are done. - Since many of the data is observational, beware
of making any statements regarding causations.
9Even Homework Answers
- 4 since each category could have different
total numbers in them, the only safe way to
compare is through percentages (or relative
frequencies).