Title: Multiple Contingency-Table Analysis
1Multiple Contingency-Table Analysis
2A. Philosophical IntroductionWe are now in
position to begin dealing with cause and effect,
that is, causality. Let's take a look at what we
are saying and what we are NOT saying when we
describe something as the cause (X) of some
effect (Y) X ?? YThere is nothing
mystical, metaphysical, or superhuman about this.
We are simply playing a game, one with rules
created by human beings.
3To label one variable the cause of another
variable is nothing more than to have gathered
the evidence required by these rules in order to
impress other human beings that the labels
"cause" and "effect" are being properly used. We
have not ripped back the surface and exposed the
gears and circuits that make the universe work.
All we have done is satisfied the rules of the
game sufficiently to be granted by others the
right to use these labels.What are the rules?
There are three of them.
4B. Criteria for Evaluating CausalityThe three
criteria (rules) that you must demonstrate to be
allowed to label some X the cause of some Y
are Covariation That is, the
independent variable (X) and the dependent
variable (Y) must covary (i.e., must NOT be
statistically independent).
5Remember statistical independence? It can look
like this, . . .
Years of Formal Annual Salary
Education (in 1,000) (X)
(Y)-------------------------------------------
------------------------------ 10
35 16 35 21
35----------------------------------------
---------------------------------
6. . . or it can look like this. (A constant
cannot explain a variable.)
Years of Formal Annual
Salary Education (in 1,000)
(X) (Y)-----------------------------
--------------------------------------------
16 25 16 85
16 45--------------------------
-----------------------------------------------
7 Temporal priority The proposed cause
(X) MUST precede in time the proposed effect
(Y). X ? Y t1
t2 NonspuriousnessNO variables OTHER
THAN the proposed cause (X) could have produced
the proposed effect (Y).
8Before going any further, two qualifiers must be
noted ? Monocausalsounds like a search for
THE ONE cause ? Deterministicseems to
say that the presence of the cause
GUARANTEES the production of the
effect
9Xij
Yij
R
Test Group
Yij
R
Control Group
t1
t2
10The principal weapon that the controlled
experiment possesses is physical control?
covariation established with a t-test or
analysis of variance at the end of the
experiment? time order no problem physically
manipulate the treatment (X), so we know the
temporal sequence ? nonspuriousness control
all potentially spurious variables through
both random selection (R1) and random
assignment (R2) to groups (test and control)
and through control of the physical
environment during the experiment at the end of
the experiment, change could ONLY have been
caused by the ONE THING that varied, the
treatment present in the test group, absent
in the control group.
11Statistical control is the next best thing in
non-experimental settings? covariation use a
statistical measure of association (like ?)
AND a significance test (?2)? time order can
be a problem research design, measurement,
and logic (especially in the case of
demographic variables) are ways of
establishing? nonspuriousness this is the
real issue usually have no physical control
over subjects in field research strategy
homogenize samples with respect to categories
of control variables need to both know and be
able to measure potentially spurious variables
in order to do this.
12 Steps in Statistical Control
Using Contingency Tables create a
zero-order table and statistics (bivariate
relationship, nothing controlled) sort the
data by categories of the control
variable(s) generate first-order partial
tables and recalculate the same test
statistics as for the zero-order
table compare zero-order results with
partial-table results for each potentially
spurious (control) variable
13 The Elaboration ModelThe introduction of
a control variable will result in one of three
outcomes. Each has a special name, and each
outcome says something different about X as the
possible cause of Y Replication
Explanation Specification
14(1) Covariation and (2) time order
zero-order X ???? Y t1
t2
15 Replication The introduction of a control
variable produces NO CHANGES in the measures of
association for the partial tables AND the
relationship remains statistically significant.
161. Replication (first-order partial) Z
not-Z X ??? Y X ??? Y t1
t2 t1 t2
17 Explanation The introduction of a
control variable completely "washes out" all
associations in the partial tables (i.e.,
measures of association are all close
to 0.0 AND the relationships are no longer
statistically significant.) Could mean one
of two things, depending on the time order of
the three variables
182. Washed out (first-order partial)
statistical independence of X and Y
Z not-Z X Y X Y t1
t2 t1 t2
192a. Spuriousness (first-order partial)
Z X Y t1 t2 t3
202b. Explanation (first-order partial)
Z ???? X ???? Y t1 t2 t3
21 Specification The introduction of a
control variable results in increased strength
of association between the two variables in (at
least) one of the partial tables compared to the
zero-order table (with the relationship
remaining statistically significant) while the
other partial table(s) show(s) decreased
association AND the ABSENCE of statistical
significance. Here we have identified a
CONTEXTUAL variable (like a catalyst in a
chemical reaction).
223. Specification (first-order partial) Z
not-Z X ??? Y X Y t1
t2 t1 t2
23 Gender (X) Smoke (Y) Male Female
Total Yes 239 80 319 No 174 523
697 Total 413 603 1,016 Lambda
0.4720, Chi Square 226.3868
24(1) Covariation and (2) time order
zero-order X ???? Y t1
t2
25 Respondents Under the Age of 40 (Z1)
Gender (X) Smoke
(Y) Male Female Total Yes 143 48
191 No 104 314 418 Total 247 362
609 Lambda 0.4724, Chi Square 135.8831
26 Respondents 40 Years of Age and Over (Z2)
Gender (X) Smoke
(Y) Male Female Total Yes 96 32
128 No 70 209 279 Total 166 241
407 Lambda 0.4716, Chi Square 90.5035
271. Replication (first-order partial) Z
not-Z X ??? Y X ??? Y t1
t2 t1 t2
28 Respondents Under the Age of 40 (Z1)
Gender (X) Smoke
(Y) Male Female Total Yes 152 152
304 No 152 153 305 Total 304 305
609 Lambda 0.0016, Chi Square 0.0013
29 Respondents 40 Years of Age and Over (Z2)
Gender (X) Smoke
(Y) Male Female Total Yes 101 102
203 No 102 102 204 Total 203 204
407 Lambda 0.0024, Chi Square 0.0024
302. Washed out (first-order partial)
statistical independence of X and Y
Z not-Z X Y X Y t1
t2 t1 t2
312a. Spuriousness (first-order partial)
Z X Y t1
t2
32 Respondents Under the Age of 40 (Z1)
Gender (X) Smoke
(Y) Male Female Total Yes 164 140
304 No 156 149 305 Total 320 289
609 Lambda 0.0281, Chi Square 0.4786
33 Respondents 40 Years of Age and Over (Z2)
Gender (X) Smoke
(Y) Male Female Total Yes 173 31
204 No 30 173 203 Total 203 204
407 Lambda 0.7003, Chi Square 199.5759
343. Specification (first-order partial) Z
not-Z X ??? Y X Y t1
t2 t1 t2
35 Problems with the Elaboration Model
the greatest difficulty is running out of
cases with multiple control variables with
control variables having multiple categories
with polytomous variables anywhere thus,
tends to be used with ONE control variable
as a time not a true evaluation of causality
difficulty of interpretation in complex
tables how different is different in
concluding specification versus some form
of explanation
36 Using SAS to Perform Multiple Contingency-Table
Analysis libname old 'a\'libname library
'a\' options formchar'--------/\ltgt'
ps66 nodate nonumber proc freq
dataold.citiestable crimescityspnd /
alltitle1 'Multiple Contingency Table
Analysis'title2title3 'Zero-Order
Table'run proc sort dataold.cities
outtempby citysizerun proc freq
datatemptable crimescityspnd / allby
citysizetitle1 'Multiple Contingency-Table
Analysis'title2title3 'First-Order Partial
Tables'run
37 Multiple Contingency-Table Analysis
Zero-Order Table
TABLE OF CRIMES BY
CITYSPND CRIMES(CRIME
RATE, DICHOTOMY)
CITYSPND(CITY SPENDING, DICHOTOMY)
Frequency Percent
Row Pct
Col Pct Less More Total
-------------------------
Lo_Crime 24 11
35 38.10
17.46 55.56
68.57 31.43
55.81 55.00
-------------------------
Hi_Crime 19 9 28
30.16 14.29 44.44
67.86 32.14
44.19 45.00
-------------------------
Total 43 20
63 68.25
31.75 100.00
38 Multiple Contingency-Table Analysis
Zero-Order Table
STATISTICS FOR TABLE OF CRIMES BY
CITYSPND Statistic
DF Value Prob
--------------------------------------------------
---- Chi-Square
1 0.004 0.952 Likelihood
Ratio Chi-Square 1 0.004 0.952
Continuity Adj. Chi-Square 1 0.000
1.000 Mantel-Haenszel
Chi-Square 1 0.004 0.952
Fisher's Exact Test (Left)
0.631 (Right)
0.582
(2-Tail) 1.000
Phi Coefficient 0.008
Contingency Coefficient
0.008 Cramer's V
0.008 Statistic
Value ASE
--------------------------------------------------
---- Gamma
0.016 0.272 Kendall's
Tau-b 0.008 0.126
Stuart's Tau-c
0.007 0.117 Somers' D CR
0.007 0.118
Somers' D RC 0.008
0.135 Pearson Correlation
0.008 0.126 Spearman
Correlation 0.008 0.126
Lambda Asymmetric CR
0.000 0.000 Lambda Asymmetric
RC 0.000 0.000
Lambda Symmetric 0.000
0.000 Uncertainty Coefficient CR
0.000 0.002
Uncertainty Coefficient RC 0.000
0.001 Uncertainty Coefficient
Symmetric 0.000 0.001
39 Multiple Contingency-Table Analysis
First-Order Partial
Tables ----------------------- SIZE OF CITY,
DICHOTOMYSmall ------------------------
TABLE OF CRIMES BY CITYSPND
CRIMES(CRIME RATE,
DICHOTOMY)
CITYSPND(CITY SPENDING, DICHOTOMY)
Frequency Percent
Row Pct
Col Pct Less More Total
-------------------------
Lo_Crime 19 9
28 42.22
20.00 62.22
67.86 32.14
57.58 75.00
-------------------------
Hi_Crime 14 3 17
31.11 6.67 37.78
82.35 17.65
42.42 25.00
-------------------------
Total 33 12
45 73.33
26.67 100.00
40 Multiple Contingency-Table Analysis
First-Order Partial
Tables ----------------------- SIZE OF CITY,
DICHOTOMYSmall ------------------------
STATISTICS FOR TABLE OF CRIMES BY
CITYSPND Statistic
DF Value Prob
--------------------------------------------------
---- Chi-Square
1 1.137 0.286 Likelihood
Ratio Chi-Square 1 1.184 0.277
Continuity Adj. Chi-Square 1 0.516
0.472 Mantel-Haenszel
Chi-Square 1 1.111 0.292
Fisher's Exact Test (Left)
0.239 (Right)
0.924
(2-Tail) 0.488
Phi Coefficient -0.159
Contingency Coefficient
0.157 Cramer's V
-0.159 Statistic
Value ASE
--------------------------------------------------
---- Gamma
-0.377 0.323 Kendall's
Tau-b -0.159 0.139
Stuart's Tau-c
-0.136 0.121 Somers' D CR
-0.145 0.128
Somers' D RC -0.174
0.152 Pearson Correlation
-0.159 0.139 Spearman
Correlation -0.159 0.139
Lambda Asymmetric CR
0.000 0.000 Lambda Asymmetric
RC 0.000 0.000
Lambda Symmetric 0.000
0.000 Uncertainty Coefficient CR
0.023 0.040
Uncertainty Coefficient RC 0.020
0.035 Uncertainty Coefficient
Symmetric 0.021 0.038
41 Multiple Contingency-Table Analysis
First-Order Partial
Tables ----------------------- SIZE OF CITY,
DICHOTOMYLarge ------------------------
TABLE OF CRIMES BY CITYSPND
CRIMES(CRIME RATE,
DICHOTOMY)
CITYSPND(CITY SPENDING, DICHOTOMY)
Frequency Percent
Row Pct
Col Pct Less More Total
-------------------------
Lo_Crime 5 2
7 27.78
11.11 38.89
71.43 28.57
50.00 25.00
-------------------------
Hi_Crime 5 6 11
27.78 33.33 61.11
45.45 54.55
50.00 75.00
-------------------------
Total 10 8
18 55.56
44.44 100.00
42 Multiple Contingency-Table Analysis
First-Order Partial
Tables ----------------------- SIZE OF CITY,
DICHOTOMYLarge ------------------------
STATISTICS FOR TABLE OF CRIMES BY
CITYSPND Statistic
DF Value Prob
--------------------------------------------------
---- Chi-Square
1 1.169 0.280 Likelihood
Ratio Chi-Square 1 1.197 0.274
Continuity Adj. Chi-Square 1 0.354
0.552 Mantel-Haenszel
Chi-Square 1 1.104 0.293
Fisher's Exact Test (Left)
0.943 (Right)
0.278
(2-Tail) 0.367
Phi Coefficient 0.255
Contingency Coefficient
0.247 Cramer's V
0.255 Statistic
Value ASE
--------------------------------------------------
---- Gamma
0.500 0.387 Kendall's
Tau-b 0.255 0.223
Stuart's Tau-c
0.247 0.218 Somers' D CR
0.260 0.227
Somers' D RC 0.250
0.220 Pearson Correlation
0.255 0.223 Spearman
Correlation 0.255 0.223
Lambda Asymmetric CR
0.125 0.388 Lambda Asymmetric
RC 0.000 0.452
Lambda Symmetric 0.067
0.363 Uncertainty Coefficient CR
0.048 0.086
Uncertainty Coefficient RC 0.050
0.089 Uncertainty Coefficient
Symmetric 0.049 0.087
43 Causal Modeling with Discrete
Variables Attached are (selected)
hypothetical output from three crosstabulations
conducted using SAS. The first is the result of
a (zero-order) crosstabulation between serious
crimes per 1,000 population (CRIME) and per
capita income (INCOME) for a random sample of 63
cities in the United States. The second and
third results are for the crosstabulation of
crime and per capita income with size of city,
large and small respectively (CITYSIZE), held
constant. In this exercise, CRIME is the
dependent variable (Y), INCOME is the independent
variable (X), and CITYSIZE is the control
variable (Z). Assume the following time order
among the three variables CITYSIZE precedes
INCOME which precedes CRIME. Set a 0.05 and
answer the following questions. 1. Is the
criterion of covariation between INCOME and CRIME
satisfied by the zero-order
crosstabulation? ________ 2. Is the
relationship between INCOME and CRIME spurious
based upon the results in the partial tables?
________ 3. How would you describe the
relationship between the three variables,
INCOME, CRIME, and CITYSIZE?
44 Results for Zero-Order Table
TABLE OF INCOME (ROWS) BY CRIME (COLUMNS) TEST
STATISTIC VALUE DF
PROB PEARSON CHI-SQUARE
5.818 1 .039
COEFFICIENT VALUE
LAMBDA
.3427 Results for First-Order
Partial Table Large Cities
ONLY TABLE OF INCOME (ROWS) BY CRIME
(COLUMNS) FOR THE FOLLOWING
VALUES CITYSIZE 1 (Large) TEST
STATISTIC VALUE DF
PROB PEARSON CHI-SQUARE
4.967 1 .041
COEFFICIENT VALUE
LAMBDA
.2996 Results for First-Order
Partial Table Small Cities
ONLY TABLE OF INCOME (ROWS) BY CRIME
(COLUMNS) FOR THE FOLLOWING
VALUES CITYSIZE 0 (Small) TEST
STATISTIC VALUE DF
PROB PEARSON CHI-SQUARE
4.833 1 .044
COEFFICIENT VALUE
LAMBDA .2895
45 Causal Modeling with Discrete
Variables Attached are (selected)
hypothetical output from three crosstabulations
conducted using SAS. The first is the result of
a (zero-order) crosstabulation between serious
crimes per 1,000 population (CRIME) and per
capita income (INCOME) for a random sample of 63
cities in the United States. The second and
third results are for the crosstabulation of
crime and per capita income with size of city,
large and small respectively (CITYSIZE), held
constant. In this exercise, CRIME is the
dependent variable (Y), INCOME is the independent
variable (X), and CITYSIZE is the control
variable (Z). Assume the following time order
among the three variables CITYSIZE precedes
INCOME which precedes CRIME. Set a 0.05 and
answer the following questions. 1. Is the
criterion of covariation between INCOME and CRIME
satisfied by the zero-order crosstabulation?
Yes 2. Is the relationship
between INCOME and CRIME spurious based upon
the results in the partial tables?
Yes 3. How would you describe the relationship
between the three variables, INCOME, CRIME,
and CITYSIZE? CITYSIZE causes both INCOME
and CRIME