Title: 18: Cross-Tabulated Counts
1Chapter 18Cross-Tabulated CountsPart A
2Chapter 18, Part A
- 18.1 Types of Samples
- 18.2 Naturalistic and Cohort Samples
- 18.3 Chi-Square Test of Association
3Types of Samples
- I. Naturalistic Samples simple random sample
or complete enumeration of the population - II. Purposive Cohorts select fixed number of
individuals in each exposure group - III. Case-Control select fixed number of
diseased and non-diseased individuals
4Naturalistic (Type I) Sample
Random sample of study base
5Naturalistic (Type I) Sample
Random sample of study base
- How did we study CMV (the exposure) and
restenosis (the disease) relationship via a
naturalistic sample? - A population was identified and sampled
- Sample classified as CMV and CMV-
- Disease occurrence (restenosis) was studied and
compared in the groups.
6Purposive Cohorts (Type II sample)
Fixed numbers in exposure groups
- How would we study CMV and restenosis with a
purposive cohort design? - A population of CMV individuals would be
identified. - From this population, select, say 38,
individuals. - A population of CMV- individuals would be
identified. - From this population, select, say, 38
individuals. - Disease occurrence (restenosis) would be studied
and compared among the groups.
7Case-control (Type III sample)
Set number of cases and non-cases
- How would I do study CMV and restenosis with a
case-control design? - A population of patents who experienced
restenosis (cases) would be identified. - From this population, select, say, 38,
individuals. - A population of patients who did not restenose
(controls) would be identified. - From this population, select, say, 38
individuals. - The exposure (CMV) would be studied and compared
among the groups.
8Case-Control (Type III sample)
Set number of cases and non-cases
9Naturalistic Sample Illustrative Example
Edu. Smoke? Smoke?
Edu. - Tot
HS 12 38 50
JC 18 67 85
JC 27 95 122
UG 32 239 271
Grad 5 52 57
Total 94 491 585
- SRS, N 585
- Cross-classify education level (categorical
exposure) and smoking status (categorical
disease) - Talley R-by-C table cross-tab
10Cross-tabulation (cont.)
Educ. Smoke? Smoke?
Educ. - Tot
HS 12 38 50
JC 18 67 85
Some 27 95 122
UG 32 239 271
Grad 5 52 57
Total 94 491 585
Row margins
Total
Column margins
11Cross-tabulation of counts
For uniformity, we will always put the exposure
variable in rows put the disease variable in
columns
12Exposure / Disease relationship
Use conditional proportions to describe
relationships between exposure and disease
13Conditional Proportions Exposure / Disease
Relationship
In naturalistic and cohort samples ? row percents!
R-by-2 Table R-by-2 Table R-by-2 Table R-by-2 Table
- Total
Grp 1 a1 b1 n1
Grp 2 a2 b2 n2
? ? ? ?
Grp R aR bR nR
Total m1 m2 N
14Example
Prevalence of smoking by education
Lower education associated with higher prevalence
(negative association between education and
smoking)
15Relative Risks
Let group 1 represent the least exposed group
16Illustration RRs
Note trend
17k Levels of Disease
Efficacy of Echinacea example. Randomized
controlled clinical trial echinacea vs. placebo
in treatment of URI Exposure Echinacea vs.
placebo Disease severity of illness
Source JAMA 2003, 290(21), 2824-30
18Row Percents for Echinacea Example
Echinacea group fared slightly worse than placebo
group
19Chi-Square Test of Association
- A. H0 no association in population Ha
association in population - B. Test statistic
20Observed
Degree Smoke Smoke - Tot
HS 12 38 50
JC 18 67 85
JC 27 95 122
UG 32 239 271
Grad 5 52 57
Total 94 491 585
21Expected
Smoke Smoke - Total
HighS (50 94) 585 8.034 (50 491) 585 41.966 50
JC 13.658 71.342 85
Some 19.603 102.397 122
UG 43.545 227.455 271
Grad 9.159 47.841 57
Total 94 491 585
22Continuity Corrected Chi-Square
- Pearsons (uncorrected) chi-square
- Yates continuity-corrected chi-square
23Chi-Square Hand Calc.
24Chi-Square ? P-value
- X2stat 13.20 with 4 df
- Table E ? 4 df row ? bracket chi-square statistic
? look up right tail (P-value) regions - Example bracket X2stat between 11.14 (P .025)
and 13.28 (P .01) - ?.01 lt P lt .025
 Right tail Right tail Right tail Right tail Right tail Right tail Right tail Right tail Right tail
0.975 0.25 0.20 0.15 0.10 0.05 0.025 0.01 0.01
df 4 0.48 5.39 5.99 6.74 7.78 9.49 11.14 13.28 14.86
25Illustration X2stat 13.20 with 4 df
The P-value AUC in the tail beyond X2stat
26WinPEPI gt Compare2 gt F1
Input screen row 5 not visible
Output
27Chi-Square, cont.
- How the chi-square works. When observed values
expected values, the chi-square statistic is 0.
When the observed minus expected values gets
large ? evidence against H0 mounts - Avoid chi-square tests in small samples. Do not
use a chi-square test when more than 20 of the
cells have expected values that are less than 5.
28Chi-Square, cont.
- Supplement chi-squares with descriptive stat.
Chi-square statistics do not quantify effects - For 2-by-2 tables, chi-square and z tests produce
identical P-values.
29Discussion and demo on power and sample size
- For estimation
- For testing
- Power
- Sample size