Title: Data analysis: cross-tabulation
1Data analysis cross-tabulation
GAP Toolkit 5
Training in basic drug abuse data
management and analysis
Training session 11
2Objectives
- To introduce cross-tabulation as a method of
investigating the relationship between two
categorical variables - To describe the SPSS facilities for
cross-tabulation - To discuss a range of simple statistics to
describe the relationship between two categorical
variables - To reinforce the range of SPSS skills learnt to
date
3Bivariate analysis
- The relationship between two variables
- A two-way table
- Rows categories of one variable
- Columns categories of the second variable
4Gender
Frequency Percent Valid Percent Cumulative Percent
Valid Male 1251 79.6 79.9 79.9
Female 314 22.0 20.1 100.0
Total 1565 99.6 100.0
Missing System 6 .4
Total 1571 100.0
5Mode of ingestion Drug 1
Frequency Percent Valid Percent Cumulative Percent
Valid Swallow 794 50.5 51.0 51.0
Smoke 634 40.4 40.7 91.7
Snort 62 3.9 4.0 95.6
Inject 30 1.9 1.9 97.6
12.00 2 .1 .1 97.7
15.00 1 .1 .1 97.8
23.00 10 .6 .6 98.4
24.00 11 .7 .7 99.1
25.00 5 .3 .3 99.4
34.00 4 .3 .3 99.7
234.00 5 .3 .3 100.0
Total 1558 99.2 100.0
Missing System 13 .8
Total 1571 100.0
6Cleaning Mode1
- Save a copy of the original
- Recode the out-of-range values into a new value
(for example,12, 15, 23, 24 ,25, 34, 234 into the
value 8) - Set the new value as a user-defined missing value
(for example, 8 is declared a missing value and
given the label Out-of-range).
7Mode of ingestion Drug 1
Frequency Percent Valid Percent Cumulative Percent
Valid Swallow 794 50.5 52.2 52.2
Smoke 634 40.4 41.7 93.9
Snort 62 3.9 4.1 98.0
Inject 30 1.9 2.8 100.0
Total 1520 96.8 100.0
Missing Out-of-range 38 2.4
System 13 .8
Total 51 3.2
Total 1571 100.0
8(No Transcript)
9Gender Gender
Male Female Female Total
Swallow 600 194 194 794
Smoke 553 77 77 630
Snort 44 17 17 61
Inject 20 10 10 30
Total 1271 298 298 1515
Mode of ingestion Drug1
10Percentages
- The difference in sample size for men and women
makes comparison of raw numbers difficult - Percentages facilitate comparison by
standardizing the scale - There are three options for the denominator of
the percentage - Grand total
- Row total
- Column total
11Mode of ingestion Drug1 Gender cross-tabulation
Gender Gender
Male Female Female Total
Swallow Count Count 600 194 194 794
of Total of Total 39.6 12.8 12.8 52.4
Smoke Count Count 553 77 77 630
of Total of Total 36.5 5.1 5.1 41.6
Snort Count Count 44 17 17 61
of Total of Total 2.9 1.1 1.1 4.0
Inject Count Count 20 10 10 30
of Total of Total 1.3 .7 .7 2.0
Total Count Count 1271 298 298 1515
of Total of Total 80.3 19.7 19.7 100.0
Mode of ingestion Drug1
12Mode of ingestion Drug1 Gender cross-tabulation
Gender Gender
Male Female Female Total
Swallow Count Count 600 194 194 794
within Mode of ingestion Drug1 within Mode of ingestion Drug1 75.6 24.4 24.4 100.0
Smoke Count Count 553 77 77 630
within Mode of ingestion Drug1 within Mode of ingestion Drug1 87.8 12.2 12.2 100.0
Snort Count Count 44 17 17 61
within Mode of ingestion Drug1 within Mode of ingestion Drug1 72.1 27.9 27.9 100.0
Inject Count Count 20 10 10 30
within Mode of ingestion Drug1 within Mode of ingestion Drug1 66.7 33.3 33.3 100.0
Total Count Count 1271 298 298 1515
within Mode of ingestion Drug1 within Mode of ingestion Drug1 80.3 19.7 19.7 100.0
Mode of ingestion Drug1
13Mode of ingestion Drug1 Gender cross-tabulation
Gender Gender
Male Female Female Total
Swallow Count Count 600 194 194 794
within Gender within Gender 49.3 65.1 65.1 52.4
Smoke Count Count 553 77 77 630
within Gender within Gender 45.4 25.8 25.8 41.6
Snort Count Count 44 17 17 61
within Gender within Gender 3.6 5.7 5.7 4.0
Inject Count Count 20 10 10 30
within Gender within Gender 1.6 3.4 3.4 2.0
Total Count Count 1271 298 298 1515
within Gender within Gender 100.0 100.0 100.0 100.0
Mode of ingestion Drug1
14Choosing percentages
- Construct the proportions so that they sum to
one within the categories of the explanatory
variable. - Source (C. Marsh, Exploring Data An
Introduction to Data Analysis for Social
Scientists (Cambridge, Polity Press, 1988), p.
143.)
15(No Transcript)
16(No Transcript)
17(No Transcript)
18(No Transcript)
19Dimensions
Definitions of vertical and horizontal variables
20Two-by-two tables
- Tables with two rows and two columns
- A range of simple descriptive statistics can be
applied to two-by-two tables - It is possible to collapse larger tables to these
dimensions
21Gender White pipe cross-tabulation
White pipe White pipe
Yes No No Total
Male Count Count 290 961 961 1251
within Gender within Gender 23.2 76.8 76.8 100.0
Female Count Count 22 292 292 314
within Gender within Gender 7.0 93.0 93.0 100.0
Total Count Count 312 1253 1253 1565
within Gender within Gender 19.9 80.1 80.1 100.0
Gender
22White pipe White pipe
Yes No No
Gender Male 0.2318 0.7682 0.7682
Female 0.0701 0.9299 0.9299
23Relative risk
- Divide the probabilities for success
- For example P(WhitpipeYesGenderMale)0.2318
P(WhitpipeYesGenderFemale)0.0701Relative
risk is 0.2318/0.07013.309 - The proportion of males using white pipe was over
three times greater than females
24Odds
- The odds of success are the ratio of the
probability of success to the probability of
failure - For example
- - For males the odds of success are
0.2318/0.76820.302 - For females the odds of
success are 0.0701/0.92990.075
25Odds ratio
- Divide the odds of success for males by the odds
of success for females - For example 0.302/0.0754.005
- The odds of taking white pipe as a male are four
times those for a female
26(No Transcript)
27Risk estimate
Odds ratio M/F
95 Confidence interval 95 Confidence interval
Value Lower Upper Upper
Odds ratio for Gender (Male / Female) 4.005 2.547 6.299 6.299
For cohort white pipe Yes 3.309 2.184 5.012 5.012
For cohort white pipe No .826 .791 .862 .862
N of valid cases 1565
Relative risk of success
Relative risk of failure
28Exercise 1 cross-tabulations
- Create and comment on the following
cross-tabulations - Age vs Gender
- Race vs Gender
- Education vs Gender
- Primary drugs vs Mode of ingestion
- Suggest other cross-tabulations that would be
useful
29Exercise 2 cross-tabulation
- Construct a dichotomous variable for age Up to
24 years and Above 24 years - Construct a dichotomous variable for the primary
drug of use Alcohol and Not Alcohol - Create a cross-tabulation of the two new
variables and interpret - Generate Relative Risks and Odds Ratios and
interpret
30Summary
- Cross-tabulations
- Joint frequencies
- Marginal frequencies
- Row/Column/Total percentages
- Relative risk
- Odds
- Odds ratios