Title: Review of Coding Schemes for Categorical Data
1G89.2229 Multiple Regression Week 9 (Wednesday)
- Review of Coding Schemes for Categorical Data
- Example revisited
- Inclusion of Covariates
- Example extended
- Adjusting in Regression
2Coding Schemes for k categories
- Dummy variables
- Ideal when paired contrasts of interest
- Must choose a reference group
- Unweighted effect codes
- ANOVA approach
- Means of categories are compared without taking
into account possibly different ns. - Each of (k-1) categories compared to mean of
means - Weighted effect codes
- Like UEC but compares category means to weighted
grand mean - Special Contrasts
3Example Revisited Depression in PR Youth
Compute dummy codes with 12 year group as
reference. COMPUTE AGE180. COMPUTE
AGE160. COMPUTE AGE140. IF AGE EQ 18
AGE181. IF AGE EQ 16 AGE161. IF AGE EQ 14
AGE141. Computing unweighted effect
codes. COMPUTE AGE18E0. COMPUTE
AGE16E0. COMPUTE AGE14E0. IF AGE EQ 18
AGE18E1. IF AGE EQ 16 AGE16E1. IF AGE EQ 14
AGE14E1. IF AGE EQ 12 AGE18E-1. IF AGE EQ 12
AGE16E-1. IF AGE EQ 12 AGE14E-1. FREQUENCIES
VARIABLESage /ORDER ANALYSIS . Compute
weighted effect codes with 12 year group as
reference. COMPUTE AGE18EW0. COMPUTE
AGE16EW0. COMPUTE AGE14EW0. IF AGE EQ 18
AGE18EW1. IF AGE EQ 16 AGE16EW1. IF AGE EQ 14
AGE14EW1. IF AGE EQ 12 AGE18EW-222/356. IF AGE
EQ 12 AGE16EW-380/356. IF AGE EQ 12
AGE14EW-356/356.
4Inclusion of Covariates
- In MR we often add new variables into
prediction/structural model - In ANOVA quantitative variables added to model
are called covariates - In experiments covariates can increase precision
- In nonexperimental research, covariates are often
used to adjust for selection effects - Example Adjust age groups for level of adaptive
functioning
5Computing adjusted category means
- When no covariate is included, the Expected means
(from the model) are identical to the sample
means - When a covariate is included, the expected means
vary according to the covariate value - Relative group mean differences are not affected
(unless there is interaction) - Adjusted means often reported for covariate set
at its own mean - Adjusted means are particularly easy to compute
for centered covariates.
6Adjusting for Selection Effects
- Often we wish to make causal inferences from
group comparisons - Drug use vs. no drug use
- Active vs. negligent parenting
- Participation in training programs
- Without random assignment, group differences are
difficult to interpret - Covariates often used to adjust for alternative
selection explanations - This is a difficult area
7Some References Relevant to Selection
- Rosenbaum, Paul R. (2002) Observational studies
(Second Edition). New York Springer. - Campbell, D.T. Kenny, D. A. (1999) A primer on
regression artifacts. New York Guilford - Lord, F. M. (1967) A paradox in the
interpretation of group comparisons.
Psychological Bulletin, 68, 304-305.
8"Lord's Paradox" (Lord, 1967)
- Consider
- Weight change in two dorms, Sept-May
- Is there an effect of food service?
- Adjust for September Wt
May Wt
September Wt
9The groups differ when Sept weight is a covariate
- The regressed change analysis focuses on May
Weight holding constant September weight - Suppose we found that women were more likely to
be in Dorm B and men in Dorm A - When we compare a man and a woman who are the
same weight in September, we expect the man to
gain weight, and the woman to lose weight. - Even though it is reliable and valid, September
weight is not a perfect proxy for selection
effects
10When Does Adjustment Work?
- When samples are likely to differ only because of
random assignment fluctuations - When the covariate is a direct measure of
selection effects - AND when the covariate is measured reliably
- AND when the covariate does not capture transient
state effects - AND when there is no interaction between
covariate and group differences