Workshop Moderated Regression Analysis

About This Presentation

Title:

Workshop Moderated Regression Analysis

Description:

Why the standardized betas given by SPSS are false ... Beta-weight ( ) is already an effect size statistic, though not perfect ... – PowerPoint PPT presentation

Number of Views:310

Avg rating:3.0/5.0

Slides: 95

Provided by: wilh52

Category:

more less

Transcript and Presenter's Notes

Title: Workshop Moderated Regression Analysis

1
Workshop Moderated Regression Analysis

EASP summer school 2008, Cardiff
Wilhelm Hofmann

2
Overview of the workshop

Introduction to moderator effects
Case 1 continuous ? continuous variable
Case 2 continuous ? categorical variable
Higher-order interactions
Statistical Power
Outlook 1 dichotomous DVs
Outlook 2 moderated mediation analysis

3
Main resources

The Primer Aiken West (1991). Multiple
regression Testing and interpreting
interactions. Newbury Park, CA Sage.
Cohen, Aiken, West (2004). Regression analysis
for the behavioral sciences, Chapters 7 and 9
West, Aiken, Krull (1996). Experimental
personality designs Analyzing categorical by
continuous variable interactions. Journal of
Personality, 64, 1-48.
Whisman McClelland (2005). Designing, testing,
and interpreting interactions and moderator
effects in family research. Journal of Family
Psychology, 19, 111-120.
This presentation, dataset, syntaxes, and excel
sheets available at Summer School webpage!

4
What is a moderator effect?

Effect of a predictor variable (X) on a criterion
(Z) depends on a third variable (M), the
moderator
Synonymous term interaction effect

5
Examples from social psychology

Social facilitation Effect of presence of others
on performance depends on the dominance of
responses (Zajonc, 1965)
Effects of stress on health dependent on social
support (Cohen Wills, 1985)
Effect of provocation on aggression depends on
trait aggressiveness (Marshall Brown, 2006)

6
Simple regression analysis
X
Y
7
Simple regression analysis
Y
b1
b0
X
8
Multiple regression with additive predictor
effects
X
M
Y
9
Multiple regression with additive predictor
effects
intercept
High M

The intercept of regression of Y on X depends
upon the specific value of M
Slope of regression of Y on X (b1) stays constant

Medium M
Y
b2
10
Multiple regression including interaction among
predictors
X
M
Y
X?M
11
Multiple regression including interaction among
predictors
intercept
slope

The slope and intercept of regression of Y on X
depends upon the specific value of M
Hence, there is a different line for every
individual value of M (simple regression line)

Y
High M
Medium M
Low M
X
12
Regression model with interaction quick facts

The interaction is carried by the XM term, the
product of X and M
The b3 coefficient reflects the interaction
between X and M only if the lower order terms b1X
and b2M are included in the equation!
Leaving out these terms confounds the additive
and multiplicative effects, producing misleading
results
Each individual has a score on X and M. To form
the XM term, multiply together the individuals
scores on X and M.

13
Regression model with interaction

There are two equivalent ways to evaluate whether
an interaction is present
Test whether the increment in the squared
multiple correlation (?R2) given by the
interaction is significantly greater than zero
Test whether the coefficient b3 differs
significantly from zero
Interactions work both with continuous and
categorical predictor variables. In the latter
case, we have to agree on a coding scheme (dummy
vs. effects coding)
Workshop Case I continous ? continuous var
interaction
Workshop Case II continuous ? categorical var
interaction

14
Case 1 both predictors (and the criterion) are
continuous

X height
M age
Y life satisfaction
Does the effect of height on life satisfaction
depend on age?

height
age
Life Sat
height?age
15
The Data (available at the summer school
homepage)
16
Descriptives
17
Advanced organizer for Case 1

I) Why median splits are not an option
II) Estimating, plotting, and interpreting the
interaction
Unstandardized solution
Standardized solution
III) Inclusion of control variables
IV) Computation of effect size for interaction
term

18
I) Why we all despise median splits The costs
of dichotomization
For more details, see Cohen, 1983 Maxwell
Delaney, 1993 West, Aiken, Krull, 1996)

So why not simply split both X and M into two
groups each and conduct ordinary ANOVA to test
for interaction?
Disadvantage 1 Median splits are highly sample
dependent
Disadvantage 2 drastically reduced power to
detect (interaction) effects by willfully
throwing away useful information
Disadvantage 3 in moderated regression, median
splits can strongly bias results

19
II) Estimating the unstandardized solution

Unstandardized original metrics of variables
are preserved
Recipe
Center both X and M around the respective sample
means
Compute crossproduct of cX and cM
Regress Y on cX, cM, and cXcM

20
Why centering the continuous predictors is
important

Centering provides a meaningful zero-point for X
and M (gives you effects at the mean of X and M,
respectively)
Having clearly interpretable zero-points is
important because, in moderated regression, we
estimate conditional effects of one variable when
the other variable is fixed at 0, e.g.
Thus, b1 is not a main effect, it is a
conditional effect at M0!
Same applies when viewing effect of M on Y as a
function of X.
Centering predictors does not affect the
interaction term, but all of the other
coefficients (b0, b1, b2) in the model
Other transformations may be useful in certain
cases, but mean centering is usually the best
choice

21
SPSS Syntax

unstandardized.
center height and age (on grand mean) and
compute interaction term.
DESC varheight age.
COMPUTE heightc height - 173 .
COMPUTE agec age - 29.8625.
COMPUTE heightc.agec heightcagec.
REGRESSION
/STATISTICS R CHA COEFF
/DEPENDENT lifesat
/METHODENTER heightc agec
/METHODENTER heightc.agec.

22
SPSS output
Do not interpret betas as given by SPSS, they are
wrong!
b0 b1 b2 b3
Test of significance of interaction
23
Plotting the interaction

SPSS does not provide a straightforward module
for plotting interactions
There is an infinite number of slopes we could
compute for different combinations of X and M
Minimum We need to calculate values for high (1
SD) and low (-1 SD) X as a function of high (1
SD) and low (-1 SD) values on the moderator M

24
Unstandardized PlotCompute values for the plot
either by hand

Effect of height on life satisfaction
1 SD below the mean of age (M)
-1 SD of height
1 SD of height
1 SD above the mean of age (M)
-1 SD of height
1 SD of height

25
or let Excel do the job!
Adapted from Dawson, 2006
26
Interpreting the unstandardized plot Effect of
height moderated by age
Intercept LS at mean of height and age (when
both are centered)
Simple slope of height at mean age
b .034
Change in the slope of height for eachone-unit
increase in age
Change in the slope of height for a 1 SDincrease
in age
b .034(-.0084.9625) -.0057
Simple slope of age at mean height (difficult to
illustrate)
163
173
183
Mean Height
27
Interpreting the unstandardized plot Effect of
age moderated by height
Intercept LS at mean of age and height (when
centered)
Simple slope of age at mean height
b .017(-.0089.547) -.059
Change in the slope of age for a 1 SD increase in
height
Change in the slope of age for each one-unit
increase in height
b .017
Simple slope of height at mean age (difficult to
illustrate)
28
Estimating the proper standardized solution

Standardized solution (to get the beta-weights)
Z-standardize X, M, and Y
Compute product of z-standardized scores for X
and M
Regress zY on zX, zM, and zXzM
The unstandardized solution from the output is
the correct solution (Friedrich, 1982)!

29
Why the standardized betas given by SPSS are false

SPSS takes the z-score of the product (zXM) when
calculating the standardized scores.
Except in unusual circumstances, zXM is different
from zxzm, the product of the two z-scores we are
interested in.
Solution (Friedrich, 1982) feed the predictors
on the right into an ordinary regression. The Bs
from the output will correspond to the correct
standardized coefficients.

?
30
SPSS Syntax

standardized.
let spss z-standardize height, age, and lifesat.
DESC varheight age lifesat/save.
compute interaction term from z-standardized
scores.
COMPUTE zheight.zage zheightzage.
REGRESSION
/DEPENDENT zlifesat
/METHODENTER zheight zage
/METHODENTER zheight.zage.

31
SPSS output

Side note What happens if we do not standardize
Y?
?Then we get so-called half-standardized
regression coefficients (i.e., How does one SD
on X/M affect Y in terms of original units?)

32
Standardized plot
? .240
Change in the beta of height for a 1 SDincrease
in age
? .240(-.2701) -.030
33
Simple slope testing

Test of interaction term Does the relationship
between X and Y reliably depend upon M?
Simple slope testing Is the regression weight
for high (1 SD) or low (-1 SD) values on M
significantly different from zero?

34
Simple slope testing

Best done for the standardized solution
Simple slope testing for low (-1 SD) values of M
Add 1 (sic!) to M
Simple slope test for high (1 SD) values of M
Subtract -1 (sic!) from M
Now run separate regression analysis with each
transformed score

Add 1 SD
original scale(centered)
Subtract 1 SD
35
SPSS Syntax

simple slope testing in standardized solution.
regression at -1 SD of M add 1 to zage in order
to shift new zero point one sd below the mean.
compute zagebelowzage1.
compute zheight.zagebelowzheightzagebelow.
REGRESSION
/DEPENDENT zlifesat
/METHODENTER zheight zagebelow
/METHODENTER zheight.zagebelow.
regression at 1 SD of M subtract 1 to zage in
order to shift new zero point one sd above the
mean.
compute zageabovezage-1.
compute zheight.zageabovezheightzageabove.
REGRESSION
/DEPENDENT zlifesat
/METHODENTER zheight zageabove
/METHODENTER zheight.zageabove.

36
Simple slope testing Results
37
Illustration
? .509, p .003
? -.030, p .844
38
III) Inclusion of control variables

Often, you want to control for other variables
(covariates)
Simply add centered/z-standardized continuous
covariates as predictors to the regression
equation
In case of categorical control variables, effects
coding is recommended
Example Depression, measured on 5-point scale
(1-5) with Beck Depression Inventory (continuous)

39
SPSS

COMPUTE deprc depr 3.02.
REGRESSION
/DEPENDENT lifesat
/METHODENTER heightc agec deprc
/METHODENTER agec.heightc.

40
A note on centering the control variable(s)

If you do not center the control variable, the
intercept will be affected since you will be
estimating the regression at the true zero-point
(instead of the mean) of the control variable.

Depression centered
Depression uncentered (intercept estimated at
meaningless value of 0 on the depr. scale)
41
IV) Effect size calculation

Beta-weight (?) is already an effect size
statistic, though not perfect
f2 (see Aiken West, 1991, p. 157)

42
Calculating f2
Squared multiple correlation resulting from
combined prediction of Y by the additive set of
predictors (A) and their interaction (I) ( full
model) Squared multiple correlation resulting
from prediction by set A only ( model without
interaction term)

In words f2 gives you the proportion of
systematic variance accounted for by the
interaction relative to the unexplained variance
in the criterion
Conventions by Cohen (1988)
f2 .02 small effect
f2 .15 medium effect
f2 .26 large effect

43
Example
? small to medium effect
44
Case 2 continuous ? categorical variable
interaction (on continous DV)

Ficticious example
X Body height (continuous)
Y Life satisfaction (continuous)
M Gender (categorical male vs. female)
Does effect of body height on life satisfaction
depend on gender? Our hypothesis body height is
more important for life satisfaction in males

45
Advanced organizer for Case 2

I) Coding issues
II) Estimating the solution using dummy coding
Unstandardized solution
Standardized solution
III) Estimating the solution using unweighted
effects coding
(Unstandardized solution)
Standardized solution
IV) What if there are more than two levels on
categorical scale?
V) Inclusion of control variables
VI) Effect size calculation

46
Descriptives
47
I) Coding options

Dummy coding (01)
Allows to compare the effects of X on Y between
the reference group (d0) and the other group(s)
(d1)
Definitely preferred, if you are interested in
the specific regression weights for each group
Unweighted effects coding (-11) yields
unweighted mean effect of X on Y across groups
Preferred, if you are interested in overall mean
effect (e.g., when inserting M as a nonfocal
variable) all groups are viewed in comparison to
the unweighted mean effect across groups
Results are directly comparable with ANOVA
results when you have 2 or more categorical
variables
Weighted effects coding takes also into account
sample size of groups
Similar to unweighted effects coding except that
the size of each group is taken into
consideration
useful for representative panel analyses

Dummy coding (01)
Allows to compare the effects of X on Y between
the reference group (d0) and the other group(s)
(d1)
Definitely preferred, if you are interested in
the specific regression weights for each group
Unweighted effects coding (-11) yields
unweighted mean effect of X on Y across groups
Preferred, if you are interested in overall mean
effect (e.g., when inserting M as a nonfocal
variable) all groups are viewed in comparison to
the unweighted mean effect across groups
Results are directly comparable with ANOVA
results when you have 2 or more categorical
variables
Weighted effects coding takes also into account
sample size of groups
Similar to unweighted effects coding except that
the size of each group is taken into
consideration
useful for representative panel analyses

48
II) Estimating the unstandardized solution using
dummy coding

Unstandardized solution
Dummy-code M (0reference group 1comparison
group)
Center X ? cX
Compute product of cX and M
Regress Y on cX, M, and cXM

49
SPSS Syntax

Create dummy coding.
IF (gender0) genderd 0 .
IF (gender1) genderd 1 .
center height (on grand mean) and compute
interaction term.
DESC varheight.
COMPUTE heightc height - 173 .
Compute product term.
COMPUTE genderd.heightc genderdheightc.
Regress lifesat on heightc and genderd, adding
the interaction term.
REGRESSION
/DEPENDENT lifesat
/METHODENTER heightc genderd
/METHODENTER genderd.heightc.

50
SPSS output
b0 b1 b2 b3
51
Estimating the standardized solution using dummy
coding

Standardized solution
Dummy-code M (0reference group 1comparison
group)
Z-standardize X and Y
Compute crossproduct of zX and M
Regress zY on zX, M, and zXM
The unstandardized solution from the output is
the correct solution (Friedrich, 1982)!

52
SPSS Syntax

compute z-scores of all continuous varialbes
involved and then compute interaction term.
DESC varlifesat height/save.
COMPUTE genderd.zheight genderdzheight.
EXECUTE .
REGRESSION
/DEPENDENT zlifesat
/METHODENTER zheight genderd
/METHODENTER genderd.zheight.

53
SPSS output standardized solution
.507 estimated difference in regression weights
between groups
54
Correct regression equations
55
Plotting the interaction

Convention calculate predicted values for high
(1 SD) and low (-1 SD) values of X in both
groups of M

56
Unstandardized Plot

Females (reference group M0)
-1 SD
1 SD
Males (M1)
-1 SD
1 SD

57
Excel spreadsheet
Adapted from Dawson, 2006
58
Interpreting the unstandardized plot
Intercept for reference group at mean of height
(when height is centered)
Slope of height forreference group
Change in the slope when going from reference
group to other group
Difference in intercept between reference and
comparison groupat mean of height
163
173
183
Mean Height
59
Interpreting the standardized plot
Intercept for reference group at mean of height
(when height is centered)
Difference in the slope when going from
reference group to other group
Slope of height forreference group
Difference in intercept between both groups at
mean of height
60
Simple slope testing

Test of interaction term answers the question
Are the two regression weights in group A and B
significantly different from each other?
Simple slope testing answers Is the regression
weight in group A (or B) significantly different
from zero?

61
Simple slope testing

Use dummy coding
Simple slope test of the reference group (women)
Is already given in SPSS output as the test of
the conditional effect for M!
Simple slope test of the comparison group (men)
Easiest way recode M such that group B is now
the reference group (0). Then do regression
analysis all over again.

Simple slopes comparison group
(recode men0 women1).
IF (gender0) genderd2 1.
IF (gender1) genderd2 0.
COMPUTE genderd2.zheight genderd2zheight.
REGRESSION
/MISSING LISTWISE
/DEPENDENT zlifesat
/METHODENTER zheight genderd2
/METHODENTER genderd2.zheight.

? .544, p .003
? .036, p .807

?The effect of height on life satisfaction is
significant for men, but not for women.

63
III) Estimating the unstandardized solution using
unweighted effects coding

Unstandardized solution
Effect-code M (-1 group A 1 group B)
Center X
Compute crossproduct of centered Xc and M
Regress Y on Xc, M, and XcM
Interpret the unstandardized solution from the
output

64
Estimating the standardized solution using
unweighted effects coding

Standardized solution (to get the beta-weights)
Effect-code M (-1 group A 1 group B)
Z-standardize X and Y
Compute crossproduct of z-standardized scores for
X and M
Regress zY on zX, M, and zXM
Again, the unstandardized solution from the
output is the correct (standardized) solution
(Friedrich, 1982)!

65
SPSS Syntax (standardized solution only)

IF (gender0) gendere -1.
IF (gender1) gendere 1.
COMPUTE gendere.zheight genderezheight.
REGRESSION
/DEPENDENT zlifesat
/METHODENTER zheight gendere
/METHODENTER gendere.zheight.

66
Interpreting the standardized plot
Unweighted grand mean of both groups at mean of
height (when height is centered)
Unweighted mean slope across both groups
Deviation of the slope for the group coded 1
from the unweighted mean slope
Difference in intercept between group coded 1
from the unweighted grand mean
67
To sum up and compare
Dummy coding
Unweighted effects coding

In dummy coding, the contrasts are with the
reference group (0)
In unweighted effects coding, the contrasts are
with the unweighted mean of the sample
Regression weights for unweighted effects coding
equal exactly half of the weights for dummy
coding.
Dummy/effects coding does not change the
significance test of the interaction (and the
simple slope tests)

68
Further issues

V) What if there are more than 2 groups?
VI) Adding control variables
VII) Computing the effect size for the
interaction term

69
V) What if there are more than 2 groups?

Coding systems can be easily extended to N levels
of categorical variable
Example 3 groups (dummy coding) give you 3
possibilities
You need N-1 dummy variables
Include each dummy and its interaction with other
predictor in equation
Interpretation each dummy captures difference
between reference group and group coded 1
Statistical evaluation of overall interaction
effect R2 change

70
V) What if there are more than 2 groups?

Example 3 groups using effects coding
Interpretation each coding var captures the
difference between group coded 1 and unweighted
grand mean
Statistical evaluation of overall interaction
effect R2 change

71
VI) Adding control variables

Simply add centered covariates as predictors to
the unstandardized regression equation (or
z-standardized covariates to the standardized
regression equation).

72
VII) Effect size calculation

Again, f2 should be used

Squared multiple correlation resulting from
combined prediction of Y by the additive set of
predictors (A) and their interaction (I) ( full
model) Squared multiple correlation resulting
from prediction by set A only ( model without
interaction term)
73
Higher-order interactions

Higher-order interactions interactions among
more than 2 variables
All basic principles (centering, coding, probing,
simple slope testing, effect size) generalize to
higher-order interactions (see Aiken West,
1991, Chapter 4)

74
Example

Y Life satisfaction (continuous)
X Body height (continuous)
M1 Age (continuous)
M2 Gender (categorical male vs. female)
Is the moderator effect of age and height
different in males and females?
Important Include all lower-level (e.g.,
two-way) interactions before inserting the
higher-order (e.g., three-way) term!

75
Syntax

Standardized solution
compute z-scores of all continuous varialbes
involved and then compute two-way and three way
interaction term(s).
two-way.
COMPUTE genderd.zheight genderdzheight.
COMPUTE genderd.zage genderdzage.
COMPUTE zheight.zage zheightzage.
three-way.
COMPUTE genderd.zheight.zage genderdzheightzag
e.
REGRESSION
/DEPENDENT zlifesat
/METHODENTER zheight zage genderd
/METHODENTER zheight.zage genderd.zheight
genderd.zage
/METHODENTER genderd.zheight.zage.

76
SPSS output
Three-way interaction p .090
77
SPSS output (contd)
Slope of height in females at mean of age
Change in slope of height for males at mean of age
Difference in slope of height for males at mean
of age as compared to males 1 SD above the mean
of age
78
Plotting the interaction

Plot first-level moderator effect (e.g., height ?
age) at different levels of the third variable
(e.g., gender)
It is best to use separate graphs for that
There are 6 different ways to plot the three-way
interaction
Best presentation should be determined by theory
In the case of categorical vars it often makes
sense to plot the separate graphs as a function
of group
The logic to compute the values for different
combinations of high and low values on predictors
is the same as in the two-way case

79
Excel sheet for three-way IA
Adapted from Dawson, 2006
80
Plotting the three-way interaction
?.029.346 .375
?.375 -.435 -.06
?.029
81
Simple slope tests

This syntax estimates the beta of the steep slope
of height for males low in age (see previous
slide)
recode group membership.
IF (gender0) genderd2 1 .
IF (gender1) genderd2 0 .
transform age.
COMPUTE zagebelowzage1.
compute new product terms.
COMPUTE zheight.zagebelowzheightzagebelow.
COMPUTE genderd2.zheight genderd2zheight.
COMPUTE genderd2.zagebelow genderd2zagebelow.
COMPUTE zheight.zagebelow zheightzagebelow.
COMPUTE genderd2.zheight.zagebelow
genderd2zheightzagebelow.
REGRESSION
/DEPENDENT zlifesat
/METHODENTER zheight zagebelow genderd2

82
Output simple slope test
Slope of height in males one SD below the mean of
age
83
The challenge of statistical power when testing
moderator effects

If variables were measured without error, the
following sample sizes are needed to detect
small, medium, and large interaction effects with
adequate power (80)
Large effect (f2 .26) N 26
Medium effect (f2 .13) N 55
Small effect (f2 .02) N 392
Busemeyer Jones (1983) reliability of product
term of two uncorrelated variables is the product
of the reliabilites of the two variables
.80 x .80 .64
Required sample size is more than doubled
(trippled) when predictor reliabilites drop from
1 to .80 (.70) (Aiken West, 1991)
Problem gets even worse for higher-order
interactions

84
Outlook 1 Dichotomous DV

What if the DV is dichotomous (e.g., group
membership, voting decision etc.)?
Use moderated logistic regression (Jaccard, 2001)

85
Outlook 2 Moderated Mediation Analysis
X
Y
Z
86
Outlook 2 Moderated mediated regression analysis

Preacher, K. J., Rucker, D. D., Hayes, A. F.
(2007). Assessing moderated mediation
hypotheses Theory, methods, and prescriptions.
Multivariate Behavioral Research, 42, 185-227.
Check out http//www.comm.ohio-state.edu/ahayes/SP
SS20programs/modmed.htm, for a copy of the paper
and a convenient spss macro that does all the
computations

87
End of presentation

Thank you very much for your attention!

88
Appendix
89
Some donts for Case II
Useful procedures to get a first feel for the
data, but not appropriate tests for interaction
a) Testing the difference in subgroup
correlations - confound true moderator effects
with difference in predictor variance (Whisman
McClelland, 2005)- does not control for possible
interdependence among predictor and moderator-
loss of power
b) Splitting the file and regressing Y on X
separately by the two groups - does not control
for possible interdependence among predictor and
moderator - does not test for difference in
regression weights
Difference in regression weights .428
90
Dummy coding Standardized Plot