Title: Structural Equation Modeling
1Structural Equation Modeling
2Other Names
- SEM Structural Equation Modeling
- CSA Covariance Structure Analysis
- Causal Models
- Simultaneous Equation Modeling
- Path Analysis (with Latent Variables)
- Confirmatory Factor Analysis
3SEM in a nutshell
- Combination of factor analysis and regression
- Continuous and discrete predictors and outcomes
- Relationships among measured or latent variables
- Direct link between Path Diagrams and equations
and fit statistics - Models contain both measurement and path models
4An Example of a Path Diagram
5Vocabulary
- Measured variable
- Observed variables, indicators or manifest
variables in an SEM design - Predictors and outcomes in path analysis
- Squares in the diagram
- Latent Variable
- Un-observable variable in the model, factor,
construct - Construct driving measured variables in the
measurement model - Circles in the diagram
6More Vocabulary
- Error or E
- Variance left over after prediction of a measured
variable - Disturbance or D
- Variance left over after prediction of a factor
- Exogenous Variable
- Variable that predicts other variables
- Endogenous Variables
- A variable that is predicted by another variable
- A predicted variable is endogenous even if it in
turn predicts another variable
7Still more Vocabulary
- Measurement Model
- The part of the model that relates indicators to
latent factors - The measurement model is the factor analytic part
of SEM - Path model
- This is the part of the model that relates
variable or factors to one another (prediction) - If no factors are in the model then only path
model exists between indicators
8Even more Vocabulary
- Direct Effect
- Regression coefficients of direct prediction
- Indirect Effect
- Mediating effect of x1 on y through x2
- Confirmatory Factor Analysis
- Covariance Structure
- Relationships based on variance and covariance
- Mean Structure
- Includes means (intercepts) into the model
9Back to Path Diagrams
- Single-headed arrow ?
- This is prediction
- Regression Coefficient or factor loading
- Double headed arrow ?
- This is correlation
- Missing Paths
- Hypothesized absence of relationship
- Can also set path to zero
10The Previous Example
11Types of SEM questions
- Does the model produce an estimated population
covariance matrix that fits the sample data? - SEM calculates many indices of fit close fit,
absolute fit, etc. - Which model best fits the data?
- What is the percent of variance in the variables
explained by the factors? - What is the reliability of the indicators?
- What are the parameter estimates from the model?
12SEM questions
- Are there any indirect or mediating effects in
the model? - Are there group differences?
- Multi-group models
- Can change in the variance (or mean) be tracked
over time? - Growth Curve or Latent Growth Curve Analysis
13SEM questions
- Can a model be estimated with individual and
group level components? - Multilevel Models
- Can latent categorical variables be estimated?
- Mixture models
- Can a latent group membership be estimated from
continuous and discrete variables? - Latent Class Analysis
14SEM questions
- Can we predict the rate at which people will drop
out of a study or end treatment? - Discrete-time survival mixture analysis
- Can these techniques be combined into a huge
mess? - Multiple group multilevel growth curve latent
class analysis???????
15SEM limitations
- SEM is a confirmatory approach
- You need to have established theory about the
relationships - Cannot be used to explore possible relationships
when you have more than a handful of variables - Exploratory methods (e.g. model modification) can
be used on top of the original theory - SEM is not causal experimental design cause
16SEM limitations
- SEM is often thought of as strictly correlational
but can be used (like regression) with
experimental data if you know how to use it. - SEM is by far a very fancy technique but this
does not make up for a bad experiment and the
data can only be generalized to the population at
hand
17SEM limitations
- Biggest limitation is sample size
- It needs to be large to get stable estimates of
the covariances/correlations - 200 subjects for small to medium sized model
- A minimum of 10 subjects per estimated parameter
- Also affected by effect size and required power
18SEM limitations
- Missing data
- Can be dealt with in the typical ways (e.g.
regression, EM algorithm, etc.) through SPSS and
data screening - Most SEM programs will estimate missing data and
run the model simultaneously - Multivariate Normality and no outliers
- Screen for univariate and multivariate outliers
- SEM programs have tests for multi-normality
- SEM programs have corrected estimators when
theres a violation
19SEM limitations
- Linearity
- No multicollinearity/singularity
- Residuals Covariances (R minus reproduced R)
- Should be small
- Centered around zero
- Symmetric distribution of errors
- If asymmetric than some covariances are being
estimated better than others
20Technical Stuff Follow
21Basic Structure
Simple regression y ?x ?
Implied Covariance Matrix
22The univariate consequences of measurement error
- x True Score Error ? ?
- ? Var(x) Var(?) Var(?) ? ?
- Thus, Var(x) overestimates the variance of the
true score
23The bivariate consequences of measurement error
- A simple regression model with measurement error
-
-
- y ?x ? ?
where ?xx is the measurement reliability of x.
24IntroductionThe bivariate consequences of
measurement error
- Impact on goodness-of-fit
- Whats the impact on sample inference?
- Generally, the distortions are not as systematic
for multiple regression and simultaneous equation
models
25Confirmatory Factor Analysis Model
Where x (q ? 1) vector of
indicator/manifest variables ? (n ? 1) vector
of latent constructs (factors) ? (q ? 1) vector
of errors of measurement ? (q ? n) matrix of
factor loadings
26Confirmatory Factor Analysis Example
- Measures for positive emotions ?1
- x1 Happiness, x2Pride
- Measures for negative emotions ?2
- x3 Sadness, x4Fear
- Model
27Confirmatory Factor Analysis Example
28Confirmatory Factor AnalysisGraphical
Representation
29Confirmatory Factor AnalysisModel Assumptions
E(?) 0 E(?) 0 Var(?) ? Var(?)
? Cov(?, ?) 0
Implied Mean Vector
Implied Covariance Matrix
30Confirmatory Factor AnalysisExample
31Confirmatory Factor AnalysisModel Identification
- Definition
- The set of parameters ??,?,? is not
identified if there exists ?1??2 such that ?(?1)
?(?2).
32Confirmatory Factor AnalysisIs the one-factor,
two-indicator model identified?
- Example Measures for temperature ? x1
Celsius, x2Fahrenheit - Measurement Model
-
- where ?1 and ?2 are measurement intercepts.
33Confirmatory Factor AnalysisScale indeterminacy
- Recall measurement model
-
-
- Origin indeterminacy ? E(?) 0
- Scale (unit) indeterminacy
- How should single-indicator factors be handled?
34Confirmatory Factor AnalysisThe one-factor,
two-indicator model is under identified
- Population covariance matrix
- Implied covariance matrix
- Solution 1 Solution 2
35Confirmatory Factor AnalysisIs the one-factor
three-indicator model identified?
?21
?31
1
36Confirmatory Factor AnalysisThe one-factor
three-indicator model is exactly identified
37Confirmatory Factor AnalysisIdentification
Rules - Number of free parameters ? ½ q (q1)
- Three-Indicator Rule n?1 One non zero element
per row of ? Three or more indicators per
factor ? Diagonal
- Two-Indicator Rule
- n gt 1
- ?ij ? 0 for at least one pair i, j, i ? j
- one non-zero element per row of ?
- Two or more indicators per factor
- ? Diagonal
38Confirmatory Factor AnalysisMaximum Likelihood
Estimation
xi i.i.d MVNq(0, ?(?)) i1, , N
39Confirmatory Factor AnalysisOther Estimation
Methods
- Unweighted Least Squares
- Generalized L.S.
40Confirmatory Factor AnalysisThe Asymptotic
Covariance Matrix
Information Matrix
41Confirmatory Factor AnalysisGoodness-of-fit
measures
Root Mean-Square Residual
Correlation Residuals
Goodness-of-Fit Index
Communalities/Reliabilities
Coefficient of Determination
42Confirmatory Factor AnalysisGoodness-of-fit
measures
43Confirmatory Factor AnalysisOther
Goodness-of-fit indices
- Root Mean Square Error of Approximation
- where df (q(q1)/2) t (degrees of
freedom). - RMSEA ? 0.05 ? Close fit
- 0.05 lt RMSEA ? 0.08 ? Reasonable fit
- RMSEA gt 0.1 ? Poor fit
44Confirmatory Factor Analysis Multitrait-Multime
thod Example
?x1x2
?x3x1
?x4x3
?x4x2
45Confirmatory Factor Analysis Multitrait-Multime
thod Example
?1
?2
?4
x2
x1
x3
x4
?3
?2
?1
?3
?4
46Brand Halos and Brand Evaluations Lynd Bacon
(1999)
Performance
Quality
Pt2
Pt1
Qd1
Qt1
Pd1
Qt2
Pd2
Qd2
DirtyScooter
TrailBomber
47Brand Halos and Brand EvaluationsSources of
Variance
- Brand Attribute
- DirtyScooter
- Pd1 0.71 0.04
- Pd2 0.74 0.02
- TrailBomber
- Pt1 0.40 0.39
- Pt2 0.41 0.30
48Convergent and Discriminant ValidityBagozzi and
Yi (1993)
- Attitude towards coupons (?1) with three semantic
differential measures x1pleasant/unpleasant - x2good/bad
- x3favorable/unfavorable
- Subjective norms (?2) with two measures
- x4 Most people who are important to me think I
definitely should use coupons for
shopping in the supermarket - x5 Most people who are important to me
probably consider my use of coupons to be
wise.
49Convergent and Discriminant Validity Bagozzi and
Yi (1993)
.82
x2
?2 .33
50Convergent and Discriminant ValidityBagozzi and
Yi (1993)
- Convergent validity
- - Goodness-of-fit
- - All loadings are high and significant
- Discriminant validity H0 ?1 is rejected
- Measurement reliability (x1.56, x2.67, x3.53,
x4.48, x5.81) -
51The Full Structural Equation Model Measurement
Model
-
- Where
- x (q ? 1) vector of exogenous
indicator/manifest variables - y (p ? 1) vector of endogenous
indicator/manifest variables - ? (n ? 1) vector of exogenous latent constructs
with mean 0 and variance ?? - ? (m ? 1) vector of endogenous latent
constructs - ? (q ? 1) vector of errors of measurement with
mean 0 and variance ?? - ? (p ? 1) vector of errors of measurement with
mean 0 and variance ?? - ?x (q ? n) matrix of factor loadings
- ?y (p ? m) matrix of factor loadings
52The Full Structural Equation Model Structural
Model
- where
- B (m x m) Coefficient Matrix for the effect of
? on ? - ? (m x n) Coefficient Matrix for the effect ?
on ? - ? (m x 1) Vector of errors, E(?) 0 , COV(?,
?) ? , COV(?, ?) 0
53The Full Structural Equation Model The Implied
Covariance Matrix
54The Full Structural Equation Model Identification
- Number of parameters lt(pq)(pq1)/2
- Two-Step Rule
- - Measurement Model Identification
- - Structural Model Identification
55The Full Structural Equation Model Structural
Model Identification
- Null B Rule (B0)
- Recursive Rule
- - B Triangular
- - ? Diagonal
- Order Condition
- ith equation is identified if of variables
excluded from ith equation is ? m-1 - Rank Condition
- - Form
- - ith equation is identified if rank of Ci m
1 where Ci formed from those columns of C
that have 0 in the ith row.
56The Full Structural Equation Model Structural
Model Identification Example
?1
?1
?1
?2
?2
?2
57The Full Structural Equation ModelStructural
Model Identification Example
- Form
-
- Rank of is m-12-11
- Rank of is m-12-11
- Both equations are identified
58Construct Validation by Use of Panel Model
Bagozzi and Yi (1993)
59Construct Validation by Use of Panel Model
Bagozzi and Yi (1993)
- ?31 and ?42 capture temporal stability
- ?21 and ?43 reflect discriminant validity
- Convergent validity is assessed by overall model
fit and by the magnitude and significance of the
factor loadings - The covariance between two serially correlated
errors is a measure of specific variance
60Construct Validation by Use of Panel Model
Bagozzi and Yi (1993)
61Construct Validation by Use of Panel Model
Bagozzi and Yi (1993)
- Convergent validity
- - Goodness-of-fit
- - All loadings are high and significant
- - Factorial invariance holds
- Discriminant validity H0 ?21 1 and ?43 1 are
rejected - Temporal stability and