Title: Challenges posed by Structural Equation Models
1Challenges posed byStructural Equation Models
- Thomas Richardson
- Department of Statistics
- University of Washington
Joint work with Mathias Drton, UC Berkeley
Peter Spirtes, CMU
2Overview
- Challenges for Likelihood Inference
- Problems in Model Selection and Interpretation
- Partial Solution
- sub-class of path diagrams ancestral graphs
3Problems for Likelihood Inference
- Likelihood may be multimodal
- e.g. the bi-variate Gaussian Seemingly Unrelated
Regression (SUR) model
may have up to 3 local maxima.
Consistent starting value does not guarantee
iterative procedures will find the MLE.
4Problems for Likelihood Inference
- Discrete latent variable models are not curved
exponential families
ternary latent class variable
15 parameters in saturated model 14 model
parameters BUT model has 2d.f. (Goodman)
C
binary observed variables
X1
X2
X3
X4
Usual asymptotics may not apply
5Problems for Likelihood Inference
- Likelihood may be highly multimodal in the
asymptotic limit - After accounting for label switching/aliasing
C
d.f. may vary as a function of model parameters
X1
X2
X3
X4
Why report one mode ?
6Problems for Model Selection
- SEM models with latent variables are not curved
exponential families - Standard c2 asymptotics do not necessarily apply
e.g. for LRTs - Model selection criteria such as BIC are not
asymptotically consistent - The effective degrees of freedom may vary
depending on the values of the model parameters
7Problems for Model Selection
- Many models may be equivalent
X1
Y1
X2
Y2
X1
X1
Y1
Y1
X2
Y2
X2
Y2
8Problems for Model Selection
- Models with different numbers of latents may be
equivalent - e.g. unrestricted error covariance within blocks
9Problems for Model Selection
- Models with different numbers of latents may be
equivalent - e.g. unrestricted error covariance within blocks
X1
Y1
w
x
Xp
Yq
X1
Y1
y
Xp
Yq
Wegelin Richardson (2001)
10Two scenarios
- A single SEM model is proposed and fitted. The
results are reported.
11Two scenarios
- A single SEM model is proposed and fitted. The
results are reported. - The researcher fits a sequence of models, making
modifications to an original specification. - Model equivalence implies
- Final model depends on initial model chosen
- Sequence of changes is often ad hoc
- Equivalent models may lead to very different
substantive conclusions - Often many equivalence classes of models give
reasonable fit. Why report just one?
12Partial Solution
- Embed each latent variable model in a larger
model without latent variables characterized by
conditional independence restrictions. - We ignore non-independence constraints and
inequality constraints.
Latent variable model
Model imposing only independence constraints on
observed variables
Sets of distributions
13The Generating graph
- Begin with a graph, and associated set of
independences
Toy Example
t
a
b
c
d
G
others
14Marginalization
- Suppose now that some variables are unobserved
- Find the independence relations involving only
the observed variables
Toy Example
hidden
t
a
b
c
d
G
Unobserved independencies in red
a
t
d
t
others
b
c
t
15Marginalization
- Suppose now that some variables are unobserved
- Find the independence relations involving only
the observed variables
Toy Example
hidden
t
a
b
c
d
G
Unobserved independencies in red
a
t
d
t
others
b
c
t
16Graphical Marginalization
- Now construct a graph that represents the
conditional independence relations among the
observed variables. - Bi-directed edges are required.
Toy Example
t
a
b
c
d
a
b
c
d
G
G
represents
all and only the distributions in which these
independencies hold
17Equivalence re-visited
- Restrict model class to path diagrams including
only observed variables characterized by
conditional independence - Ancestral Graph Markov models
- For such models we can
- Determine the entire class of equivalent models
- Identify which features they have in common
- Models are curved exponential usual asymptotics
do apply
18Ancestral Graph
T
A
B
C
D
A
D
C
A
B
19Equivalent ancestral graphs
T
A
B
C
D
A
D
C
A
B
U
V
Þ
B
C
D
A
B
C
D
A
20Equivalent ancestral graphs
T
A
D
C
A
B
U
V
A
B
C
D
R
P
Q
Þ
B
C
D
A
A
D
B
C
Markov Equiv. Class of Graphs with Latent
Variables
21Equivalence Classes
Equivalent ancestral graphs
T
A
D
C
A
B
U
V
A
B
C
D
R
P
Q
A
D
B
C
R
M
N
B
C
D
Þ
A
A
D
B
C
infinitely many others
L
Markov Equiv. Class of Graphs with Latent
Variables
22Equivalence class of Ancestral Graphs
T
A
D
C
A
B
U
V
A
B
C
D
R
P
Q
A
D
B
C
R
M
N
B
C
D
A
A
D
B
C
ß
infinitely many others
L
A
B
C
D
Markov Equiv. Class of Graphs with Latent
Variables
Partial Ancestral Graph
23Equivalence class of Ancestral Graphs
T
A
D
C
A
B
U
V
A
B
C
D
R
P
Q
A
D
B
C
R
M
N
B
C
D
A
A
D
B
C
ß
infinitely many others
L
A
B
C
D
Partial Ancestral Graph
Markov Equiv. Class of Graphs with Latent
Variables
24Measurement models
- If we have pure measurement models with several
indicators per latent - May apply similar search methods among the latent
variables (Spirtes et al. 2001 Silva et al.2003)
25Other Related Work
- Iterative ML estimation methods exist
- Guaranteed convergence
- Multimodality is still possible
- Implemented in R package ggm (Drton Marchetti,
2003) - Current work
- Extension to discrete data
- Parameterization and ML fitting for binary
bi-directed graphs already exist - Implementing search procedures in R
26References
- Richardson, T., Spirtes, P. (2002) Ancestral
graph Markov models, Ann. Stat., 30 962-1030 - Richardson, T. (2003) Markov properties for
acyclic directed mixed graphs. Scand. J. Statist.
30(1), pp. 145-157 - Drton, M., Richardson T. (2003) A new algorithm
for maximum likelihood estimation in Gaussian
graphical models for marginal independence. UAI
03, 184-191 - Drton, M., Richardson T. (2003) Iterative
conditional fitting in Gaussian ancestral graph
models. UAI 04 130-137. - Drton, M., Richardson T. (2004) Multimodality of
the likelihood in the bivariate seemingly
unrelated regressions model. Biometrika, 91(2),
383-92. - Marchetti, G., Drton, M. (2003) ggm package.
Available from http//cran.r-project.org