Title: General Structural Equation (LISREL) Models
1General Structural Equation (LISREL) Models
2Todays class
- Solution to covariance algebra homework
- Brief discussion on scaling and reference
indicators - IDENTIFICATION of models (major topic for today)
- A look at AMOS model (homework)
3Solutions to covariance algebra exercise
COV(X1,X2) COV(X3,X4) VAR(X6)
4COV(X1,X2)
Relevant equations X1 LV1 e1 X2 b1 LV1
e2 COV(X1,X2) COV(LV1 E1, b1LV1 e2)
COV(LV1,b1 LV1) COV(E1,b1 LV1) COV(LV1,
e2) COV(e1,e2) b1 COV(LV1,LV1)
b1 VAR(LV1)
5COV(X3,X4)
Relevant equations X3 b3 LV1 e3 X4 b4 LV2
e4 LV2 b6 LV1 d1
COV(X3,X4) COV( b3 LV1 e3, b4 LV2 e4)
cov (b3 LV1 e3, b4 b6 LV1 d1 e4)
COV(b3 LV1, b4 b6 LV1) COV(b3 LV1,d1) COV(b3
LV1,e4) COV (e3,b4 b6LV1) COV(e3,d1)
COV(e3,e4) orange0 b3 b4 b6 COV(LV1,LV1)
b3 b4 b6 VAR(LV1)
6VAR(X6)
Relevant equations X6 LV2 e6 LV2 b6 LV1
d1
COV(X6,X6) COV(LV2 e6, LV2 e6) COV (b6
LV1 d1 e6, b6 LV1 d1 e6)
COV (b6 LV1, b6 LV1) COV(d1,d1)
COV(e6,e6) all other terms drop out! b62
VAR(LV1) VAR(D1) VAR(e6)
7REPRODUCED COVARIANCES What have we been doing
with our covariance exercises, and why? Given
model parameters a) structural equation
coefficients b) variances of exogenous variables
(LVs, manifest vars that are completely
exogenous errors) c) covariances among
exogenous variables
8Given model parameters a) structural equation
coefficients b) variances of exogenous variables
(LVs, manifest vars that are completely
exogenous errors) c) covariances among
exogenous variables We can estimate the
covariances of the observed variables. COV
(hat) SIGMA (more properly Sigma
hat) Reproduced covariances Implied
covariances S
9Given model parameters We can estimate the
covariances of the observed variables. S(?)
hat - we will use symbol S Observed
covariances S S S ? we seek to find a set
of parameter estimates which minimizes
this Fit function (expression of S
S) ML fit function is log S tr (S S-1)
log S - q qno. of manifest vars Note that
when S S exactly, Fml 0 A test of
significance (?2) is based on Fml
(N-1)Fml
10Fit function (expression of S S) ML fit
function is log S tr (S S-1) log S -
q qno. of manifest vars
Simplest case One parameter model
Fit funct.
solution
b1 ?
Iterative methods required guess b1, then
take derivative of F with respect to b1 to
determine whether solution has higher or lower
value of b1.
11Degrees of freedom in a model
- Number of observed variances/covariances
- minus the number of parameters
- 4 manifest variables k4
- (k1)(k) / 2 variances/covariances 10
- 8 manifest variables (81)(8)/2 36
12 of parameters VAR(X1) VAR(X2) VAR(e3)
VAR(e4) b1, b2, b3 COV(X1,X2) Total 8
Number of degrees of freedom in this model 2
Some testable assumptions in this model X1 ? X4
(direct path) 0 X2 ? X3 (direct path) 0
13Model parameters VAR(X1) VAR(X2) VAR(e3)
VAR(e4) b1, b2, b3, b4, b5 COV(X1,X2) Total 10
Model df0
When df0, S S exactly (at proper solution for
parameters ?) When dfgt0, generally S ? S (the
closer they are to each other, the better the
fit of the model)
14It is possible to have models with insufficient
information to (uniquely) estimate parameters
15Identification
- A model is identified if there is sufficient
information to uniquely estimate parameters - A model is over-identified if more than
sufficient - A model is just-identified if df0 just enough
information - A model is under-identified if there is
insufficient information
16Identification
- It is possible to have a model that has positive
degrees of freedom, yet is under-identified
17Identification
- If a model has negative degrees of freedom, it
will be under-identified - df0 is a necessary condition (sometimes referred
to as the t-rule) - Not, however, a sufficient condition
18Identification
- Identification status of a 3-indicator latent
variable model
Parameters VAR(F1) VAR(E1) VAR(E2) one
of VAR(E3) these must b1 be fixed b2 to
1.0 b3
Empirical covariances, variances 6
Parameters 6 df0 Three-indicator rule
19Three indicator rule
- Latent variable (factor) model with 3 indicators
will be identified - Sufficient condition
- subject to note below
- All 3 coefficients must be non-trivial.
- If one is very close to 0, then the model reduces
to a 2-indicator model
20What if only 2 indicators?
VAR(F1) VAR(E1) VAR(E2) b1 B2 5 parameters Fixing
one of VAR(F1), b1 or b2 to 1.0 reduces of
parameters to 4
3 variances covariances VAR(X1) COV(X1,X2)
VAR(X2)
Under-identified! df -1
21What if 4 indicators?
Parameters b1,b2,b3,b4 VAR(L1) one fixed to
1.0 VAR(e1) VAR(e2) VAR(e3) VAR(e4) Total
8 With 4 manifest variables, we have 10 empirical
variances and covariances df 2 Over-identified
22What if 4 indicators?
df 2 Over-identified Does this mean we
can/should just toss out one of the manifest
variables? When a model is over-identified, we
can test assumptions (with df0, no testable
assumptions) Some examples of assumptions Cov(e1
,e2) 0 Cov(e2,e4) 0
23What if 4 indicators?
Implications of adding cov(e1,e2) Originally
estimated b1, b2 too strong some of covariance
between X1 and X2 is not due to F1 but due to
some extraneous factor.
24Does this mean we must throw out variables with
fewer than 3 indicators?
- No, but to include these variables in the model,
we have to make stronger assumptions, and these
assumptions are not themselves testable.
25Does this mean we must throw out variables with
fewer than 3 indicators? Single indicator
variables
This model is under-identified Parameters
VAR(LV1) b1 VAR(e1) Even fixing b1 1.0 2
parameters, 1 variance.
One possible solution impose constraints Fix b1
1.0 Fix VAR(e1) 0 ! Implies perfect
measurement Now, VAR(LV1) VAR(X1)
26Does this mean we must throw out variables with
fewer than 3 indicators? Single indicator
variables
One possible solution impose constraints Fix b1
1.0 Fix VAR(e1) 0 ! Implies perfect
measurement Now, VAR(LV1) VAR(X1)
LV1 now used as predictor of other variables in
model
27Does this mean we must throw out variables with
fewer than 3 indicators? Single indicator
variables
LV1 now used as predictor of other variables in
model
Above model reduces to this. Most software
packages will allow single-indicator estimation
without the need to explicitly define b11.0,
var(e1)0 etc.
28Does this mean we must throw out variables with
fewer than 3 indicators? Single indicator
variables
While the constraints b11.0, VAR(e1) 0 are
common, other constraints are possible. For
example, it might be more reasonable to assume
20 measurement error as opposed to 0.
29Single indicator models with assumed (non-zero)
error variances.
- Find out what the total variance of the manifest
variable is (e.g., look at the variance-covariance
matrix of manifest variables, which is usually
available on printouts) - If the variance of variable v239 is 2.4, and you
wish to assume 20 measurement error, then the
variance of the error term will be set to .20 x
2.4 0.48.
30Single indicator models with assumed (non-zero)
error variances.
- If the variance of variable v239 is 2.4, and you
wish to assume 20 measurement error, then the
variance of the error term will be set to .20 x
2.4 0.48.
In AMOS, right-click on circled error term, then
enter object properties
31If 2 indicators
- Model is under-identified BUT we could impose
constraints - Force b1b2 (we havent learned how to do this in
the software packages yet though) - This is only appropriate if the 2 indicators have
the same variance - Force var(e1) var(e2)
- This is only appropriate if the 2 indicators have
the same variance
32If 2 indicators
- we could impose constraints
- Force b1b2 (we havent learned how to do this in
the software packages yet though) - This is only appropriate if the 2 indicators have
the same variance - Force var(e1) var(e2)
- This is only appropriate if the 2 indicators have
the same variance - The variance constraint implies an assumption
that the 2 indicators have the same reliability
(this assumption not testable), but at least we
are able to estimate what this reliability (R2 in
measurement equations) is - If variances of manifest variables unequal, we
must impose a more complex constraint - Example VAR(X1) 1.0 VAR(X2) 4.0
- Constraint Var(e1) 4 Var(e2)
(difficult with AMOS) - OR, we could adjust the variance of X2 before
using it in the model (this could be accomplished
by mean-centering the variable and dividing
values by 2 in an SPSS RECODE statement)
33More on 2-indicator models
Parameters Var(e1) var(e2) var(e3) var(e4) b1,
b2 VAR(F1) VAR(F2) COV(f1,f2) 9
parameters Empirical variances/covariances 10 Des
pite the fact that this is a 2-indicator model,
it IS identified! TWO INDICATOR RULE
34More on 2-indicator models
BEWARE This works ONLY if COV(F1,F2) is
non-zero. If COV(F1,F2) 0, then this reduces
to 2 under-identified 1 LV/2 indicator models.
BEWARE! Two-indicator models relying on the
2-indicator rule for identification (no other
constraints) become unstable long before COV
(F1,F2) reduces to 0. As a rule, look for
covariances with standardized values of at least
0.30 before accepting models based on this
principle.
35Causal models (construct equations)
- RECURSIVE RULE Recursive models are identified
(sufficient condition) - T-rule Necessary but not sufficient
A non-recursive model Is it identified?
36Causal models (construct equations)
- THE RANK CONDITION
- We now construct a matrix in which the row
represents endogenous variables, and the columns
represent both endogenous and exogenous
variables - X4 X3 X1 X2
- X3 1,1 1,2 1,3 1,4
- X4 2,1 2,2 2,3 2,4
- The numbers in this matrix refer to the matrix
elements. Elements 1,1 and 2,2 are replaced
with 1s, and other elements are replaced by the
parameters i -
- Matrix C
- V3 V4 V1 V2
- V3 1 e b 0
- V4 d 1 0 c
- i In presentations such as those in Bollen
(1989), these are actually the negative values of
the coefficients, but the distinction is
unimportant for purposes of this current
discussion. The use of negative coefficients is
associated with the matrix equations which are
used in the rank condition test. -
- See Baer manuscript, chapter 3
- For each equation, we delete all columns that do
not have zeros in the row in question. Thus,
for the first row, X3, we delete all columns that
do not have a zero in that row. We are left with
the following sub-matrix - C1 0
- c
C2 b 0
the rank of each sub-matrix must be at least p-1,
where p is the number of endogenous variables
37Identification exercise examples
38(No Transcript)
39(No Transcript)
40(No Transcript)
41Example 3, continued
42Note rank condition is sufficient.
43Setting the variance of latent variables
Parameters b1,b2,b3,b4 Var(e1),var(e2),var(e3),var
(e4) Var(L1) THIS MODEL WOULD APPEAR TO BE
IDENTIFIED (T-RULE) BUT IS NOT UNLESS WE FIX
THE VARIANCE OF L1 IN SOME WAY
44Setting the variance of latent variables
THIS MODEL WOULD APPEAR TO BE IDENTIFIED (T-RULE)
BUT IS NOT UNLESS WE FIX THE VARIANCE OF L1 IN
SOME WAY b11.0 OR b21.0 OR b31.0 OR b4
1.0 OR VAR(L1) 1.0 Could be some number other
than 1 (must be positive in the case of var(L1)
though).
45Setting the variance of latent variables
Var(L1) 1.0 Free Free
b1 .6 1.0 1.5
b2 .4 .67 1.0
b3 .8 1.3 2.0
b4 .6 1.0 1.5
VAR(L1) VAR(X1) VAR(E1) (2nd model) If error
small, VAR(L1) VAR(X1) if not, VAR(L1) smaller.
46AMOS Example (distributed yesterday)Religiosity
IndicatorsV9 Importance of Religion 1Very
important 2quite imp.3not very 4not at
allV147 Church attendance 1more than 1/week
21/week31/month4ChristmasEaster
5other holy days 6once/year 7less
8neverV175 Views 1there is a personal God
2some sort of spirit/life force 3dont know
4 dont think there is any sort of GodV176
Importance of God in your life 1Not at all
through 10verySexual MoralityCan each be
always justified (10), never justified (1) or
something in between.V304 Married men/women
having an affair.V305 Sex under legal age of
consentV307 HomosexualityV308
ProstitutionV309 AbortionV310 Divorce
47AMOS Example (distributed yesterday)
AMOS Programs and Output Listings For Religiosity
Sexual Morality Problem RSMProg1 4 indicator
model for religiosity. Fit indices
OK. Modification indices suggest small
improvement if correlated error term added.
Model has acceptable fit without this addition
- whether or not to add the term probably best
adjudicated by whether or not it makes sense
theoretically RSMProg2 4 indicator model for
religiosity with correlated error term
added. Note drop in R-square for the 2
indicators involved in the correlated error
term. Also it would now be slightly preferable
to use as a reference indicator a variable
that is not associated with a correlated error
term. RSMProg3 (correlated error term
dropped) Add 5-indicator latent variable for
sex-morality attitudes. RSMProg4 Add 2 correlated
error terms RSMProg5 (correlated errors
dropped) Like RSMProg3, but with causal model
(Religiosity exogenous, Sex/morality attitudes
endogenous)
48(end)
- Exercises
- Do the covariance algebra exercises in section
(h) of chapter 2. Appendix 2 is missing from
the manuscript BUT we will distribute answers in
a couple of days. - Work through part 2 of the AMOS introduction