Title: Meeting 9
1Meeting 9 Covariate Latent Class Analysis
2Two-Class Covariate Model (see Equations 7.1 and
7.4)
3Logistic regression model for with 2
covariates
Note For each case, in addition to observations
for the response variables A, B, C, and
D (say), there are data for Z1 and Z2.
4Note that the latent class proportion, , is
conditional on the covariates but that the
conditional probabilities are not. From the
perspective of LC models incorporating groups,
this is a partial heterogeneous model.
5 INPUT Cheating data with GPA as
covariate Two-Class Model lat 1 man 4
con 1 dim 2 2 2 2 2 lab X A B C D P mod XP
P,cov(P,1,X,c,-1) AX BX
CX DX rec 315 des 1 0 dat
cheat4.prn
LEM input for cheating data with GPA as Covariate
P GPA is the continuous covariate see next
slide for structure of data file
This the required mod record for covariate models
see LEM Manual
Two cases lack GPA so n 315, not 317
The des record influences the form of the
logistic regression model see later
The raw data are in a file named cheat4.prn see
below for format
6Description from page 44 of the LEM Manual
Note that W latent variable and x covariate
vector (x1 and x2)
Note that there is a cov( ) segment for each
covariate, x1 and x2.
The des record contains a row for each covariate
this is contrast coding but different
coding forms, such as dummy coding, can be
used.
7Contents of the file cheat4.prn
A B C D P
1 No 2 Yes
1 lowest GPA 5 highest GPA
For raw data input to LEM, you need a rec record
giving the number of cases in the input file
(e.g., rec 315) and a dat record with the name
of the flat ascii data file (e.g., dat
cheat4.prn). The input file should be in the same
subdirectory as the executable element,
LEMWIN.EXE. DO NOT create the print file with
a program such as WORD. Use NOTEPAD, or
similar program, since it outputs flat ascii
files.
etc. etc. for 315 cases
8 STATISTICS Number of iterations 414
Converge criterion 0.0000009826 Seed
random values 6026 X-squared
5430.0381 (0.0000) L-squared
859.3242 (1.0000) Cressie-Read
2098.2545 (1.0000) Dissimilarity index
0.2749 Degrees of freedom 4716
Log-likelihood -2241.72248 Number of
parameters 9 (315) Sample size
315.0 BIC(L-squared) -26269.8084
AIC(L-squared) -8572.6758
BIC(log-likelihood) 4535.2181
AIC(log-likelihood) 4501.4450 Eigenvalues
information matrix 273.3488 116.6123
96.2470 75.2482 72.6527 47.9488
35.1337 14.4093 4.5089
Chi-squares statistics are not meaningful since
they are based on a cross-tabulation involving
individual cases that is, table based on
315x2x2x2x2 5040 cells.
These BIC, AIC not interpretable since based
on invalid chi-squares.
These BIC, AIC values are OK since they are
based on log-likelihood.
9Conditional Probabilities for Covariate LCA with
Cheat4 Data
P(AX) 1 1 0.4268 (0.1239) 2
1 0.5732 (0.1239) 1 2
0.9900 (0.0128) 2 2 0.0100
(0.0128) P(BX) 1 1 0.4764
(0.0993) 2 1 0.5236 (0.0993) 1
2 0.9641 (0.0186) 2 2
0.0359 (0.0186) P(CX) 1 1
0.7834 (0.0673) 2 1 0.2166
(0.0673) 1 2 0.9649 (0.0143) 2
2 0.0351 (0.0143) P(DX) 1 1
0.5945 (0.0837) 2 1 0.4055
(0.0837) 1 2 0.8240 (0.0272) 2
2 0.1760 (0.0272)
N Y N Y
Latent class 1 has a relatively higher
conditional prob for a 2 Yes response and
can be considered the Cheater class. Except
for variable D, latent class 2 has very low
rates of Yes responses.
N Y N Y
N Y N Y
N Y N Y
10Covariate (P) portion of the Log-Linear Printout
LOG-LINEAR PARAMETERS TABLE XP or
P(XP) effect beta std err
z-value exp(beta) Wald df prob cov(P)
X 1 1 -0.7997 0.1780 -4.494
0.4495 20.20 1 0.000
Significance test for regression coefficient
Wald z2
Odds ratio
With this scaling, ß0 0
The regression slope (-.7997) is negative thus,
increasing GPA is associated with a lower
predicted probability for membership in latent
class 1, the Cheater class.
11NOTE Odds ratio .4495 .202 .4495X.449 .091
.4495X.202 .041 .4495x.091 .018 .4495X.041
Reminder Odds Prob/(1-Prob)
12 STATISTICS Number of iterations 414
Converge criterion 0.0000009826 Seed
random values 6026 X-squared
5430.0381 (0.0000) L-squared
859.3242 (1.0000) Cressie-Read
2098.2545 (1.0000) Dissimilarity index
0.2749 Degrees of freedom 4716
Log-likelihood -2241.72248 Number of
parameters 9 (315) Sample size
315.0 BIC(L-squared) -26269.8084
AIC(L-squared) -8572.6758
BIC(log-likelihood) 4535.2181
AIC(log-likelihood) 4501.4450
STATISTICS Number of iterations 136
Converge criterion 0.0000009153 Seed
random values 4204 X-squared
5433.0708 (0.0000) L-squared
859.3241 (1.0000) Cressie-Read
2098.6872 (1.0000) Dissimilarity index
0.2749 Degrees of freedom 4716
Log-likelihood -2241.72245 Number of
parameters 9 (315) Sample size
315.0 BIC(L-squared) -26269.8084
AIC(L-squared) -8572.6759
BIC(log-likelihood) 4535.2181
AIC(log-likelihood) 4501.4449
Output using des 1 0
Output using des 1 -1
13Logistic Coefficient using Des 1 -1
LOG-LINEAR PARAMETERS TABLE XP or
P(XP) effect beta std err
z-value exp(beta) Wald df prob cov(P)
X 1 1 -0.3997 0.0889 -4.495
0.6705 20.20 1 0.000
Note that with Des 1 0 coding, the beta was
-,7997
14LEM input for cheating data with Sex, GPA
College as Covariates
INPUT Cheating data with Covariates
Two-Class Model x sex (1,2), GPA (1-5),
3 college dummies GAPCollege products lat 1
man 4 con 5 dim 2 2 2 2 2 lab W A B C D x
mod Wx x,cov(x,1,W,c,-1),cov(x,2,W,c,-1),
cov(x,3,W,c,-1),cov(x,4,W,c,-1),cov(x,5,W,c,-1)
AW BW CW DW rec
314 des 1 0 1 0 1 0 1 0
1 0 dat cheat4SGC.prn
One more case with missing data.
15Contents of the file cheat4GSC.prn
A B C D S P C1-3 Prod1-3
C1 C3 are dummy variables for the 4
colleges Prod1 Prod3 are products for College
and GPA (P) products not used in this analysis.
etc. etc. for 314 cases
16 STATISTICS Number of iterations 238
Converge criterion 0.0000009955 Seed
random values 5177 X-squared
5351.9911 (0.0000) L-squared
846.0682 (1.0000) Cressie-Read
2047.8551 (1.0000) Dissimilarity index
0.2729 Degrees of freedom 4697
Log-likelihood -2228.34350 Number of
parameters 13 (314) Sample size
314.0 BIC(L-squared) -26158.8307
AIC(L-squared) -8547.9318
BIC(log-likelihood) 4531.4291
AIC(log-likelihood) 4482.6870 Eigenvalues
information matrix 251.4948 177.0134
112.5420 82.8747 76.2716 43.7782
30.2218 13.2959 9.2520 6.5526
5.4114 4.1367 0.8306
Meaningless
OK
17Covariate (x) portion of the Log-Linear Printout
LOG-LINEAR PARAMETERS TABLE Wx or
P(Wx) effect beta std err
z-value exp(beta) Wald df prob cov(x)
W 1 1 0.0361 0.3844 0.094
1.0368 0.01 1 0.925 cov(x) W 1 2
-0.9821 0.2718 -3.614 0.3745
13.06 1 0.000 cov(x) W 1 3
0.1555 0.6959 0.223 1.1682 0.05 1
0.823 cov(x) W 1 4 1.0624
0.6146 1.729 2.8934 2.99 1 0.084
cov(x) W 1 5 0.5914 0.7733
0.765 1.8065 0.58 1 0.444
Sex GPA College 1 College 2 College 3
18Conditional Probabilities for Covariate LCA with
Cheat4GSC Data
P(AW) 1 1 0.5018 (0.1405) 2
1 0.4982 (0.1405) 1 2
0.9901 (0.0117) 2 2 0.0099
(0.0117) P(BW) 1 1 0.5143
(0.1070) 2 1 0.4857 (0.1070) 1
2 0.9750 (0.0197) 2 2
0.0250 (0.0197) P(CW) 1 1
0.7964 (0.0630) 2 1 0.2036
(0.0630) 1 2 0.9676 (0.0142) 2
2 0.0324 (0.0142) P(DW) 1 1
0.5891 (0.0793) 2 1 0.4109
(0.0793) 1 2 0.8325 (0.0312) 2
2 0.1675 (0.0312)
Compare to Slide 9 Conditional probabilities
are very similar
19LEM input for cheating data with GPA, College
Products as Covariates
Cheating data with GPA Two-Class Model x
sex (1,2), GPA (1-5), 3 college dummies, 3
GPACollege products lat 1 man 4 con 8 dim 2 2 2
2 2 lab W A B C D x x1 sex not included as
covariate mod Wx x,cov(x,2,W,c,-1),cov(x,3,W,c,-
1), cov(x,4,W,c,-1),cov(x,5,W,c,-1),cov(x,6,W,
c,-1),cov(x,7,W,c,-1),cov(x,8,W,c,-1) AW
BW CW DW rec 314 des 1 0 1
0 1 0 1 0 1 0 1 0 1
0 dat cheat4SGC.prn