Model Selections and Comparisons - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

Model Selections and Comparisons

Description:

1992 by Wright State University School of Medicine and United Health Services in Dayton, Ohio ... association graph - set of vertices, each vertex is a variable ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 14
Provided by: infost
Category:

less

Transcript and Presenter's Notes

Title: Model Selections and Comparisons


1
Model Selections and Comparisons
(Categorical Data Analysis, Ch 9.2)
Yumi Kubo Alvin Hsieh
Model 2
Model 1
2
Survey Data
  • 1992 by Wright State University School of
    Medicine and United Health Services in Dayton,
    Ohio
  • 2276 students in the last year of high school
    (nonurban area)
  • We add more dimensions to 8.2.4
  • Variables Alcohol (A), Cigarette (C), Marijuana
    (M)
  • Added variables Gender (G), Race (R)

3
Association Graphs (Definitions)
  • association graph - set of vertices, each
    vertex is a variable
  • edge - conditional association between 2
    variables
  • path - sequence of edges leading from one
    variable to another

4
Association Graphs (Saturated)
Variable
Path
M
G
M
G
R
R
C
A
Conditional Association
5
Association Graphs (Reduced)
M
G
R
A
C
6
Data Set
Marijuana
Use


Race White
Race Other

Female Male Female
Male Alcohol Cigarette yes no yes no yes no yes
no yes yes 405 268 453 228 23 23 30 19 no 13 218
28 201 2 19 1 18 no yes 1 17 1 17 0 1 1 8 no 1 11
7 1 133 0 12 0 17
7
SAS Program
Too large to place here Go to survey.sas
8
R Program
Original codes (modified below)
http//math.cl.uh.edu/thompsonla/RCode.txt
surveylt-data.frame(expand.grid(cigarettec("Yes","
No"),
alcoholc("Yes","No"),
marijuanac("Yes","No"),
genderc("female","male"),
racec("white","other")
),
countc(405,13,1,1,268,218,17,117,453
,28,1,1,228,201,17,
133,23,2,0,0,23,19,1,12,30,1,1,0,19,
18,8,17)) library(MASS) fit.GRlt-glm(count .
genderrace, datasurvey, familypoisson)
mutual independence GR fit.homog.assoclt-glm(coun
t .2, datasurvey, familypoisson)
homogeneous association fit.3factlt-glm(count
.3, datasurvey, familypoisson) all three
factor terms summary(reslt-stepAIC(fit.homog.assoc,
scope list(lower cigarette alcohol
marijuana genderrace),
direction"backward")) fit.AC.AM.CM.AG.AR.GM.GR.MR
lt-res fit.AC.AM.CM.AG.AR.GM.GRlt-update(fit.AC.AM.C
M.AG.AR.GM.GR.MR, . - marijuanarace) fit.AC.AM.C
M.AG.AR.GRlt-update(fit.AC.AM.CM.AG.AR.GM.GR, . -
marijuanagender)
9
R Program (P-values)
1-pchisq((15.8-15.3),1) 1-pchisq((16.7-15.8),1)
1-pchisq((19.9-16.7),1) 1-pchisq((28.8-19.9),1)
1-pchisq((40.3-28.8),1)
10
Model Selection
  • Select an Alpha level (default to use 0.05)
  • Look at the P-values of the model
  • Use (in R) 1-pchisq(G2, df)
  • Stop selecting once you reach the Alpha in (1)
  • Model 1 GRACMGR
  • Model 2 GRACMGR(all pairs)

11
Model Selection (Continued)
  • Model 3 GRACMGR(all pairs)(all 3 factors)
  • Model 4g lowest change in G2, taking out CR
  • Model 5 lowest change in G2, taking out CG
  • Model 6 lowest change in G2, taking out MR
  • Model 7 lowest change in G2, taking out GM
  • Consider ACMACAMCM

12
Goodness-of-Fit tests(Table 9.2)
13
Thank You!
Any Questions???
Write a Comment
User Comments (0)
About PowerShow.com