Title: Analysis of the NonHuman Primate Study: Update
1Analysis of the Non-Human Primate Study Update
David Madigan Rutgers University
stat.rutgers.edu/madigan
2Goal of the Analysis
- Are measurable aspects of the state of the immune
system predictive of survival? - Problem hundreds of different assay timepoints
but fewer than one hundred macaques - Initial descriptive analysis
- Regularized predictive modeling
- New functional decision tree modeling
3(No Transcript)
4(No Transcript)
5(No Transcript)
6(No Transcript)
7(No Transcript)
8(No Transcript)
9(No Transcript)
10(No Transcript)
11(No Transcript)
12(No Transcript)
13(No Transcript)
14IgG
15ED50
16IFNeli
17SI
18IL4eli
19IFNm
20r_ED-IgG
21Logistic Regression Model
- Linear model for log odds of category membership
p(y1xi)
log ? bj xij bxi
p(y-1xi)
- Conditional probability model
22Maximum Likelihood Training
- Choose parameters (bj's) that maximize
probability (likelihood) of class labels (yi's)
given documents (xis)
- Tends to overfit
- Not defined if d gt n
- Feature selection
23Shrinkage Methods
- Feature selection is a discrete process
individual variables are either in or out.
Combinatorial nightmare. - This method can have high variance a different
dataset from the same source can result in a
totally different model - Shrinkage methods allow a variable to be partly
included in the model. That is, the variable is
included but with a shrunken co-efficient - Elegant way to tackle over-fitting
24Ridge Logistic Regression
Maximum likelihood plus a constraint
Lasso Logistic Regression
Maximum likelihood plus a constraint
25s
26(No Transcript)
27- L1 Logistic Regression
- complete imputation
- common weeks only (0,4,8,26,30,38,42,46,50)
- no interactions
IGG_38 -0.16 (0.17) ED50_30 -0.11
(0.14) SI_8 -0.09 (0.30) IFNeli_8 -0.07
(0.24) ED50_38 -0.03 (0.35) ED50_42 -0.03
(0.36) IFNeli_26 -0.02 (0.26) IL4/IFNeli_0 0.04
(0.36)
bbrtrain -p 1 -s --autosearch --accurate
commonBBR.txt commonBBR.mod
28- L1 Logistic Regression
- limited imputation
- common weeks only (0,4,8,26,30,38,42,46,50)
- no interactions
IGG_34 -0.16 ED50_30 -0.06 SI_8 -0.03
bbrtrain -p 1 -s --autosearch --accurate
commonBBRreduced.txt commonBBRreduced.mod
29Tree Models
- Easy to understand recursively divide predictor
space into regions where response variable has
small variance - Can model complex interactions
- Hypothesis generation
30(No Transcript)
31(No Transcript)
32(No Transcript)
33(No Transcript)
34 Groups 1-3
35(No Transcript)
36Work in Progress
- Data summary visualization ?
- Regularized logistic regression ?
- Characterize assay trajectories rather than
individual time points ? - Assessment of out-of-sample predictive
performance ? - Report ?