AVRP: Initial Analysis of the NonHuman Primate Study - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

AVRP: Initial Analysis of the NonHuman Primate Study

Description:

Are measurable aspects of the state of the immune system ... Modifed Gauss-Seidel; Highly tuned C implementation. http://stat.rutgers.edu/~madigan/BBR ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 30
Provided by: Madi1
Category:

less

Transcript and Presenter's Notes

Title: AVRP: Initial Analysis of the NonHuman Primate Study


1
AVRP Initial Analysis of the Non-Human Primate
Study
David Madigan Rutgers University
stat.rutgers.edu/madigan
2
Goal of the Analysis
  • Are measurable aspects of the state of the immune
    system predictive of survival?
  • Problem hundreds of different assays but fewer
    than one hundred macaques
  • Initial descriptive analysis
  • Regularized predictive modeling


3
(No Transcript)
4
(No Transcript)
5
(No Transcript)
6
(No Transcript)
7
(No Transcript)
8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
Logistic Regression Model
  • Linear model for log odds of category membership

p(y1xi)
log ? bj xij bxi
p(y-1xi)
  • Conditional probability model

15
Maximum Likelihood Training
  • Choose parameters (bj's) that maximize
    probability (likelihood) of class labels (yi's)
    given documents (xis)
  • Tends to overfit
  • Not defined if d gt n
  • Feature selection

16
Shrinkage Methods
  • Feature selection is a discrete process
    individual variables are either in or out.
    Combinatorial nightmare.
  • This method can have high variance a different
    dataset from the same source can result in a
    totally different model
  • Shrinkage methods allow a variable to be partly
    included in the model. That is, the variable is
    included but with a shrunken co-efficient
  • Elegant way to tackle over-fitting


17
Ridge Regression
subject to
Equivalently
This leads to Choose ? by cross-validation.
works even when XTX is singular
18
s
19
Least Absolute Shrinkage Selection Operator
(LASSO)
Tibshirani
subject to
  • Quadratic programming algorithm needed to solve
    for the parameter estimates
  • Modifed Gauss-Seidel Highly tuned C
    implementation
  • http//stat.rutgers.edu/madigan/BBR

20
(No Transcript)
21
Same as putting a double exponential or Laplace
prior on each bj
22
(No Transcript)
23
Data Sets
  • ModApte subset of Reuters-21578
  • 90 categories 9603 training docs 18978 features
  • Reuters RCV1-v2
  • 103 cats 23149 training docs 47152 features
  • OHSUMED heart disease categories
  • 77 cats 83944 training docs 122076 features
  • Cosine normalized TFxIDF weights

24
Dense vs. Sparse Models (Macroaveraged F1)
25
(No Transcript)
26
(No Transcript)
27
Groups 1-3 TNA at week 38 IFNm at week
4 TNFe at week 4
Estimate Std. Error z value Pr(gtz)
(Intercept) -14.9800 9.6559 -1.551 0.1208
tna38 -0.4594 0.5611 -0.819
0.4129 ifnm4 1.8591 1.4046
1.324 0.1856 tnfe4 16.2882
8.7637 1.859 0.0631 .
Groups 4-8 IgG at week 46 TNA at week
8 SI at week 38 IL6m at week 38
Estimate Std. Error z value Pr(gtz)
(Intercept) -2.7190 1.6131 -1.686 0.09186
. dose 31.5690 19.6857 1.604
0.10879 igg46 -0.9257 0.6544
-1.415 0.15718 tna8 -0.1901
0.2356 -0.807 0.41971 si38 1.1912
0.8243 1.445 0.1345 il6m38 -0.9989
0.5405 -1.848 0.06457
28
Groups 1-3
29
Groups 4-8
Write a Comment
User Comments (0)
About PowerShow.com