Bayesian Logistic Regression - PowerPoint PPT Presentation

1 / 51

About This Presentation

Title:

Bayesian Logistic Regression

Description:

The O-Ring data consist of 23 observations on Pre-Challenger Space Shuttle Launches ... Now we consider situations where the binary response may be in error ... – PowerPoint PPT presentation

Number of Views:439

Avg rating:3.0/5.0

Slides: 52

Provided by: wesjo

Category:

more less

Transcript and Presenter's Notes

Title: Bayesian Logistic Regression

1
Bayesian Logistic Regression

O-Ring Data Analysis

2
O-Ring Data

The O-Ring data consist of 23 observations on
Pre-Challenger Space Shuttle Launches
On each launch, it is observed whether there is
at least one O-ring failure, and the temperature
at launch
The goal is to model the probability of at least
one O-ring failure as a function of temperature

3
(No Transcript)
4
Logistic Regression Model

P(T) Pr(Y1T)
Y is distributed as Bernoulli(P(T))
Log(P(T)/(1-P(T)) logit(P(T))
Beta1 TBeta2
P(T) exp(Beta1 TBeta2)/ (1exp(Beta1
TBeta2))
Interested in P(55) and P(75)

5
WinBUGS CODE (Noninformative Prior)

model for (i in 1n)
yi dbern(pi)
logit(pi) lt- beta1 Tibeta2
Or lt- exp(beta2)
Prob1 lt- exp(beta1 55beta2)/
(1exp(beta1
55beta2))
Prob2 lt- exp(beta1 75beta2)/
(1exp(beta1
75beta2))
for (j in 12) betaj
dnorm(0,0.25)

6
The Data and Initial Values

list(T c(53,57,58,63,66,67,67,67,68,69,
70,70,70,70,72,73,75,75,76,76,78,79,81),
y c(1,1,1,1,0,0,0,0,0,0,0,0,1,
1,0,0,0,1,0,0,0,0,0),n23)
list(betac(0,0)) list(betac(-10,10)
T Temperatures at Launch Y 1 if at least one
O-Ring failure and 0 otherwise.

7
Script Code

We can either run the previous WinBUGS code by
manually clicking, or
We can run a script code, that will do the
clicking automatically
Script code is especially useful if you will run
WinBUGS several times before you are done
analyzing a data set

8
HERE IS SCRIPT CODE

display('log')
check('c/oringlr.odc)
data('c/oringdat.odc)
compile(2)
inits(1, 'c/orinit1.odc')
inits(2, 'c/orinit2.odc')

Show results on screen
Run the program
Use these data
Compile two Markov chains
Use these initials For chain 1
And these for Chain 2
9
More Script

gen.inits()
update(500)
set(beta)
set(Prob1)
set(Prob2)
set(Or)

Generate any initial values that were not
specified already
Burn-in of 500
Tell WinBUGS that you want to monitor the
Betas, P(55) P(75), and Or
10
More Script

dic.set()
update(2000)
gr()
stats()
history()
trace()

Get the Deviance Information Criterion
Use 2000 MC samples
Get the Gelman-Rubin Diagnostic
Get summary statistics
Plot the histories for all parameters
Real time monitoring of the chains
11
More Script

density()
autoC()
quantiles()
dic.stats()
coda(,output)
save('OringLog')

Plots Posterior densities
Monitor Autocorrelation
Monitor Quantiles
Obtain Deviance Information Statistics
Save the Markov Chain for future use
Save the entire log in this file
12

The code is run by going to the model menu and
clicking on script
Clearly convergence hadnt occurred before 1200
or so, after the burn-in burn-in was too small

13
Convergence is not happening
14
Autocorrelation is high even higher for the
Betas
15
Quantiles are also not looking good
16
Now we took a 50,000 burn-in with 50,000 more
samples
Its Still not converging
17
Further Analysis

We considered another prior on the Betas with
larger variance (10,000).
Convergence seemed better with a 100,000 burn-in,
but it was still questionable
So we consider a proper, and slightly informative
prior on the Betas

18
A Sensible Informative Prior for Beta

Now consider the probability of O-ring failure at
55 degrees and at 75 degrees respectively
We already defined P(55) and P(75) respectively
Suppose that apriori , the scientist is 2/3 sure
that P(55) gt 0.5, and that they are also 2/3 sure
that P(75) lt 0.5.
Using Betabuster, we get Beta(1.6,1) and
Beta(1,1.6) distributions for P(55) and P(75)
respectively

19
The Priors for P(55) and P(75)
Density For P(55)
Density For P(75)
20

This is not particularly strong prior
information.
We have mainly suggested that the probably of
O-ring failure may be decreasing as Temperature
increases
This kind of prior may be preferable to having
arbitrary normal priors on Beta

21
WinBUGS for LR with Inf Prior

We now need to tell WinBugs what to do with the
prior information
This is very easy for this simple problem
Two things are different from last time
(1) Specify Beta priors for the Ps rather than
normal priors for the Betas
(2) Tell WinBUGS how to relate Ps to Betas

22
We Know

logit(P(55)) Beta1 55Beta2
logit(P(75)) Beta1 TBeta2
So we solve for Beta1 and Beta2 in terms of P(55)
and P(75)
This is done using standard algebraic methods for
solving two linear equations in two unknowns

23
First Part is the Same as Before

model for (i in 1n) yi dbern(pi)
logit(pi) lt- beta1 Tibeta2
Prob1 lt- exp(beta1 55beta2)/
(1exp(beta1 55beta2))
Prob2 lt- exp(beta1 75beta2)/
(1exp(beta1 beta275))

24
This is the New Part

P55 dbeta(1.6,1) P75
dbeta(1,1.6)
beta1 lt- (75/20) logit(P55)
(-55/20)logit(P75)
beta2 lt- (-1/20)logit(P55)
(1/20)logit(P75)

We need new initial values for the Ps
list(P c(0.5,0.5))
list(P c(0.1,0.9))
We ran this code with a burn-in of 10 and with
10000 samples

26
History Plots
27
Gelman-Rubin Plot
28
Quantile Plot
29
Posterior Inferences

est sd 2.5 97.5
P55 0.77 0.15 0.42 0.97
P75 0.15 0.08 0.03 0.34
Beta0 10.79 4.75 2.424 21.05
Beta2 -0.17 0.07 -0.32 -0.05

30
Betai N(0,10000)

est sd 2.5 97.5
P55 0.88 0.14 0.49 0.999
P75 0.086 0.07 0.005 0.27
Beta0 18.37 8.89 4.40 38.75
Beta1 -0.28 0.13 -0.58 -0.08

31
Induced Prior on Probs of O-Ring Failure

Clearly, these priors are not sensible for this
problem. They result in posteriors that shrink
estimates of probabilities toward zero and one.

32
Logistic Regression with Error in the
Response

Now we consider situations where the binary
response may be in error
For example, Magder and Hughes analyzed data on
361 pregnant women who participated in a smoking
cessation program
The response is Yes I quit or No I didnt

If they really quit, you would expect that they
would tell the truth 100 of the time, but
If they didnt quit, you might expect them to lie
and say they did some proportion of the time

34
Model for Probability of Quitting as Function of
Cov Information

logit(P(Quitx_i)) x_iBeta
Covariates are Age (3 categories) Previous
smoking status (2 categories) and Education (3
categories)
So we have 1 intercept plus 2 plus 1 plus 2
(dummy) regression coefficients
We model prior information for 6 probabilities of
quitting and induce prior on Beta

35
Informative Prior on Probability of Quitting
36

So let pi_i be the probability of actually
quitting for individuals corresponding to the ith
row of the previous table
We will specify independent beta priors for each
of the types of individual that have been
specified
The rows of the matrix given have been specified
so as to not be too close

Let pi be the vector (pi_1,,pi_6)
Let x_i correspond to the ith row of the previous
matrix, with a one in the first slot
Let X be the 6 by 6 matrix composed of putting
all 6 row vectors x_i on top of one another
We elicit the following beta priors for q

38
The Actual Prior Specification
Prior Mode 95 Prob Int.
Beta (a,b)
39
Induced Prior on Beta

We have selected X so that it is invertible. If
it was not invertible, we would have to try again
We have the vector equation
logit(pi) XBeta
Which is solved by
Beta inv(X)logit(pi)

40
The Model for the Data

Let Se P(Says she QuitActually Quit)
Sp P(Says she didnt QuitActually Didnt)
The response is Y_i 1 if ith woman says she
quit, and Y 0 if she says she didnt
Let q_i be the actual probability of quitting
for women with same covariates as woman i in the
data

41
q_i Plays the Role of Prevalence

So the data are the Y_is and their corresponding
x_is
P(Y_i 1x_i) q_iSe (1-q_i)Sp
logit(q_i) x_iBeta

42
Prior on Se and Sp

Se Beta(a,b) Sp Beta(c,d)
For these data, a 99 and b 1 and c 14 and d
2 were selected
The prior median for Se is 0.993 and
corresponding 95 probability interval is
(0.963, .9997)
For Sp, these are 0.89 (0.68, 0.98)

43
WinBUGS Code

Model for(i in 1N)
yi dbern(si)
si lt- qiSe(1-qi)(1-Sp)
logit(qi) lt- beta6 x1ibeta1
x2ibeta2
x3ibeta3 x4ibeta4
x5ibeta5
Se dbeta(99,1) Sp dbeta(14,2)
pi1 dbeta(8, 15) pi2 dbeta(10, 15)
pi3 dbeta(3, 13) pi4 dbeta(8, 10)
pi5 dbeta(4, 15) pi6 dbeta(6, 15)

44
Beta inv(X)logit(pi)

beta1 lt- xinv1,1logit(pi1)
xinv1,2logit(pi2)
xinv1,3logit(pi3) xinv1,4logit(pi
4)
xinv1,5logit(pi5)
xinv1,6logit(pi6)
beta2 lt- xinv2,1logit(pi1)
xinv2,2logit(pi2
xinv2,3logit(pi3)
xinv2,4logit(pi4)
xinv2,5logit(pi5) xinv2,6logit(pi6
)
beta3 lt- xinv3,1logit(pi1)
xinv3,2logit(pi2)
xinv3,3logit(pi3) xinv3,4logit(pi
4)
xinv3,5logit(pi5)
xinv3,6logit(pi6)
beta4 lt- xinv4,1logit(pi1)
xinv4,2logit(pi2)
xinv4,3logit(pi3)
xinv4,4logit(pi4)
xinv4,5logit(pi5)
xinv4,6logit(pi6)
Etc.

The adjusted odds ratio estimate for smoking was
3.62 (1.97, 7.78)
So, if you smoked less than a pack a day, your
estimated odds of quitting were about 3.6 times
greater than if you smoked more than a pack a day

The odds of quitting for those who graduated from
HS and did not attend college was estimated to be
2.36 (1.27, 5.14) times greater than those who
did not graduate from high school

47
Estimated probabilities of smoking cessation
among

(a) high school graduates in their 20s without
any college experience that smoked lt 1 pack/day
0.30 (0.15, 0.41)
(b) high school graduates in their 20s without
any college experience that smoked 1 pack/day
0.10 (0.03, 0.20)

48
Estimated Difference and Ratio