Title: Logistic Regression II
1Logistic Regression II
2How to calculate ORs from model with interaction
term
- Standard Wald
- Parameter DF Estimate Error
Chi-Square Pr gt ChiSq - Â
- Intercept 1 -1.2858 0.6247
4.2360 0.0396 - psa 1 0.0608 0.0280 11.6952
0.0006 - race 1 0.0954 0.5421
0.0310 0.8603 - psarace 1 -0.0349 0.0193
3.2822 0.0700
Increased odds for every 5 mg/ml increase in
PSA If white (race0) If black (race1)
3ORs for increasing psa at different levels of
race.
4ORs for increasing psa at different levels of
race.
5OR for being black (vs. white), at different
levels of psa.
6Predictions
- Whats the predicted probability for a white man
with psa level of 10 mg/ml?
7Predictions
- Whats the predicted probability for a black man
with psa level of 10 mg/ml?
8Predictions
- Whats the predicted probability for a white man
with psa level of 0 mg/ml (reference group)?
9Predictions
- Whats the predicted probability for a black man
with psa level of 0 mg/ml?
10Diagnostics Residuals
- Whats a residual in the context of logistic
regression? - Residualobserved-predicted
- For logistic regression
- residual 1 predicted probability
- OR residual 0 predicted probability
11Diagnostics Residuals
- Whats the residual for a white man with psa
level of 0 mg/ml who has capsule penetration?
- Whats the residual for a white man with psa
level of 0 mg/ml who does not have capsule
penetration?
12In SASrecall model with psa and gleason
- proc logistic data hrp261.psa
- model capsule (event"1") psa gleason
- output outMyOutdata lMyLowerCI
- pMypredicted uMyUpperCI resdevMyresiduals
- run
- proc gplot data MyOutdata
- plot Myresidualspredictor
- run
13Residualpsa
14Estimated probgleason
15Model-Checking Goodness of Fit
- Partition observations into groups by level of 1
predictor (e.g. 8 groups by psa level) - Sum the predicted probabilities for all ?n/8 in
each of the groups. (expected) - Count the total number of events (e.g.,
capsule1) in each of the groups (observed) - You have 8 observed vs. expected
- (observed-expected)2/expected has ? chi-square
distribution with df groups(8)-number
parameters in the model (2) 6 - Null Hypothesis model is a good fit
16Hosmer and Lemeshow Lack Fit Test
- Divide observations into (as best as possible)
deciles from lowest to highest predicted
probabilities - May not be exactly even deciles because of ties
and if total observations is not a multiple of 10 - Sum the predicted probabilities for all ?n/10 in
each decile. (expected) - Count the total number of events in each decile
(observed) - You have 10 observed vs. expected
- (observed-expected)2/expected has ? chi-square
(df8) distribution - Null Hypothesis model is a good fit
17In SAS
- proc logistic data hrp261.psa
- model capsule (event"1") psa race psarace
/lackfit - run
18Results lackfit option
- Partition for the Hosmer and Lemeshow Test
-
capsule 1 capsule 0 - Group Total Observed
Expected Observed Expected - 1 39 4
9.80 35 29.20 - 2 38 11
10.40 27 27.60 - 3 38 11
11.10 27 26.90 - 4 40 13
12.33 27 27.67 - 5 38 17
12.46 21 25.54 - 6 38 12
13.27 26 24.73 - 7 38 13
14.61 25 23.39 - 8 38 22
17.18 16 20.82 - 9 38 21
21.84 17 16.16 - 10 32 27
28.01 5 3.99 - Hosmer and Lemeshow
Goodness-of-Fit Test
INDICATES GOOD FIT!
19In SAS
- proc logistic data hrp261.psa
- model capsule (event"1") psa /lackfit
- run
20Results lackfit option
-
capsule 1 capsule 0 - Group Total Observed
Expected Observed Expected - 1 40 4
10.51 36 29.49 - 2 39 12
11.02 27 27.98 - 3 38 9
11.35 29 26.65 - 4 37 15
11.65 22 25.35 - 5 40 17
13.24 23 26.76 - 6 40 9
14.04 31 25.96 - 7 38 13
14.56 25 23.44 - 8 39 28
17.44 11 21.56 - 9 38 21
21.86 17 16.14 - 10 31 25
27.32 6 3.68 - Hosmer and Lemeshow
Goodness-of-Fit Test - Chi-Square
DF Pr gt ChiSq
INDICATES POOR FIT!
21Conditional Logistic Regression for Matched Data
22Recall Matching
- Matching can control for extraneous sources of
variability and increase the power of a
statistical test. - Match M controls to each case based on potential
confounders, such as age and gender.
23Recall Agresti example, diabetes and MI
- Match each MI case to an MI control based on age
and gender. - Ask about history of diabetes to find out if
diabetes increases your risk for MI.
24odds(favors case/discordant pair)
25Conditional Logistic Regression
26The Conditional Likelihood each discordant
stratum (rather than individual) gets 1 term in
the likelihood
For each stratum, we add to the likelihood the
CONDITIONAL probability that the case got disease
and the control did not, given that we have a
case-control pair.
Note the marginal probability of disease may
differ in each age-gender stratum, but we assume
that the (multiplicative) increase in disease
risk due to exposure is constant across strata.
27Recall probability terms
28(No Transcript)
29?The conditional likelihood
30Conditional Logistic Regression
31Example MI and diabetes
32Conditional Logistic Regression
33In SAS
- proc logistic data YourDatamodel MI (event
"Yes") diabetesstrata PairIDrun
34ExamplePrenatal ultrasound examinations and risk
of childhood leukemia case-control study BMJ
2000320282-283
- Could there be an association between exposure to
ultrasound in utero and an increased risk of
childhood malignancies? - Previous studies have found no association, but
they have had poor statistical power to detect an
association. - Swedish researchers performed a nationwide
population based case-control study using
prospectively assembled data on prenatal exposure
to ultrasound.
35ExamplePrenatal ultrasound examinations and risk
of childhood leukemia case-control study BMJ
2000320282-283
- 535 cases all children born and diagnosed as
having myeloid leukemia between 1973Â and 1989Â in
Swedish registers of birth, cancer, and causes of
death. - 535 matched controls 1 control was randomly
selected for each case from the Swedish Birth
Registry, matched by sex and year and month of
birth.
36115
85
235
100
But this type of analysis is limited to single
dichotomous exposure
37- Used conditional logistic regression to look at
dose-response with number of ultrasounds - Results
- Reference OR 1.0 no ultrasounds
- OR .91 for 1-2 ultrasounds
- OR.64 for gt3 ultrasounds
- Conclusion no evidence of a positive association
between prenatal ultrasound and childhood
leukemia even evidence of inverse association
(which could be explained by reasons for frequent
ultrasound)
38Extension 1M matching
- Each term in the likelihood represents a stratum
of 1M individuals - More complicated likelihood expression!
- Just as easy to implement in SAS as well see
Wednesday