Logistic Regression - PowerPoint PPT Presentation

1 / 47
About This Presentation
Title:

Logistic Regression

Description:

Thanks to Curtis A. Parvin, Ph.D. Regression: Relate 1 predictor ('independent') variables to an outcome 'dependent') variable ... Bagley SC, White H, Golumb BA. ... – PowerPoint PPT presentation

Number of Views:1767
Avg rating:3.0/5.0
Slides: 48
Provided by: curtis48
Category:

less

Transcript and Presenter's Notes

Title: Logistic Regression


1
Logistic Regression Clinical Prediction Rules
  • DOC Research
  • October 21, 2009
  • Brian F. Gage, MD
  • Thanks to Curtis A. Parvin, Ph.D

2
Regression Relate 1 predictor (independent)
variables to an outcome dependent) variable
  • Cox regression
  • Binary outcome variable
  • Used to quantify the time to event (the hazard)
  • Assumptions
  • 1. Proportional hazard (multiplicative risk) and
  • 2. Non-informative censoring
  • Logistic regression
  • Binary outcome variable
  • Quantify the relationship between the odds of the
    outcome occurring and the predictor variable(s)
  • Ordinary linear regression
  • Continuous outcome variable
  • Determine the relationship between a continuous
    outcome variable and the predictor variable(s)

3
Other Flavors of Logistic Regression
  • Conditional Logistic Regression
  • Matched pairs data (11, 1k, k1k2 matching)
  • Ordinal Logistic Regression
  • More than two ordered groups for outcome
  • Multinomial Logistic Regression
  • More than two unordered groups for outcome

4
Example 1 Odds of CA After Background Exposure
to Radiation
  • Study Question Does the background level of
    radiation fallout cause CA?
  • Predictor Variable ?
  • http//www.elementsdatabase.com/

5
  • Hypothesis Higher levels of radioactive
    strontium (Sr-90) in deciduous teeth lost in the
    1960s predicts subsequent CA over the next 40
    years.
  • Subjects St. Louis children
  • Statistical Analysis Could be Cox or logistic
    regression
  • Study Design What do you recommend?
  • Assume that you are collaborating w/ the tooth
    fairy
  • Finding http//newsok.com/st.-louis-baby-teeth-y
    ield-new-findings-on-nuclear-fallout/article/feed/
    95578

6
Example 2 Development of Angina. Logistic
Regression Michael P. LaValleyCirculation
20081172395-2399
7
Angina Goodness of Fit Calibration
8
Angina Discrimination
  • C-statistic measures how well we can
    differentiate volunteers in the 2 groups
  • Generally ranges from 0.5 (no discrimination
    better than chance) to 1.0 (perfect
    discrimination)
  • 0.8 0.9 is excellent
  • C 0.64 for this model

9
Example 3 Odds of Major Bleeding Around Time of
NSTEMI
  • Background Tx of MI often causes bleeding
  • Significance Being able to predict major
    bleeding, could allow us to minimize that risk
  • Hypothesis We could develop validate an
    accurate clinical prediction rule for bleeding
  • Study Design Split-sample, retrospective cohort
  • Subjects 89,134 participants in CRUSADE

10
Split Samples 80 Derivation 20 Validation
11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
http//www.crusadebleedingscore.org
17
Example 4 Relationship between gestational age
(GA) and whether an infant is breast feeding at
time of hospital discharge
18
Ordinary Linear Regression
19
Logistic Regression
20
How do we get an S-shaped curve?
  • Rather than using probability as our outcome
    variable, we use a transformation that is a
    function of probability
  • We choose our transformation so that it ranges
    between (-8,8) as probability ranges between
    (0,1)
  • We will use the logit transform
  • Fitting a straight line using the logit transform
    as the outcome variable is called logistic
    regression
  • After we estimate the straight line we can
    transform back to get our S-shaped curve

21
Probability, Odds, and the Logit Transform
  • Probability, P, ranges between 0 and 1
  • Define Odds P/(1-P)
  • Odds range between 0 and 8
  • Note P Odds/(1Odds)
  • The Logit transform is the logarithm of the Odds
  • Logit log(Odds) logP/(1-P)
  • Logit ranges between -8 and 8
  • Note Odds eLogit
  • Note P eLogit/(1eLogit)

22
Log(Odds) -16.72 0.577GA
23
(No Transcript)
24
Logistic Regression
  • Model the logarithm of the odds of an outcome as
    a linear combination of predictor variables
  • Logit log(Odds) abXcY. . .
  • Estimate the coefficients a, b, c based on a
    random sample of subjects data
  • Determine which of the predictors are good
  • Assess model fit
  • Use the model to predict future cases

25
Logistic Regression Coefficients
  • For a single predictor variable, logistic
    regression fits a straight line to the log of the
    Odds
  • log(Odds) a bX
  • b is the slope coefficient for X
  • Each 1 unit change in X, changes the log(Odds) by
    b units

26
(No Transcript)
27
Logistic Regression Coefficients
  • b logOdds(X1) logOdds(X)
  • Note log(A) log(B) log(A/B)
  • b logOdds(X1)/Odds(X)
  • Note Odds(X1)/Odds(X) is called an Odds ratio

28
Odds and Odds Ratios
  • Odds defines the probability that an event occurs
    divided by the probability that the event doesnt
    occur
  • An Odds ratio is the ratio of two Odds
  • An Odds ratio could represent the ratio of the
    odds in two different groups
  • An Odds ratio could represent the ratio of the
    odds at two different values for a risk variable

29
Breast Feeding Example
The Odds ratio for breast feeding at hospital
discharge for GA32 compared to GA28 is 4.0/0.5
8.0
30
Logistic Regression Coefficients and Odds Ratios
  • b logOdds(X1)/Odds(X)
  • b estimates the log of the Odds ratio associated
    with a 1 unit increase in X
  • eb estimates the the odds ratio for a 1 unit
    increase in X
  • For the breast feeding example
  • log(Odds) -16.72 0.577GA
  • the odds of breast feeding at hospital discharge
    increase by a factor of e0.577 1.78 for each
    additional week of GA

31
Logistic Regression Odds Ratios
32
Logistic Regression When There is Only One Binary
Predictor
  • This situation can be handled as a classic
    case-control study

Disease Cases Controls
Risk Yes a b Factor No c
d Odds Ratio (OR) a/c ad b/d bc
33
The Real Strength of Logistic Regression is When
There are Multiple Predictor Variables
  • The independent variables (predictors, risk
    factors) can be categorical or continuous
  • Example TDx-FLM II and gestational age as
    predictors of risk for respiratory distress
    syndrome (RDS)
  • TDx-FLM II measures mg surfactant/g of albumin in
    amniotic fluid

34
The Data (some of it)
35
(No Transcript)
36
(No Transcript)
37
(No Transcript)
38
Logistic Regression Parameter Estimates
--------------------------------------------------
---------------------------- rds
Coef. Std. Err. z Pgtz 95 Conf.
Interval ---------------------------------------
--------------------------------------
tdxflm -.1121656 .0163848 -7.11 0.000
-.1442792 -.0800520 ga -.3661113
.1192559 -2.58 0.010 -.5998486
-.1323740 _cons 15.68597 4.322678
3.63 0.000 7.213680 24.15827 -----------
--------------------------------------------------
-----------------
log(Odds) 15.69 - 0.112TDxFLM - 0.366GA
Odds Ratio for a 1 g/mg increase in TDxFLM
e-0.112 0.894 Odds Ratio for a 1 week increase
in GA e-0.366 0.693
39
Using the Logistic Model to Predict Risk of RDS
  • We can use the logistic model equation to
  • Identify variables that are significant
    predictors
  • calculate the absolute risk (probability) of RDS
    (may give biased estimates)
  • calculate the relative risk (odds ratio) of RDS
  • develop a classifier for diagnosing RDS

40
Logistic Regression Parameter Estimates
Significant coefficients mean significantly
different from zero
--------------------------------------------------
---------------------------- rds
Coef. Std. Err. z Pgtz 95 Conf.
Interval ---------------------------------------
--------------------------------------
tdxflm -.1121656 .0163848 -7.11 0.000
-.1442792 -.0800520 ga -.3661113
.1192559 -2.58 0.010 -.5998486
-.1323740 _cons 15.68597 4.322678
3.63 0.000 7.213680 24.15827 -----------
--------------------------------------------------
-----------------
Significant Odds ratios mean significantly
different from one
--------------------------------------------------
---------------------------- rds Odds
Ratio Std. Err. z Pgtz 95 Conf.
Interval ---------------------------------------
--------------------------------------
tdxflm .893896 .0154269 -7.11 0.000
.8636601 .9241324 ga .6934256
.0871025 -2.58 0.010 .5227078
.8641434 -----------------------------------------
-------------------------------------
41
Odds ratios for RDS relative to a TDX FLM II
ratio of 70 mg/g at 37 weeks gestational age
42
Logistic Regression Predicted Probabilities and
Classification with 0.20 cutoff
TDxFLM GA RDS Logistic
P Classify 75 30 0
.0115517 0 TN 7 31 1
.9521286 1 TP 14.8 31
1 .8912354 1 TP 18.3
31 1 .8462539 1 TP 27
31 1 .6718219 1 TP
22 31 0 .7832782 1
FP 29 31 0 .6198854
1 FP 135 31 0
.0000095 0 TN 4 32
1 .9543484 1 TP 15
32 1 .8568574 1 TP 16.5
32 1 .8346432 1 TP
25 32 1 .6575863 1
TP 44.2 32 1 .1779585
0 FN 35.5 32 0
.3679177 1 FP 41 32
0 .2374989 1 FP 48
32 0 .1232235 0 TN 49
32 0 .1114575 0 TN
55.8 32 0 .0547323 0
TN 59 32 0 .0386864
0 TN 59 32 0
.0386864 0 TN
43
(No Transcript)
44
(No Transcript)
45
Software Packages that perform Logistic Regression
  • STATA
  • SAS
  • SPSS
  • R
  • JMP
  • Others

46
References
  • Hosmer DW, Lemeshow S. Applied logistic
    regression, 2nd ed., New York, NY John Wiley
    Sons, 2000.
  • Kleinbaum DG. Logistic regression a
    self-learning text. New York, NY
    Springer-Verlag, 1994.
  • Bagley SC, White H, Golumb BA. Logistic
    regression in the medical literature standards
    for use and reporting, with particular attention
    to one medical domain. J Clin Epidemiol
    200154979-85.
  • (http//www.sciencedirect.com/science/publications
    /journal)
  • Ostir GV, Uchida T. Logistic regression a
    nontechnical review. Am J Phys Med Rehabil
    200079565-72.
  • (pdf file available online through Ovid gateway)
  • http//www.ioa.pdx.edu/newsom/pa551/lectur21.htm
  • http//personal.ecu.edu/whiteheadj/data/logit/
  • Parvin CA, Kaplan LA, Chapman JF, McManamon TG,
    Gronowski AM. Predicting respiratory distress
    syndrome using gestational age and fetal lung
    maturity by fluorescent polarization. Am J Obstet
    Gynecol 2005192199-207.

47
Next Week
  • Attend 4th Annual Research Symposium and Poster
    Session 1230-430
  • Farrell Learning and Teaching Centers Connor
    Auditorium
  • Lecture 440-545
  • Read Hulley et al. Chapter 9
  • Problem Set 4
Write a Comment
User Comments (0)
About PowerShow.com