Chap 8: Introduction to Logistic Regression - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Chap 8: Introduction to Logistic Regression

Description:

yi= 0 1 xi1 .... k xik ei ... y*i = 1 when yi 0. 0 when yi 0 ... p(ei a)=1/(1 e-a). Then we have: pi=1/(1 exp-( 0 1 xi1 .... k xik ) ... – PowerPoint PPT presentation

Number of Views:122
Avg rating:3.0/5.0
Slides: 20
Provided by: CNH4
Category:

less

Transcript and Presenter's Notes

Title: Chap 8: Introduction to Logistic Regression


1
Chap 8Introduction to Logistic Regression
2
Logistic regression
  • Models the relationship between a set of
    variables xi
  • dichotomous (eat yes/no)
  • categorical (social class, ... )
  • continuous (age, ...)
  • and
  • dichotomous variable Y
  • Dichotomous (binary) outcome most common
    situation in biology and epidemiology

3
Logistic regression (1)
Table 2 Age and signs of coronary heart
disease (CD)
4
How can we analyse these data?
  • Comparison of the mean age of diseased and
    non-diseased women
  • Non-diseased 38.6 years
  • Diseased 58.7 years (plt0.0001)
  • Linear regression?

5
Dot-plot Data from Table 2
6
Logistic regression (2)
  • Table 3 Prevalence () of signs of CD
    according to age group

7
Dot-plot Data from Table 3
Diseased
Age (years)
8
The logistic function (1)
Probability of disease
x
9
The logistic function (2)
logodds
10
The logistic function (3)
  • Advantages of the logit
  • Simple transformation of P(yx)
  • Linear relationship with x
  • Can be continuous (Logit between - ? to ?)
  • Known binomial distribution (P between 0 and 1)
  • Directly related to the notion of odds of disease
  • O
  • LogO

11
Interpretation of b (1)
  • X1
  • X0

12
Interprepation
  • Intercept is the point on the Y-axis (log odds)
    crossed by the regression line when X0.
  • Slope is the rate at which the predicted log odds
    increases (or, in some cases, decreases) with
    each successive unit of X.
  • Within the context of logistic regression, you
    will usually find the slope of the log odds
    regression line referred to as the "constant."
  • The exponent of the slope  exp(slope) describes
    the proportionate rate at which the predicted
    odds changes with each successive unit of X.

13
Example
  • If X29 and the odds is 1.81, then we say
  • that
  • The predicted odds for x29 is
  • 1.81 times as large as the one for X28
  • the one for X30 is 1.81 times as large
  • as the one for X29 and so on.

14
Interpretation of b (2)
  • b increase in log-odds for a one unit
    increase in x
  • Test of the hypothesis that b0 (Wald test)
  • Interval testing

15
(No Transcript)
16
-0.123 is the rate at which the predicted CD odds
decreases with each successive unit of X. It
means also that the predicted CD odds for age30
is Exp(-0.123)0.9 times as large as the one for
age29 the one for X31 is 0.9 times as large
as the one for X30 and so on
17
  • Results of fitting Logistic Regression Model
  • logO6.43(-.121 x 31)2.67The corresponding
    predicted odds would be exp(logO)exp(2.67)14.
    43
  • And the corresponding predicted probability would
    be probabilityO/(1O)14.43/(114.43)0.93

18
Link between Logistic regression and threshold
variable
  • When y is continuous we may use
  • yi?0 ?1 xi1.?k xik ei
  • If we create a new variable y, called threshold
    variable such that
  • yi 1 when yi ? 0
  • 0 when yi lt 0
  • if we use a logistic distribution on ei instead
    of a normal distribution, defined such that
  • p(ei ?
    a)1/(1e-a).
  • Then we have
  • pi1/(1exp-(?0 ?1
    xi1.?k xik ) )

19
Discrimination classification Rules
  • Aim of this section is to classify a data into a
    given group.
  • We need to find a rule which allows us to to
    determine whether or not an observation falls
    into a certain group/category/class.
  • We define, a categorical variable and use this
    indicator as the response variable.
  • We need to establish a classification rule for
    discriminating our populations.
Write a Comment
User Comments (0)
About PowerShow.com