Linear Models III Thursday May 31, 10:15-12:00 - PowerPoint PPT Presentation

About This Presentation
Title:

Linear Models III Thursday May 31, 10:15-12:00

Description:

Linear Models III Thursday May 31, 10:15-12:00 Deborah Rosenberg, PhD Research Associate Professor Division of Epidemiology and Biostatistics University of IL School ... – PowerPoint PPT presentation

Number of Views:127
Avg rating:3.0/5.0
Slides: 64
Provided by: DebR135
Category:

less

Transcript and Presenter's Notes

Title: Linear Models III Thursday May 31, 10:15-12:00


1
Linear Models IIIThursday May 31, 1015-1200
  • Deborah Rosenberg, PhD
  • Research Associate Professor
  • Division of Epidemiology and Biostatistics
  • University of IL School of Public Health
  • Training Course in MCH Epidemiology

2
Ordinal and Nominal Outcomes
  • Outcomes with More than 2 Categories
  • Examples of Outcomes which might be suited for
    ordinal or nominal regression
  • Ordinal or Nominal bmi categories
  • Nominal cause of death categories
  • Ordinal or nominal severity of illness categories
  • Ordinal or nominal categories of program
    participation

3
Ordinal and Nominal Outcomes
  • The Cumulative Logit Model
  • The primary motivation for using a logistic model
    with an ordinal outcome is to accommodate a truly
    ordinal variable that has a "ceiling" and "floor"
    effect and one in which the intervals between
    each response category can be somewhat arbitrary
    that is, it is not a continuous variable.
  • Modeling an ordinal outcome as a continuous
    variable can yield biased results because it will
    yield predicted values outside the range of the
    ordinal variable.

4
Ordinal and Nominal Outcomes
  • The Cumulative Logit Model
  • An ordered outcome may reflect an underlying
    continuous variable for which we have no data or
    for which we don't know the "real" threshold
    values.
  • For example, a Likert scale for satisfactionvery
    dissatisfied to very satisfiedor for
    agreementstrongly disagree to strongly agreehas
    response categories reflecting a continuous scale
    for which there is no data.

5
Modeling Ordinal Outcomes
  • Some other ordinal variables that may reflect an
    underlying continuous construct that cannot be
    measured as such. The ordered values are
    intended to reflect distinct threshold values.
  • Examples of ordinal variables of this type
  • access to care index
  • reports of experience of life stress
  • assessment of overall health status
  • satisfaction with care

4
6
Ordinal and Nominal Outcomes
  • The Cumulative Logit Model
  • To appropriately model an outcome as ordinal, the
    proportional odds assumption must hold.
  • The proportional odds assumption
  • if an independent variable increases (or
    decreases) the odds of being in category 1 v. the
    remaining categories, then it also similarly
    increases (or decreases) the odds of being in
    category 2 and 1 combined v. the remaining
    categories, in categories 3, 2, and 1 combined v.
    the remaining categories, etc.

7
Ordinal and Nominal Outcomes
  • The Cumulative Logit Model
  • The null hypothesis for the proportional odds
    assumption is that the odds ratios for the
    association between a risk factor and an ordinal
    outcome are constant regardless of how the
    category boundaries are drawn.
  • If the proportional odds assumption holds, then
    the association between an independent variable
    and the outcome can be expressed as a single
    summary estimatea common odds ratioacross all
    categories.

8
Ordinal and Nominal Outcomes
  • The Cumulative Logit Model
  • The proportional odds assumption can be tested
    with a chi-square statistic a score test. A
    nonsignificant result means that the null
    hypothesis will not be rejected and that the
    cumulative logit model is appropriate a
    significant result means that the proportional
    odds assumption may not hold.

9
Ordinal and Nominal Outcomes
  • The Cumulative Logit Model
  • For an ordered outcome with k categories
  • Both the numerator and denominator change
  • http//www.indiana.edu/7Estatmath/stat/all/cat/2b
    1.html

10
Ordinal and Nominal Outcomes
  • Odds Among the exposed a / bcd
  • Odds Among the exposed ab / cd
  • Odds Among the exposed abc / d

11
Ordinal and Nominal Outcomes
  • The Cumulative Logit Model
  • Given k categories of an ordered outcome
    variable, a cumulative logit model yields k-1
    intercept terms. Each intercept corresponds to a
    category combined with all adjacent lower-ordered
    categories.
  • Since proportional odds are assumed, and
    therefore a common odds ratio, the effect of each
    covariate is reflected in a single beta
    coefficient.

12
Ordinal and Nominal Outcomes
  • The Cumulative Logit Model
  • Suppose an outcome variable has 4 categories and
    we are modeling one independent variable. The
    cumulative logit model will look as follows
  • ln(Odds) b0,1 b0,12 b0,123 b1
  • The odds ratio is the same regardless of
    category

13
Ordinal and Nominal Outcomes
  • A stratified approach to mimic a cumulative logit
    model for a 4 category variable, would mean
    creating new dichotomous variables something like
    the following
  • if ordvar 1 then ordvar1 1
  • else if ordvar . then ordvar1 0
  • if 1ltordvarlt2 then ordvar2 1
  • else if ordvar . then ordvar2 0
  • if 1ltordvarlt3 then ordvar3 1
  • else if ordvar . then ordvar3 0

14
Ordinal and Nominal Outcomes
  • Mimicking Cumulative Logit with Binary Logistic
    Models
  • proc logistic The OR from each model
  • model ordvar1 factors will be approx. the
    same if
  • run the proportional odds
  • proc logistic assumption holds.
  • model ordvar2 factors
  • run
  • proc logistic Note that all observations
  • model ordvar3 factors are used in each
    model.
  • run

15
Ordinal and Nominal Outcomes
  • The Cumulative Logit Model
  • If the proportional odds assumption does not
    hold, it might be because the outcome variable is
    nominal rather than ordinal, or it might be that
    we have mis-specified the categories, failing to
    pinpoint important thresholds on the underlying
    continuum.
  • The score test is quite sensitiveit is up to the
    analyst to examine the pattern of ORs for
    different dichotomous cutpoints and decide
    whether it is reasonable to use a cumulative
    logit model.

16
Ordinal and Nominal Outcomes
  • The Generalized Logit Model
  • In contrast to the cumulative logit model, in a
    generalized logit model, the outcome categories
    are like dummy variablesmutually exclusive
    categories compared to a common reference group.

17
Ordinal and Nominal Outcomes
  • The Generalized Logit Model
  • For a nominal outcome with k categories
  • Fixed denominator (reference category)
  • http//www.indiana.edu/7Estatmath/stat/all/cat/2b
    1.html

18
Ordinal and Nominal Outcomes
  • Odds Among the exposed a / d
  • Odds Among the exposed b / d
  • Odds Among the exposed c / d

19
Ordinal and Nominal Outcomes
  • The Generalized Logit Model
  • Given k categories of an outcome variable, a
    generalized logit model yields k-1 intercept
    terms. Each intercept corresponds to a single
    category.
  • Since proportional odds are not assumed, odds
    ratios can vary across categories, and therefore
    the effect of each covariate is reflected in k-1
    slope parameters.

20
Ordinal and Nominal Outcomes
  • The Generalized Logit Model
  • Suppose an outcome variable has 4 categories and
    we are modeling one independent variable. The
    generalized logit model is as follows
  • ln(Odds) b0,1 b0,2 b0,3 b1,1 b1,2 b1,3
  • 1. The odds ratios are
  • distinct for each category
  • 2. 3.

21
Ordinal and Nominal Outcomes
  • The Generalized Logit Model
  • Each slope parameter tests the odds of being in
    one outcome category compared to the odds of
    being in the reference category
  • Compared to those without Factor A, individuals
    with factor A have ___ times the odds of having
    the outcomecategory 1
  • Compared to those without Factor A, individuals
    with factor A have ___ times the odds of having
    the outcomecategory 2
  • Compared to those without Factor A, individuals
    with factor A have ___ times the odds of having
    the outcomecategory 3

22
Ordinal and Nominal Outcomes
  • A stratified approach to mimic generalized logit
    model for a 4 category variable, would not
    require creation of new variables, but would mean
    running models like the following

23
Ordinal and Nominal Outcomes
  • proc logistic Mimicking Generalized Logit
  • where ordvar in(1,4) with Binary
    Logistic Models
  • model ordvar factors
  • run
  • proc logistic The ORs from the
  • where ordvar in(2,4) models will differ.
  • model ordvar factors
  • run
  • proc logistic Note that different
  • where ordvar in(3,4) subsets of
    observations
  • model ordvar factors are used in each
    model.
  • run

24
Example 1.
  • The Association of Smoking and Fetal/Infant Death
  • in Preterm Deliveries
  • Crude OR1.07

25
Example 1.
  • The Association of Smoking and Fetal/Infant Death
    in Preterm Deliveries
  • Crude Logistic Model with Dichotomous Outcome

26
Example 1.
  • Cumulative Logit Odds of type of death among
    smokers
  • and the OR for smoker v. nonsmoker
  • Odds46 / (331135)0.04 Odds(4633) /
    11350.07
  • OR 1.04 OR 1.07

27
Example 1.
  • Cumulative Logit Model with 3 Categories
  • Ordered Value outcome5
    Frequency
  • 1 fetal death gt20 wks
    332
  • 2 neonatal death 0-28 days
    229
  • 3 survivor gt28 days
    8520
  • Probabilities modeled are cumulated over the
    lower Ordered Values.
  • Score Test for the Proportional Odds Assumption
  • Chi-Square DF Pr gt ChiSq The
    proportional
  • 0.0400 1 0.8414 odds
    assumption
  • holds

28
Example 1.
  • Cumulative Logit Each intercept corresponds to a
    category plus all categories with lower ordered
    values v. the remaining categories.
  • The odds ratio is an average of the cumulative
    logits
  • 46 / (331135) e-3.28030.0635 0.04
  • (4633) / 1135 e-2.72910.0635 0.07

29
Example 1.
  • Generalized Logit Model with 3 Categories
  • In a generalized logit model, each intercept and
    slope correspond to a single category.
  • Is 1.07 a reasonable summary of 1.047 and 1.096?

30
Example 2.
  • The Association of Maternal Risk and Fetal/Infant
    Death in Preterm Deliveries

31
Example 2.
  • The Association of Maternal Risk and Fetal/Infant
    Death in Preterm Deliveries
  • Crude Logistic Model with Dichotomous Outcome

32
Example 2.
  • Cumulative Logit Model with 3 Categories
  • Ordered Value outcome5
    Frequency
  • 1 fetal death gt20 wks
    418
  • 2 neonatal death 0-28 days
    261
  • 3 survivor gt28 days
    9549
  • Probabilities modeled are cumulated over the
    lower Ordered Values.
  • Score Test for the Proportional Odds Assumption
  • Chi-Square DF Pr gt ChiSq The
    proportional
  • 10.7077 1 0.0011 odds
    assumption
  • does not hold.

33
Example 2.
  • Cumulative Logit Model with 3 Categories
  • The odds ratio is an average of the cumulative
    logits
  • e-3.17500.0473 0.04
  • e-2.66290.0473 0.07

34
Example 2.
  • Generalized Logit Model with 3 Categories
  • Is 1.048 a reasonable summary of 0.86 and 1.5?

35
Example 3. LBW
  • Modeling a 3 category birthweight variable
  • /cumulative logit /
  • proc logistic orderformatted
  • model bwcat smoking late_no_pnc
  • run

36
Example 3. LBW
37
Example 3. LBW
  • /mimicking cumulative logit with binary models/
  • proc logistic orderformatted
  • model vlbw smoking late_no_pnc
  • run
  • vlbw v.
  • mlbw and normal
  • proc logistic orderformatted
  • model lbw smoking late_no_pnc
  • run
  • vlbw and mlbw v.
  • normal
  • Both models include all observations in the sample

38
Example 3. LBW
  • / generalized logit /
  • proc logistic orderformatted
  • model bwcat(ref'normal bw') smoking
    late_no_pnc
  • / linkglogit
  • run

39
Example 3. LBW
  • vlbw v. normal and mlbw v. normal

40
Example 3. LBW
  • / mimicking generalized logit with binary
    models/
  • proc logistic orderformatted
  • where bwcat 2 or bwcat 0
  • model bwcat(ref'normal bw') smoking
    late_no_pnc
  • / linkglogit
  • run
  • proc logistic orderformatted
  • where bwcat 1 or bwcat 0
  • model bwcat(ref'normal bw') smoking
    late_no_pnc
  • / linkglogit
  • run

41
Example 3. LBW
  • Generalized logit approach using binary models
    with only a subset of observations in each model
  • vlbw v.
  • normal
  • mlbw v.
  • normal

42
Example 3. LBW
  • Generalized logit models can get complicated,
  • but custom estimates can still be obtained in the
    usual way.
  • proc logistic orderformatted
  • where 2ltmomagelt3
  • class parityrisk(ref'no hx preterm') /
    paramref
  • model bwcat smoking late_no_pnc matrisk
    momage
  • parityrisk smokingparityrisk /
    linkglogit
  • contrast 'sm-risk, hxpreterm' smoking 1
    matrisk 1
  • smokingparityrisk 1 0 / estimateexp
  • contrast 'sm-risk, primips'smoking 1 matrisk 1
  • smokingparityrisk 0 1 / estimateexp
  • contrast 'sm-risk, lorisk multips' smoking 1
    matrisk 1
  • smokingparityrisk 0 0 / estimateexp
  • run

43
Example 3. LBW
  • The tests for the constructs in the model are all
    statistically significant

44
Example 3. LBW
  • Not all beta coefficients are statistically
    significant.

45
Example 3. LBW
  • Parity-specific contrasts of the joint effect of
    smoking and having some antepartum medical risk,
    adjusting for entry into prenatal care and
    maternal age.
  • Should we leave the smokingparityrisk term in
    the model?

46
Example 4. Prenatal Care
  • Should we consider the categories ordinal or
    nominal?

47
Example 4. Prenatal Care
  • The Overlapping dichotomous Contrasts
  • No Pnc v. Any PNC, OR 3.2 Inad/No
    v. Adeq/Adeq/Inter, OR2.7
  • Inter/Inad/No v. Adeq/Adeq, OR1.8 All others
    v. Adeq, OR0.60

48
Example 4. Prenatal Care
  • Non-overlapping dichotomous contrasts

49
Example 4. Prenatal Care
  • Cumulative Logit
  • The null hypothesis of
  • proportional odds is rejected.
  • Any association is
  • obscured by averaging
  • across levels of APNCU.

50
Example 4. Prenatal Care
  • Generalized
  • Logit

51
Example 4. Prenatal Care
  • Women with a prior lbw delivery had more than 4
    times the odds of receiving no or inadequate
    prenatal care rather than adequate care compared
    to women with no history of lbw delivery.
  • Compared to women without a history of lbw
    delivery, however, these high risk women also had
    more than twice the odds of appropriately
    receiving care beyond what is considered adequate
    for most women.

52
Example 5.
  • Outcome is a
  • 3 level rating
  • of MCH
  • epidemiology
  • functioning
  • above average
  • average
  • below average

53
Summary Ordinal and Nominal Outcomes
  • Cumulative--Ordinal
  • Generalized--Nominal
  • Proportional odds assumptionassess the series of
    binary comparisons from collapsing categories
  • k-1 intercepts
  • 1 slope / 1 odds ratio
  • No assumption of the shape of the association
  • Categories compared to a reference group
  • k-1 intercepts
  • k-1 slopes / k-1 odds ratios

54
Summary Ordinal and Nominal Outcomes
  • Issues for categorizing an outcome variable are
    similar to those for defining categories for
    independent variables
  • Conceptual meaning of the categories
  • Statistical tests v. judgment about differences
    between categories
  • Sample size and power

55
Summary Ordinal and Nominal Outcomes
  • Model Building
  • Similar to beginning with examining dummy
    variables for an independent variable prior to
    deciding whether to use it in an ordinal form,
    sometimes it is useful to run a generalized logit
    model first, since it requires no assumption
    about the ordering of the categories, and
    empirically assess whether the variation in
    category-specific odds ratios is important or
    negligible.

56
Summary Ordinal and Nominal Outcomes
  • And even if the proportional odds assumption
    holds, reporting separate odds ratios for each
    categoryusing generalized logitmay be important
    in order to emphasize the similarity of the
    strength of the association across categories.
  • In addition, the cumulative logit model will not
    only force the strength of association to be
    uniform, the predicted values will also be forced
    to be linear. Using generalized logit, the
    predicted odds and odds ratios will both more
    closely reflect the observed values.

57
Summary Ordinal and Nominal Outcomes
  • Why Not Just Always Run Stratified Models for
    Generalized Logit?
  • For nominal outcomes, using a single model may be
    more efficient than using separate binary models
  • With separate binary models, need to decide
    whether each model should include the same
    independent variables or whether different final,
    category-specific models make sense, each
    including only those variables which are risk or
    protective factors for a particular binary
    comparison

56
58
Summary Ordinal and Nominal Outcomes
  • Using a single multinomial model permits a
    unified profile of risk and protective factors
    across the categoriesboth significant and
    insignificant

59
Summary Ordinal and Nominal Outcomes
  • For a variable that is actually continuous, are
    there reasons to use a cumulative logit model
    instead of a continuous outcome model?
  • For example, when would modeling ordinal
    categories of birthweight be preferable either to
    modeling birthweight continuously in grams or
    categorized into nominal groups?
  • using a variable as ordinal (with fewer
    categories) as opposed to continuous will yield
    odds ratios instead of mean differences
  • No assumption of normality required

60
Summary Ordinal and Nominal Outcomes
  • For a variable that meets the proportional odds
    assumption, is it still appropriate to choose to
    use a generalized logit approach?
  • using ordinal as opposed to nominal categories
    will be more efficient if there is truly an
    ordinal effect
  • Why "waste" degrees of freedom on multiple odds
    ratios, if the effect is constant across
    categories?

61
Which Modeling Approach?
  • Choosing the form of the outcome variable
  • Stressful Life Events
  • Any stressful life event (y/n) independent vars
  • (dichotomous)
  • Fin. Emot. Traum. Partner independent vars
  • (Nominal - No stressful life events as the
    reference)
  • Sum of stressful life events independent vars
  • (continuous)
  • Scale of stressful life events independent vars
  • (ordinal)

62
Which Modeling Approach?
  • Choosing the form of the outcome variable
  • Maternal Depression
  • Any depression (y/n) independent vars
  • PrePost Pre_Only PP_Only independent vars
  • (Nominal - No depression as the reference)
  • Severe Moderate Mild independent vars
  • (Ordinal or Nominal)
  • Depression Severity Scale independent vars
  • (ordinal)

63
Which Modeling Approach?
  • Choosing the form of the outcome variable
  • Breastfeeding
  • Ever Breastfed (yes v. no) independent vars
  • Exclusive BFgt2 mos. (yes v. no) independent
    vars
  • Exclusive gt2 mo. Exclusive BFlt2 mo.
    independent vars
  • Never Breastfed as reference
  • BFlt2 mo. BF 2-6 mo. BF gt 6 mo. independent vars
  • Never Breastfed as reference
  • Breastfeeding duration in weeks independent
    vars
Write a Comment
User Comments (0)
About PowerShow.com