Statistics 262: Intermediate Biostatistics - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Statistics 262: Intermediate Biostatistics

Description:

Statistics 262: Intermediate Biostatistics May 18, 2004: Cox Regression III: residuals and diagnostics, repeated events Jonathan Taylor and Kristin Cobb – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 16
Provided by: wwwstatSt
Category:

less

Transcript and Presenter's Notes

Title: Statistics 262: Intermediate Biostatistics


1
Statistics 262 Intermediate Biostatistics
May 18, 2004 Cox Regression III residuals and
diagnostics, repeated events
  • Jonathan Taylor and Kristin Cobb

2
Residuals
  • Residuals are used to investigate the lack of fit
    of a model to a given subject.
  • For Cox regression, theres no easy analog to the
    usual observed minus predicted residual of
    linear regression

3
Deviance Residuals
  • Deviance residuals are based on martingale
    residuals ci (1 if event, 0 if censored) minus
    the estimated cumulative hazard to ti (as a
    function of fitted model) for individual i
  • ci-H(ti,Xi,?ßi)
  • See Hosmer and Lemeshow for more discussion

4
Deviance Residuals
  • Behave like residuals from ordinary linear
    regression
  • Should be symmetrically distributed around 0 and
    have standard deviation of 1.0.
  • Negative for observations with longer than
    expected observed survival times.
  • Plot deviance residuals against covariates to
    look for unusual patterns.

5
Deviance Residuals
  • In SAS, option on the output statement
  • Ouput outoutdata resdev

6
Schoenfeld residuals
  • Schoenfeld (1982) proposed the first set of
    residuals for use with Cox regression packages
  • Schoenfeld D. Residuals for the proportional
    hazards regresssion model. Biometrika, 1982,
    69(1)239-241.
  • Instead of a single residual for each individual,
    there is a separate residual for each individual
    for each covariate
  • Based on the individual contributions to the
    derivative of the log partial likelihood (see
    chapter 6 in Hosmer and Lemeshow for more math
    details, p.198-199)
  • Note Schoenfeld residuals are not defined for
    censored individuals.

7
Schoenfeld residuals
  • Where K is the covariate of interest,
  • the Schoenfeld residual is the covariate-value,
    Xik, for the person (i) who actually died at time
    ti minus the expected value of the covariate for
    the risk set at ti (a weighted-average of the
    covariate, weighted by each individuals
    likelihood of dying at ti).
  • Plot Schoenfeld residuals against time to
    evaluate PH assumption

8
Schoenfeld residuals
  • In SAS
  • option on the output statement
  • ressch

9
Influence diagnostics
  • How would the result change if a particular
    observation is removed from the analysis?

10
Influence statistics
  • Likelihood displacement (ld) measures influence
    of removing one individual on the model as a
    whole. Whats the change in the likelihood when
    this individual is omitted?
  • DFBETA-how much each coefficient will change by
    removal of a single observation
  • negative DFBETA indicates coefficient increases
    when the observation is removed

11
Influence statistics
  • In SAS
  • option on the output statement
  • ld dfbeta

12

What about repeated events?
  • Death (presumably) can only happen once, but many
    outcomes could happen twice
  • Fractures
  • Heart attacks
  • Pregnancy
  • Etc

13

Repeated events 1
  • Strategy 1 run a second Cox regression (among
    those who had a first event) starting with first
    event time as the origin
  • Repeat for third, fourth, fifth, events, etc.
  • Problems increasingly smaller and smaller sample
    sizes.

14

Repeated eventsStrategy 2
  • Treat each interval as a distinct observation,
    such that someone who had 3 events, for example,
    gives 3 observations to the dataset
  • Major problem dependence between the same
    individual

15

Strategy 3
  • Stratify by individual (fixed effects partial
    likelihood)
  • In PROC PHREG strata id
  • Problems
  • does not work well with RCT data, however
  • requires that most individuals have at least 2
    events
  • Can only estimate coefficients for those
    covariates that vary across successive spells for
    each individual this excludes constant personal
    characteristics such as age, education, gender,
    ethnicity, genotype
Write a Comment
User Comments (0)
About PowerShow.com