Title: Cox Regression II
1Cox Regression II
Kristin Sainani Ph.D.http//www.stanford.edu/kco
bbStanford UniversityDepartment of Health
Research and Policy
2Topics
- Stratification
- Age as time scale
- Residuals
- Repeated events
- Intention-to-treat analysis for RCTs
31. Stratification
- Violations of PH assumption can be resolved by
- Adding timecovariate interaction
- Adding other time-dependent version of the
covariate - Stratification
4Stratification
- Different stratum are allowed to have different
baseline hazard functions. - Hazard functions do not need to be parallel
between different stratum. - Essentially results in a weighted hazard ratio
being estimated weighted over the different
strata. - Useful for nuisance confounders (where you do
not care to estimate the effect). - Does not allow you to evaluate interaction or
confounding of stratification variable (will miss
possible interactions).
5Example stratify on gender
- Males 1, 3, 4, 10, 12, 18 (subjects 1-6)
- Females 1, 4, 5, 9 (subjects 7-10)
6The PL
72. Using age as the time-scale in Cox Regression
- Age is a common confounder in Cox Regression,
since age is strongly related to death and
disease. - You may control for age by adding baseline age as
a covariate to the Cox model. - A better strategy for large-scale longitudinal
surveys, such as NHANES, is to use age as your
time-scale (rather than time-in-study). - You may additionally stratify on birth cohort to
control for cohort effects.
8Age as time-scale
- The risk set becomes everyone who was at risk at
a certain age rather than at a certain event
time. - The risk set contains everyone who was still
event-free at the age of the person who had the
event. - Requires enough people at risk at all ages (such
as in a large-scale, longitudinal survey).
9The likelihood with age as time
Event times 3, 5, 7, 12, 13 (years-in-study) Ba
seline ages 28, 25, 40, 29, 30 (years) Age at
event or censoring 31, 30, 47, 41, 43
103. Residuals
- Residuals are used to investigate the lack of fit
of a model to a given subject. - For Cox regression, theres no easy analog to the
usual observed minus predicted residual of
linear regression
11Martingale residual
- ci (1 if event, 0 if censored) minus the
estimated cumulative hazard to ti (as a function
of fitted model) for individual i - ci-H(ti,Xi,?ßi)
- E.g., for a subject who was censored at 2 months,
and whose predicted cumulative hazard to 2 months
was 20 - Martingale0-.20 -.20
- E.g., for a subject who had an event at 13
months, and whose predicted cumulative hazard to
13 months was 50 - Martingale1-.50 .50
- Gives excess failures.
- Martingale residuals are not symmetrically
distributed, even when the fitted model is
correctly, so transform to deviance residuals...
12Deviance Residuals
- The deviance residual is a normalized transform
of the martingale residual. These residuals are
much more symmetrically distributed about zero. - Observations with large deviance residuals are
poorly predicted by the model.
13Deviance Residuals
- Behave like residuals from ordinary linear
regression - Should be symmetrically distributed around 0 and
have standard deviation of 1.0. - Negative for observations with longer than
expected observed survival times. - Plot deviance residuals against covariates to
look for unusual patterns.
14Deviance Residuals
- In SAS, option on the output statement
- Output outoutdata resdevVarname
- Cannot get diagnostics in SAS if time-dependent
covariate in the model
15Example uis data
Pattern looks fairly symmetric around 0.
16Example uis data
17Example censored only
18Example had event only
19Schoenfeld residuals
- Schoenfeld (1982) proposed the first set of
residuals for use with Cox regression packages - Schoenfeld D. Residuals for the proportional
hazards regresssion model. Biometrika, 1982,
69(1)239-241. - Instead of a single residual for each individual,
there is a separate residual for each individual
for each covariate - Note Schoenfeld residuals are not defined for
censored individuals.
20Schoenfeld residuals
- The Schoenfeld residual is defined as the
covariate value for the individual that failed
minus its expected value. (Yields residuals for
each individual who failed, for each covariate). - Expected value of the covariate at time ti a
weighted-average of the covariate, weighted by
the likelihood of failure for each individual in
the risk set at ti.
21Example
- 5 people left in our risk set at event time7
months - Female 55-year old smoker
- Male 45-year old non-smoker
- Female 67-year old smoker
- Male 58-year old smoker
- Male 70-year old non-smoker
- The 55-year old female smoker is the one who has
the event
22Example
- Based on our model, we can calculate a predicted
probability of death by time 7 for each person
(call it p-hat) - Female 55-year old smoker p-hat.10
- Male 45-year old non-smoker p-hat.05
- Female 67-year old smoker p-hat.30
- Male 58-year old smoker p-hat.20
- Male 70-year old non-smoker p-hat.30
- Thus, the expected value for the AGE of the
person who failed is - 55(.10) 45 (.05) 67(.30) 58 (.20) 70
(.30) 60 - And, the Schoenfeld residual is 55-60 -5
23Example
- Based on our model, we can calculate a predicted
probability of death by time 7 for each person
(call it p-hat) - Female 55-year old smoker p-hat.10
- Male 45-year old non-smoker p-hat.05
- Female 67-year old smoker p-hat.30
- Male 58-year old smoker p-hat.20
- Male 70-year old non-smoker p-hat.30
- The expected value for the GENDER of the person
who failed is - 0(.10) 1(.05) 0(.30) 1 (.20) 1 (.30) .55
- And, the Schoenfeld residual is 0-.55 -.55
24Schoenfeld residuals
- Since the Schoenfeld residuals are, in principle,
independent of time, a plot that shows a
non-random pattern against time is evidence of
violation of the PH assumption. - Plot Schoenfeld residuals against time to
evaluate PH assumption - Regress Schoenfeld residuals against time to test
for independence between residuals and time.
25Example no pattern with time
26Example violation of PH
27Schoenfeld residuals
- In SAS
- option on the output statement
- Output outoutdata ressch Covariate1 Covariate2
Covariate3
28Summary of the many ways to evaluate PH
assumption
- 1. Examine log(-log(S(t)) plots
- PH assumption is supported by parallel lines and
refuted by lines that cross or nearly cross - Must use categorical predictors or categories of
a continuous predictor - 2. Include interaction with time in the model
- PH assumption is supported by non-significant
interaction coefficient and refuted by
significant interaction coefficient - Retaining the interaction term in the model
corrects for the violation of PH - Dont complicate your model in this way unless
its absolutely necessary! - 3. Plot Schoenfeld residuals
- PH assumption is supported by a random pattern
with time and refuted by a non-random pattern - 4. Regress Schoenfeld residuals against time to
test for independence between residuals and time. - PH assumption is supported by a non-significant
relationship between residuals and time, and
refuted by a significant relationship
294. Repeated events
- Death (presumably) can only happen once, but many
outcomes could happen twice - Fractures
- Heart attacks
- Pregnancy
- Etc
30Repeated events 1
- Strategy 1 run a second Cox regression (among
those who had a first event) starting with first
event time as the origin - Repeat for third, fourth, fifth, events, etc.
- Problems increasingly smaller and smaller sample
sizes.
31Repeated events Strategy 2
- Treat each interval as a distinct observation,
such that someone who had 3 events, for example,
gives 3 observations to the dataset - Major problem dependence between the same
individual
32Strategy 3
- Stratify by individual (fixed effects partial
likelihood) - In PROC PHREG strata id
- Problems
- does not work well with RCT data
- requires that most individuals have at least 2
events - Can only estimate coefficients for those
covariates that vary across successive spells for
each individual this excludes constant personal
characteristics such as age, education, gender,
ethnicity, genotype
335. Considerations when analyzing data from an RCT
34Intention-to-Treat Analysis
- Intention-to-treat analysis compare outcomes
according to the groups to which subjects were
initially assigned, regardless of which
intervention they actually received. - Evaluates treatment effectiveness rather than
treatment efficacy
35Why intention to treat?
- Non-intention-to-treat analyses lose the benefits
of randomization, as the groups may no longer be
balanced with regards to factors that influence
the outcome. - Intention-to-treat analysis simulates real
life, where patients often dont adhere
perfectly to treatment or may discontinue
treatment altogether.
36Drop-ins and Drop-outs example, WHI
Womens Health Initiative Writing Group.
JAMA. 2002288321-333.
37Effect of Intention to treat on the statistical
analysis
- Intention-to-treat analyses tend to underestimate
treatment effects increased variability waters
down results.
38Example
- Take the following hypothetical RCT
- Treated subjects have a 25 chance of dying
during the 2-year study vs. placebo subjects have
a 50 chance of dying. - TRUE RR 25/50 .50 (treated have 50 less
chance of dying) - You do a 2-yr RCT of 100 treated and 100 placebo
subjects. - If nobody switched, you would see about 25 deaths
in the treated group and about 50 deaths in the
placebo group (give or take a few due to random
chance). - ?Observed RR? .50
39Example, continued
- BUT, if early in the study, 25 treated subjects
switch to placebo and 25 placebo subjects switch
to control. - You would see about
- 25.25 75.50 43-44 deaths in the placebo
group - And about
- 25.50 75.25 31 deaths in the treated group
- Observed RR 31/44 ? .70
- Diluted effect!
40References
- Paul Allison. Survival Analysis Using SAS. SAS
Institute Inc., Cary, NC 2003.