Title: Introduction to Survival Analysis October 13
1Introduction to Survival AnalysisOctober 13
20, 2009
- Brian F. Gage, MD, MSc
- with thanks to Bing Ho, MD, MPH
- Dept. of Medicine
- Washington University in St. Louis
2Goal Conceptual and Graphical Understanding of
Survival Analyses
- What is survival analysis
- When to use it?
- How it compares to alternative statistics
- Univariate method Kaplan-Meier curves
- Multivariate methods Cox-proportional hazards
model - Assessment of adequacy of model
3Sample Kidney Transplant (Tx) Data
PID Years Donor Tx Fails 1
6.1790 Cadaveric 1 2 10.1604
Living-related 1 3 0.0260
Cadaveric 1 4 4.2967
Living-related 1 5 3.8560
Cadaveric 1 6 2.3644
Living-related 1 7 0.8420
Cadaveric 1 8 2.8048
Living-related 1 9 2.7940
Cadaveric 1 10 5.4670
Living-related 1 11 5.1450
Cadaveric 1 12 3.7554
Living-related 1
4Univariate analysis of Tx survival in recipients
of cadaveric kidney
Mean 1.9 years Median1.3 years
5Univariate analysis of Tx survival in recipients
of living-related kidney
Mean 3.0 years Median2.15 years
6How Would You Analyze Those Data?
- All 2000 simulated pts. were followed until time
of rejection or Tx failure. - ltwrite your data analysis plan heregt
7Univariate analysis of logarithm (Tx survival) in
recipients of living-related kidney
8Univariate analysis of logarithm (Tx survival) in
recipients of cadaveric kidney
9Comparisons of Log (Tx Survival)
- Variable Method Variances DF
Pr gt t - LnYears Pooled Equal 1998
lt.0001 - LnYears Satterthwaite Unequal 1988
lt.0001 - Variable Method Two-Sided Pr gt
Z - LnYears Wilcoxon/Mann-Whitney Two-Sample Test
lt.0001
10Suppose You Only Have Time/Money to Follow
Participants for 4.5 Years or that some Patients
Enrolled Late
PID Years Donor Tx Fails 1
4.5 Cadaveric 0 2 4.5
Living-related 0 3 0.0260
Cadaveric 1 4 4.2967
Living-related 1 5 3.8560
Cadaveric 1 6 2.3644
Living-related 1 7 0.8420
Cadaveric 1 8 2.8048
Living-related 1 9 2.7940
Cadaveric 1 10 4.5
Living-related 0 11 4.5
Cadaveric 0 12 3.7554
Living-related 1
11Univariate analysis of Tx survival in recipients
of cadaveric kidney
Data censored at 4.5 years
12Univariate analysis of Tx survival in recipients
of living-related kidney
Data censored at 4.5 years
13Now, Survival Times are Censored
- A t-test is no longer appropriate
- We dont know how long patients will survive past
the observation window - We cant compute the mean (or SD) of survival
time between the 2 cohorts - although may be able to observe medians
14To Analyze Censored Data, We Need to Use
Time-to-Event Analysis, Such as St
- Survivor function, S(t) defines the probability
of surviving longer than time t - Known as Kaplan-Meier curves or product-limit
- Does not account for other covariates
- Model time to failure or time to event
- Survival analysis has a dichotomous (binary)
outcome - Unlike logistic regression, survival analysis
analyzes the time to an event - Able to account for censoring
- But not covariates
- When is this OK?
- Can compare survival between 2 groups
15Kaplan-Meier Plots of Kidney Tx
St
P lt .0001
Median survival
Living-Related Donor
Cadaveric Donor
16How to Compare Kaplan-Meier Curves?
- Hypothesis test (test of significance)
- H0 the curves are statistically the same
- HA the curves are statistically different
- Compares observed to expected cell counts
- Test statistic is compared to ?2
- Do you weigh each failure equally?
- Yes gt Log-Rank (Mantel-Haenszel) Test
- or do you penalize early failure more?
- Yes gt Generalized Wilcoxon (Breslow) Test.
17Time to Cardiovascular Adverse Event in VIGOR
Trial
P lt .001
1-S(t)
18Censoring is Variable
- Subject does not experience event of interest
- Incomplete follow-up
- Lost to follow-up
- Withdraws from study
- Death (if not an endpoint)
Death
Death
Death
19Importance of censored data
- Why are censored data important?
- In a Cox model, what is the key assumption of
censoring?
20When to use Survival Analysis
- When one suspects that 1 explanatory variable(s)
explains the differences in time to an event - Examples
- Time to death or clinical endpoint
- Time in remission after treatment of disease
- Recidivism rate after addiction treatment
- Especially when follow-up is incomplete or
variable
21P .0001
Gage B et al. Adverse outcomes and predictors of
underuse of antithrombotic therapy in Medicare
beneficiaries with chronic atrial fibrillation.
Stroke 200031822-7.
22Limitation of Kaplan-Meier curves
- What happens when you have several covariates
that you believe contribute to survival? - Example
- Smoking, hyperlipidemia, diabetes, hypertension,
contribute to time to myocardial infarct or
stroke. - Can use stratified K-M curves, but only for 2 or
maybe 3 categorical covariates. - Need another approach Cox proportional hazards
model is most common for many covariates, esp.
continuous ones
23Multivariate method Cox proportional hazards
- Can assess the effect of multiple covariates on
survival - Cox-proportional hazards is the most commonly
used multivariate survival method - Easy to implement in SPSS, Stata, JMP, SAS, or R
- Parametric approaches are an alternative, but
they require stronger assumptions about h(t) - They yield a closed eqn. for S(t) and H(t)
24Cox model Proportional hazard assumption
- Hazard Ratio (HR) exp(B) is a multiplicative
riskthis is the proportional hazard assumption - Can handle both continuous and categorical
predictor variables - Can stratify results using a categorical variable
- Cox models distinguish individual contributions
of covariates on survival.
25Hazard Rate h(t)
- of pts. dying per unit time in the interval
- of pts. alive at t
- h(t) is called the hazard rate, hazard
function, conditional failure rate, or
instantaneous failure rate.
ht
26The Hazard Rate h(t)
ht lim ?(1-St)/ ?t / St
?t ? 0
?(1-St)
? t
27Cox proportional hazard model
- Separates baseline hazard function (ho(t), which
can be any shape) from covariates - Baseline hazard function over time
- h(t) ho(t)exp(B1XBo)
- Covariates are not usually time independent
- But they can be
- B1 is used to calculate the hazard ratio, which
is similar to the relative risk - semiparametric
28Time to Cardiovascular Adverse Event in VIGOR
Trial Should be Summarized w/ a Single HR,
Instead of
RR 2.6
RR 2.4
RR 1.9
RR 1.9
RR 1.2
29Use These 2 Eqns. to Show How the Hazard Ratio
Changes when Binary Factor B1 Is Present (X1)
Rather than Absent (X0)
- ht hot exp(B1XBo)
- Hazard ratio (HR) htX1 / htX0
- Hint exp (a) / exp (b) exp (a-b)
- Relative risk reduction, RRR, 1-HR
30Cox proportional hazards models
- Hazard Ratio (HR) exp(B) is a multiplicative
riskthis is the proportional hazard assumption - Sometimes can be compensated for by using an
interaction term - Can handle both continuous and categorical
predictor variables - can stratify results using a categorical variable
- no distribution assumption is required in that
case
31Output of Cox Proportional Hazard Model From
Simulated Kidney Tx Data
Analysis of Maximum Likelihood
Estimates Parameter
Standard Parameter DF Estimate Error
Chi-Square Pr gt ChiSq Donor 1
0.474 0.0493 92.3 lt.0001
Hazard 95 Hazard Ratio Parameter
Ratio Confidence Limits Donor
1.61 1.46 1.77
Thus, cadaveric Tx were 61 more likely to fail.
32Limitations of Cox PH model
- Normally, does not include variables that change
over time - Luckily most variables (e.g. gender, ethnicity,
and congenital condition, birth year) are constant
33Example Tumor Extent
- 3000 patients derived from SEER cancer registry
and Medicare billing information - Explore the relationship between tumor extent and
survival - Hypothesis is that more extensive tumor
involvement is related to poorer survival
34Log-Rank ?2 269 p lt.0001
35Example Tumor Extent
- Tumor stage may not be the only covariate that
affects survival - Medical comorbidities poor functional status
may be associated with poorer outcome - Ethnicity and gender may contribute
- Tumor grade and genotype may contribute
- Etc.
- Cox proportional hazards model could quantify
these relationships
36Summary of Kaplan-Meier Curves
- Model time to failure or time to event
- Survival analysis has a dichotomous (binary)
outcome - Unlike logistic regression, survival analysis
analyzes the time to an event - Able to account for censoring
- Can compare survival between 2 groups
37Summary of time-to-event analyses
- Quantifies time to a single, dichotomous event
- Handles censored data well
- Cox models distinguish individual contributions
of covariates on survival, provided certain
assumptions are met - Cox models are used commonly in outcomes
research. To learn more, take a full-course in
survival analysis - E.g. Math 434 - Survival Analysis or
http//k30.im.wustl.edu/program/interm20biostats
20syllabus.doc - BST.520 Survival Data Analysis at SLU