Statistics 262: Intermediate Biostatistics - PowerPoint PPT Presentation

About This Presentation

Title:

Statistics 262: Intermediate Biostatistics

Description:

Statistics 262: Intermediate Biostatistics ... Time (months) to conception or censoring in 38 sub-fertile women after laparoscopy and hydrotubation ... – PowerPoint PPT presentation

Number of Views:157

Avg rating:3.0/5.0

Slides: 74

Provided by: kristinc

Learn more at: http://web.stanford.edu

Category:

more less

Transcript and Presenter's Notes

Title: Statistics 262: Intermediate Biostatistics

1
Statistics 262 Intermediate Biostatistics
Kaplan-Meier methods and Parametric Regression
methods
2
More on Kaplan-Meier estimator of
S(t)(product-limit estimator or KM estimator)

When there are no censored data, the KM estimator
is simple and intuitive
Estimated S(t) proportion of observations with
failure times gt t.
For example, if you are following 10 patients,
and 3 of them die by the end of the first year,
then your best estimate of S(1 year) 70.
When there are censored data, KM provides
estimate of S(t) that takes censoring into
account (see last weeks lecture).
If the censored observation had actually been a
failure S(1 year)4/53/42/32/540
KM estimator is defined only at times when events
occur! (empirically defined)

3
KM (product-limit) estimator, formally
4
KM (product-limit) estimator, formally
This formula gives the product-limit estimate of
survival at each time an event happens.
5
Example 1 time-to-conception for subfertile women
Failure here is a good thing. 38 women (in
1982) were treated for infertility with
laparoscopy and hydrotubation. All women were
followed for up to 2-years to describe
time-to-conception. The event is conception, and
women "survived" until they conceived.
Example from BMJ, Dec 1998 317 1572 - 1580.
6
Raw data Time (months) to conception or
censoring in 38 sub-fertile women after
laparoscopy and hydrotubation (1982 study)
Data from Luthra P, Bland JM, Stanton SL.
Incidence of pregnancy after laparoscopy and
hydrotubation. BMJ 1982 284 1013-1014
7
Corresponding Kaplan-Meier Curve
S(t) is estimated at 9 event times. (step-wise
function)
8
Raw data Time (months) to conception or
censoring in 38 sub-fertile women after
laparoscopy and hydrotubation (1982 study)
Data from Luthra P, Bland JM, Stanton SL.
Incidence of pregnancy after laparoscopy and
hydrotubation. BMJ 1982 284 1013-1014
9
Raw data Time (months) to conception or
censoring in 38 sub-fertile women after
laparoscopy and hydrotubation (1982 study)
Data from Luthra P, Bland JM, Stanton SL.
Incidence of pregnancy after laparoscopy and
hydrotubation. BMJ 1982 284 1013-1014
10
Corresponding Kaplan-Meier Curve
6 women conceived in 1st month (1st menstrual
cycle). Therefore, 32/38 survived
pregnancy-free past 1 month.
11
Corresponding Kaplan-Meier Curve
12
Important detail of how the data were
codedCensoring at t2 indicates survival PAST
the 2nd cycle (i.e., we know the woman survived
her 2nd cycle pregnancy-free). Thus, for
calculating KM estimator at 2 months, this person
should still be included in the risk set. Think
of it as 2 months, e.g., 2.1 months.
Raw data Time (months) to conception or
censoring in 38 sub-fertile women after
laparoscopy and hydrotubation (1982 study)
Data from Luthra P, Bland JM, Stanton SL.
Incidence of pregnancy after laparoscopy and
hydrotubation. BMJ 1982 284 1013-1014
13
Corresponding Kaplan-Meier Curve
14
Corresponding Kaplan-Meier Curve
5 women conceive in 2nd month. The risk set at
event time 2 included 32 women. Therefore,
27/3284.4 survived event time 2
pregnancy-free.
Can get an estimate of the hazard rate here,
h(t2) 5/3215.6. Given that you didnt get
pregnant in month 1, you have an estimated 5/32
chance of conceiving in the 2nd month. And
estimate of density (marginal probability of
conceiving in month 2) f(t)h(t)S(t)(.711)(.15
6)11
15
Raw data Time (months) to conception or
censoring in 38 sub-fertile women after
laparoscopy and hydrotubation (1982 study)
Data from Luthra P, Bland JM, Stanton SL.
Incidence of pregnancy after laparoscopy and
hydrotubation. BMJ 1982 284 1013-1014
16
Corresponding Kaplan-Meier Curve
17
Corresponding Kaplan-Meier Curve
3 women conceive in the 3rd month. The risk set
at event time 3 included 26 women. 23/2688.5
survived event time 3 pregnancy-free.
18
Raw data Time (months) to conception or
censoring in 38 sub-fertile women after
laparoscopy and hydrotubation (1982 study)
Risk set at 4 months includes 22 women
Data from Luthra P, Bland JM, Stanton SL.
Incidence of pregnancy after laparoscopy and
hydrotubation. BMJ 1982 284 1013-1014
19
Corresponding Kaplan-Meier Curve
20
Corresponding Kaplan-Meier Curve
3 women conceive in the 4th month, and 1 was
censored between months 3 and 4. The risk set at
event time 4 included 22 women. 19/2286.4
survived event time 4 pregnancy-free.
And estimate of density (marginal probability of
conceiving in month 4) f(t)h(t)S(t)(.136)
(.542)7.4
21
Raw data Time (months) to conception or
censoring in 38 sub-fertile women after
laparoscopy and hydrotubation (1982 study)
Risk set at 6 months includes 18 women
Data from Luthra P, Bland JM, Stanton SL.
Incidence of pregnancy after laparoscopy and
hydrotubation. BMJ 1982 284 1013-1014
22
Corresponding Kaplan-Meier Curve
23
Corresponding Kaplan-Meier Curve
2 women conceive in the 6th month of the study,
and one was censored between months 4 and 6. The
risk set at event time 5 included 18
women. 16/1888.8 survived event time 5
pregnancy-free.
24
Skipping ahead to the 9th and final event time
(months16)
25
Raw data Time (months) to conception or
censoring in 38 sub-fertile women after
laparoscopy and hydrotubation (1982 study)
2 remaining at 16 months (9th event time)
Data from Luthra P, Bland JM, Stanton SL.
Incidence of pregnancy after laparoscopy and
hydrotubation. BMJ 1982 284 1013-1014
26
Skipping ahead to the 9th and final event time
(months16)
Tail here just represents that the final 2 women
did not conceive (cannot make many inferences
from the end of a KM curve)!
27
Kaplan-Meier SAS output

The LIFETEST
Procedure
Product-Limit
Survival Estimates
Survival
Standard Number Number
time Survival Failure
Error Failed Left
0.0000 1.0000 0
0 0 38
1.0000 . .
. 1 37
1.0000 . .
. 2 36
1.0000 . .
. 3 35
1.0000 . .
. 4 34
1.0000 . .
. 5 33
1.0000 0.8421 0.1579
0.0592 6 32
2.0000 . .
. 7 31
2.0000 . .
. 8 30
2.0000 . .
. 9 29

28
Kaplan-Meier SAS output

Survival
Standard Number Number
time Survival Failure
Error Failed Left
6.0000 . .
. 18 17
6.0000 0.4825 0.5175
0.0834 19 16
7.0000 . .
. 19 15
7.0000 . .
. 19 14
8.0000 . .
. 19 13
8.0000 . .
. 19 12
9.0000 . .
. 20 11
9.0000 . .
. 21 10
9.0000 0.3619 0.6381
0.0869 22 9
9.0000 . .
. 22 8
9.0000 . .
. 22 7
9.0000 . .
. 22 6
10.0000 0.3016 0.6984
0.0910 23 5

29
Monday Gut Check Problem

Calculate the product-limit estimate of survival
for the following data (n9)

Time-to-event (months) Survival (1died/0censored)
10 0
2 1
4 0
8 1
12 0
14 0
10 1
1 0
3 0
30
Not so easy to get a plot of the actual hazard
function! In SAS, need a complicated MACRO, and
depends on assumptionsheres what I get from
Paul Allisons macro for these data
31
At best, you can get the cumulative hazard
function
32
Cumulative Hazard Function

If the hazard function is constant, e.g. h(t)k,
then the cumulative hazard function will be
linear (and higher hazards will have steeper
slopes)

If the hazard function is increasing with time,
e.g. h(t)kt, then the cumulative hazard function
will be curved up, for example h(t)kt gives a
quadratic

If the hazard function is decreasing over time,
e.g. h(t)k/t, then the cumulative hazard
function should be curved down, for example

33
Kaplan-Meier example 2

Researchers randomized 44 patients with chronic
active hepatitis were to receive prednisolone or
no treatment (control), then compared survival
curves.

Example from BMJ 1998317468-469 ( 15 August )
34
Survival times (months) of 44 patients with
chronic active hepatitis randomised to receive
prednisolone or no treatment.
Data from BMJ 1998317468-469 ( 15 August )
censored
35
Kaplan-Meier example 2
Are these two curves different?
Misleading to the eyeapparent convergence by end
of study. But this is due to 6 controls who
survived fairly long, and 3 events in the
treatment group when the sample size was small.
36
Control group

Survival
Standard Number Number
time Survival Failure
Error Failed Left
0.000 1.0000 0
0 0 22
2.000 0.9545 0.0455
0.0444 1 21
3.000 0.9091 0.0909
0.0613 2 20
4.000 0.8636 0.1364
0.0732 3 19
7.000 0.8182 0.1818
0.0822 4 18
10.000 0.7727 0.2273
0.0893 5 17
22.000 0.7273 0.2727
0.0950 6 16
28.000 0.6818 0.3182
0.0993 7 15
29.000 0.6364 0.3636
0.1026 8 14
32.000 0.5909 0.4091
0.1048 9 13
37.000 0.5455 0.4545
0.1062 10 12
40.000 0.5000 0.5000
0.1066 11 11
41.000 0.4545 0.5455
0.1062 12 10
54.000 0.4091 0.5909
0.1048 13 9
61.000 0.3636 0.6364
0.1026 14 8

6 controls made it past 100 months.
37
treated group

Survival
Standard Number Number
time Survival Failure
Error Failed Left
0.000 1.0000 0
0 0 22
2.000 0.9545 0.0455
0.0444 1 21
6.000 0.9091 0.0909
0.0613 2 20
12.000 0.8636 0.1364
0.0732 3 19
54.000 0.8182 0.1818
0.0822 4 18
56.000 . .
. 4 17
68.000 0.7701 0.2299
0.0904 5 16
89.000 0.7219 0.2781
0.0967 6 15
96.000 . .
. 7 14
96.000 0.6257 0.3743
0.1051 8 13
125.000 . .
. 8 12
128.000 . .
. 8 11
131.000 . .
. 8 10
140.000 . .
. 8 9

38
Point-wise confidence intervals
We will not worry about mathematical formula for
confidence bands. The important point is that
there is a confidence interval for each estimate
of S(t). (SAS uses Greenwoods formula.)
39
Log-rank test

Test of Equality over Strata
Pr gt
Test Chi-Square DF
Chi-Square
Log-Rank 4.6599 1 0.0309
Wilcoxon 6.5435 1 0.0105
-2Log(LR) 5.4096 1 0.0200

Chi-square test (with 1 df) of the (overall)
difference between the two groups. Groups appear
significantly different.
40
Log-rank test
Log-rank test is just a Cochran-Mantel-Haenszel
chi-square test! Anyone remember (know) what
this is?
41
CMH test of conditional independence
K Strata unique event times
Nk
42
CMH test of conditional independence
K Strata unique event times
Nk
43
CMH test of conditional independence
How do you know that this is a chi-square with 1
df?
44
Event time 1 (2 months), control group

Survival
Standard Number Number
time Survival Failure
Error Failed Left
0.000 1.0000 0
0 0 22
2.000 0.9545 0.0455
0.0444 1 21
3.000 0.9091 0.0909
0.0613 2 20
4.000 0.8636 0.1364
0.0732 3 19
7.000 0.8182 0.1818
0.0822 4 18
10.000 0.7727 0.2273
0.0893 5 17
22.000 0.7273 0.2727
0.0950 6 16
28.000 0.6818 0.3182
0.0993 7 15
29.000 0.6364 0.3636
0.1026 8 14
32.000 0.5909 0.4091
0.1048 9 13
37.000 0.5455 0.4545
0.1062 10 12
40.000 0.5000 0.5000
0.1066 11 11
41.000 0.4545 0.5455
0.1062 12 10
54.000 0.4091 0.5909
0.1048 13 9
61.000 0.3636 0.6364
0.1026 14 8

45
Event time 1 (2 months), treated group

Survival
Standard Number Number
time Survival Failure
Error Failed Left
0.000 1.0000 0
0 0 22
2.000 0.9545 0.0455
0.0444 1 21
6.000 0.9091 0.0909
0.0613 2 20
12.000 0.8636 0.1364
0.0732 3 19
54.000 0.8182 0.1818
0.0822 4 18
56.000 . .
. 4 17
68.000 0.7701 0.2299
0.0904 5 16
89.000 0.7219 0.2781
0.0967 6 15
96.000 . .
. 7 14
96.000 0.6257 0.3743
0.1051 8 13
125.000 . .
. 8 12
128.000 . .
. 8 11
131.000 . .
. 8 10
140.000 . .
. 8 9

46
Stratum 1 event time 1
Event time 1 1 died from each group. (22 at risk
in each group)
44
47
Event time 2 (3 months), control group

Survival
Standard Number Number
time Survival Failure
Error Failed Left
0.000 1.0000 0
0 0 22
2.000 0.9545 0.0455
0.0444 1 21
3.000 0.9091 0.0909
0.0613 2 20
4.000 0.8636 0.1364
0.0732 3 19
7.000 0.8182 0.1818
0.0822 4 18
10.000 0.7727 0.2273
0.0893 5 17
22.000 0.7273 0.2727
0.0950 6 16
28.000 0.6818 0.3182
0.0993 7 15
29.000 0.6364 0.3636
0.1026 8 14
32.000 0.5909 0.4091
0.1048 9 13
37.000 0.5455 0.4545
0.1062 10 12
40.000 0.5000 0.5000
0.1066 11 11
41.000 0.4545 0.5455
0.1062 12 10
54.000 0.4091 0.5909
0.1048 13 9
61.000 0.3636 0.6364
0.1026 14 8

48
Event time 2 (3 months), treated group

Survival
Standard Number Number
time Survival Failure
Error Failed Left
0.000 1.0000 0
0 0 22
2.000 0.9545 0.0455
0.0444 1 21
6.000 0.9091 0.0909
0.0613 2 20
12.000 0.8636 0.1364
0.0732 3 19
54.000 0.8182 0.1818
0.0822 4 18
56.000 . .
. 4 17
68.000 0.7701 0.2299
0.0904 5 16
89.000 0.7219 0.2781
0.0967 6 15
96.000 . .
. 7 14
96.000 0.6257 0.3743
0.1051 8 13
125.000 . .
. 8 12
128.000 . .
. 8 11
131.000 . .
. 8 10
140.000 . .
. 8 9

49
Stratum 2 event time 2
Event time 2 At 3 months, 1 died in the control
group. At that time 21 from each group were at
risk
42
50
Event time 3 (4 months), control group

Survival
Standard Number Number
time Survival Failure
Error Failed Left
0.000 1.0000 0
0 0 22
2.000 0.9545 0.0455
0.0444 1 21
3.000 0.9091 0.0909
0.0613 2 20
4.000 0.8636 0.1364
0.0732 3 19
7.000 0.8182 0.1818
0.0822 4 18
10.000 0.7727 0.2273
0.0893 5 17
22.000 0.7273 0.2727
0.0950 6 16
28.000 0.6818 0.3182
0.0993 7 15
29.000 0.6364 0.3636
0.1026 8 14
32.000 0.5909 0.4091
0.1048 9 13
37.000 0.5455 0.4545
0.1062 10 12
40.000 0.5000 0.5000
0.1066 11 11
41.000 0.4545 0.5455
0.1062 12 10
54.000 0.4091 0.5909
0.1048 13 9
61.000 0.3636 0.6364
0.1026 14 8

51
Event time 3 (4 months), treated group

Survival
Standard Number Number
time Survival Failure
Error Failed Left
0.000 1.0000 0
0 0 22
2.000 0.9545 0.0455
0.0444 1 21
6.000 0.9091 0.0909
0.0613 2 20
12.000 0.8636 0.1364
0.0732 3 19
54.000 0.8182 0.1818
0.0822 4 18
56.000 . .
. 4 17
68.000 0.7701 0.2299
0.0904 5 16
89.000 0.7219 0.2781
0.0967 6 15
96.000 . .
. 7 14
96.000 0.6257 0.3743
0.1051 8 13
125.000 . .
. 8 12
128.000 . .
. 8 11
131.000 . .
. 8 10
140.000 . .
. 8 9

52
Stratum 3 event time 3 (4 months)
Event time 3 At 4 months, 1 died in the control
group. At that time 21 from the treated group and
20 from the control group were at-risk.
41
53
Etc.
54
Log-rank test, et al.

Test of Equality over Strata
Pr gt
Test Chi-Square DF
Chi-Square
Log-Rank 4.6599 1 0.0309
Wilcoxon 6.5435 1 0.0105
-2Log(LR) 5.4096 1 0.0200

55
Estimated log(S(t))
Maybe hazard function decreases a little then
increases a little? Hard to say exactly
56
Approximated h(t)
57
One more graph from SAS
log(-log(S(t)) log(cumulative hazard) If group
plots are parallel, this indicates that the
proportional hazards assumption is
valid. Necessary assumption for calculation of
Hazard Ratios
58
Uses of Kaplan-Meier

Commonly used to describe survivorship of study
population/s.
Commonly used to compare two study populations.
Intuitive graphical presentation.

59
Limitations of Kaplan-Meier

Mainly descriptive
Doesnt control for covariates
Requires categorical predictors
SAS does let you easily discretize continuous
variables for KM methods, for exploratory
purposes.
Cant accommodate time-dependent variables

60
Parametric Models for the hazard/survival function

The class of regression models estimated by PROC
LIFEREG is known as the accelerated failure time
models.

61
Shape parameter (inverse of the scale
parameter) lt1 hazard rate is decreasing gt1
hazard rate is increasing
Parameters of the Weibull distribution
62
Constant hazard rate (special case of Weibull
where shape parameter 1.0)
63
Recall two parametric models

Components
A baseline hazard function (that may change over
time).
A linear function of a set of k fixed covariates
that when exponentiated (and a few other things)
gives the relative risk.

64
To get Hazard Ratios (relative risk)

Weibull (and thus exponential) are proportional
hazards models, so hazard ratio can be
calculated.
For other parametric models, you cannot calculate
hazard ratio (hazards are not necessarily
proportional over time).

More tricky to get confidence intervals here!
65
Whats a hazard ratio?

Distinction between hazard/rate ratio and odds
ratio/risk ratio
Hazard/rate ratio ratio of incidence rates
Odds/risk ratio ratio of proportions

66
Example 1

Using data from pregnancy study
Recall roughly, hazard rates were similar over
time
(implies exponential model should be a good fit).

67
The LIFEREG Procedure Analysis of
Parameter Estimates
Standard 95 Confidence Chi-
Parameter DF Estimate Error
Limits Square Pr gt ChiSq
Intercept 1 2.2636 0.2049 1.8621
2.6651 122.08 lt.0001 Scale
1 1.0217 0.1638 0.7462 1.3987
Weibull Shape 1 0.9788 0.1569 0.7149
1.3401
Scale of 1.0 makes a Weibull an exponential, so
looks exponential.
68
Parametric estimates of survival function based
on a Weibull model (left) and exponential (right).
69
Example 2 2 groups

Using data from hepatitis trial, I fit
exponential and Weibull models in SAS using
LIFEREG (Weibull is default in LIFEREG)

70
The LIFEREG Procedure Dependent
Variable Log(time) Right
Censored Values 17
Left Censored Values
0 Interval Censored
Values 0
Name of Distribution Exponential
Log Likelihood
-68.03461345
Analysis of Parameter Estimates
Standard 95
Confidence Chi- Parameter DF
Estimate Error Limits Square Pr gt
ChiSq Intercept 1 4.4886
0.2500 3.9986 4.9786 322.37 lt.0001
group 1 0.9008 0.3917 0.1332
1.6685 5.29 0.0214 Scale
0 1.0000 0.0000 1.0000 1.0000
Weibull Shape 0 1.0000 0.0000 1.0000
1.0000
Hazard ratio (treated vs. control) e-0.9008
.406
Interpretation median time to death was
decreased 60 in treated group or, equivalently,
mortality rate is 60 lower in treated group.
71
Model Information Dependent
Variable Log(time) Right
Censored Values 17
Left Censored Values
0 Interval Censored
Values 0
Name of Distribution Weibull
Log Likelihood
-66.94904552
Analysis of Parameter Estimates
Standard 95 Confidence
Chi- Parameter DF Estimate
Error Limits Square Pr gt ChiSq
Intercept 1 4.4811 0.3169 3.8601
5.1022 200.00 lt.0001 group
1 1.0544 0.5096 0.0556 2.0533
4.28 0.0385 Scale 1
1.2673 0.2139 0.9103 1.7643
Weibull Shape 1 0.7891 0.1332 0.5668
1.0985
Comparison of models using Likelihood Ratio
test -2LogLikelihood(simpler model)2LogLikelihoo
d(more complex) chi-square with 1 df (1 extra
parameter estimated for weibull model). 136-134
2 NS No evidence that Weibull model is much
better than exponential.
Hazard ratio (treated vs. control) e-1.05/1.267
.43
Shape parameter is just 1/scale parameter!
72
Parametric estimates of cumulative survival based
on Weibull model (left) and exponential (right),
by group.
73
Compare to Cox regression