Statistics 262: Intermediate Biostatistics - PowerPoint PPT Presentation

About This Presentation
Title:

Statistics 262: Intermediate Biostatistics

Description:

Statistics 262: Intermediate Biostatistics ... Time (months) to conception or censoring in 38 sub-fertile women after laparoscopy and hydrotubation ... – PowerPoint PPT presentation

Number of Views:156
Avg rating:3.0/5.0
Slides: 74
Provided by: kristinc
Learn more at: http://web.stanford.edu
Category:

less

Transcript and Presenter's Notes

Title: Statistics 262: Intermediate Biostatistics


1
Statistics 262 Intermediate Biostatistics
Kaplan-Meier methods and Parametric Regression
methods
2
More on Kaplan-Meier estimator of
S(t)(product-limit estimator or KM estimator)
  • When there are no censored data, the KM estimator
    is simple and intuitive
  • Estimated S(t) proportion of observations with
    failure times gt t.
  • For example, if you are following 10 patients,
    and 3 of them die by the end of the first year,
    then your best estimate of S(1 year) 70.
  • When there are censored data, KM provides
    estimate of S(t) that takes censoring into
    account (see last weeks lecture).
  • If the censored observation had actually been a
    failure S(1 year)4/53/42/32/540
  • KM estimator is defined only at times when events
    occur! (empirically defined)

3
KM (product-limit) estimator, formally
4
KM (product-limit) estimator, formally
This formula gives the product-limit estimate of
survival at each time an event happens.
5
Example 1 time-to-conception for subfertile women
Failure here is a good thing. 38 women (in
1982) were treated for infertility with
laparoscopy and hydrotubation. All women were
followed for up to 2-years to describe
time-to-conception. The event is conception, and
women "survived" until they conceived.
Example from BMJ, Dec 1998 317 1572 - 1580.
6
Raw data Time (months) to conception or
censoring in 38 sub-fertile women after
laparoscopy and hydrotubation (1982 study)
Data from Luthra P, Bland JM, Stanton SL.
Incidence of pregnancy after laparoscopy and
hydrotubation. BMJ 1982 284 1013-1014
7
Corresponding Kaplan-Meier Curve
S(t) is estimated at 9 event times. (step-wise
function)
8
Raw data Time (months) to conception or
censoring in 38 sub-fertile women after
laparoscopy and hydrotubation (1982 study)
Data from Luthra P, Bland JM, Stanton SL.
Incidence of pregnancy after laparoscopy and
hydrotubation. BMJ 1982 284 1013-1014
9
Raw data Time (months) to conception or
censoring in 38 sub-fertile women after
laparoscopy and hydrotubation (1982 study)
Data from Luthra P, Bland JM, Stanton SL.
Incidence of pregnancy after laparoscopy and
hydrotubation. BMJ 1982 284 1013-1014
10
Corresponding Kaplan-Meier Curve
6 women conceived in 1st month (1st menstrual
cycle). Therefore, 32/38 survived
pregnancy-free past 1 month.
11
Corresponding Kaplan-Meier Curve
12
Important detail of how the data were
codedCensoring at t2 indicates survival PAST
the 2nd cycle (i.e., we know the woman survived
her 2nd cycle pregnancy-free). Thus, for
calculating KM estimator at 2 months, this person
should still be included in the risk set. Think
of it as 2 months, e.g., 2.1 months.
Raw data Time (months) to conception or
censoring in 38 sub-fertile women after
laparoscopy and hydrotubation (1982 study)
Data from Luthra P, Bland JM, Stanton SL.
Incidence of pregnancy after laparoscopy and
hydrotubation. BMJ 1982 284 1013-1014
13
Corresponding Kaplan-Meier Curve
14
Corresponding Kaplan-Meier Curve
5 women conceive in 2nd month. The risk set at
event time 2 included 32 women. Therefore,
27/3284.4 survived event time 2
pregnancy-free.
Can get an estimate of the hazard rate here,
h(t2) 5/3215.6. Given that you didnt get
pregnant in month 1, you have an estimated 5/32
chance of conceiving in the 2nd month. And
estimate of density (marginal probability of
conceiving in month 2) f(t)h(t)S(t)(.711)(.15
6)11
15
Raw data Time (months) to conception or
censoring in 38 sub-fertile women after
laparoscopy and hydrotubation (1982 study)
Data from Luthra P, Bland JM, Stanton SL.
Incidence of pregnancy after laparoscopy and
hydrotubation. BMJ 1982 284 1013-1014
16
Corresponding Kaplan-Meier Curve
17
Corresponding Kaplan-Meier Curve
3 women conceive in the 3rd month. The risk set
at event time 3 included 26 women. 23/2688.5
survived event time 3 pregnancy-free.
18
Raw data Time (months) to conception or
censoring in 38 sub-fertile women after
laparoscopy and hydrotubation (1982 study)
Risk set at 4 months includes 22 women
Data from Luthra P, Bland JM, Stanton SL.
Incidence of pregnancy after laparoscopy and
hydrotubation. BMJ 1982 284 1013-1014
19
Corresponding Kaplan-Meier Curve
20
Corresponding Kaplan-Meier Curve
3 women conceive in the 4th month, and 1 was
censored between months 3 and 4. The risk set at
event time 4 included 22 women. 19/2286.4
survived event time 4 pregnancy-free.
And estimate of density (marginal probability of
conceiving in month 4) f(t)h(t)S(t)(.136)
(.542)7.4
21
Raw data Time (months) to conception or
censoring in 38 sub-fertile women after
laparoscopy and hydrotubation (1982 study)
Risk set at 6 months includes 18 women
Data from Luthra P, Bland JM, Stanton SL.
Incidence of pregnancy after laparoscopy and
hydrotubation. BMJ 1982 284 1013-1014
22
Corresponding Kaplan-Meier Curve
23
Corresponding Kaplan-Meier Curve
2 women conceive in the 6th month of the study,
and one was censored between months 4 and 6. The
risk set at event time 5 included 18
women. 16/1888.8 survived event time 5
pregnancy-free.
24
Skipping ahead to the 9th and final event time
(months16)
25
Raw data Time (months) to conception or
censoring in 38 sub-fertile women after
laparoscopy and hydrotubation (1982 study)
2 remaining at 16 months (9th event time)
Data from Luthra P, Bland JM, Stanton SL.
Incidence of pregnancy after laparoscopy and
hydrotubation. BMJ 1982 284 1013-1014
26
Skipping ahead to the 9th and final event time
(months16)
Tail here just represents that the final 2 women
did not conceive (cannot make many inferences
from the end of a KM curve)!
27
Kaplan-Meier SAS output
  • The LIFETEST
    Procedure
  • Product-Limit
    Survival Estimates

  • Survival

  • Standard Number Number
  • time Survival Failure
    Error Failed Left
  • 0.0000 1.0000 0
    0 0 38
  • 1.0000 . .
    . 1 37
  • 1.0000 . .
    . 2 36
  • 1.0000 . .
    . 3 35
  • 1.0000 . .
    . 4 34
  • 1.0000 . .
    . 5 33
  • 1.0000 0.8421 0.1579
    0.0592 6 32
  • 2.0000 . .
    . 7 31
  • 2.0000 . .
    . 8 30
  • 2.0000 . .
    . 9 29

28
Kaplan-Meier SAS output
  • Survival

  • Standard Number Number
  • time Survival Failure
    Error Failed Left
  • 6.0000 . .
    . 18 17
  • 6.0000 0.4825 0.5175
    0.0834 19 16
  • 7.0000 . .
    . 19 15
  • 7.0000 . .
    . 19 14
  • 8.0000 . .
    . 19 13
  • 8.0000 . .
    . 19 12
  • 9.0000 . .
    . 20 11
  • 9.0000 . .
    . 21 10
  • 9.0000 0.3619 0.6381
    0.0869 22 9
  • 9.0000 . .
    . 22 8
  • 9.0000 . .
    . 22 7
  • 9.0000 . .
    . 22 6
  • 10.0000 0.3016 0.6984
    0.0910 23 5

29
Monday Gut Check Problem
  • Calculate the product-limit estimate of survival
    for the following data (n9)

Time-to-event (months) Survival (1died/0censored)
10 0
2 1
4 0
8 1
12 0
14 0
10 1
1 0
3 0
30
Not so easy to get a plot of the actual hazard
function! In SAS, need a complicated MACRO, and
depends on assumptionsheres what I get from
Paul Allisons macro for these data
31
At best, you can get the cumulative hazard
function
32
Cumulative Hazard Function
  • If the hazard function is constant, e.g. h(t)k,
    then the cumulative hazard function will be
    linear (and higher hazards will have steeper
    slopes)
  • If the hazard function is increasing with time,
    e.g. h(t)kt, then the cumulative hazard function
    will be curved up, for example h(t)kt gives a
    quadratic
  • If the hazard function is decreasing over time,
    e.g. h(t)k/t, then the cumulative hazard
    function should be curved down, for example

33
Kaplan-Meier example 2
  • Researchers randomized 44 patients with chronic
    active hepatitis were to receive prednisolone or
    no treatment (control), then compared survival
    curves.

Example from BMJ 1998317468-469 ( 15 August )
34
Survival times (months) of 44 patients with
chronic active hepatitis randomised to receive
prednisolone or no treatment.
Data from BMJ 1998317468-469 ( 15 August )
censored
35
Kaplan-Meier example 2
Are these two curves different?
Misleading to the eyeapparent convergence by end
of study. But this is due to 6 controls who
survived fairly long, and 3 events in the
treatment group when the sample size was small.
36
Control group
  • Survival

  • Standard Number Number
  • time Survival Failure
    Error Failed Left
  • 0.000 1.0000 0
    0 0 22
  • 2.000 0.9545 0.0455
    0.0444 1 21
  • 3.000 0.9091 0.0909
    0.0613 2 20
  • 4.000 0.8636 0.1364
    0.0732 3 19
  • 7.000 0.8182 0.1818
    0.0822 4 18
  • 10.000 0.7727 0.2273
    0.0893 5 17
  • 22.000 0.7273 0.2727
    0.0950 6 16
  • 28.000 0.6818 0.3182
    0.0993 7 15
  • 29.000 0.6364 0.3636
    0.1026 8 14
  • 32.000 0.5909 0.4091
    0.1048 9 13
  • 37.000 0.5455 0.4545
    0.1062 10 12
  • 40.000 0.5000 0.5000
    0.1066 11 11
  • 41.000 0.4545 0.5455
    0.1062 12 10
  • 54.000 0.4091 0.5909
    0.1048 13 9
  • 61.000 0.3636 0.6364
    0.1026 14 8

6 controls made it past 100 months.
37
treated group

  • Survival

  • Standard Number Number
  • time Survival Failure
    Error Failed Left
  • 0.000 1.0000 0
    0 0 22
  • 2.000 0.9545 0.0455
    0.0444 1 21
  • 6.000 0.9091 0.0909
    0.0613 2 20
  • 12.000 0.8636 0.1364
    0.0732 3 19
  • 54.000 0.8182 0.1818
    0.0822 4 18
  • 56.000 . .
    . 4 17
  • 68.000 0.7701 0.2299
    0.0904 5 16
  • 89.000 0.7219 0.2781
    0.0967 6 15
  • 96.000 . .
    . 7 14
  • 96.000 0.6257 0.3743
    0.1051 8 13
  • 125.000 . .
    . 8 12
  • 128.000 . .
    . 8 11
  • 131.000 . .
    . 8 10
  • 140.000 . .
    . 8 9

38
Point-wise confidence intervals
We will not worry about mathematical formula for
confidence bands. The important point is that
there is a confidence interval for each estimate
of S(t). (SAS uses Greenwoods formula.)
39
Log-rank test
  • Test of Equality over Strata
  • Pr gt
    Test Chi-Square DF
    Chi-Square
  • Log-Rank 4.6599 1 0.0309
  • Wilcoxon 6.5435 1 0.0105
  • -2Log(LR) 5.4096 1 0.0200

Chi-square test (with 1 df) of the (overall)
difference between the two groups. Groups appear
significantly different.
40
Log-rank test
Log-rank test is just a Cochran-Mantel-Haenszel
chi-square test! Anyone remember (know) what
this is?
41
CMH test of conditional independence
K Strata unique event times
Nk
42
CMH test of conditional independence
K Strata unique event times
Nk
43
CMH test of conditional independence
How do you know that this is a chi-square with 1
df?
44
Event time 1 (2 months), control group
  • Survival

  • Standard Number Number
  • time Survival Failure
    Error Failed Left
  • 0.000 1.0000 0
    0 0 22
  • 2.000 0.9545 0.0455
    0.0444 1 21
  • 3.000 0.9091 0.0909
    0.0613 2 20
  • 4.000 0.8636 0.1364
    0.0732 3 19
  • 7.000 0.8182 0.1818
    0.0822 4 18
  • 10.000 0.7727 0.2273
    0.0893 5 17
  • 22.000 0.7273 0.2727
    0.0950 6 16
  • 28.000 0.6818 0.3182
    0.0993 7 15
  • 29.000 0.6364 0.3636
    0.1026 8 14
  • 32.000 0.5909 0.4091
    0.1048 9 13
  • 37.000 0.5455 0.4545
    0.1062 10 12
  • 40.000 0.5000 0.5000
    0.1066 11 11
  • 41.000 0.4545 0.5455
    0.1062 12 10
  • 54.000 0.4091 0.5909
    0.1048 13 9
  • 61.000 0.3636 0.6364
    0.1026 14 8

45
Event time 1 (2 months), treated group

  • Survival

  • Standard Number Number
  • time Survival Failure
    Error Failed Left
  • 0.000 1.0000 0
    0 0 22
  • 2.000 0.9545 0.0455
    0.0444 1 21
  • 6.000 0.9091 0.0909
    0.0613 2 20
  • 12.000 0.8636 0.1364
    0.0732 3 19
  • 54.000 0.8182 0.1818
    0.0822 4 18
  • 56.000 . .
    . 4 17
  • 68.000 0.7701 0.2299
    0.0904 5 16
  • 89.000 0.7219 0.2781
    0.0967 6 15
  • 96.000 . .
    . 7 14
  • 96.000 0.6257 0.3743
    0.1051 8 13
  • 125.000 . .
    . 8 12
  • 128.000 . .
    . 8 11
  • 131.000 . .
    . 8 10
  • 140.000 . .
    . 8 9

46
Stratum 1 event time 1
Event time 1 1 died from each group. (22 at risk
in each group)
44
47
Event time 2 (3 months), control group
  • Survival

  • Standard Number Number
  • time Survival Failure
    Error Failed Left
  • 0.000 1.0000 0
    0 0 22
  • 2.000 0.9545 0.0455
    0.0444 1 21
  • 3.000 0.9091 0.0909
    0.0613 2 20
  • 4.000 0.8636 0.1364
    0.0732 3 19
  • 7.000 0.8182 0.1818
    0.0822 4 18
  • 10.000 0.7727 0.2273
    0.0893 5 17
  • 22.000 0.7273 0.2727
    0.0950 6 16
  • 28.000 0.6818 0.3182
    0.0993 7 15
  • 29.000 0.6364 0.3636
    0.1026 8 14
  • 32.000 0.5909 0.4091
    0.1048 9 13
  • 37.000 0.5455 0.4545
    0.1062 10 12
  • 40.000 0.5000 0.5000
    0.1066 11 11
  • 41.000 0.4545 0.5455
    0.1062 12 10
  • 54.000 0.4091 0.5909
    0.1048 13 9
  • 61.000 0.3636 0.6364
    0.1026 14 8

48
Event time 2 (3 months), treated group

  • Survival

  • Standard Number Number
  • time Survival Failure
    Error Failed Left
  • 0.000 1.0000 0
    0 0 22
  • 2.000 0.9545 0.0455
    0.0444 1 21
  • 6.000 0.9091 0.0909
    0.0613 2 20
  • 12.000 0.8636 0.1364
    0.0732 3 19
  • 54.000 0.8182 0.1818
    0.0822 4 18
  • 56.000 . .
    . 4 17
  • 68.000 0.7701 0.2299
    0.0904 5 16
  • 89.000 0.7219 0.2781
    0.0967 6 15
  • 96.000 . .
    . 7 14
  • 96.000 0.6257 0.3743
    0.1051 8 13
  • 125.000 . .
    . 8 12
  • 128.000 . .
    . 8 11
  • 131.000 . .
    . 8 10
  • 140.000 . .
    . 8 9

49
Stratum 2 event time 2
Event time 2 At 3 months, 1 died in the control
group. At that time 21 from each group were at
risk
42
50
Event time 3 (4 months), control group
  • Survival

  • Standard Number Number
  • time Survival Failure
    Error Failed Left
  • 0.000 1.0000 0
    0 0 22
  • 2.000 0.9545 0.0455
    0.0444 1 21
  • 3.000 0.9091 0.0909
    0.0613 2 20
  • 4.000 0.8636 0.1364
    0.0732 3 19
  • 7.000 0.8182 0.1818
    0.0822 4 18
  • 10.000 0.7727 0.2273
    0.0893 5 17
  • 22.000 0.7273 0.2727
    0.0950 6 16
  • 28.000 0.6818 0.3182
    0.0993 7 15
  • 29.000 0.6364 0.3636
    0.1026 8 14
  • 32.000 0.5909 0.4091
    0.1048 9 13
  • 37.000 0.5455 0.4545
    0.1062 10 12
  • 40.000 0.5000 0.5000
    0.1066 11 11
  • 41.000 0.4545 0.5455
    0.1062 12 10
  • 54.000 0.4091 0.5909
    0.1048 13 9
  • 61.000 0.3636 0.6364
    0.1026 14 8

51
Event time 3 (4 months), treated group

  • Survival

  • Standard Number Number
  • time Survival Failure
    Error Failed Left
  • 0.000 1.0000 0
    0 0 22
  • 2.000 0.9545 0.0455
    0.0444 1 21
  • 6.000 0.9091 0.0909
    0.0613 2 20
  • 12.000 0.8636 0.1364
    0.0732 3 19
  • 54.000 0.8182 0.1818
    0.0822 4 18
  • 56.000 . .
    . 4 17
  • 68.000 0.7701 0.2299
    0.0904 5 16
  • 89.000 0.7219 0.2781
    0.0967 6 15
  • 96.000 . .
    . 7 14
  • 96.000 0.6257 0.3743
    0.1051 8 13
  • 125.000 . .
    . 8 12
  • 128.000 . .
    . 8 11
  • 131.000 . .
    . 8 10
  • 140.000 . .
    . 8 9

52
Stratum 3 event time 3 (4 months)
Event time 3 At 4 months, 1 died in the control
group. At that time 21 from the treated group and
20 from the control group were at-risk.
41
53
Etc.
54
Log-rank test, et al.
  • Test of Equality over Strata
  • Pr gt
    Test Chi-Square DF
    Chi-Square
  • Log-Rank 4.6599 1 0.0309
  • Wilcoxon 6.5435 1 0.0105
  • -2Log(LR) 5.4096 1 0.0200

55
Estimated log(S(t))
Maybe hazard function decreases a little then
increases a little? Hard to say exactly
56
Approximated h(t)
57
One more graph from SAS
log(-log(S(t)) log(cumulative hazard) If group
plots are parallel, this indicates that the
proportional hazards assumption is
valid. Necessary assumption for calculation of
Hazard Ratios
58
Uses of Kaplan-Meier
  • Commonly used to describe survivorship of study
    population/s.
  • Commonly used to compare two study populations.
  • Intuitive graphical presentation.

59
Limitations of Kaplan-Meier
  • Mainly descriptive
  • Doesnt control for covariates
  • Requires categorical predictors
  • SAS does let you easily discretize continuous
    variables for KM methods, for exploratory
    purposes.
  • Cant accommodate time-dependent variables

60
Parametric Models for the hazard/survival function
  • The class of regression models estimated by PROC
    LIFEREG is known as the accelerated failure time
    models.

61
Shape parameter (inverse of the scale
parameter) lt1 hazard rate is decreasing gt1
hazard rate is increasing
Parameters of the Weibull distribution
62
Constant hazard rate (special case of Weibull
where shape parameter 1.0)
63
Recall two parametric models
  • Components
  • A baseline hazard function (that may change over
    time).
  • A linear function of a set of k fixed covariates
    that when exponentiated (and a few other things)
    gives the relative risk.

64
To get Hazard Ratios (relative risk)
  • Weibull (and thus exponential) are proportional
    hazards models, so hazard ratio can be
    calculated.
  • For other parametric models, you cannot calculate
    hazard ratio (hazards are not necessarily
    proportional over time).

More tricky to get confidence intervals here!
65
Whats a hazard ratio?
  • Distinction between hazard/rate ratio and odds
    ratio/risk ratio
  • Hazard/rate ratio ratio of incidence rates
  • Odds/risk ratio ratio of proportions

66
Example 1
  • Using data from pregnancy study
  • Recall roughly, hazard rates were similar over
    time
  • (implies exponential model should be a good fit).

67
The LIFEREG Procedure Analysis of
Parameter Estimates
Standard 95 Confidence Chi-
Parameter DF Estimate Error
Limits Square Pr gt ChiSq
Intercept 1 2.2636 0.2049 1.8621
2.6651 122.08 lt.0001 Scale
1 1.0217 0.1638 0.7462 1.3987
Weibull Shape 1 0.9788 0.1569 0.7149
1.3401
Scale of 1.0 makes a Weibull an exponential, so
looks exponential.
68
Parametric estimates of survival function based
on a Weibull model (left) and exponential (right).
69
Example 2 2 groups
  • Using data from hepatitis trial, I fit
    exponential and Weibull models in SAS using
    LIFEREG (Weibull is default in LIFEREG)

70
The LIFEREG Procedure Dependent
Variable Log(time) Right
Censored Values 17
Left Censored Values
0 Interval Censored
Values 0
Name of Distribution Exponential
Log Likelihood
-68.03461345
Analysis of Parameter Estimates
Standard 95
Confidence Chi- Parameter DF
Estimate Error Limits Square Pr gt
ChiSq Intercept 1 4.4886
0.2500 3.9986 4.9786 322.37 lt.0001
group 1 0.9008 0.3917 0.1332
1.6685 5.29 0.0214 Scale
0 1.0000 0.0000 1.0000 1.0000
Weibull Shape 0 1.0000 0.0000 1.0000
1.0000
Hazard ratio (treated vs. control) e-0.9008
.406
Interpretation median time to death was
decreased 60 in treated group or, equivalently,
mortality rate is 60 lower in treated group.
71
Model Information Dependent
Variable Log(time) Right
Censored Values 17
Left Censored Values
0 Interval Censored
Values 0
Name of Distribution Weibull
Log Likelihood
-66.94904552
Analysis of Parameter Estimates
Standard 95 Confidence
Chi- Parameter DF Estimate
Error Limits Square Pr gt ChiSq
Intercept 1 4.4811 0.3169 3.8601
5.1022 200.00 lt.0001 group
1 1.0544 0.5096 0.0556 2.0533
4.28 0.0385 Scale 1
1.2673 0.2139 0.9103 1.7643
Weibull Shape 1 0.7891 0.1332 0.5668
1.0985
Comparison of models using Likelihood Ratio
test -2LogLikelihood(simpler model)2LogLikelihoo
d(more complex) chi-square with 1 df (1 extra
parameter estimated for weibull model). 136-134
2 NS No evidence that Weibull model is much
better than exponential.
Hazard ratio (treated vs. control) e-1.05/1.267
.43
Shape parameter is just 1/scale parameter!
72
Parametric estimates of cumulative survival based
on Weibull model (left) and exponential (right),
by group.
73
Compare to Cox regression
  • Parameter Standard
    Hazard 95 Hazard Ratio
  • Variable DF Estimate Error
    Chi-Square Pr gt ChiSq Ratio Confidence
    Limits
  • group 1 -0.83230 0.39739
    4.3865 0.0362 0.435 0.200 0.948
Write a Comment
User Comments (0)
About PowerShow.com