Title: ISCB2007DeLorenzoAntoliniValsecchi
1Evaluation of alternative prognostic
stratifications by prediction accuracy measures
on individual survival
Paola De Lorenzo, Laura Antolini and Maria Grazia
Valsecchi Department of Clinical Medicine and
Prevention University of Milano-Bicocca,
Italy paola.delorenzo_at_unimib.it
28th ISCB Conference, Alexandroupolis, July
29-August 2 2007
2Overview
- Introduction
- Motivating example prognostic models in acute
lymphoblastic leukaemia (ALL) in infancy - Measures of discrimination/accuracy
- Results
- References
3Introduction
- In clinical research, interest commonly lies in
the identification of groups of patients at
different prognosis, to tailor future treatment
interventions. - Identification of factors that explain
heterogeneity in outcome. - Few groups are desirable. E.g. Low Risk (LR),
Intermediate Risk (IR) and High Risk (HR).
Problem availability of many candidate factors
may originate alternative stratifications with
similar prognostic discrimination ability.
Comparison of stratifications
4Motivating example
- International clinical trial on infant ALL, 374
patients - aim at classifying patients into risk groups,
defined by presenting features and early response
to PDN treatment. - preliminary analysis lead to 2 alternative
stratifications
Stratification 1 LR no genetic lesion IR
otherwise HR genetic lesion agelt6m.
WBC300K
Stratification 2 LR no genetic lesion IR
otherwise HR genetic lesion agelt6m.
PDN response
5Motivating example
K-M EFS estimates
LR
IR
HR
6Motivating example
Patients classification
Stratifications are concordant for 320 pts.
(86), discordant in 302454 pts.(14)
7Measures
Stratification1 and Stratification2 may be
compared by
- Measures of discrimination
- based on agreement between ranking of predicted
times and of individual observed times (Harrells
C) - based on hazard ratio of prognostic groups (SEP,
D) - Measure of inaccuracy of individual prediction
- based on comparisons between observed and
predicted survival (Brier Score)
8Notation
For the i-th individual, we observe Ttime-to-eve
nt, possibly censored devent status indicator
(d1 if T is a failure time) xfixed covariates
- Stratification rules produce risk strata based on
X. Let - if i-th individual is assigned to
stratum j - the estimated survival in
stratum j
9Harrells C -definition-
Aim to evaluate agreement between predicted and
observed times
- Given a pair of subjects (i, l), such that
, - Assuming separation between predicted survival
curves - (one-to-one correspondence between and ),
which may be estimated by
10Harrells C -results-
Stratification2 PDN
Stratification1 WBC
0.676
0.679
95 CI 0.624 - 0.735
95 CI 0.618 - 0.734
- equivalent discrimination ability
- not informative on performance in the 2 relevant
subsets, and (ties)
11Brier Score -definition-
Aim to evaluate the prediction error at the
individual level (say, i-th subject in j-th
stratum) At t, compare the observed
status with the predicted survival for stratum j,
With a quadratic loss function, the prediction
error is
expected Brier Score
which can be estimated by (no censoring)
12Brier Score -definition-
Estimation with censoring
where probability of being free from
censoring
Explained residual variation
where is calculated for
95 Bootstrap CI (B1000 samples with
replacement)
13Brier Score -results-
and 95 CI at relevant time-points
No Stratification overall EFS
Stratification1 WBC
Stratification2 PDN
14Brier Score -results-
R2 and 95 CI at relevant time-points
Stratification1 WBC
Stratification2 PDN
explained variation is higher in PDN than WBC
15Brier Score -partition-
BSc(t) 1/374 320BSc(t) 30BSc?(t)
24BSc(t)
16Brier Score -partition-
Prediction at time t12 months
in both ? and prediction seems to be more
accurate when assigned stratum is HR
17Brier Score -partition-
K-M EFS estimates in subgroups
IR with PDN
IR with WBC
HR with PDN
HR with WBC
18Brier Score -conclusions-
HR NEW HR by WBC and/or HR by PDN
Stratification NEW
Comparison by BSc(t)
LR
IR
HR
19References
- Harrells C
- Harrell FE Jr, Lee KL, Mark DB. Multivariable
prognostic models issues in developing models,
evaluating assumptions and adequacy, and
measuring and reducing errors. Statistics in
Medicine, 1996 15(4)361-87. - Antolini L, Boracchi P, Biganzoli E. A
time-dependent discrimination index for survival
data. Statistics in Medicine, 2005
243927-3944. - SEP, D
- Royston P, Saurebrei W. A new measure of
prognostic separation in survival data.
Statistics in Medicine, 2004 23723-748 - Brier Score
- Graf E, Schmoor C, Sauerbrei W, Schumacher M.
Assessment and comparison of prognostic
classification schemes for survival data.
Statistics in Medicine, 1999 182529-2545.
20(No Transcript)
21D -definition-
Stratification rules induce a risk ordering
among individuals, based on a suitable prognostic
index PI (e.g. log hazard ratio, ). The
discrimination predicted by could be evaluated
by quantifying the variation among the .
is a natural but
unsatisfactory choice
in a validation setting, with a new sample, do
not depend on the outcome data
use only the risk ordering
22D -definition-
- assume
- express in terms of standard gaussian ordered
rankits , - fit
- estimate of the standard error of
After convenient re-scaling with obtain
23D -results-
Stratification2 PDN
Stratification1 WBC
D 1.030 plt0.001
D 0.937 plt0.001
- equivalent discrimination ability
- not informative on performance in the 2 relevant
subsets, and
Stratification NEW
D 1.114 plt0.001
24Brier Score -partition-
Partition of at relevant time-points
BSc(t)BSc(t)BSc?(t)BSc(t)
in both ? and prediction is more accurate when
assigned stratum is HR
25Motivating example
Stratification2 PDN
Stratification1 WBC