Title: Metaanalysis of diagnostic test studies using individual patient data and aggregate data
1Meta-analysis of diagnostic test studies using
individual patient data and aggregate data
Richard D Riley, Susanna R Dodd, Jean Craig,
John R Thompson, Paula R Williamson
Centre for Medical Statistics and Health
Evaluation, University of Liverpool, UK. E-mail
rdriley_at_liv.ac.uk
2Meta-analysis of diagnostic test studies using
individual patient data and aggregate data
Richard D Riley, Susanna R Dodd, Jean Craig,
John R Thompson, Paula R Williamson
Centre for Medical Statistics and Health
Evaluation, University of Liverpool, UK. E-mail
rdriley_at_liv.ac.uk
- Background
- Bivariate meta-analysis
- Patient-level covariates
- Aggregate data
- Individual patient data
- IPD and AD
Individual patient data (IPD)
Aggregate Data (AD)
?
3- Motivating Example Fever in children
- How do we know when a child has an abnormally
high temperature (fever)?
4(No Transcript)
5- Motivating Example Fever in children
- NHS guidelines are that
- Fever in children defined as rectal temperature
38 0C
- Rectal measurements are clearly not ideal
- Less-invasive alternatives preferable,
especially for non-infants
- Q What is the diagnostic accuracy of ear
temperature measurements compared to the rectal
reference standard?
6(No Transcript)
7 Meta-Analysis Dataset of Craig et al (2002)
- 23 studies identified
- Each assess accuracy of ear thermometers for
diagnosing fever
- Reference standard was rectal thermometer
- Most studies use 38 oC to define fever, as per
guidelines
- Want to summarise diagnostic accuracy across
studies
8 Meta-Analysis Dataset of Craig et al (2002)
- 23 studies identified
- Each assess accuracy of ear thermometers for
diagnosing fever
- Reference standard was rectal thermometer
- Most studies use 38 oC to define fever, as per
guidelines
- Want to summarise diagnostic accuracy across
studies
- However ...
- Age of children varies across studies (0 to 18
years)
- Different rectal and ear measuring devices used
9 Meta-Analysis Dataset of Craig et al (2002)
- 23 studies identified
- - 12 studies provide aggregate data (AD)
- i.e. a 2 by 2 table of diagnostic accuracy
- - 11 studies give individual patient data
(IPD) with age
- i.e. test response, true fever status, age
for each patient
10 Meta-Analysis Dataset of Craig et al (2002)
- 23 studies identified
- Methodological question
- How can we meta-analyse this dataset to
- (i) summarise diagnostic accuracy across
studies
- (ii) assess effect of measurement device
- (iii) examine if how age modifies diagnostic
accuracy
- (iv) utilise IPD AD appropriately ?
11 Utilise Bivariate Meta-Analysis framework
- Bivariate model currently proposed within an AD
framework
- One diagnostic accuracy table per study
modelled
-
- Reitsma et al (2005) Chu Cole (2006) Harbord
et al (2007)
12 13 14 -
-
- Within-studies
- Between-studies
- Mean logit-sensitivity Mean
logit-specificity across
studies across
studies
15 -
-
- Within-studies
- Between-studies
- Study-level covariates can also be introduced
here,
- e.g. Measurement device, to explain
heterogeneity
16 -
-
- Within-studies
- Between-studies
17 -
-
- Within-studies
- Between-studies
- between-study
- heterogeneity
18 -
-
- Within-studies
- Between-studies
- between-study
- correlation
19Application to all 23 temperature data studies
- Maximum likelihood estimation using STATA gives
- A summary specificity of 0.96 (95 CI 0.93 to
0.98)
- A summary sensitivity of 0.71 (95 CI 0.60 to
0.82)
- Between-study correlation of -0.63 ... emphasises
non-independence of sensitivity and specificity
across studies
- Between-study variances of 1.23 and 1.47
20Application to all 23 temperature data studies
- Maximum likelihood estimation using STATA gives
- A summary specificity of 0.96 (95 CI 0.93 to
0.98)
- A summary sensitivity of 0.71 (95 CI 0.60 to
0.82)
- Between-study correlation of -0.63 ... emphasises
non-independence of sensitivity and specificity
across studies
- Between-study variances of 1.23 and 1.47
- Summary sensitivity too low to recommend ear
temperature for diagnosing fever in children
- BUT! Large heterogeneity limits this conclusion
21Assessing measurement device
- Want to explain between-study heterogeneity if
possible
- e.g. Is diagnostic accuracy affected by the ear
and rectal temperature measurement device
(mercury, electronic)?
- 8 different pairs of devices across the 23
studies
- Fit bivariate model again including covariate
for device pair
- Between-study variance estimates reduced to 0.49
and 0.54
22Assessing measurement device
- Want to explain between-study heterogeneity if
possible
- e.g. Is diagnostic accuracy affected by the ear
and rectal temperature measurement device
(mercury, electronic)?
- 8 different pairs of devices across the 23
studies
- Fit bivariate model again including covariate
for device pair
- Between-study variance estimates reduced to 0.49
and 0.54
- Measurement device explains about 60 of the
unexplained between-study heterogeneity
- But large uncertainty for each device pair
- plus concern of confounding across studies ?
23(No Transcript)
24The issue of patient-level covariates
- Diagnostic accuracy may depend on patient
characteristics such as age, sex, smoking status,
and BMI
- Can we produce diagnostic accuracy results
tailored to the individual patient?
- e.g. Perhaps ear thermometers perform better for
non-infants than infant?
25The issue of patient-level covariates
- Diagnostic accuracy may depend on patient
characteristics such as age, sex, smoking status,
and BMI
- Can we produce diagnostic accuracy results
tailored to the individual patient?
- e.g. Perhaps ear thermometers perform better for
non-infants than infant?
- In such situations the previous bivariate models
using the AD framework are theoretically wrong
- Underlying sensitivity and specificity is now not
fixed across patients in the same study
26- Previously we modelled AD from the 2 by 2
tables - this collapses
everything down to the study-level
- Within-studies (AD model)
27- Previously we modelled AD from the 2 by 2
tables - this collapses
everything down to the study-level
- Within-studies (AD model)
- Need to now model patient responses (y 0,1)
directly
- enables us to model at the patient-level
- Patient Study Disease? Correct test result (y)
- 1 1 0 1
- 2 1 1 1
- 3 1 1 0
- 4 1 0 1
- etc
-
28- Previously we modelled AD from the 2 by 2
tables - this collapses
everything down to the study-level
- Within-studies (AD model)
- Need to now model patient responses (y 0,1)
directly
- enables us to model at the patient-level
- Within-studies (IPD model)
-
29- Previously we modelled AD from the 2 by 2
tables - this collapses
everything down to the study-level
- Within-studies (AD model)
- Need to now model patient responses (y 0,1)
directly
- enables us to model at the patient-level
- Within-studies (IPD model)
-
30- Previously we modelled AD from the 2 by 2
tables - this collapses
everything down to the study-level
- Within-studies (AD model)
- Need to now model patient responses (y 0,1)
directly
- enables us to model at the patient-level
- Within-studies (IPD model)
-
31- Previously we modelled AD from the 2 by 2
tables - this collapses
everything down to the study-level
- Within-studies (AD model)
- Need to now model patient responses (y 0,1)
directly
- enables us to model at the patient-level
- Within-studies (IPD model)
-
- allows each patient to have their own diagnostic
accuracy
32- Previously we modelled AD from the 2 by 2
tables - this collapses
everything down to the study-level
- Within-studies (AD model)
- Need to now model patient responses (y 0,1)
directly
- enables us to model at the patient-level
- Within-studies (IPD model)
-
- Include patient-level covariates if desired
33- To estimate effect of patient-level covariates
we usually require IPD with patient-level
covariates
- Allows us to model within-study and across-study
effects
- Patient Study Disease? Correct test result (y)
Age
- 1 1 0 1 7
- 2 1 1 1 13
- 3 1 1 0 9
- 4 1 0 1 17
- etc
-
34- Within-study effect
- Effect of individual covariates on diagnostic
accuracy
- Results tailored to individual patient
- e.g. the diagnostic accuracy for infants
compared to non- infants, or males compared to
females, is
- Within each study, include covariate centred
about its mean
-
35- Within-study effect
- Effect of individual covariates on diagnostic
accuracy
- Results tailored to individual patient
- e.g. the diagnostic accuracy for infants
compared to non- infants, or males compared to
females, is
- Within each study, include covariate centred
about its mean
- Across-study effect
- How mean sensitivity and mean specificity in a
study is associated with the mean patient-level
covariate
- Between-studies, include covariate mean
- Results relate to the study-level (population)
- e.g. In a population with a proportion of 70
males, the underlying mean diagnostic accuracy
will be
-
36Within versus across-study effect estimates
- Within-study effects meaningful to individual
patient
- But not obtainable if IPD including covariate not
available
- Across-study effects meaningful at the population
level
- Available when mean covariate is available for
each study
- Simulation studies show that in ideal conditions
across-study effects will reflect within-study
effects (unbiased)
- But across-study effects prone to confounding
across studies (e.g. measurement device)
ecological bias
- Interpret with caution!
37- Application to temperature data
- 11 of the 23 studies provide IPD with age
- How does being an infant modify sensitivity?
- Within-study effect
- ?1w 0.10 (S.E. 0.18)
38- Application to temperature data
- 11 of the 23 studies provide IPD with age
- How does being an infant modify sensitivity?
- Within-study effect
- ?1w 0.10 (S.E. 0.18)
-
if non-infants have a summary sensitivity of 70
then infants have a summary sensitivity of 72
non-significant
39- Application to temperature data
- 11 of the 23 studies provide IPD with age
- How does being an infant modify sensitivity?
- Within-study effect Across-study effect
- ?1w 0.10 (S.E. 0.18) ?1A -3.81(S.E. 1.32)
-
-
if non-infants have a summary sensitivity of 70
then infants have a summary sensitivity of 72
non-significant
40- Application to temperature data
- 11 of the 23 studies provide IPD with age
- How does being an infant modify sensitivity?
- Within-study effect Across-study effect
- ?1w 0.10 (S.E. 0.18) ?1A -3.81(S.E. 1.32)
-
-
if non-infants have a summary sensitivity of 70
then infants have a summary sensitivity of 72
non-significant
if non-infant studies have an underlying
sensitivity of 70 then infant studies have an u
nderlying sensitivity of just 5
significant
41- Application to temperature data
- 11 of the 23 studies provide IPD with age
- How does being an infant modify sensitivity?
- Within-study effect Across-study effect
- ?1w 0.10 (S.E. 0.18) ?1A -3.81(S.E. 1.32)
-
-
if non-infants have a summary sensitivity of 70
then infants have a summary sensitivity of 72
non-significant
if non-infant studies have an underlying
sensitivity of 70 then infant studies have an u
nderlying sensitivity of just 5
significant
Very different conclusions!
42- Application to temperature data
- 11 of the 23 studies provide IPD with age
- How does being an infant modify sensitivity?
- Within-study effect Across-study effect
- ?1w 0.10 (S.E. 0.18) ?1A -3.81(S.E. 1.32)
-
-
? NO IPD
if non-infants have a summary sensitivity of 70
then infants have a summary sensitivity of 72
non-significant
if non-infant studies have an underlying
sensitivity of 70 then infant studies have an u
nderlying sensitivity of just 5
significant
Very different conclusions!
43- Combining IPD and aggregate data
- Sometimes a mixture of IPD and AD studies
obtained
- e.g.12 temperature studies did not provide IPD
with age
- Want to utilise all the evidence
-
-
-
-
44- Combining IPD and aggregate data
- Sometimes a mixture of IPD and AD studies
obtained
- e.g.12 temperature studies did not provide IPD
with age
- Want to utilise all the evidence
-
- Simultaneously fit
- (1) IPD studies model including all covariates
- (2) AD studies model including all but patient
covariates need to include random-effects
to account for unknown patient-level
covariate -
-
-
45- Combining IPD and aggregate data
- Sometimes a mixture of IPD and AD studies
obtained
- e.g.12 temperature studies did not provide IPD
with age
- Want to utilise all the evidence
-
- Simultaneously fit
- (1) IPD studies model including all covariates
- (2) AD studies model including all but patient
covariates need to include random-effects
to account for unknown patient-level
covariate - Models (1) and (2) linked by common parameters
- Estimation use
- (i) dummy variables (frequentist
approach)
- or (ii) simultaneous models (Bayesian
approach)
-
-
-
46- Combining IPD and aggregate data
- Application to all 23 studies
- 11 IPD studies assess infant-accuracy effect
- ?1w 0.10 (S.E. 0.18) ?0w 0.12 (S.E.
0.36)
- No evidence that diagnostic accuracy is
different for infants and non-infants
- All 23 studies give summary sensitivity and
specificity for each measurement device pair
- No evidence that ear thermometers are suitable
for diagnosing fever
- Further studies need to standardise devices
46
47Further research issues
- Non-linear relationships
- Confounding within-studies
- Allow within-study effects to vary across
studies?
- Are IPD and AD studies of comparable quality?
- Issue that AD studies add heterogeneity due to
greater variation in threshold level
- In IPD studies, better to pool the individual ROC
curves directly (Kester Buntinx, 2000)?
48Conclusions
- Bivariate random-effects meta-analysis
- AD framework using binomial distribution
- Patient-level covariates?
- IPD framework using Bernoulli distribution
- IPD enables within-study accuracy-covariate
effects
- Preferable to across-study effects
- Models to combine IPD and AD
49e-mail rdriley_at_liv.ac.uk
References Chu H, Cole SR Bivariate meta-analysi
s of sensitivity and specificity with sparse
data a generalized linear mixed model approach.
J Clin Epi 2006, 591331-1332
Reitsma JB, et al. Bivariate analysis of
sensitivity and specificity produces informative
summary measures in diagnostic reviews. J Clin
Epi 2005, 58982-990. Craig JV, et al. Infrared
ear thermometry compared with rectal thermometry
in children a systematic review. Lancet 2002,
360603-609. Harbord RM, et al A unification of
models for meta-analysis of diagnostic accuracy
studies. Biostatistics 2007, 8239-251
Riley RD, et al Meta-analysis of continuous
outcomes combining individual patient data and
aggregate data. Stat Med 2008, 27 1870-93
Kester AD, Buntinx F Meta-analysis of ROC
curves. Med Decis Making 2000, 20430-439
Riley et al. Meta-analysis of diagnostic test
studies using individual patient data and
aggregate data. Stat Med (in press)