Studies of Diagnostic Tests - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Studies of Diagnostic Tests

Description:

Double Gold Standard Bias: effect of spontaneously resolving cases. d. c. V/Q Scan - b ... Specificity, d/(b d) biased __ Double gold standard compared with PA ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 43
Provided by: ThomasB62
Category:

less

Transcript and Presenter's Notes

Title: Studies of Diagnostic Tests


1
Studies of Diagnostic Tests
  • Thomas B. Newman, MD, MPH
  • October 15, 2009

2
Reminders/Announcements
  • Door must be closed
  • Write down answers to problems in the book and
    check your answers!
  • Final exam to be passed out 12/3, reviewed 12/10
  • Send questions!

3
Overview
  • Common biases of studies of diagnostic test
    accuracy
  • Prevalence, spectrum and nonindependence
  • Meta-analysis of diagnostic tests
  • Checklist systematic approach
  • Examples
  • Physical examination for presentation
  • Pain with percussion, hopping or cough for
    appendicitis
  • Pertussis
  • Predicting hyperbilirubinemia

4
Bias 1 Example
  • Study of BNP to diagnose congestive heart failure
    (CHF, Chapter 4, Problem 3)

5
Bias 1 Example
  • Gold standard determination of CHF by two
    cardiologists blinded to BNP
  • Chest x-rays found to be highly predictive of CHF
  • Is there a problem with assessing accuracy of
    chest x-rays to diagnose CHF in this study?

Maisel AS, Krishnaswamy P, Nowak RM, McCord J,
Hollander JE, Duc P, et al. Rapid measurement of
B-type natriuretic peptide in the emergency
diagnosis of heart failure. N Engl J Med
2002347(3)161-7.
6
Bias 1 Incorporation bias
  • Cardiologists not blinded to Chest X-ray
  • Probably used (incorporated) it to make final
    diagnosis
  • Incorporation bias for assessment of Chest X-ray
    (not BNP)
  • Biases both sensitivity and specificity upward

7
Bias 2 Example
  • Visual assessment of jaundice in newborns
  • Study patients who are getting a bilirubin
    measurement
  • Ask clinicians to estimate extent of jaundice at
    time of blood draw

8
Visual Assessment of jaundice Results
  • Sensitivity of jaundice below the nipple line for
    TSB 12 mg/dL 97
  • Specificity 19
  • What is the problem?

Editors Note The take-home message for me is
that no jaundice below the nipple line equals no
bilirubin test, unless theres some other
indication. --Catherine D. DeAngelis, MD
Moyer et al., APAM 2000 154391
9
Bias 2 Verification bias
  • Inclusion criterion for study gold standard test
    was done
  • in this case, blood test for bilirubin
  • Subjects with positive index tests are more
    likely to be get the gold standard and to be
    included in the study
  • clinicians dont order blood test for bilirubin
    if the jaundice is minimal
  • How doe this affect sensitivity and specificity?

10
Bias 2 Verification Bias
Sensitivity, a/(ac), is biased ___.
Specificity, d/(bd), is biased ___.
AKA Work-up, Referral Bias, or Ascertainment Bias
11
Bias 3
  • Example Pioped study of accuracy of V/Q scan to
    diagnose pulmonary embolus
  • Study Population All patients presenting to the
    ED who received a V/Q scan
  • Test V/Q Scan
  • Disease Pulmonary embolism (PE)
  • Gold Standards
  • 1. Pulmonary arteriogram (PA-gram) if done (more
    likely with more abnormal V/Q scan)
  • 2. Clinical follow-up in other patients (more
    likely with normal VQ scan

PIOPED. JAMA 1990263(20)2753-9.
12
Double Gold Standard Bias
  • Two different gold standards
  • One gold standard (e.g., surgery, invasive test)
    is more likely to be applied in patients with
    positive index test,
  • Other gold standard (e.g., clinical follow-up) is
    more likely to be applied in patients with a
    negative index test.
  • There are some patients in whom the tests do not
    give the same answer
  • spontaneously resolving disease
  • newly occurring disease

13
Double Gold Standard Bias effect of
spontaneously resolving cases
Sensitivity, a/(ac) biased __ Specificity,
d/(bd) biased __
Double gold standard compared with follow-up for
all
Double gold standard compared with PA-Gram for all
14
Double Gold Standard Bias effect of newly
occurring cases
Sensitivity, a/(ac) biased __ Specificity,
d/(bd) biased __
Double gold standard compared with follow-up for
all
Double gold standard compared with PA-Gram for all
15
Double Gold Standard Bias Ultrasound diagnosis
of intussusception
16
What if 10 of the 86 U/S- followed subjects
actually had intussusceptions that resolved
spontaneously?
17
Spectrum of Disease, Nondisease and Test Results
  • Disease is often easier to diagnose if severe
  • Nondisease is easier to diagnose if patient is
    well than if the patient has other diseases
  • Test results will be more reproducible if
    ambiguous results excluded

18
Spectrum Bias
  • Sensitivity depends on the spectrum of disease in
    the population being tested.
  • Specificity depends on the spectrum of
    non-disease in the population being tested.
  • Example Absence of Nasal Bone (on 13-week
    ultrasound) as a Test for Chromosomal Abnormality

19
Spectrum Bias Example Absence of Nasal Bone as a
Test for Chromosomal Abnormality
Sensitivity 229/333 69 BUT the D group only
included fetuses with Trisomy 21
Cicero et al., Ultrasound Obstet Gynecol 2004
23 218-23
20
Spectrum Bias Absence of Nasal Bone as a Test
for Chromosomal Abnormality
  • D group excluded 295 fetuses with other
    chromosomal abnormalities (esp. Trisomy 18)
  • Among these fetuses, sensitivity 32 (not 69)
  • What decision is this test supposed to help with?
  • If it is whether to test chromosomes using
    chorionic villus sampling or amniocentesis,
    these 295 fetuses should be included!

21
Spectrum BiasAbsence of Nasal Bone as a Test
for Chromosomal Abnormality, effect of including
other trisomies in D group
Sensitivity 324/628 52 NOT 69 obtained when
the D group only included fetuses with Trisomy 21
22
Quiz What if we considered the nasal bone
absence as a test for Trisomy 21?
  • Then instead of excluding subjects with other
    chromosomal abnormalities or including them as
    D, we should count them as D-. Compared with
    excluding them,
  • What would happen to sensitivity?
  • What would happen to specificity?

23
Prevalence, spectrum and nonindependence
  • Prevalence (prior probability) of disease may be
    related to disease severity
  • One mechanism is different spectra of disease or
    nondisease
  • Another is that whatever is causing the high
    prior probability is related to the same aspect
    of the disease as the test

24
Prevalence, spectrum and nonindependence
  • Examples
  • Iron deficiency
  • Diseases identified by screening
  • Urinalysis as a test for UTI in women with more
    and fewer symptoms (high and low prior
    probability)

25
Overfitting
26
Meta-analyses of Diagnostic Tests
  • Systematic and reproducible approach to finding
    studies
  • Summary of results of each study
  • Investigation into heterogeneity
  • Summary estimate of results, if appropriate
  • Unlike other meta-analyses (risk factors,
    treatments), results arent summarized with a
    single number (e.g., RR), but with two related
    numbers (sensitivity and specificity)
  • These can be plotted on an ROC plane

27
MRI for the diagnosis of MS
Whiting et al. BMJ 2006332875-84
28
Studies of Diagnostic Test Accuracy Checklist
  • Was there an independent, blind comparison with a
    reference (gold) standard of diagnosis?
  • Was the diagnostic test evaluated in an
    appropriate spectrum of patients (like those in
    whom we would use it in practice)?
  • Was the reference standard applied regardless of
    the diagnostic test result?
  • Was the test (or cluster of tests) validated in a
    second, independent group of patients?

From Sackett et al., Evidence-based Medicine,2nd
ed. (NY Churchill Livingstone), 2000. p 68
29
Systematic Approach
  • Authors and funding source
  • Research question
  • Study design
  • Study subjects
  • Predictor variable
  • Outcome variable
  • Results Analysis
  • Conclusions

30
A clinical decision rule to identify children at
low risk for appendicitis (Problem 5.6)
  • Study design prospective cohort study
  • Subjects
  • Of 4140 patients 3-18 years presenting to Boston
    Childrens Hospital ED with CC abdominal pain
  • 767 (19) received surgical consultation for
    possible appendicitis
  • 113 Excluded (Chronic diseases, recent imaging)
  • 53 missed
  • 601 included in the study (425 in derivation set)

Kharbanda et al. Pediatrics 2005 116(3) 709-16
31
A clinical decision rule to identify children at
low risk for appendicitis
  • Predictor variable
  • Standardized assessment by PEM attending
  • Focus on Pain with percussion, hopping or cough
    (complete data in N381)
  • Outcome variable
  • Pathologic diagnosis of appendicitis for those
    who received surgery (37)
  • Follow-up telephone call to family or
    pediatrician 2-4 weeks after the ED visit for
    those who did not receive surgery (63)

Kharbanda et al. Pediatrics 116(3) 709-16
32
A clinical decision rule to identify children at
low risk for appendicitis
  • Results Pain with percussion, hopping or
    cough
  • 78 sensitivity seems low to me. Is it valid for
    me in deciding whom to image?

Kharbanda et al. Pediatrics 116(3) 709-16
33
Checklist
  • Was there an independent, blind comparison with a
    reference (gold) standard of diagnosis?
  • Was the diagnostic test evaluated in an
    appropriate spectrum of patients (like those in
    whom we would use it in practice)?
  • Was the reference standard applied regardless of
    the diagnostic test result?
  • Was the test (or cluster of tests) validated in a
    second, independent group of patients?

From Sackett et al., Evidence-based Medicine,2nd
ed. (NY Churchill Livingstone), 2000. p 68
34
Systematic approach
  • Study design prospective cohort study
  • Subjects
  • Of 4140 patients 3-18 years presenting to Boston
    Childrens Hospital ED with CC abdominal pain
  • 767 (19) received surgical consultation for
    possible appendicitis

Kharbanda et al. Pediatrics 116(3) 709-16
35
A clinical decision rule to identify children at
low risk for appendicitis
  • Predictor variable
  • Pain with percussion, hopping or cough
    (complete data in N381)
  • Outcome variable
  • Pathologic diagnosis of appendicitis for those
    who received surgery (37)
  • Follow-up telephone call to family or
    pediatrician 2-4 weeks after the ED visit for
    those who did not receive surgery (63)

Kharbanda et al. Pediatrics 116(3) 709-16
36
Issues
  • Sample representative?
  • Verification bias?
  • Double-gold standard bias?
  • Spectrum bias

37
For children presenting with abdominal pain to
SFGH 6-M
  • Sensitivity probably valid (not falsely low)
  • But whether all of them tried to hop is not clear
  • Specificity probably low
  • PPV is high
  • NPV is low
  • Does not address surgical consultation decision

38
Does this coughing patient have pertussis?
  • RQ (for us) what are LR for coughing fits,
    whoop, and post-tussive vomiting in adults with
    persistent cough?
  • Design (for one study we reviewed) Prospective
    cross-sectional study
  • Subjects 217 adults 18 years with cough 7-21
    days, no fever or other clear cause for cough
    enrolled by 80 French GPs.
  • In a subsample from 58 GPs, of 710 who met
    inclusion criteria only 99 (14) enrolled

Gilberg S et al. J Inf Dis 2002186415-8
39
Petussis diagnosis
  • Predictor variables GPs interviewed patients
    using a standardized questionnaire.
  • Outcome variable Evidence of pertussis based on
  • Culture (N1)
  • PCR (N36)
  • Or 2-fold change in anti-pertussis toxin IgG
    (N40)
  • Total N 70/217 with evidence of pertussis

Gilberg S et al. J Inf Dis 2002186415-8
40
Results
  • 89 in both groups met CDC criteria for pertussis

41
Issues
  • Verification (selection) bias only 14 of
    eligible subjects included
  • Questionable gold standard (internally
    inconsistent)
  • Nice illustration of difficulty doing a
    systematic review!

42
Questions?
Write a Comment
User Comments (0)
About PowerShow.com