Assessing Information from Multilevel Ordinal and Continuous Tests - PowerPoint PPT Presentation

1 / 94
About This Presentation
Title:

Assessing Information from Multilevel Ordinal and Continuous Tests

Description:

... www.quesgen.com/Calculators/PostProdOfDisease/PostProdOfDisease. ... If your threshold for CVS is between 11% and 43%, you can stop after the nasal bone exam ... – PowerPoint PPT presentation

Number of Views:107
Avg rating:3.0/5.0
Slides: 95
Provided by: michae194
Category:

less

Transcript and Presenter's Notes

Title: Assessing Information from Multilevel Ordinal and Continuous Tests


1
Chapter 7 Prognostic Tests Chapter 8
Combining Tests and Multivariable Decision Rules
Michael A. Kohn, MD, MPP 10/29/2009
2
Outline of Topics
  • Prognostic Tests
  • Differences from diagnostic tests
  • Quantifying prediction calibration and
    discrimination
  • Value of prognostic information
  • Comparing predictions
  • Combining Tests/Diagnostic Models
  • Importance of test non-independence
  • Recursive Partitioning
  • Logistic Regression
  • Variable (Test) Selection
  • Importance of validation separate from derivation

3
Prognostic Tests (Ch 7)
  • Differences from diagnostic tests
  • Validation/Quantifying Accuracy (calibration and
    discrimination)
  • Assessing the value of prognostic information
  • Comparing predictions by different people or
    different models

Will not discuss time-to-event analysis or
predicting continuous outcomes. (Covered in
Chapter 7.)
4
Chance determines whether you get the disease
Spin the needle
5
Diagnostic Test
  • Spin needle to see if you develop disease.
  • Perform test for disease.
  • Gold standard determines true disease state.
    (Can calculate sensitivity, specificity, LRs.)

6
Prognostic Test
  • Perform test to predict the risk of disease.
  • Spin needle to see if you develop disease.
  • How do you assess the validity of the
    predictions?

7
Example Mastate Cancer
  • Once developed, always fatal.
  • Can be prevented by mastatectomy.
  • Two oncologists separately assign each of N
    individuals a risk for developing mastate cancer
    in the next 5 years.

8
(No Transcript)
9
How do you assess the validity of the predictions?
10
How many like this?
Oncologist 1 assigns risk of 50
Spin the needles.
How many get mastate cancer?
11
How many like this?
Oncologist 1 assigns risk of 35
Spin the needles.
How many get mastate cancer?
12
How many like this?
Oncologist 1 assigns risk of 20
Spin the needles.
How many get mastate cancer?
13
Calibration
  • How accurate are the predicted probabilities?
  • Break the population into groups
  • Compare actual and predicted probabilities for
    each group

Related to Goodness-of-Fit and diagnostic model
validation, which will be discussed shortly.
14
Calibration
15
Calibration
16
Discrimination
  • How well can the test separate subjects in the
    population from the mean probability to values
    closer to zero or 1?
  • May be more generalizable
  • Often measured with C-statistic (AUROC)

17
Discrimination
18
Discrimination
19
Discrimination
AUROC 0.63
20
True Risk
Oncologist 1 20 Oncologist 2 20 True Risk
11.1
Oncologist 1 35 Oncologist 2 20 True Risk
16.7
Oncologist 1 50 Oncologist 2 20 True Risk
33.3
21
True Risk -- Calibration
22
True Risk -- Calibration
23
True Risk -- Discrimination
24
True Risk -- Discrimination
25
True Risk -- Discrimination
AUROC 0.63
26
ROC curve depends only on rankings, not
calibration
27
Random event occurs AFTER prognostic test.
1) Perform test to predict the risk of
disease. 2) Spin needle to see if you develop
disease.
Only crystal ball allows perfect prediction.
28
Maximum AUROC
True Risk 11.1
True Risk 16.7
True Risk 33.3
Maximum AUROC 0.65
29
Diagnostic versus Prognostic Tests
Identify Prevalent Disease
Predict Incident Disease/Outcome
Prior to Test
After Test
Cross-Sectional
Cohort
/-, ordinal, continuous
Risk (Probability)
1
lt1 (not clairvoyant)
30
Value of Prognostic Information
  • Why do you want to know risk of mastate cancer?

To decide whether to do a mastatectomy.
31
Value of Prognostic Information
  • It is 4 times worse to die of mastate gland
    cancer than to have a mastatectomy.
  • B C 4C
  • Ptt C/(BC) C/4C 0.25 25

32
Value of Prognostic Information300 patients (100
per risk group)
  • Oncologist 1 31
  • gt 25
  • Mastatectomy
  • 89 out of 100 unnecessary no mastate cancer
    deaths
  • Oncologist 1 37
  • gt 25
  • Mastatectomy
  • 83 out of 100 unnecessary no mastate cancer
    deaths
  • Oncologist 1 53
  • gt 25
  • Mastatectomy
  • 67 out of 100 unnecessary no mastate cancer
    deaths

33
Value of Prognostic Information300 patients (100
per risk group)
  • Oncologist 2 20
  • lt 25
  • No Mastatectomy
  • 11 out of 100 die of mastate cancer no
    mastatectomies
  • Oncologist 2 20
  • lt 25
  • No Mastatectomy
  • 17 out of 100 die no mastatectomies
  • Oncologist 2 20
  • lt 25
  • No Mastatectomy
  • 33 out of 100 die no mastatectomies

34
Value of Prognostic Information300 patients (100
per risk group)
  • True Risk 11
  • lt 25
  • No Mastatectomy
  • 11 out of 100 die of mastate cancer no
    mastatectomies
  • True Risk 17
  • lt 25
  • No Mastatectomy
  • 17 out of 100 die no mastatectomies
  • True Risk 33
  • gt 25
  • Mastatectomy
  • 67 out of 100 unnecessary no mastate cancer
    deaths

35
Value of Prognostic Information300 patients (100
per risk group)
36
Value of Prognostic Information
  • Doctors and patients like prognostic information
  • But hard to assess its value
  • Most objective approach is decision-analytic.
    Consider
  • What decision is to be made?
  • Costs of errors?
  • Cost of test?

37
Comparing Predictions
  • Compare ROC Curves and AUROCs
  • Reclassification Tables, Net Reclassification
    Improvement (NRI), Integrated Discrimination
    Improvement (IDI)
  • See Jan. 30, 2008 Issue of Statistics in
    Medicine (? and EBD Edition 2 ?)

Pencina et al. Stat Med. 2008 Jan
3027(2)157-72
38
Common Problems with Studies of Prognostic Tests
  • See Chapter 7

39
Combining Tests/Diagnostic Models
  • Importance of test non-independence
  • Recursive Partitioning
  • Logistic Regression
  • Variable (Test) Selection
  • Importance of validation separate from derivation
    (calibration and discrimination revisited)

40
Combining TestsExample
  • Prenatal sonographic Nuchal Translucency (NT) and
    Nasal Bone Exam as dichotomous tests for Trisomy
    21

Cicero, S., G. Rembouskos, et al. (2004).
"Likelihood ratio for trisomy 21 in fetuses with
absent nasal bone at the 11-14-week scan."
Ultrasound Obstet Gynecol 23(3) 218-23.
41
If NT 3.5 mm Positive for Trisomy 21
Whats wrong with this definition?
42
(No Transcript)
43
  • In general, dont make multi-level tests like NT
    into dichotomous tests by choosing a fixed cutoff
  • I did it here to make the discussion of multiple
    tests easier
  • I arbitrarily chose to call 3.5 mm positive

44
One Dichotomous Test
  • Trisomy 21
  • Nuchal D D- LR
  • Translucency
  • 3.5 mm 212 478 7.0
  • lt 3.5 mm 121 4745 0.4
  • Total 333 5223

Do you see that this is (212/333)/(478/5223)?
Review of Chapter 3 What are the sensitivity,
specificity, PPV, and NPV of this test? (Be
careful.)
45
Nuchal Translucency
  • Sensitivity 212/333 64
  • Specificity 4745/5223 91
  • Prevalence 333/(3335223) 6
  • (Study population pregnant women about to
    undergo CVS, so high prevalence of Trisomy 21)
  • PPV 212/(212 478) 31
  • NPV 4745/(121 4745) 97.5

Not that great prior to test P(D-) 94
46
Clinical Scenario One TestPre-Test Probability
of Downs 6NT Positive
  • Pre-test prob 0.06
  • Pre-test odds 0.06/0.94 0.064
  • LR() 7.0
  • Post-Test Odds Pre-Test Odds x LR()
  • 0.064 x 7.0 0.44
  • Post-Test prob 0.44/(0.44 1) 0.31

47
NT Positive
  • Pre-test Prob 0.06
  • P(ResultTrisomy 21) 0.64
  • P(ResultNo Trisomy 21) 0.09
  • Post-Test Prob ?
  • http//www.quesgen.com/Calculators/PostProdOfDisea
    se/PostProdOfDisease.html

Slide Rule
48
Nasal Bone Seen NBANo Neg for Trisomy 21
Nasal Bone Absent NBAYes Pos for Trisomy 21
49
Second Dichotomous Test
  • Nasal Bone Tri21 Tri21- LR
  • Absent
  • Yes 229 129 27.8
  • No 104 5094 0.32
  • Total 333 5223

Do you see that this is (229/333)/(129/5223)?
50
Pre-Test Probability of Trisomy 21 6NT
Positive for Trisomy 21 ( 3.5 mm)Post-NT
Probability of Trisomy 21 31NBA Positive (no
bone seen)Post-NBA Probability of Trisomy 21 ?
Clinical Scenario Two Tests
Using Probabilities
51
Clinical Scenario Two Tests
Using Odds
Pre-Test Odds of Tri21 0.064NT Positive (LR
7.0)Post-Test Odds of Tri21 0.44NBA Positive
(LR 27.8?)Post-Test Odds of Tri21 .44 x
27.8? 12.4? (P
12.4/(112.4) 92.5?)
52
Clinical Scenario Two TestsPre-Test
Probability of Trisomy 21 6NT 3.5 mm AND
Nasal Bone Absent
53
Question
  • Can we use the post-test odds after a positive
    Nuchal Translucency as the pre-test odds for the
    positive Nasal Bone Examination?
  • i.e., can we combine the positive results by
    multiplying their LRs?
  • LR(NT, NBE ) LR(NT ) x LR(NBE ) ?
  • 7.0 x 27.8 ?
  • 194 ?

54
Answer No
Not 194
158/(158 36) 81, not 92.5
55
Non-Independence
  • Absence of the nasal bone does not tell you as
    much if you already know that the nuchal
    translucency is 3.5 mm.

56
Clinical Scenario
Using Odds
Pre-Test Odds of Tri21 0.064NT/NBE (LR
68.8)Post-Test Odds 0.064 x 68.8
4.40 (P 4.40/(14.40) 81, not 92.5)
57
Non-Independence
58
Non-Independence of NT and NBA
  • Apparently, even in chromosomally normal fetuses,
    enlarged NT and absence of the nasal bone are
    associated. A false positive on the NT makes a
    false positive on the NBE more likely. Of normal
    (D-) fetuses with NT lt 3.5 mm only 2.0 had nasal
    bone absent. Of normal (D-) fetuses with NT
    3.5 mm, 7.5 had nasal bone absent.

Some (but not all) of this may have to do with
ethnicity. In this London study, chromosomally
normal fetuses of Afro-Caribbean ethnicity had
both larger NTs and more frequent absence of the
nasal bone.
In Trisomy 21 (D) fetuses, normal NT was
associated with the presence of the nasal bone,
so a false negative on the NT was associated with
a false negative on the NBE.
59
Non-Independence
  • Instead of looking for the nasal bone, what if
    the second test were just a repeat measurement of
    the nuchal translucency?
  • A second positive NT would do little to increase
    your certainty of Trisomy 21. If it was false
    positive the first time around, it is likely to
    be false positive the second time.

60
Reasons for Non-Independence
  • Tests measure the same aspect of disease.
  • One aspect of Downs syndrome is slower fetal
    development the NT decreases more slowly and the
    nasal bone ossifies later. Chromosomally NORMAL
    fetuses that develop slowly will tend to have
    false positives on BOTH the NT Exam and the Nasal
    Bone Exam.

61
Reasons for Non-Independence
  • Heterogeneity of Disease (e.g. spectrum of
    severity).
  • Heterogeneity of Non-Disease.
  • (See EBD page 158.)

Not particularly important in the Downs
syndrome example
62
Unless tests are independent, we cant combine
results by multiplying LRs
63
Ways to Combine Multiple Tests
  • On a group of patients (derivation set), perform
    the multiple tests and (independently)
    determine true disease status (apply the gold
    standard)
  • Measure LR for each possible combination of
    results
  • Recursive Partitioning
  • Logistic Regression

Beware of incorporation bias
64
Determine LR for Each Result Combination
Assumes pre-test prob 6
65
Sort by LR (Descending)
66
Apply Chapter 4 Multilevel Tests
  • Now you have a multilevel test (In this case, 4
    levels.)
  • Have LR for each test result
  • Can create ROC curve and calculate AUROC
  • Given pre-test probability and treatment
    threshold probability (C/(BC)), can find optimal
    cutoff.

67
Create ROC Table
68
AUROC 0.896
69
Optimal Cutoff
  • Assume
  • Pre-test probability 6
  • Threshold for CVS is 2

70
Determine LR for Each Result Combination
2 dichotomous tests 4 combinations 3 dichotomous
tests 8 combinations 4 dichotomous tests 16
combinations Etc.
2 3-level tests 9 combinations 3 3-level tests
27 combinations Etc.
71
Determine LR for Each Result Combination
How do you handle continuous tests?
Not always practical for groups of tests.
72
Recursive PartitioningMeasure NT First
73
Recursive PartitioningExamine Nasal Bone First
74
Do Nasal Bone Exam First
  • Better separates Trisomy 21 from chromosomally
    normal fetuses
  • If your threshold for CVS is between 11 and 43,
    you can stop after the nasal bone exam
  • If your threshold is between 1 and 11, you
    should do the NT exam only if the NBE is normal.

75
Recursive PartitioningExamine Nasal Bone
FirstCVS if P(Trisomy 21 gt 5)
76
Recursive PartitioningExamine Nasal Bone
FirstCVS if P(Trisomy 21 gt 5)
77
Recursive Partitioning
  • Same as Classification and Regression Trees
    (CART)
  • Dont have to work out probabilities (or LRs) for
    all possible combinations of tests, because of
    tree pruning

78
Recursive Partitioning
  • Does not deal well with continuous test results
  • when there is a monotonic relationship between
    the test result and the probability of disease

79
Logistic Regression
  • Ln(Odds(D))
  • a bNTNT bNBANBA binteract(NT)(NBA)
  • 1
  • - 0
  • More on this later in ATCR!

80
Why does logistic regression model log-odds
instead of probability?
Related to why the LR Slide Rules log-odds scale
helps us visualize combining test results.
81
Probability of Trisomy 21 vs. Maternal Age
82
Ln(Odds) of Trisomy 21 vs. Maternal Age
83
Combining 2 Continuous Tests
gt 1 Probability of Trisomy 21
lt 1 Probability of Trisomy 21
84
Choosing Which Tests to Include in the Decision
Rule
  • Have focused on how to combine results of two or
    more tests, not on which of several tests to
    include in a decision rule.
  • Variable Selection Options include
  • Recursive partitioning
  • Automated stepwise logistic regression

Choice of variables in derivation data set
requires confirmation in a separate validation
data set.
85
Variable Selection
  • Especially susceptible to overfitting

86
Need for Validation Example
  • Study of clinical predictors of bacterial
    diarrhea.
  • Evaluated 34 historical items and 16 physical
    examination questions.
  • 3 questions (abrupt onset, gt 4 stools/day, and
    absence of vomiting) best predicted a positive
    stool culture (sensitivity 86 specificity 60
    for all 3).
  • Would these 3 be the best predictors in a new
    dataset? Would they have the same sensitivity
    and specificity?

DeWitt TG, Humphrey KF, McCarthy P. Clinical
predictors of acute bacterial diarrhea in young
children. Pediatrics. Oct 198576(4)551-556.
87
Need for Validation
  • Develop prediction rule by choosing a few tests
    and findings from a large number of
    possibilities.
  • Takes advantage of chance variations in the data.
  • Predictive ability of rule will probably
    disappear when you try to validate on a new
    dataset.
  • Can be referred to as overfitting.

88
VALIDATION
  • No matter what technique (CART or logistic
    regression) is used, the tests included in a
    model and the way in which their results are
    combined must be tested on a data set different
    from the one used to derive the rule.
  • Beware of studies that use a validation set to
    tweak the model. This is really just a second
    derivation step.

89
Prognostic Tests and Multivariable Diagnostic
Models
  • Commonly express results in terms of a
    probability
  • -- risk of the outcome by a fixed time point
    (prognostic test)
  • -- posterior probability of disease (diagnostic
    model)
  • Need to assess both calibration and
    discrimination.

90
Validation Dataset
  • Measure all the variables needed for the model.
  • Determine disease status (D or D-) on all
    subjects.

91
VALIDATIONCalibration
  • -- Divide dataset into probability groups
    (deciles, quintiles, ) based on the model (no
    tweaking allowed).
  • -- In each group, compare actual D proportion
    to model-predicted probability in each group.

92
VALIDATIONDiscrimination
  • Discrimination
  • -- Test result is model-predicted probability of
    disease.
  • -- Use Walking Man to draw ROC curve and
    calculate AUROC.

93
Outline of Topics
  • Prognostic Tests
  • Differences from diagnostic tests
  • Quantifying prediction calibration and
    discrimination
  • Comparing predictions
  • Value of prognostic information
  • Combining Tests/Diagnostic Models
  • Importance of test non-independence
  • Recursive Partitioning
  • Logistic Regression
  • Variable (Test) Selection
  • Importance of validation separate from derivation

94
Please return game spinners to red bag!
Write a Comment
User Comments (0)
About PowerShow.com