Biomarkers: The Good, The Bad and The Beautiful - PowerPoint PPT Presentation

1 / 60
About This Presentation
Title:

Biomarkers: The Good, The Bad and The Beautiful

Description:

... believes discriminates extremely well between women delivering pre-term and full ... methods have been developed for use of ROC curves to select markers with ... – PowerPoint PPT presentation

Number of Views:142
Avg rating:3.0/5.0
Slides: 61
Provided by: schi8
Category:

less

Transcript and Presenter's Notes

Title: Biomarkers: The Good, The Bad and The Beautiful


1
Biomarkers The Good, The Bad and The Beautiful
  • Enrique F. Schisterman, PhD
  • Division of Epidemiology, Statistics and
    Prevention
  • NICHD NIH - DHHS

2
  • Background
  • ROC Curve
  • Pooling Biomarkers
  • Confounding Adjustment
  • Measurement Error Corrections
  • Conclusions
  • Future Directions

3
Background
  • Biomarker A specific physical trait used to
    measure or indicate the effects or progress of a
    disease or condition
  • Newly developed laboratory methods expand the
    number of biomarkers on a daily bases

4
Constraints when Investigating Biomarkers
  • Money which marker to choose in large scale
    studies and how?
  • Measurement Issues
  • Random measurement error when unaccounted
    biases the results.
  • Limit of Detection replacement values are used,
    which might bias the results towards or away from
    the null depending on the choice of replacement
    value. Usually researchers use 0, LOD/2,
    LOD/sqrt(2) or the LOD.

5
  • Background
  • ROC Curve
  • Pooling Biomarkers
  • Limit of Detection
  • Measurement Error Corrections
  • Conclusions
  • Future Directions

6
ROC curvebackground
  • Before a marker is used its discriminating
    accuracy needs to be evaluated
  • Most commonly used tool - Receiver Operating
    Characteristic (ROC) curve.
  • We consider only continuous markers

7
ROC curvedescription
  • Two populations, healthy (H) and diseased (D),
    with corresponding measured marker XH and XD
    with cdfs FH, FD and pdfs fH and fD.

8
  • Choose a threshold, c and classify individuals
    as
  • healthy (negative) if marker value lt c, or
  • diseased (positive) if marker value c
  • Sensitivity Probability of a true positive
  • Pr(XD gt c )1-FD(c)q
  • Specificity Probability of a true negative
  • Pr(XH ? c )FH(c)p

9
Distribution of marker, X in healthy and disease
populations
10
  • ROC curve graph of (1-p,q) for all c.
  • ROC(1-p)1-FD(FH-1(p))
  • ROC defined over all possible thresholds gives
    the entire range of possible sensitivities and
    specificities.
  • Diagnostic accuracy evaluated using measures
    including the area under the curve (AUC AUC
    Prob(XHltXD)) and Youdens index (J)

11
ROC curve example Complete Separation
1
q(c)
Chance line
c
q(c)
p(c)
1 p(c)
0
1
q(c) 1 and p(c) 1 for some c
12
ROC curve example Complete Overlap
c
q(c)
p(c)
q(c) 1- p(c), for all c
13
ROC curve example Partial Separation
1
q(c)
Chance line
c
q(c)
p(c)
1 p(c)
0
1
Sensitivity by 1- Specificity P( True
Pos.) by P( False Pos.) across all c
14
Youdens Index (J)
The greatest difference in the rates of a correct
versus an incorrect diagnosis by a continuous
biomarker.
15
Youdens indexoptimal cutpoint
When the distributions of diseased (XD, true
positives) and healthy (XH, true negatives) are
continuous and unimodal, then Youdens index
occurs at the intersection (c) of the pdfs such
that
J
c
Now, at the optimum cutpoint denoted by c.
16
Parametric evaluation of ROC
  • Assume some parametric model for example
  • substitute parameters with sample estimates

17
Youdens Index - Normal Case
Youdens index is found by substituting these
equations for sensitivity(q) and specificity(p)
at the cutpoint c.
Cases
and, Controls
18
Optimal Cut Point Normal Case
  • A special occurrence of the optimum c is the
    midpoint between the means when
  • otherwise c must be calculated using a specific
    form of the
  • quadratic equation.

Where
and
19
  • Background
  • ROC Curve
  • Pooling Biomarkers
  • Limit of Detection
  • Measurement Error Corrections
  • Conclusions
  • Future Directions

20
Background I
  • Michael Jordan, PhD, a leading investigator at
    NIH, is looking to evaluate a new biomarker that
    he believes discriminates extremely well between
    women delivering pre-term and full term.
  • The cost per measurement of biomarker
    (continuous) is 1000.

21
Background II
  • A power analysis revealed that 200 women who
    delivered pre-term and 200 women who delivered
    full-term are necessary to efficiently estimate
    the area under the ROC curve.
  • The cost of this small study is
  • 1000 x 400 400,000.
  • His budget is 200,000.

22
What should he do?
  • Nothing
  • Do a study with 100 individual measurements and
    obtain an estimate of the area under the ROC
    curve with very large confidence intervals.
  • Develop new methodology

23
The New MethodologyPooling
  • Specimens are separated by case status, i.e.,
    cases with cases, controls with controls
  • Specimens are grouped in p number of groups, each
    of size g so that p x g N
  • The individual specimens from each group g are
    combined

24
PoolingNormal Case
Where
25
The parameters of the distribution of the
individual marker values can be estimated from
the pooled data as
26
Under Normality and Pooling Assumptions
And the almost MLE Estimator is
27
Estimate of Youdens Index - Normal Case
28
Area Under ROC under Gamma
Where
29
The maximum likelihood estimate of these
parameters can be obtained numerically from the
observations on the pooled specimens
30
Simulation - Bias nm200
31
Simulation - RMSE
32
Number of assays required to reach equivalency of
MSE for an area 0.7 under Normal assumptions
33
Number of Assays Required to Reach Equivalency
for an Area 0.8 Under Normal Assumptions
34
Comparing multiple pooled biomarkers
Define Pxij (i1, 2 j1,, n) as the jth pooled
observation for the biomarker i and p as the
pooled size
35
Hypothesis testing comparison of AUC for pooled
biomarkers
Where
Then, the null is rejected at significant level ?
if
And s is a consistent estimator of the standard
error S of
36
AUC10.65, AUC20.70, ?0.8
.8
.7
.6
.5
Power
.4
.3
number of assays
60
.2
.1
4
3
2
1
pool size (1 is naïve)
37
Example
  • Interleukin-6 is a biomarker that may potentially
    aid with the prediction of pre-term.
  • The 40 cases and 40 controls follow a gamma
    distribution.
  • Individual specimens were tested, g1
  • Pooled in pairs and tested again, g2
  • Pooled in pairs again and retested, g4

38
Exampleoperating characteristics under
poolinggamma assumption
39
Exampledistribution of IL-6 in cases and controls
40
  • Background
  • ROC Curve
  • Pooling Biomarkers
  • Limit of Detection
  • Measurement Error Corrections
  • Conclusions
  • Future Directions

41
Definitions
Un-pooled Specimens
Pooled Specimens
42
What Happens When Biomarker is Affected By an LOD?
1
q(c)
1
0
1 p(c)
43
Effect of Pooling
Un-pooled
Un-pooled
Pooled
Pooled
44
Likelihood Function
Differencing with respect to
To obtain estimates of
45
Number of Observation Above the LOD
LOD
LOD
46
Efficiency of the maximum likelihood estimators
of mean and variance
47
  • Background
  • ROC Curve
  • Pooling Biomarkers
  • Limit of Detection
  • Measurement Error Corrections
  • Conclusions
  • Future Directions

48
Random Measurement Error
Observed Value True Value Random Measurement
Error
xi Xi i1,,nx
yj Yj j1,,ny with Xi, Yj,
and independent of each other and
where .
Reliability Index - quantifies the level of
measurement error.
49
Reliability Study
Let wij Wi eij i1,, n0, j1,,
pi, Also let and
. Consequently,
is an unbiased
estimator of .
Now, we have making
an unbiased estimator of
. Similarly, is unbiased
for .
50
Measurement error (ME) correctioncutpoint,
Youdens index
Substituting estimators that take ME in to
account results in the corrected estimates
and
where and
,
51
Measurement error (ME) correctionconfidence
intervals
Using the delta method to approximate the
variance of J, is a two tailed (1-a)
confidence interval for J. Similarly, via the
delta method, is a two tailed (1-a)
confidence interval for c.
52
Simulation Study
n (50,50,19)
n (100,100,49)
n (1000,1000,299)
Reliability index, R
53
Measurement error correctionTBARS example
54
  • Background
  • ROC Curve
  • Pooling Biomarkers
  • Limit of Detection
  • Measurement Error Corrections
  • Conclusions
  • Future Directions

55
Conclusions
  • Cost-effective methods have been developed for
    use of ROC curves to select markers with pooled
    samples and effectively estimate the
    discrimination ability when constraint by LOD.
  • Methods have been developed to account for Limit
    of Detection adjustment in the evaluation
    biomarkers for the ROC curve
  • Methods for correction for random measurement
    error, an important consideration in the design
    and analysis of markers have been developed for
    use with the ROC curve

56
  • Background
  • ROC Curve
  • Pooling Biomarkers
  • Limit of Detection
  • Measurement Error Corrections
  • Conclusions
  • Future Directions

57
Future Directions
  • Linear combination of multiple markers corrected
    for LOD and ME
  • Logistic regression with Pooled data affected by
    LOD
  • Design issues with pooling
  • Vacation!!!!

58
Bibliography
Schisterman EF and Perkins N. Confidence
Intervals for the Youden Index and Corresponding
Optimal Cut-point. Communications in Statistics
Simulations and Computations. Mumford, SL
Schisterman, EF Vexler, A and Liu A. Pooling
Biospecimens and Limits of dtection Effects on
ROC Curve Analysis. Biostatistics. (In
Press). Perkins, NJ and Schisterman EF. The
Inconsistency of Optimal Cut-points Using Two
ROC Based Criteria. American Journal of
Epidemiology. (In Press). Bondell H1, Liu A and
Schisterman EF. Statistical Inference Based on
Pooled Data A Moment-Based Estimating Equation
Approach. Applied Statistics. (In
Press) Rotnitzky A, Faraggi D and Schisterman
EF. Doubly robust estimators of the area under
the Receiver Operating Characteristic curve in
the presence of verification bias. ournal of the
American Statistical Association. Theory and
Methods. (In Press) Schisterman EF, Vexler A,
Whitcomb BW and Liu A. The Limitations due to
Exposure Detection Limits for Regression Models.
American Journal of Epidemiology. 2006 163
374-383. Schisterman EF, Faraggi D and Reiser B.
Estimation of the ROC Curve for Compound
Distributions. Statistics in Medicine. 2006
25623638
59
And Like Dr. Miguel Hernan says Life is
SimpleEat, Sleep and Control for Confounding
60
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com