Title: Receiver Operating Characteristic Curve (ROC) Analysis for Prediction Studies Ruth O
1 Receiver Operating Characteristic Curve (ROC)
Analysis for Prediction StudiesRuth OHara,
Helena Kraemer, Jerome Yesavage, Jean Thompson,
Art Noda, Joy Taylor, Jared Tinklenberg
Stanford University, Department of Psychiatry and
Behavioral Sciences Stanford University School of
Medicine Sierra Pacific MIRECC Veterans Affairs
Palo Alto Health Care System
2The Clinical Need for Signal Detection Procedures
- Clinical practice is often hit or miss therapy
- Try one thing, if that does not work, try another
- This is frustrating for the patient and expensive
- The Goal find the best treatment for the
patient with specific characteristics - New news in psychiatry old hat in internal
medicine
3Receiver Operating Characteristic Curve (ROC)
Analysis
- Signal Detection Technique
- Traditionally used to evaluate diagnostic tests
- Now employed to identify subgroups of a
population at differential risk for a specific
outcome (clinical decline, treatment response) - Identifies moderators
4Receiver Operating Characteristic Curve (ROC)
Analysis
5ROC Analysis Historical Development (1)
- Derived from early radar in WW2 Battle of Britain
to address Accurately identifying the signals on
the radar scan to predict the outcome of interest
Enemy planes when there were many extraneous
signals (e.g. Geese)?
6ROC Analysis Historical Development (2)
- True Positives Radar Operator interpreted
signal as Enemy Planes and there were Enemy
planes (Good Result No wasted Resources) - True Negatives Radar Operator said no planes
and there were none (Good Result No wasted
resources) - False Positives Radar Operator said planes, but
there were none (Geese wasted resources) - False Negatives Radar Operator said no plane,
but there were planes (Bombs dropped very bad
outcome)
7ROC AnalysisHistorical Development
- Sensitivity Probability of correctly
interpreting the radar signal as Enemy planes
among those times when Enemy planes were actually
coming - SE True Positives / True Positives False
Negatives - Specificity Probability of correctly
interpreting the radar signal as no Enemy planes
among those times when no Enemy planes were
actually coming - SP True Negatives / True Negatives False
Positives
8ROC Prediction of Enemy Planes by RAF Radar
Operators
9Receiver Operating Characteristic Curve (ROC)
Analysis Applications
10ROC Analysis Evaluating Medical Tests
- The evaluation of the ability of a diagnostic
test to identify a disease involves considering - PPrevalence occurrence in the population of
the outcome of interest (e.g. disease) - True Positives
- True Negatives
- False Positives
- False Negatives
- PPrevalenceTrue Positives False Negatives
11ROC Analysis Medical Test Evaluation
- True Positives Test states you have the disease
when you do have the disease - True Negatives Test states you do not have the
disease when you do not have the disease - False Positives Test states you have the
disease when you do not have the disease - False Negatives Test states you do not have the
disease when you do
12ROC Analysis Evaluating Medical Tests
- Sensitivity The probability of having a positive
test result among those with a positive diagnosis
for the disease - SE True Positives / True Positives False
Negatives - Specificity The probability of having a
negative test result among those with a negative
diagnosis for the disease - SP True Negatives / True Negatives False
Positives
13The Basic Tool 2X2
Test Test-
O TP(a) FN(b) P(a b)
O- FP(c) TN(d) P'1-P
Q(a c) Q'1-Q
Sensitivity (SE)a/P Specificity (SP)d/P
14ROC GDS (Test) for Diagnosis of Clinically
Confirmed Depression
15Which Test Do You Use Medical Tests Evaluation
- GDS SE .80 SP .85
- Beck Depression Inventory SE .85 SP .75
- Major Depression Inventory SE .66 SP .63
16ROC Analysis
- ROC first calculates Sensitivity and Specificity
- Quality Indices measures the quality of the
sensitivity and specificity - ROC computes the quality indices for each
predictor to find the ones with optimal
sensitivity and specificity
17To Detect the Optimal Sensitivity and Specificity
- Depends on the relative CLINICAL importance of
false negatives versus false positives. - W1 means only false negatives matter.
- W0 means only false positives matter.
- W1/2 means both matter equally.
- Analytically Use weighted kappa.
18ROC Analysis
- P TP FN P 1- (TP FN)
- Q TP FP Q 1- (TP FP)
- EFF TP TN
- ?(0.5, 0) (TP TN) - (TP FN)(TPFP) -
(1-(TP FN)(1-(TP FP)) - 1 (TP FN)(TPFP) - (1-(TP FN))(1-(TP
FP))
19ROC Plane and Curve
ROC curve
Ideal Point
Random ROC
(Q,Q)
(P,P)
20Receiver Operating Characteristic Curve (ROC)
Analysis Applications
- Identifying Predictors of Clinical Outcome
21ROC Analysis Prediction Studies (Dr. Kraemer)
- ROC can identify predictors/characteristics
- of patients that are at differential risk for
a specific outcome of interest. e.g. What are the
Characteristics of AD Patients at risk for rapid
decline and are high priority for treatment? - What are the clinical predictors of Alzheimer
Disease patients who are good responders (or
poor responders) to cholinesterase inhibitor
treatments? - Useful in real world clinical medicine where
multiple variables affect the clinical outcome
and patients seldom have one pure diagnosis
22ROC Identifying Predictors of an Outcome
- 1. ROC relates a predictor (test) to the clinical
outcome of interest (Diagnosis/Gold Standard) - 2. ROC searches all predictors and their
associated cut-points - 3. ROC determines which predictor and associated
cut-point yields the optimal sensitivity and
specificity for identifying the outcome of
interest yielding two groups at differential risk
for the outcome
23ROC Identifying Predictors of an Outcome
- 4. ROC is an iterative process that is then rerun
automatically for each group yielded in Step 3.
in order to examine which predictor and
associated cut-point may further divide the
groups - 5. ROC will keep searching within each group
yielded until one of three stopping rules apply
(see Stopping rule slide) - 6. ROC thus identifies subgroups of individuals
that are at increased risk for the outcome of
interest
24 ROC AnalysisAdvantages and Disadvantages
- No assumptions of normal distribution
- Multiple predictors can be evaluated
simultaneously - Indicates interactions among predictors
- Indicates cut-points on these predictors
- Yields clinically relevant information
- Non-hypothesis testing
- Requires large samples
- Capitalizes on chance needs stringent stopping
rule
25ROC Analysis Procedure
- Start with large sample size
- Define the outcome of interest (always binary)
- Choose Success/Failure criteria
- Select predictor variables of interest (as many
as you like) - Run ROC Program that systematically finds best
predictors for Success/Failure
26The Basic Tool 2X2
RF RF-
O TP(a) FN(b) P(a b)
O- FP(c) TN(d) P'1-P
Q(a c) Q'1-Q
Sensitivity (SE)a/P Specificity (SP)d/P
27ROC Identifying Predictors Their Cut-points
- Dichotomous Variables such as Gender
- ROC calculates the Se and Sp for Female vs. Male
- For Continuous Variables such as Age
- ROC would calculate Se and Sp for the cut-point
of 60 vs. 616263 .85 then could calculate for
cut-point of 6061 vs. 626364 .85, and so
forth.
28ROC Gender as Predictor ofClinically Confirmed
Depression
29ROC Identifying Predictors Their Cut-points
- Dichotomous Variables ROC calculates the Se and
Sp for Female vs. Male, Aphasia vs. No Aphasia,
etc. - For Continuous Variables such as Age
- ROC would calculate Se and Sp for the cut-point
of 60 vs. 616263 .85 then could calculate for
cut-point of 6061 vs. 626364 .85, and so
forth.
30ROC Age as Predictor of Clinically Confirmed
Depression
31ROC Age as Predictor of Clinically Confirmed
Depression
32Receiver Operating Characteristic Curve (ROC)
Analysis
- Conducting the ROC An Example
33ROC Analysis Procedure
- Start with large sample size
- Define the outcome of interest
- Choose Success/Failure criteria
- Identify predictor variables of interest
- Run ROC Program that systematically finds best
predictors for Success/Failure
34ROC Analysis Example
- Population under investigation 1, 472 AD
patients from 10 Centerswith a 12 month
follow-up - Clinically significant outcomeMore rapid
decline as defined by a loss of 3 or more MMSE
points per year, post-visit - O'Hara R et al. (2002). Which Alzheimer patients
are at risk for rapid cognitive decline? J
Geriatr Psychiatry Neurol15(4)233-8.
35Predictor Variables
- Age-at -patient-visit
- Reported age of symptom onset
- Gender
- Years of education
- Ethnicity
- MMSE score
- Living Arrangement
- Presence of Aphasia
- Presence of Hallucinations
- Presence of Extrapyramidal Signs
36(No Transcript)
37Stopping Rules
- No more possibilities (rare!)
- Inadequate sample size
- Optimal test (if a priori) would not have been
statistically significant (plt.001)
38Figure 10.3
N512 (100)P.53
Non-minority
Minority
N 191 (37)P.25
N 321 (63)P.70
Bayley Mental Dev. Index lt 115
Mother neverattended college
Mother attended college
Bayley Mental Dev. Index 115
N110 (21)P.48
N87 (17)P.45
N104 (20)P.09
N211 (41)P.81
Bayley Mental Dev. Indexlt106
Bayley Mental Dev. Index106
Bayley Mental Dev. Indexlt106
Bayley Mental Dev. Index106
Graduatedfrom college
Attended, didnot graduate
N131 (26)P.91
N80 (16)P.65
N30 (6)P.73
ROC Decision Tree for IHDP Control group with
outcome of low IQ at age 3. (w 0.5)
39ROC Plane and Swarm of Points
ROC curve
40To Detect the Optimal Sensitivity and Specificity
- Depends on the relative CLINICAL importance of
false negatives versus false positives. - W1 means only false negatives matter.
- W0 means only false positives matter.
- W1/2 means both matter equally.
- Analytically Use weighted kappa.
- Geometrically Draw a line through the Ideal
Point with slope determined by P and w. Push
this line down until it just touches the ROC
curve. That point is optimal.
41ROC Analysis Conclusion
- Yields Clinically Relevant Information
- Identifies complex interactions
- Identifies individuals with different
characteristics but at the same risk for the
clinically relevant outcome - Identifies individuals at the least risk
- Can take differential clinical costs of false
positives and false negatives into account
42Conclusion
- It is not sufficient to identify risk factors or
even to identify moderators and mediators etc. or
a structural model. - It is necessary to present and interpret the
results so that clinicians, policy makers,
consumers, other researchers can apply them. - ROC trees are one method to accomplish this
purpose.
43Receiver Operating Characteristic Curve (ROC)
Analysis
44Using the ROC ProgramA. How to Get the ROC
Program
- Go to http//mirecc.stanford.edu
- Go to Top Information Requests
- Go to ROC4 is available for download HERE.
- Double Click on HERE
- A pop-up window will give you the option to open
or save the ROC4 zip file - Best option is to save it to a folder you have
already created e.g. ROC analysis
45Using the ROC ProgramB. Opening the ROC Program
- Go to your ROC analysis folder
- Unzip the ROC4.zip file (Some computers will
automatically unzip when you double click on it
or you may need to use an unzip program) - Once unzipped the following 5 files will appear
- Read_Me.doc A help file which explains what to
do - ROC4.19.exe The actual ROC program
- rDemoData.bat Batch file that gets ROC program
to run - Demo.txt A demo data input file
- runDemoData.doc A demo data output file
46What the Files Look Like
47Using the ROC ProgramC. Preparing Data for ROC
Program
- First prepare your data file
- Put your data in Excel form
- Your outcome measure should always be
- Dichotomous
- Coded as a 1 or 0
- In the far right column
- All dichotomous predictor variables coded as a 1
or 0 - All missing data coded as 9999.99
- Remove all IDs or other non-predictor information
- Save your Excel data file as Text (Tab delimited)
- Give it a name that has no spaces This will be
your data input file
48What Your Data Input File Looks Like
49Using the ROC ProgramD. Executing the ROC Program
- Open up Microsoft Word
- Within Word open the rDemoData batch file
- It will open to read as follows
- echo "Program running- check folder with output
file and REFRESH to confirm running
- roc4.19 Demo.txt 50gt runDemoData.doc
- Where you see Demo, you replace with the name
of your data input file - Where you see runDemoData, you replace with
the name of your data output file - Then save your new batch file with a new name and
put .bat at the end of the name (easiest name
is one associated with the data names you have
assigned).
50Using the ROC ProgramD. An Example of Executing
the ROC Program
- Helena has data entitled Workshop saved as text
and now called Workshopdata.txt - Within Word open the rDemoData batch file to read
as follows - echo "Program running- check folder with output
file and REFRESH to confirm running
- roc4.19 Demo.txt 50gt runDemoData.doc
- Demo is replaced with Workshopdata
- runDemoData is replaced with runWorkshopdata
- New batch file is saved as rWorkshopdata.bat
- echo "Program running- check folder with output
file and REFRESH to confirm running
- roc4.19 Workshopdata.txt 50gt runWorkshopdata.doc
- Double Left Click on new batch file and as if by
magic your output file entitled
runWorkshopdata.doc will appear
51Using the ROC ProgramE. How to Read Your Output
File
- Open up your data output file which will be in
Word - Select All
- Change Font to 6
- Go to Page Setup and change from Portrait to
Landscape - Expand your margins if you are still getting wrap
around