Title: ROC Curves
1ROC Curves Wilcoxon and Mann-Whitney Tests
- Lindsay Jacks
- Tutorial Presentation
- CHL 5210 Categorical Data Analysis
- October 16th, 2007
2Outline
- Binary Classification Model
- ROC Curve
- Area under the ROC Curve
- Nonparametric Methods
- Mann-Whitney Test
- Wilcoxon Signed-Rank Test
- SAS Code
3Binary Classification Model
True Positive The actual value is positive and it
is classified as positive False Negative (Type
II Error) The actual value is positive but it is
classified as negative True Negative The actual
value is negative and it is classified as
negative False Positive (Type I Error) The
actual value is negative but it is classified as
positive
Confusion Matrix
4Evaluation Metrics
- True Positive Rate (TPR)
- Positives correctly classified / Total positives
- Sensitivity
- False Positive Rate (FPR)
- Negatives incorrectly classified / Total
negatives - 1 - Specificity
5ROC Curve
- Receiver Operating Characteristic (ROC) curve
- A technique for visualizing, organizing and
selecting classifiers based on their performance - Two-dimensional graph in which the TPR is plotted
on the Y axis and the FPR is plotted on the X
axis -
- Sensitivity vs. (1 Specificity)
- Depicts relative tradeoffs between benefits (true
positives) and costs (false positives)
6ROC Curve
- The relationship between sensitivity and
specificity can be described in the graph below - The best possible prediction
- method produces a point in
- the upper left corner
- representing 100 sensitivity
- and 100 specificity
- If a diagnostic procedure
- has no predictive value, the
- relationship between
- sensitivity and specificity is
- linear
7ROC Space
- Each prediction result or one instance of a
confusion matrix represents one point in the ROC
space - A completely random guess gives a point along the
diagonal line (B) - Points above the diagonal line (A, C) indicate
good classification results - Points below the diagonal line (C) indicate
incorrect results
8Area under ROC curve (AUC)
- The area under the ROC curve depends on the
overlap of two normal distribution curves - The greater the overlap of the
- curves, the smaller the area
- under the ROC curve (the lower
- the predictive power of the test)
- The area of overlap indicates
- where the test cannot distinguish
- normal from disease
- When the normal distribution
- curves overlap totally, the ROC
- curve turns into a diagonal line
9Area under ROC curve (AUC)
- To compare classifiers we may want to reduce the
ROC performance to a single scalar value
representing expected performance - ? Calculate the AUC
- Since the AUC is a portion of the area of the
unit square, its value will always be between 0
and 1 - However, because random guessing produces the
diagonal line between (0, 0) and (1, 1), which
has an area of 0.5, no realistic classifier
should have an AUC less than 0.5 - An ideal classifier has an area of 1
10Area under ROC curve (AUC)
- Important statistical property AUC is equivalent
to the probability that the classifier will rank
a randomly chosen positive instance higher than a
randomly chosen negative instance - This is equivalent to the
- Mann-Whitney statistic
- Comparing two ROC curves
- The graph represents the areas
- under two ROC curves, A and B.
- Classifier B has greater area and
- therefore better average
- performance
11ROC Curve Applications
- ROC analysis provides a tool to select possibly
optimal models and to discard suboptimal ones - Related to cost/benefit analysis of diagnostic
decision making - Widely used in medicine, radiology, psychology
recently becoming more popular in areas like
machine learning and data mining - The area under the ROC curve is equivalent to the
Mann-Whitney statistic however, summarizing the
ROC curve into a single number loses information
about the pattern
12Nonparametric Methods
- Usually require the use of interval- or
ratio-scaled data - Provide an alternative series of statistical
methods that require no or very limited
assumptions to be made about the data - Require no assumptions about the population
probability distributions - ? Distribution-free methods
13Mann-Whitney Test
- Also known as Mann-Whitney-Wilcoxon (MWW) or
Wilcoxon rank-sum test - A nonparametric alternative to the two-sample
t-test which is based solely on the order in
which the observations from the two samples fall - Method for determining whether there is a
difference between two populations - Requirements
- Data must be ordinal or continuous measurements
- The two samples must be independent
14Mann-Whitney Test
- Null hypothesis H0 The two populations are
identical. - Process
- Combine independent samples into one sample
(nn1n2) - Rank the combined data from lowest to highest
values, with tied values being assigned the
average of the tied rankings - Compute T, the sum of the ranks for the
observations in the first sample - If the two populations are identical, the sum of
the ranks of the first sample and those in the
second sample should be close to the same value - Compare the observed value of T to the sampling
distribution of T for identical populations
15Mann-Whitney Test
- Sampling distribution of T for identical
populations (under H0) - Mean µT n1(n1n21)
- 2
- Variance vT n1n2(n1n21)
- 12
- Test Statistic z T - µT asymptotically N(0,1)
distribution - vvT
16Wilcoxon Signed-Rank Test
- A nonparametric alternative to the paired t-test
for the case of two related samples or repeated
measurements on a single sample - Method for determining whether there is a
difference between two populations - Requirements
- Data must be interval measurements
- Does not require assumptions about the form of
the distribution of the measurements
17Wilcoxon Signed-Rank Test
- Test assumes there is information in the
magnitudes of the differences between paired
observations, as well as the signs - Null hypothesis H0 The two populations are
identical. - Process
- Compute the differences between the paired
observations (discard any differences of zero) - Rank the absolute value of the differences from
lowest to highest, with tied differences being
assigned the average ranking of their positions - Give the ranks the sign of the original
difference in the data - Sum the signed ranks and determine whether the
sum is significantly different from zero
18Wilcoxon Signed-Rank Test
- Sampling distribution of T for identical
populations (under H0) - Mean µT 0
- Variance vT n(n1)(2n1)
- 6
- Test Statistic z T asymptotically N(0,1)
distribution - vvT
19SAS Code
- ROC Curve
- ROCPLOT macro
- Produces a plot showing the ROC curve associated
with a fitted binary-response model - Plot of the sensitivity against 1-specificity
values associated with the observations'
predicted event probabilities - You must first run the LOGISTIC procedure to
fit the desired model
20SAS Code
- ROC Curve
- ROC macro
- Nonparametric comparison of areas under
correlated ROC curves - Provides point and confidence interval estimates
of each curve's area and of the pairwise
differences among the areas - Tests of the pairwise differences are also given
- You must first run the LOGISTIC procedure to
fit each of the models whose ROC curves are to be
compared
21SAS Code
- Mann-Whitney-Wilcoxon Test
- PROC NPAR1WAY WILCOXON
- CLASS variable
- VAR variable
- EXACT WILCOXON
-
- Wilcoxon Signed-Rank Test
- PROC UNIVARIATE
- VAR variable
-
- You must first perform a DATA step to create
the difference SAS will not calculate the
difference in PROC UNIVARIATE