Title: Data Mining Session 7 Evaluation in context
1Data MiningSession 7Evaluation in context
- Luc Dehaspe
- K.U.L. Computer Science Department
2Course overview
Session 2-3 Data preparation
Data Mining
3Evaluation in Context
Foster Provost and David Jensen. Evaluating
Knowledge Discovery and Data Mining. Tutorial
Notes for Fourth International Conference on
Knowledge Discovery and Data Mining (KDD 1998).
- Introduction
- Classifier evaluation
- Accuracy
- Other evaluation metrics
- Comparing rankings
- Costs and benefits
- Lift and cumulative response curves
- ROC curves
4IntroductionTypes of context for evaluation
- Analytic vs. Engineering
- Primary goal understanding vs. building
- Evaluation must support primary goal
- The problem to be solved
- Does the evaluation capture the crucial details
- Does a successful evaluation imply a successful
solution?
5IntroductionTypes of context for evaluation
- Example scaling up
- Analytic questions
- Do I understand scaling up better?
- Why is my technique effective?
- Why scale up in the first place?
- Engineering questions
- Does my system scale up effectively?
- Are my assumptions valid?
- Is performance satisfactory?
6Classifier evaluation Accuracy
- Percent correct (1-error rate)
- One aspect of performance is predictive ability
how well does a model classify? - Vast majority of research results and many
practical projects use accuracy (error rate) for
comparison - Obvious first step , default metric
7Accuracy parameter estimation
- Infer the value of a population parameter based
on a sample statistic
8Accuracy parameter estimation
- Hold-out set Partition data into training and
test sets - Cross validation create k systematic partitions
9Accuracy problematic issues
- Higher accuracy does not necessarily imply better
performance on target task - The use of error rate often suggests
insufficiently careful thought about the real
objectives of the research David Hand,
Construction and Assessment of Classification
Rules (1997) - Assumes equal misclassification costs
- Cost (False positive) ?? cost (False negative)
- Assumes (relatively) uniform class distribution
- In case of 991 split, 99 default accuracy
10Accuracy problematic issues
- Each classifier produces confusion matrix
- Which classifier should be used?
- Under what conditions?
- Confusion matrix based evaluation metrics
- Accuracy (TPTN) / Total
- 7/14 vs 7/14
- Recall True positive rate (TPR) Sensitivity
TP/Actual() - 3/9 vs 5/9
- Precision Positive Predictive Value
TP/Predicted() - 3/4 vs 5/8
- False positive rate (FPR) FP/Actual(-)
- 1/5 vs 3/5
11Comparing rankings
- Rankers produce continuous output (e.g., in
0,1) - Rather than /-
- Neural nets, decision trees, etc., can rank cases
- Probabilistic models
- Combine with threshold to form classifier
- Above threshold Under threshold -
- One ranker can define many classifiers
12Comparing rankings
Key question what cut-off is appropriate?
13Actions have costs and benefits
- Medical diagnosis
- Cost of falsely indicating cancer is different
from cost of missing a true cancer case - Fraud detection
- Cost of falsely challenging customer is different
from cost of leaving fraud undetected - Customer segmentation (mail-targeting)
- Cost of contacting a non-buyer is different from
cost of not contacting a buyer - Benefit of not contacting a non-buyer is
different from benefit of contacting a buyer
14Precise comparison bases on objective function
- Must take costs and /- distributions into
account - f(FPA , FNA, distrib, Cfp, Cfn)
- f(FPB , FNB, distrib, Cfp, Cfn)
- f(FPC , FNC, distrib, Cfp, Cfn)
- ..
from confusion matrices
Target conditions
15Example calculate expected profit
- Profit
- Probability that a will occur times
- TPR times Benefit of correctly classifying a
- Plus (1-TPR) times Cost of classifying a as a
- Plus probability that a will occur times
- (1-FPR) times Benefit of correctly classifying a
- Plus FPR times Cost of classifying a as a
- More formally
- Profit p() (TPRb(Y,) (1-TPR)c(N,))
- p(-) ((1-FPR)b(N,-) FPRc(Y,-))
- Choose classifier that maximizes profit
- Notice in formula above Benefit ? 0, Cost ? 0
16Problems with objective functions
- Target conditions are difficult to know precisely
- Cost/benefit estimates often imprecise
- What if - 10?
- Class distributions may change over time
- Need for visual comparison framework that
separates classification performance from
cost/benefit and class distribution assumptions
17Lift and cumulative response curves
18Example lift profit
- Target condition
- Mailing for 3 product price per mailing 2
Maximum return on Investment 2
19ROC curves
- Cumulative response curves
- pro Separate classification performance from
precise cost/benefit specifications - con assume target distribution same as training
distribution - Solution ROC curves
- Receiver Operating Characteristic analysis
(from signal detection theory) - Each classifier represented by plotting its
(FPR,TPR) pair - Isomorphic to cumulative response curves
20ROC Curves
21ROC Curves Convex hull
Identifies best classifiers under any all
conditions
- Identifies best classifiers under any all
conditions - Any point on convex hull can be reached with
combination of classifiers on the hull - New classifiers only interesting if they extend
the convex hull
22Evaluation in Context
Foster Provost and David Jensen. Evaluating
Knowledge Discovery and Data Mining. Tutorial
Notes for Fourth International Conference on
Knowledge Discovery and Data Mining (KDD 1998).
- Introduction
- Classifier evaluation
- Accuracy
- Other evaluation metrics
- Comparing rankings
- Costs and benefits
- Lift and cumulative response curves
- ROC curves