Data Mining Session 7 Evaluation in context - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Data Mining Session 7 Evaluation in context

Description:

Comparing rankings. Costs and benefits. Lift and cumulative response curves. ROC curves ... Need for visual comparison framework that separates classification ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 23
Provided by: lucde8
Category:

less

Transcript and Presenter's Notes

Title: Data Mining Session 7 Evaluation in context


1
Data MiningSession 7Evaluation in context
  • Luc Dehaspe
  • K.U.L. Computer Science Department

2
Course overview
Session 2-3 Data preparation
Data Mining
3
Evaluation in Context
Foster Provost and David Jensen. Evaluating
Knowledge Discovery and Data Mining. Tutorial
Notes for Fourth International Conference on
Knowledge Discovery and Data Mining (KDD 1998).
  • Introduction
  • Classifier evaluation
  • Accuracy
  • Other evaluation metrics
  • Comparing rankings
  • Costs and benefits
  • Lift and cumulative response curves
  • ROC curves

4
IntroductionTypes of context for evaluation
  • Analytic vs. Engineering
  • Primary goal understanding vs. building
  • Evaluation must support primary goal
  • The problem to be solved
  • Does the evaluation capture the crucial details
  • Does a successful evaluation imply a successful
    solution?

5
IntroductionTypes of context for evaluation
  • Example scaling up
  • Analytic questions
  • Do I understand scaling up better?
  • Why is my technique effective?
  • Why scale up in the first place?
  • Engineering questions
  • Does my system scale up effectively?
  • Are my assumptions valid?
  • Is performance satisfactory?

6
Classifier evaluation Accuracy
  • Percent correct (1-error rate)
  • One aspect of performance is predictive ability
    how well does a model classify?
  • Vast majority of research results and many
    practical projects use accuracy (error rate) for
    comparison
  • Obvious first step , default metric

7
Accuracy parameter estimation
  • Infer the value of a population parameter based
    on a sample statistic

8
Accuracy parameter estimation
  • Hold-out set Partition data into training and
    test sets
  • Cross validation create k systematic partitions

9
Accuracy problematic issues
  • Higher accuracy does not necessarily imply better
    performance on target task
  • The use of error rate often suggests
    insufficiently careful thought about the real
    objectives of the research David Hand,
    Construction and Assessment of Classification
    Rules (1997)
  • Assumes equal misclassification costs
  • Cost (False positive) ?? cost (False negative)
  • Assumes (relatively) uniform class distribution
  • In case of 991 split, 99 default accuracy

10
Accuracy problematic issues
  • Each classifier produces confusion matrix
  • Which classifier should be used?
  • Under what conditions?
  • Confusion matrix based evaluation metrics
  • Accuracy (TPTN) / Total
  • 7/14 vs 7/14
  • Recall True positive rate (TPR) Sensitivity
    TP/Actual()
  • 3/9 vs 5/9
  • Precision Positive Predictive Value
    TP/Predicted()
  • 3/4 vs 5/8
  • False positive rate (FPR) FP/Actual(-)
  • 1/5 vs 3/5

11
Comparing rankings
  • Rankers produce continuous output (e.g., in
    0,1)
  • Rather than /-
  • Neural nets, decision trees, etc., can rank cases
  • Probabilistic models
  • Combine with threshold to form classifier
  • Above threshold Under threshold -
  • One ranker can define many classifiers

12
Comparing rankings
Key question what cut-off is appropriate?
13
Actions have costs and benefits
  • Medical diagnosis
  • Cost of falsely indicating cancer is different
    from cost of missing a true cancer case
  • Fraud detection
  • Cost of falsely challenging customer is different
    from cost of leaving fraud undetected
  • Customer segmentation (mail-targeting)
  • Cost of contacting a non-buyer is different from
    cost of not contacting a buyer
  • Benefit of not contacting a non-buyer is
    different from benefit of contacting a buyer

14
Precise comparison bases on objective function
  • Must take costs and /- distributions into
    account
  • f(FPA , FNA, distrib, Cfp, Cfn)
  • f(FPB , FNB, distrib, Cfp, Cfn)
  • f(FPC , FNC, distrib, Cfp, Cfn)
  • ..

from confusion matrices
Target conditions
15
Example calculate expected profit
  • Profit
  • Probability that a will occur times
  • TPR times Benefit of correctly classifying a
  • Plus (1-TPR) times Cost of classifying a as a
  • Plus probability that a will occur times
  • (1-FPR) times Benefit of correctly classifying a
  • Plus FPR times Cost of classifying a as a
  • More formally
  • Profit p() (TPRb(Y,) (1-TPR)c(N,))
  • p(-) ((1-FPR)b(N,-) FPRc(Y,-))
  • Choose classifier that maximizes profit
  • Notice in formula above Benefit ? 0, Cost ? 0

16
Problems with objective functions
  • Target conditions are difficult to know precisely
  • Cost/benefit estimates often imprecise
  • What if - 10?
  • Class distributions may change over time
  • Need for visual comparison framework that
    separates classification performance from
    cost/benefit and class distribution assumptions

17
Lift and cumulative response curves
18
Example lift profit
  • Target condition
  • Mailing for 3 product price per mailing 2

Maximum return on Investment 2
19
ROC curves
  • Cumulative response curves
  • pro Separate classification performance from
    precise cost/benefit specifications
  • con assume target distribution same as training
    distribution
  • Solution ROC curves
  • Receiver Operating Characteristic analysis
    (from signal detection theory)
  • Each classifier represented by plotting its
    (FPR,TPR) pair
  • Isomorphic to cumulative response curves

20
ROC Curves
21
ROC Curves Convex hull
Identifies best classifiers under any all
conditions
  • Identifies best classifiers under any all
    conditions
  • Any point on convex hull can be reached with
    combination of classifiers on the hull
  • New classifiers only interesting if they extend
    the convex hull

22
Evaluation in Context
Foster Provost and David Jensen. Evaluating
Knowledge Discovery and Data Mining. Tutorial
Notes for Fourth International Conference on
Knowledge Discovery and Data Mining (KDD 1998).
  • Introduction
  • Classifier evaluation
  • Accuracy
  • Other evaluation metrics
  • Comparing rankings
  • Costs and benefits
  • Lift and cumulative response curves
  • ROC curves
Write a Comment
User Comments (0)
About PowerShow.com