Fraud Detection Experiments - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

Fraud Detection Experiments

Description:

Title: JAM: Java Agents for Meta-Learning over Distributed Databases Author: Andreas Leonidas Prodromidis Last modified by: Andreas Leonidas Prodromidis – PowerPoint PPT presentation

Number of Views:96
Avg rating:3.0/5.0
Slides: 45
Provided by: Andrea582
Category:

less

Transcript and Presenter's Notes

Title: Fraud Detection Experiments


1
Fraud Detection Experiments
  • Chase Credit Card
  • 500,000 records spanning one year
  • Evenly distributed
  • 20 fraud, 80 non fraud
  • First Union Credit Card
  • 500,000 records spanning one year
  • Unevenly distributed
  • 15 fraud, 85 non fraud

2
Intra-bank experiments
  • Classifier Selection Algorithm Coverage/TP-FP
  • Let V be the validation set
  • Until no other examples in V be covered
  • select the classifier with highest TP-FP rate on
    V
  • Remove covered examples from V
  • Setting
  • 12 subsets
  • 5 algorithms (Bayes, C4.5, Cart, ID3, Ripper)
  • 6-fold cross validation

3
TP-FP vs number of classifiers
  • Input base classifiersChase
  • Test data setChase
  • Best meta classifier Naïve Bayes with 25-32 base
    classifiers.

4
TP-FP vs number of classifiers
  • Input base classifiers First Union
  • Test data setFirst Union
  • Best meta classifier Naïve Bayes with 10-17
    base classifiers.

5
Accuracy vs number of classifiers
  • Input base classifiers Chase
  • Test data setChase
  • Best meta classifier Ripper with 50 base
    classifiers. Comparable performance is attained
    with 25-30 classifiers

6
Accuracy vs number of classifiers
  • Input base classifiers First Union
  • Test data setFirst Union
  • Best meta classifier Ripper with 13 base
    classifiers.

7
Intra-bank experiments
  • Coverage, cost model combined metric algorithm
  • Let V be the validation set
  • Until no examples can be covered from V
  • select classifier Cj that achieves the highest
    savings on V
  • Remove covered examples from V

8
Savings vs number of classifiers
  • Input base classifiers Chase
  • Test data set Chase
  • Best Meta ClassifiersSingle naïve bayesian base
    classifier ( 820K)

9
Savings of base classifiers
  • Input base classifiers Chase
  • Test data set Chase
  • Conclusion Learning algorithms focus on binary
    classification problem. If base classifiers
    fail to detect expensive fraud, meta learning
    cannot improve savings.

10
Savings vs number of classifiers
  • Input base classifiers First Union
  • Test data set First Union
  • Best Meta ClassifiersNaïve bayes with 22 base
    classifiers ( 945K)

11
Savings of base classifiers
  • Input base classifiers First Union
  • Test data set First Union
  • ConclusionThe majority of base classifiers are
    able to detect transactions that both fraudulent
    and expensive. Meta learning saves an additional
    100K.

12
Different distributions experiments
  • Number of Datasites 6
  • Training sets 50-50 Fraud/Non-Fraud
  • Testing sets 20-80 Fraud/Non-Fraud
  • Base classifiers ID3, CART
  • Meta classifiers ID3, CART, Bayes, Ripper
  • Base classifiers 81 TP, 29 FP
  • Meta-classifiers 86 TP, 25 FP

13
Inter-bank experiments
  • Chase includes 2 attributes not present in First
    Union data
  • Add two fictitious fields
  • Classifier agents support unknown values
  • Chase and First Union define an attribute with
    different semantics
  • Project Chase values on First Union semantics

14
Inter-bank experiments
  • Input base classifiersChase
  • Test data setChase and First Union
  • TaskCompare TP and FP rates of a classifier on
    different test sets.
  • ConclusionChase classifiers CAN be applied to
    First Union data, but not without penalty.

15
Inter-bank experiments
  • Input base classifiersFirst Union
  • Test data setFirst Union and Chase
  • TaskCompare TP and FP rates of a classifier on
    different test sets.
  • ConclusionFirst Union classifiers CAN be
    applied to Chase data, but not without penalty.

16
TP-FP vs number of classifiers
  • Input base classifiersFirst Union and Chase
  • Test data setChase
  • Result
  • Ripper, CART comparable
  • Naïve bayes slightly superior
  • C4.5, ID3 inferior

17
Accuracy vs number of classifiers
  • Input base classifiersFirst Union and Chase
  • Test data setChase
  • Result
  • CART, Ripper comparable
  • Naïve Bayes, C4.5, ID3 inferior

18
TP-FP vs number of classifiers
  • Input base classifiersFirst Union and Chase
  • Test data setFirst Union
  • Result
  • Naïve Bayes, C4.5, CART comparable only when
    using all classifiers
  • Ripper superior only when using all classifiers
  • ID3 inferior

19
Accuracy vs number of classifiers
  • Input base classifiersFirst Union and Chase
  • Test data setFirst Union
  • Result
  • Naïve Bayes, C4.5, CART, Ripper comparable only
    when using all classifiers
  • ID3 inferior

20
CHASE max fraud loss 1,470K
Overhead 75
21
FU max fraud loss 1,085K
Overhead 75
22
Aggregate Cost Model
  • X overhead to challenge a fraud

23
Experiment Set-up
  • Training data set 10/1995 - 7/1996
  • Testing data set 9/1996
  • Each data point is the average of the 10
    classifiers (Oct. 1995 to July 1996)
  • Training set size 6,400 transactions (to allow
    90 of frauds)

24
Average Aggregate Cost(C4.5)
25
Accuracy (C4.5)
26
Average Aggregate Cost(CART)
27
Accuracy (CART)
28
Average Aggregate Cost(RIPPER)
29
Accuracy (RIPPER)
30
Average Aggregate Cost(BAYES)
31
Accuracy (BAYES)
32
Amount Saved Overhead 100
  • Fraud in training data 30.00
  • Fraud in training data 23.14
  • Maximum saving 1337K
  • Losses/transaction if no detection 40.81

33
Do patterns change over time?
  • Entire Chase credit card data set
  • Original fraud rate (20 - 80)
  • Due to billing cycle and fraud investigation
    delays, training data are 2 months older than
    testing data
  • Two experiments were conducted with different
    training data sets
  • Test data set 9/1996 (last month)

34
Training data sets
  • Back in time experiment
  • July 1996
  • June July 1996
  • ...
  • October 1995 ... July 1996
  • Forward in time experiment
  • October 1995
  • October 1995 November 1995
  • October 1995 ... July 1996

35
Patterns dont change Accuracy
36
Patterns dont change Savings
37
Divide and Conquer Conflict Resolving
  • Conflicts Base level data with different class
    labels yet same predicted classifications.

38
Class-combiner meta-level training data
39
Prevalence of Conflicts in Meta-level Training
Data
  • Note True Label ID3CARTRIPPER
  • 1 fraud, 0 non-fraud

40
Divide and Conquer Conflict Resolving (contd)
  • We divide the training sets into subsets of
    training data according to each conflict pattern
  • For each subset, recursively apply divide-conquer
    until stopping criteria is met
  • We use a rote table to learn meta-level training
    data

41
Experiment Set-up
  • A full years Chase credit card data. Natural
    fraud percentage (20). Fields not available at
    authorization were removed
  • Each month from Oct. 1995 to July 1996 was used
    as a training set
  • Testing set was chosen from month that is 2
    months older. In real world, it takes 2 month for
    billing and fraud investigation
  • Result was averages of 10 runs

42
Results
  • Without Conflict Resolving Technique but Uses
    Rote Table to learn Meta-level Data
  • Overall Accuracy 88.8
  • True Positive 59.8
  • False Positive 3.81
  • With Conflict Resolving Technique
  • Overall Accuracy 89.1 (increase of 0.3)
  • True Positive 61.2 (increase of 1.4 )
  • False Positive 3.88 (increase of 0.07)

43
Achievable Maximum Accuracy
  • Using nearest neighbor approach to estimate a
    loose upper bound of the maximum accuracy we can
    achieve
  • The algorithm calculates the percentage of noise
    in training data
  • Approximately 91.0...so the we are 1.9 close
    to the maximum accuracy

44
Accuracy Result
Write a Comment
User Comments (0)
About PowerShow.com