Rule Discovery for Fraud Detection

About This Presentation

Title:

Rule Discovery for Fraud Detection

Description:

Given a set of page views, predict whether the visitor will view ... Non-crawlers. Hand selected rules with near perfect accuracy. Rule Generator. Applying ... – PowerPoint PPT presentation

Number of Views:28

Avg rating:3.0/5.0

Slides: 14

Provided by: gadip

Category:

more less

Transcript and Presenter's Notes

Title: Rule Discovery for Fraud Detection

1
KDD Cup 2000 Question 1
2
Overview

Objective
Given a set of page views, predict whether the
visitor will view another page or not
Data
Raw Data - Clicks
Aggregated Data - Sessions
Some sessions clipped in the middle
Indicator Session continues
Methods and Tools
Exploratory Data Analysis - SAS
Classification Tree Amdocs Business Insight
Tool
Decision tree
Rules Extraction
Modeling
Combining models

3
The Winning Model - Introduction
This model combines Artificial intelligence,
i.e. Automated procedures with Human intuition /
Domain knowledge decisions
4
The Winning Model - general scheme
5
Building Main Model
Decision Tree
Decision Tree
Decision Tree
5 trees
5 trees
5 trees
built on 34000 cases
built on 34000 cases
built on 34000 cases
6
Description of sub-models
Each model captures a different aspect of the
overall behavior in the data. Combining or
ensembling the models provides the best
prediction results.
Best rule
Chooses most accurate rule satisfied by each
record
Logistic regression on rule set raw field
values combine to define score for each record
Hybrid Model
Logistic regression on rule set defines score for
each record as a combination of rules the record
satisfies
Merged Rules
7
Applying Main Model
Decision Tree
Decision Tree
Decision Tree
5 trees
5 trees
5 trees
built on 34000 cases
built on 34000 cases
built on 34000 cases
Rule Generator
Rule Generator
Rule Generator
1466 rules
1466 rules
1466 rules
111 continue rules
111 continue rules
111 continue rules
Best
Hybrid
Merged
Best
Hybrid
Merged
Best
Hybrid
Merged
Rule
Model
Rules
Rule
Model
Rules
Rule
Model
Rules
8
The Winning Model - general scheme
9
Small Whitebox
10
Small Whitebox
Decision Tree
Applying The Model
11
The prediction
The prediction is not that much better than
choosing the majority class. But it is enough to
win first place!
12
Final Considerations

Since both types of errors (false positives and
true negatives) are given the same weight, a
segment must have a very high probability of
continuing to justify not being classified as the
majority class.
The ratio of continue / not continue in the test
set must be estimated as accurately as possible.
The cutoff point (which score threshold divides
the two classes) must be carefully chosen.

13

The End

Write a Comment

User Comments (0)