Rare Category Detection

About This Presentation

Title:

Rare Category Detection

Description:

Start de-novo. Very skewed classes. Majority classes. Minority classes. Labeling oracle ... Initial condition: labeled examples from each class ... – PowerPoint PPT presentation

Number of Views:54

Avg rating:3.0/5.0

Slides: 39

Provided by: carbonVide1

Category:

Tags: category | de | detection | novo | rare

more less

Transcript and Presenter's Notes

Title: Rare Category Detection

1
Rare Category Detection

Jingrui He
Machine Learning Department
Carnegie Mellon University
Joint work with Jaime Carbonell

2
Whats Rare Category Detection

Start de-novo
Very skewed classes
Majority classes
Minority classes
Labeling oracle
Goal
Discover minority classes with a few label
requests

3
Comparison with Outlier Detection

Rare classes
A group of points
Clustered
Non-separable from the majority classes

Outliers
A single point
Scattered
Separable

4
Comparison with Active Learning

Rare category detection
Initial condition NO labeled examples
Goal discover the minority classes with the
least label requests

Active learning
Initial condition labeled examples from each
class
Goal improve the performance of the current
classifier with the least label requests

5
Applications
Network intrusion detection
Fraud detection
Astronomy
Spam image detection
6
The Big Picture
Classifier
Unbalanced Unlabeled Data Set
Rare Category Detection
Learning in Unbalanced Settings
Feature Extraction
Spatial
Raw Data
Relational
Temporal
7
Outline

Problem definition
Related work
Rare category detection for spatial data
Prior-dependent rare category detection
Prior-free rare category detection
Conclusion

8
Related Work

Pelleg Moore 2004
Mixture model
Different selection criteria
Fine Mansour 2006
Generic consistency algorithm
Upper bounds and lower bounds
Papadimitriou et al 2003
LOCI algorithm for groups of outliers

Separable or Near-separable
9
Outline

Problem definition
Related work
Rare category detection for spatial data
Prior-dependent rare category detection
Prior-free rare category detection
Conclusion

10
Notations

Unlabeled examples ,
m Classes
m-1 rare classes
One majority class ,
Goal find at least ONE example from each rare
class by requesting a few labels

11
Assumptions

The distribution of the majority class is
sufficiently smooth
Examples from the minority classes form compact
clusters in the feature space

12
Overview of the Algorithms

Nearest-neighbor-based methods
Methodology local density differential sampling
Intuition select examples according to the
change in local density

13
Two Classes NNDB
1. Calculate class-specific radius
2. ,
,
Increase t by 1
3.
4. Query
No
5. Rare class?
Yes
6. Output
14
NNDB Calculate Class-Specific Radius

Number of examples from the minority class
, calculate the distance between
and its nearest neighbor
The class-specific radius

15
NNDB Calculate Nearest Neighbors
16
NNDB Calculate the Scores
Query
17
NNDB Pick the Next Candidate
Increase t by 1
Query
18
Why NNDB Works

Theoretically
Theorem 1 He Carbonell 2007 under certain
conditions, with high probability, after a few
iteration steps, NNDB queries at least one
example whose probability of coming from the
minority class is at least 1/3
Intuitively
The score measures the
change in local density

19
Multiple Classes ALICE

m-1 rare classes
One majority class ,

1. For each rare class c,
Yes
2. We have found examples from class c
No
3. Run NNDB with prior
20
Why ALICE Works

Theoretically
Theorem 2 He Carbonell 2008 under certain
conditions, with high probability, in each outer
loop of ALICE, after a few iteration steps in
NNDB, ALICE queries at least one example whose
probability of coming from one minority class is
at least 1/3

21
Implementation Issues

ALICE
Problem repeatedly sampling from the same rare
class
MALICE
Solution relevance feedback

Class-specific radius
22
Results on Synthetic Data Sets
23
Summary of Real Data Sets

Abalone
4177 examples
7-dimensional features
20 classes
Largest class 16.50
Smallest class 0.34

Shuttle
4515 examples
9-dimensional features
7 classes
Largest class 75.53
Smallest class 0.13

24
Results on Real Data Sets
Abalone
Shuttle
MALICE
MALICE
Interleave
Interleave
Random sampling
Random sampling
25
Imprecise priors
Abalone
Shuttle
26
Outline

Problem definition
Related work
Rare category detection for spatial data
Prior-dependent rare category detection
Prior-free rare category detection
Conclusion

27
Overview of the Algorithm

Density-based method
Methodology specially designed exponential
families
Intuition select examples according to the
change in local density
Difference from NNDB (ALICE) NO prior
information needed

28
Specially Designed ExponentialFamilies Efron
Tibshirani 1996

Favorable compromise between parametric and
nonparametric density estimation
Estimated density

Carrier density
parameter vector
Normalizing parameter
vector of sufficient statistics
29
SEDER Algorithm

Carrier density kernel density estimator
To decouple the estimation of different
parameters
Decompose
Relax the constraint such that

30
Parameter Estimation

Theorem 3 To appear the maximum likelihood
estimate and of and satisfy the
following conditions
where

31
Parameter Estimation cont.

Let
where ,

positive parameter
in most cases
32
Scoring Function

The estimated density
Scoring function norm of the gradient
where

33
Results on Synthetic Data Sets
34
Summary of Real Data Sets
Moderately Skewed
Extremely Skewed
35
Moderately Skewed Data Sets
Ecoli
Glass
MALICE
MALICE
36
Extremely Skewed Data Sets
Page Blocks
Abalone
MALICE
MALICE
Shuttle
MALICE
37
Conclusion