Title: Hong Cheng, Xifeng Yan, Jiawei Han and ChihWei Hsu
1Discriminative Frequent Pattern Analysis for
Effective Classification
- Hong Cheng, Xifeng Yan, Jiawei Han and Chih-Wei
Hsu - ICDE 2007
2Outline
- Introduction
- The framework of Frequent Pattern-based
Classification - Experimental Results
- Conclusion
3- How does frequent pattern-based classification
achieve both high scalability and accuracy for
the classification of large datasets? - What is the strategy for setting the minimum
support threshold? - Given a set of frequent patterns, how should we
select high quality ones for effective
classification?
4Introduction
- The use of frequent patterns without feature
selection will result in a huge feature space. - This might slow down the model learning process.
- The classification accuracy deteriorates.
- An effective and efficient feature selection
algorithm is proposed to select a set of frequent
and discriminative patterns for classification.
5Frequent Pattern vs. Single Feature
The discriminative power of some frequent
patterns is higher than that of single features.
(a) Austral
(b) Cleve
(c) Sonar
Fig. 1. Information Gain vs. Pattern Length
6The Framework of Frequent Pattern-based
Classification
- It includes three steps
- Feature generation
- Feature selection
- Model learning
7Problem Formulation
- , where
- Let x be the feature vector of a data point s.
- the dataset is represented in Bd as
,where -
-
8Discriminative Power v.s. Pattern frequency
- This paper demonstrates that the discriminative
power of low-support features is limited. - The low-support features could harm the
classification accuracy due to overfitting.
9Cont.
- The discriminative power of a pattern is closely
related to its support
For a pattern represented by a random
variable X,
Given a DB with a fixed class distribution, H(C)
is a constant.
IGub(CX) is closely related to
If
H(CX) reaches its lower bound when q0 or 1
Therefore, the discriminative power of low
frequency patterns is bounded by a small value.
10Empirical Results
(c) Sonar
(b) Breast
(a) Austral
Fig. 2. Information Gain vs. Pattern Frequency
11Set min_sup
- A subset of high quality features are selected
for classification,with -
- Because
, features with support can be
skipped. - The major steps
- Compute
- Choose
- Find
- Mine frequent patterns with
12Feature Selection
- Given a set of frequent patterns, both
non-discriminative and redundant patterns exist. - We want to single out the discriminative patterns
and remove redundant ones - The notion of Maximal Marginal Relevance (MMR) is
borrowed -
-
-
13Experimental Results
14Scalability Tests
15Conclusion
- An Effective and efficient feature selection
algorithm is proposed to select a set of frequent
and discriminative patterns for classification. - Scalability issue
- It is computationally infeasible to generate all
feature combinations and filter them with an
information gain threshold - Efficient method (DDPMine FPtree pruning) H.
Cheng, X. Yan, J. Han, and P. S. Yu, "Direct
Discriminative Pattern Mining for Effective
Classification", ICDE'08.