Title: Unsupervised Feature Selection for Ensemble of Classifiers
1Unsupervised Feature Selection for Ensemble of
Classifiers
9th International Workshop on Frontiers in
Handwriting Recognition October, 26-29, 2005
Tokyo, JAPAN
- Marisa Morita, Luiz S. Oliveira, and Robert
Sabourin
Pontifical Catholical University of Parana
(PUCPR), Curitiba, BRAZIL Ecole de Technologie
Supérieure (ETS), Montreal, CANADA
2Introduction
- Ensemble of classifiers has been widely used to.
- Reduce model uncertainty.
- Improve generalization performance.
- Good ensemble consists of
- Good classifiers.
- Make errors on different parts of the feature
space.
3Methods For Ensembles
- Classical methods
- Bagging, Boosting.
- Random subspace.
- Varies the subsets of features.
- Feature Selection.
- Most of the works concerning ensembles have been
carried out under the supervised learning
paradigm.
4Unsupervised Feature Selection
- This work is based on unsupervised feature
selection ICDAR03. - Search for a subset of features that best
uncovers natural groupings (clusters) from data
according to some criterion. - To find the subset of features that maximizes the
criterion, the clusters have to be defined. - Convert continuous features to discrete and find
the best subset of features.
5Unsupervised Feature Selection
- For a given subset of features, the number of
clusters is unknown. - Clustering can become a trial-and-error work.
- Multi-criterion optimization function.
- Number of features
- Validity index (measure the quality of the
clustering).
6The Proposed Method
- Based on a hierarchical Multi-Objective Genetic
Algorithm (ICDAR03) - 1st Level performs unsupervised feature
selection. - Finds a set of good classifiers.
- 2nd Level combines those classifiers.
- Maximizing the generalization power of the
ensemble and a measure of diversity - Produces a set of ensembles.
7OVERPRODUCE AND CHOOSE
NEW SEARCH SPACE
- Straightforward strategy
- To find the set that maximizes performance.
- Single objective Premature convergence.
Diversity Measure Explore different
trade-offs Between performance and diversity.
8Classifiers and Feature Sets
- HMM-based classifiers trained to recognized
Brazilian month words. - 1200, 400, 400 words for TR, VL, TS
- 10 200 words from the legal amount database
- Character model.
- 500 words second VL set.
9Classifiers and Feature Sets
Trial-and-error
10Finding Ensembles
- 1) Perform Unsupervised Feature Selection
- 2) To combine the classifiers produced in the 1st
level to provide a set of powerful ensembles. - Each gene of the chromosome stands for a
classifier of the Pareto generated during the
feature selection. - If a chromosome has all bits selected, all
classifiers of the Pareto will be included in the
ensemble.
11Summary Of The 1st Level Classifiers
122nd-level Population
1st Level Pareto
2nd-Level Population
1
2
n
13Objective Functions
- To find the most diverse set of classifiers that
brings a good generalization. - Maximization of the recognition rate of the
ensemble. - Maximization of a measure of diversity.
- Overlap, entropy, ambiguity, etc
- Explore different trade-offs between performance
and diversity.
14Ambiguity
- If the classifiers implement the same functions,
the ambiguity will be small.
15Experimental Results
- Ensembles produced by the 2nd level
- Concavities (F1)
- Distances 32 (F2)
- Distances 64 (F3)
- F1UF2UF3
16Performance on the Test setand Improvements
17Adding a Different type of Classifier in the
Ensemble
- In order to show that the algorithm is able to
find the most complementary classifiers to
compose the ensembles - Global Features
- Ascenders, descenders, loops, etc
- Good performance when combined with other
features. - 87.2 on the test set.
18Adding a Different type of Classifier in the
Ensemble
G
G
G
G
19Results with Global Features
- Worthy of Remark
- G was selected in all four experiments.
- Reduction of the teams.
- Improvement in the recognition rates.
20Conclusion
- Based on the results on different feature sets,
we can conclude that UFS is able to generate a
set of diverse classifiers. - The second level of the algorithm succeeds in
finding complementary classifiers to compose the
ensembles. - Remarkable improvements in terms of recognition
rates at zero-rejection level and fixed error
rates.
21Thanks!!
22Concavities Contour (F1)
23Distances 32 (F2)
24Distances 64 (F3)
25F1UF2UF3