Feature Subset Selection Using Genetic Algorithms for Handwritten Digit Recognition - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Feature Subset Selection Using Genetic Algorithms for Handwritten Digit Recognition

Description:

14th Brazilian Symposium on Computer Graphics and Image Processing ... 14th Brazilian Symposium on Computer Graphics and Image Processing. Feature Set and Classifier ... – PowerPoint PPT presentation

Number of Views:972
Avg rating:3.0/5.0
Slides: 16
Provided by: luizeso
Category:

less

Transcript and Presenter's Notes

Title: Feature Subset Selection Using Genetic Algorithms for Handwritten Digit Recognition


1
Feature Subset Selection Using Genetic Algorithms
for Handwritten Digit Recognition
  • L.S.Oliveira, N. Benahmed, R.Sabourin,
    F.Bortolozzi, and C.Y.Suen

Pontifícia Universidade Católica do Paraná
(PUCPR) BRAZIL Ecole de Technologie Superiéure
(ETS) CANADA Centre for Pattern Recognition and
Machine Inteligence (CENPARMI) - CANADA
2
Introduction
  • Features can distinguish one class of patterns
    from another in a more concise and meaningful.
  • It is not unusual to find problems involving
    hundreds features.
  • Beyond a certain point, the inclusion of
    additional features leads to a worse rather than
    better performance.
  • Feature subset selection problem.

3
Feature Subset Selection
  • Reduce the number of features used in
    classification while maintaining an acceptable
    classification accuracy.
  • Filter approach.
  • Feature selection is done independently of the
    learning algorithm used to build the classifier.
  • Wrapper approach.
  • Takes into account the learning algorithm.

4
Feature Set and Classifier
  • Feature set
  • Based on a mixture of concavity and contour-based
    features (132 components).
  • Classifier
  • Neural network trained with BP algorithm.
  • Database (NIST SD19).
  • 195,000, 60,089 and 58,646 images for training,
    validation and test respectively.
  • Performance.
  • 99.13 validation set (hsf_7).
  • 97.52 test set (hsf_4).

5
Genetic Algorithms for Feature Subset Selection.
  • Practical applications such as handwriting
    recognition presents a multi-criterion
    optimization problem
  • Number of features.
  • Accuracy of classification.
  • Genetic algorithms (GA)
  • Quite effective for rapid global search of large,
    non-linear and poorly understood spaces.
  • Effective in solving large-scale problems.
  • Simple GA and Iterative GA.

6
Genetic Algorithm
  • Model of machine learning.
  • Behavior derived from metaphor of some of the
    mechanisms of evolution in nature.
  • Population based.
  • Quality of each individual is evaluated through a
    fitness function.
  • Main operators
  • Selection, crossover and mutation.

7
Genetic Algorithm
Procedure Begin t ? 0 initialize P(t) while
(not termination condition) t ? t 1 select
P(t) from P(t-1) crossover P(t) mutate
P(t) evaluate P(t) end end
8
Representation and Operators
  • Binary representation, which is the most
    straightforward scheme.
  • Operators.
  • Bit-flip mutation.
  • One-point crossover.
  • Roulette wheel selection.

9
Parameters
  • Parameter settings.
  • Population size 30.
  • Number of generations 1000.
  • Probability of crossover 0.8.
  • Probability of mutation 0.007.
  • Based on results of several preliminary runs.

10
Objective Function
  • Two objectives
  • Minimization of the number of features.
  • Minimization of the error rate of the classifier.
  • There is a set of alternative trade-offs.
  • Weighting method.
  • Aggregates the objectives into a single and
    parameterized objective.
  • Linear combination.

11
Fitness Evaluation
  • Wrapper approach.
  • Evaluation of each chromosome requires training
    the corresponding neural net and computing its
    accuracy.
  • It is not feasible due to the limits imposed by
    the learning time of huge database.
  • Sensitivity analysis.
  • It uses the sensitivity of the net to estimate
    the relationship of input features with the
    network performance.
  • Replace the unselected features by the average
    value computed in the training set.

12
Different Approach of GA
  • Simple GA.
  • Iterative GA.
  • Speed up the convergence time of the algorithm by
    restricting the search space.
  • Main difference search mechanism.

13
Iterative GA
14
Experimental Results
15
Conclusion and Future Works
  • Two different approaches of GA for feature subset
    selection.
  • Modified wrapper approach.
  • Sensitivity analysis.
  • SGA.
  • Provide a reduction of about 30.
  • Error rates in the same level.
  • Multi-objective optimization.
  • Pareto-based approaches.
Write a Comment
User Comments (0)
About PowerShow.com