Fuzzy Entropy based feature selection for classification of hyperspectral data - PowerPoint PPT Presentation

About This Presentation
Title:

Fuzzy Entropy based feature selection for classification of hyperspectral data

Description:

Fuzzy Entropy based feature selection for classification of hyperspectral data Mahesh Pal Department of Civil Engineering National Institute of Technology – PowerPoint PPT presentation

Number of Views:391
Avg rating:3.0/5.0
Slides: 19
Provided by: School203
Category:

less

Transcript and Presenter's Notes

Title: Fuzzy Entropy based feature selection for classification of hyperspectral data


1
Fuzzy Entropy based feature selection for
classification of hyperspectral data Mahesh
Pal Department of Civil Engineering National
Institute of Technology Kurukshetra
2
  • Hyperspectral data
  • Measurement of radiation in the visible to the
    infrared spectral region in many finely spaced
    spectral wavebands.
  • Provide greater detail on the spectral variation
    of targets than conventional multispectral
    systems.
  • The availability of large amounts of data
    represents a challenge to classification
    analyses.
  • Each spectral waveband used in the classification
    process should add an independent set of
    information. However, features are highly
    correlated, suggesting a degree of redundancy in
    the available information which can have a
    negative impact on classification accuracy.

3
An example MULTISPECTRAL DATA Discrete
wave-bands for example Landsat 7 Band 1-
0.45-0.515 µm Band2- 0.525-0.605 µm Between 0.45
-2.235 µm - A total of six bands HYPERSPECTRAL
DATA DAIS data Between 0.502-2.395 µm - A
total of 72 bands Continuous bands at 10-45 nm
bandwidth 0.4-0.7 µm visible, 0.7-1.3 µm-
NIR, 1.0-3.0 µm-MIR, 3-100 µm- Thermal
4
  • Various approaches could be adopted for the
    appropriate classification of high dimensional
    data
  • Adoption of a classifier that is relatively
    insensitive to the Hughes effect (Vapnik, 1995).
  • Using a methods to effectively increase training
    set size i.e. semi-supervised classification (Chi
    and Bruzzone, 2005) and use of unlabelled data
    (Shahshahani and D. A. Landgrebe, 1994)
  • Use of some form of dimensionality reduction
    procedure prior to the classification analysis.

5
Feature reduction
  1. Two broad categories are feature selection and
    feature extraction.
  2. Feature reduction may speed-up the classification
    process by reducing data set size.
  3. May increase the predictive accuracy.
  4. May increase the ability to understand the
    classification rules.
  5. feature selection select a subset of the original
    features those maintains the useful information
    to separate the classes by removing redundant
    features.

6
Feature selection
Three approaches of feature selection
are Filters uses a search algorithm to search
through the space of possible features and
evaluate each feature by using a filter such as
correlation and mutual information Wrappers
uses a search algorithm to search through the
space of possible features and evaluate each
subset by using a classification algorithm.
Embedded some classification processes such as
random forest produce a ranked list of features
during classification. This study aims to explore
the usefulness of four filter based feature
selection approaches.
7
Feature selection approaches
  • Four filter based feature selection approaches
    were used.
  • Entropy
  • Fuzzy entropy
  • Signal-to-noise ratio
  • RELIEF

8
Entropy and Fuzzy Entropy
For a finite set
,if P is the probability distribution on X,
Yagers entropy is defined by

For a given fuzzy information system defined by
(U, A, V, f), where U is a finite set of objects
(Hu and Yu, 2005), A is set of features i.e. If
Q is a subset of attribute set A, and is the
fuzzy relation matrix by an indiscernibility
relation The significance of a is defined as ,
Significance If significance
, attribute a is considered
redundant. Further details of this algorithms can
be found in Hu and Yu (2005).
9
Signal to noise ratio
This approach rank all features in order to
define how well a feature discriminates between
two classes. In order to use this approach for
multiclass classification problem, one against
one approach was used in this study.
10
RELIEF
  • The general idea of RELIEF is to choose the
    features that can be most distinguished between
    classes.
  • At each step of an iterative process, an instance
    is chosen at random from the dataset and the
    weight for each feature is updated according to
    the distance of this instance to its Near-miss
    and Near-hit (Kira and Rendell, 1992).
  • An instance from the dataset will be a near-hit
    to X, if it belongs to the close neighbourhood of
    X and belongs to the same class as that of X.
  • An instance would be called a near-miss if
    belongs to the neighbourhood of X but not to the
    same class as that of X.

11
Data Set
  • DAIS 7915 sensor by German Space Agency flown on
    29 June 2000.
  • The sensor acquire information in 79-bands at a
    spatial resolution of 5m in the wavelength range
    of 0.50212.278 µm.
  • 7 features located in the mid- and thermal
    infrared region and 7 features from spectral
    region of 0.502 2.395 µm due to striping noise
    were removed.
  • An area of 512 pixels by 512 pixels and 65
    features covering the test site was used.

12
(No Transcript)
13
Training and test data
  1. Random sampling was used to collect train and
    test using a ground reference image.
  2. Eight land cover classes i.e. wheat, water, salt
    lake, hydrophytic vegetation, vineyards, bare
    soil, pasture and built-up land.
  3. A total of 800 training pixels and a total of
    3800 test pixels was used.

14
  • Classification Method
  • Support vector machines using one against one
    approach for multiclass data was used.
  • Radial basis function kernel was used.
  • Regularisation parameter (C) 5000 and Gamma 2
    was used.
  • In all feature selection approach classification
    accuracy with test dataset was obtained.
  • Test for non-inferiority using McNemar test was
    used.

15
Selected features with different feature
selection approaches
Feature selection method Selected feature
Entropy 32, 51, 63, 35, 8, 49, 42, 27, 48, 64, 6, 50, 65, 11, 53, 39, 22
Fuzzy entropy 32, 41, 50, 6, 27, 63, 36, 49, 10, 22, 65, 51, 40, 48
Relief 3, 4, 2, 11, 10, 5, 8, 6, 9, 7, 12, 1, 13, 23, 22, 25, 24, 20, 31, 30
Signal to noise ratio 5, 7, 8, 9, 6, 10, 11, 4, 12, 3, 32, 31, 33, 30, 24, 23, 25, 29, 13, 26
16
Classification accuracy with SVM classifier with
different selected features
Feature selection method Number of features used in classification Classification accuracy ()
No feature selection 65 91.76
Fuzzy entropy 14 91.68
Entropy 17 91.61
Signal to noise ratio 20 91.68
Relief 20 88.61
17
Difference and non-inferiority test results based
on 95 confidence interval on the estimated
difference in accuracy from the accuracy achieved
with 65 features and the feature sets selected
using different approach.
Number of features Accuracy () Difference in accuracy () 95 confidence interval Conclusion (at 0.05 level of significance)
65 91.76 0.00 0.000-0.000 -
14 91.68 0.36 0.071-0.089 Non-inferior
17 91.61 0.13 0.142-0.158 Non-inferior
20 91.68 0.26 0.071-0.089 Non-inferior
20 88.61 3.00 3.140-3.160 Inferior
18
Conclusions
  • Fuzzy entropy based feature selection approach
    works well with this dataset and provides
    comparable performance with small number of
    selected features.
  • Accuracy achieved by signal to noise ratio and
    entropy based approaches is also comparable to
    that is achieved with full dataset but require
    more number of selected features than fuzzy
    entropy based approach.
  • Results with Relief based approach show a
    significant decline in classification accuracy in
    comparison to full dataset.
Write a Comment
User Comments (0)
About PowerShow.com