10' Feature subset selection feature weighing - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

10' Feature subset selection feature weighing

Description:

Data Mining 2, 2006. 4. SFS performs best when the optimal subset has a small ... Mining ... Data Mining 2, 2006. 13. The value of A is updated when a greater one ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 24
Provided by: bela4
Category:

less

Transcript and Presenter's Notes

Title: 10' Feature subset selection feature weighing


1
(No Transcript)
2
Filter methods
  • T statistic
  • Information
  • Distance
  • Correlation
  • Separability

3
FSS Algorithms
  • Exponential
  • Exhaustiva search
  • Branch Bound
  • Beam search
  • Sequential
  • SFS and SBS
  • Plus-l / Minus-r
  • Bidirectional
  • Floating (exponential in worst case)
  • Randomized
  • Sequential randomness
  • Genetic

4
SFS performs best when the optimal subset has a
small number of features When the search is near
the empty set, a large number of states can be
potentially evaluated Towards the full set, the
region examined by SFS is narrower since most of
the features have already been selected
5
Example
The optimal feature subset turns out to be x1,
x4, because x4 provides the only information
that x1 needs discrimination between classes ?4
and ?5
6
SBS works best when the optimal feature subset
has a large number of features, since it spends
most of its time visiting large subsets The main
limitation of SBS is its inability to reevaluate
the usefulness of a feature after it has been
discarded
7
SFS is performed from the empty set SBS is
performed from the full set Features selected by
SFS are not removed by SBS Features removed by
SBS are not selected by SFS Guarantee convergence
8
(No Transcript)
9
Some backtracking ability Main limitation is
that there is no theoretical way of predicting
the optimal values of l and r
10
(No Transcript)
11
(No Transcript)
12
Monotonicity in J
13
B
AJ(1,2)
  • The value of A is updated when a greater one is
    found in a leaf
  • Stop whenever every leaf has been purged or
    evaluated

A gt B ? purge
14
(No Transcript)
15
(No Transcript)
16
With a proper queue size, Beam Search can avoid
getting trapped in local minimal by preserving
solutions from varying regions in the search space
The optimal is 2-3-4 (J9), which is never
explored
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
The RELIEF algorithm (1)
22
The RELIEF algorithm (2)
23
Conclusions
  • Dificult and pervasive problem!
  • Lack of accepted and useful definitions for
    relevance, redundancy and irrelevance
  • Nesting problem
  • Abundance of algorithms and filters
  • Lack of proper comparative benchmarks
  • Obligatory step, usually well worth the pain
Write a Comment
User Comments (0)
About PowerShow.com