A Combinatorial Fusion Method for Feature Mining - PowerPoint PPT Presentation

About This Presentation
Title:

A Combinatorial Fusion Method for Feature Mining

Description:

Feature construction/engineering often a critical step in the data mining process ... This view is bolstered by other work on data fusion that using ensembles to ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 23
Provided by: Gar1128
Category:

less

Transcript and Presenter's Notes

Title: A Combinatorial Fusion Method for Feature Mining


1
A Combinatorial Fusion Method for Feature Mining
  • Ye Tian, Gary Weiss, D. Frank Hsu, Qiang Ma
  • Fordham University
  • Presented by Gary Weiss

2
Introduction
  • Feature construction/engineering often a critical
    step in the data mining process
  • Can be very time-consuming and may require a lot
    of manual effort
  • Our approach is to use a combinatorial method to
    automatically construct new features
  • We refer to this as feature fusion
  • Geared toward helping to predict rare classes
  • For now it is restricted to numerical features,
    but can be extended to other features

3
How does this relate to MMIS?
  • One MMIS category is local pattern analysis
  • How to efficiently identify quality knowledge
    from a single data source
  • Listed data preparation and selection as
    subtopics and also mentioned fusion
  • We acknowledge that this work probably is not
    what most people think of as MMIS

4
How can we view this work as MMIS?
  • Think of each feature as piece of information
  • Our fusion approach integrates these pieces
  • Fusion itself is a proper topic for MMIS since it
    can also be used with multiple info sources
  • The fusion method we employ does not really care
    if the information (i.e., features) are from a
    single source
  • As complexity of constructed features increases,
    each can be viewed as a classifier
  • Each fused feature is an information source
  • This view is bolstered by other work on data
    fusion that using ensembles to combine each fused
    feature

5
Description of the Method
  • A data set is a collection of records where each
    feature has a score
  • We assume numerical features
  • We then replace scores by ranks
  • Ordering of ranks determined by whether larger or
    small scores better predict class
  • Compute performance of each feature
  • Compute performance of feature combinations
  • Decide which combinations to evaluate/use

6
Step 1 A data set
7
Step 2 Scores replaced by Ranks
8
Step 3 Compute Feature Performance
  • Performance measures how well feature predicts
    minority class
  • We sort rows by feature rank and measure
    performance on top n, where n belong to
    minority class
  • In this case we evaluate on top 3 rows. Since 2
    of 3 are minority (class1), performance .66

9
Step 3 continued
10
Step 4 Compute Performance of Feature
Combinations
  • Let F6 be fused F1F2F3F4F5
  • Rank combination function is average of ranks
  • Compute rank of F6 for each record
  • Compute performance of F6 as in step 3

11
Step 5 What Combinations to Use?
  • Given n features there are 2n 1 possible
    combinations
  • C(n,1) C(n,2) C(n.n)
  • This fully exhaustive fusion strategy is
    practical for many values of n
  • We try other strategies in case not feasible
  • k-exhaustive strategy selects k best features and
    tries all combinations
  • k-fusion strategy uses all n features but fuses
    at most k features at once

12
Combinatorial Fusion Table
13
Combinatorial Fusion Algorithm
  • Combinatorial strategy generates features
  • Performance metric determines which are best
  • Used to determine which k features for k-fusion
  • Also used to determine order of features to add
  • We add a feature if it leads to a statistically
    significant improvement (p .10)
  • As measured on validation data
  • This limits the number of features
  • But requires a lot of computation

14
Example Run of Algorithm
15
Description of Experiments
  • We use Wekas DT, 1-NN, and Naïve Bayes methods
  • Analyze performance on 10 data sets
  • With and without fused features
  • Focus on AUC as the main metric
  • More appropriate than accuracy especially with
    skewed data
  • Use 3 combinatorial fusion strategies
  • 2-fusion, 3-fusion, and 6-exhaustive

16
Results
Summary Results over all 10 Data Sets
Results over 4 Most Skewed Data Sets (lt 10
Minority)
17
Discussion of Results
  • No one of the 3 fusion schemes is clearly best
  • The methods seem to help, but the biggest
    improvement is clearly with the DT method
  • May be explained by traditional DT methods having
    limited expressive power
  • They can only consider 1 feature at a time
  • Can never perfectly learn simple concepts like
    F1F2 gt 10, but can with feature-fusion
  • Bigger improvement for highly skewed data sets
  • Identifying rare cases is difficult and may
    require looking at many features in parallel

18
Future Work
  • More comprehensive experiments
  • More data sets, more skewed data sets, more
    combinatorial fusion strategies
  • Use of heuristics to more intelligently choose
    fused features
  • Performance measure now used only to order
  • Use of diversity measures
  • Avoid building classifier to determine which
    fused features to add
  • Handle non-numerical features

19
Conclusion
  • Showed how a method from information fusion can
    be applied to feature construction
  • Results encouraging but more study needed
  • Extending the method should lead to further
    improvements

20
Questions?
21
Detailed Results Accuracy
22
Detailed Results AUC
Write a Comment
User Comments (0)
About PowerShow.com