Baseline Methods for the Feature Extraction Class - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

Baseline Methods for the Feature Extraction Class

Description:

Download: http://www.modelselect.inf.ethz.ch/models.php. Task of the students: ... my_model=chain({TP('f_max=1000'), naive, bias} ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 2
Provided by: Isabell47
Category:

less

Transcript and Presenter's Notes

Title: Baseline Methods for the Feature Extraction Class


1
Baseline Methods for the Feature Extraction
Class Isabelle Guyon
BACKGROUND
DATASETS
METHODS
Challenge Good performance few
features. Tasks Two-class classification.
Data split Training/validation/test. Valid
entry Results on all 5 datasets.
We present supplementary course material
complementing the book Feature Extraction,
Fundamentals and Applications, I. Guyon et al
Eds., to appear in Springer. Classical algorithms
of feature extraction were reviewed in class.
More attention was given to the feature selection
than feature construction because of the recent
success of methods involving a large number of
"low-level" features. The book includes the
results of a NIPS 2003 feature selection
challenge. The students learned techniques
employed by the best challengers and tried to
match the best performances. A Matlab toolbox
was provided with sample code.The students could
makepost-challenge entries to
http//www.nipsfsc.ecs.soton.ac.uk/.
  • Scoring
  • Ranking according to test set balanced error
    rate (BER) , i.e. the average positive class
    error rate and negative class error rate.
  • Ties broken by the feature set size.
  • Learning objects
  • CLOP learning objects implemented in Matlab.
  • Two simple abstractions data and algorithm.
  • Download http//www.modelselect.inf.ethz.ch/mod
    els.php.
  • Task of the students
  • Baseline method provided, BER0 performance and
    n0 features.
  • Get BERltBER0 or BERBER0 but nltn0.
  • Extra credit for beating the best challenge
    entry.
  • OK to use the validation set labels for training.

RESULTS
ARCENE Best BER 11.9 ?1.2 - n01100 (11)
BER014.7 my_svcsvc('coef01', 'degree3',
'gamma0', 'shrinkage0.1') my_modelchain(sta
ndardize, s2n('f_max1100'), normalize, my_svc)
  • DEXTER Best BER3.30?0.40 - n0300 (1.5)
    BER05
  • my_classifsvc('coef01', 'degree1', 'gamma0',
    'shrinkage0.5')
  • my_modelchain(s2n('f_max300'), normalize,
    my_classif)

DEXTER text categorization
NEW YORK, October 2, 2001 Instinet Group
Incorporated (Nasdaq INET), the worlds largest
electronic agency securities broker, today
announced tha
DOROTHEA Best BER8.54?0.99 - n01000 (1)
BER012.37 my_modelchain(TP('f_max1000'),
naive, bias)
DOROTHEA drug discovery
GISETTE Best BER1.26?0.14 - n01000 (20)
BER01.80 my_classifsvc('coef01', 'degree3',
'gamma0', 'shrinkage1') my_modelchain(normal
ize, s2n('f_max1000'), my_classif)
GISETTE digit recognition
MADELON Best BER6.22?0.57 - n020 (4)
BER07.33 my_classifsvc('coef01', 'degree0',
'gamma1', 'shrinkage1') my_modelchain(probe(
relief,'p_num2000', 'pval_max0'),
standardize, my_classif)
MADELON artificial data
Write a Comment
User Comments (0)
About PowerShow.com