Baseline Methods for the Feature Extraction Class

About This Presentation

Title:

Baseline Methods for the Feature Extraction Class

Description:

Download: http://www.modelselect.inf.ethz.ch/models.php. Task of the students: ... my_model=chain({TP('f_max=1000'), naive, bias} ... – PowerPoint PPT presentation

Number of Views:45

Avg rating:3.0/5.0

Slides: 2

Provided by: Isabell47

Category:

more less

Transcript and Presenter's Notes

Title: Baseline Methods for the Feature Extraction Class

1
Baseline Methods for the Feature Extraction
Class Isabelle Guyon
BACKGROUND
DATASETS
METHODS
Challenge Good performance few
features. Tasks Two-class classification.
Data split Training/validation/test. Valid
entry Results on all 5 datasets.
We present supplementary course material
complementing the book Feature Extraction,
Fundamentals and Applications, I. Guyon et al
Eds., to appear in Springer. Classical algorithms
of feature extraction were reviewed in class.
More attention was given to the feature selection
than feature construction because of the recent
success of methods involving a large number of
"low-level" features. The book includes the
results of a NIPS 2003 feature selection
challenge. The students learned techniques
employed by the best challengers and tried to
match the best performances. A Matlab toolbox
was provided with sample code.The students could
makepost-challenge entries to
http//www.nipsfsc.ecs.soton.ac.uk/.

Scoring
Ranking according to test set balanced error
rate (BER) , i.e. the average positive class
error rate and negative class error rate.
Ties broken by the feature set size.
Learning objects
CLOP learning objects implemented in Matlab.
Two simple abstractions data and algorithm.
Download http//www.modelselect.inf.ethz.ch/mod
els.php.
Task of the students
Baseline method provided, BER0 performance and
n0 features.
Get BERltBER0 or BERBER0 but nltn0.
Extra credit for beating the best challenge
entry.
OK to use the validation set labels for training.

RESULTS
ARCENE Best BER 11.9 ?1.2 - n01100 (11)
BER014.7 my_svcsvc('coef01', 'degree3',
'gamma0', 'shrinkage0.1') my_modelchain(sta
ndardize, s2n('f_max1100'), normalize, my_svc)

DEXTER Best BER3.30?0.40 - n0300 (1.5)
BER05
my_classifsvc('coef01', 'degree1', 'gamma0',
'shrinkage0.5')
my_modelchain(s2n('f_max300'), normalize,
my_classif)

DEXTER text categorization
NEW YORK, October 2, 2001 Instinet Group
Incorporated (Nasdaq INET), the worlds largest
electronic agency securities broker, today
announced tha
DOROTHEA Best BER8.54?0.99 - n01000 (1)
BER012.37 my_modelchain(TP('f_max1000'),
naive, bias)
DOROTHEA drug discovery
GISETTE Best BER1.26?0.14 - n01000 (20)
BER01.80 my_classifsvc('coef01', 'degree3',
'gamma0', 'shrinkage1') my_modelchain(normal
ize, s2n('f_max1000'), my_classif)
GISETTE digit recognition
MADELON Best BER6.22?0.57 - n020 (4)
BER07.33 my_classifsvc('coef01', 'degree0',
'gamma1', 'shrinkage1') my_modelchain(probe(
relief,'p_num2000', 'pval_max0'),
standardize, my_classif)
MADELON artificial data

Write a Comment

User Comments (0)