Agnostic Learning - PowerPoint PPT Presentation

About This Presentation

Title:

Agnostic Learning

Description:

GINA is the digit database. HIVA. HIVA is the HIV database ... On GINA and SYLVA, significantly better results are achieved in the prior ... – PowerPoint PPT presentation

Number of Views:144

Avg rating:3.0/5.0

Slides: 28

Provided by: Isabell47

Category:

more less

Transcript and Presenter's Notes

Title: Agnostic Learning

1
MILESTONE RESULTS Mar. 1st, 2007

Agnostic Learning
vs.
Prior Knowledge
challenge
Isabelle Guyon, Amir Saffari, Gideon Dror,
Gavin Cawley, Olivier Guyon,
and many other volunteers, see http//www.agnostic
.inf.ethz.ch/credits.php

2
Thanks
3
Agnostic Learning vs. Prior Knowledge challenge

When everything else fails,
ask for additional domain knowledge
Two tracks
Agnostic learning Preprocessed datasets in a
nice feature-based representation, but no
knowledge about the identity of the features.
Prior knowledge Raw data, sometimes not in a
feature-based representation. Information given
about the nature and structure of the data.

4
Part I

DATASETS

5
Datasets
Type
Dataset
Domain
Feat-ures
Training Examples
Validation Examples
Test Examples
Dense
ADA
415
Marketing
48
4147
41471
Dense
GINA
Digits
970
3153
315
31532
Dense
HIVA
384
Drug discovery
1617
3845
38449
Sparse binary
NOVA
Text classif.
16969
1754
175
17537
Dense
SYLVA
1308
Ecology
216
13086
130858
http//www.agnostic.inf.ethz.ch
6
ADA

ADA is the marketing database
Task Discover high revenue people from census
data. Two-class pb.
Source Census bureau, Adult database from the
UCI machine-learning repository.
Features 14 original attributes including age,
workclass, education, education, marital status,
occupation, native country. Continuous, binary
and categorical features.

7
GINA
GINA is the digit database

Task Handwritten digit recognition. Separate the
odd from the even digits. Two-class pb. with
heterogeneous classes.
Source MNIST database formatted by LeCun and
Cortes.
Features 28x28 pixel map.

8
HIVA

HIVA is the HIV database
Task Find compounds active against the AIDS HIV
infection. We brought it back to a two-class pb.
(active vs. inactive), but provide the original
labels (active, moderately active, and inactive).
Data source National Cancer Inst.
Data representation The compounds are
represented by their 3d molecular structure.

9
NOVA
Subject Re Goalie masksLines 21Tom
Barrasso wore a great mask, one time, last
season. He unveiled it at a game in Boston.
It was all black, with Pgh city scenes on it.
The "Golden Triangle" graced the top, alongwith
a steel mill on one side and the Civic Arena on
the other. On the back of the helmet was the
old Pens' logo the current (at the time)
Penslogo, and a space for the "new" logo.A
great mask done in by a goalie's
superstition.Lori

NOVA is the text classification database
Task Classify newsgroup emails into politics or
religion vs. other topics.
Source The 20-Newsgroup dataset from in the UCI
machine-learning repository.
Data representation The raw text with an
estimated 17000 words of vocabulary.

10
SYLVA

SYLVA is the ecology database
Task Classify forest cover types into Ponderosa
pine vs. everything else.
Source US Forest Service (USFS).
Data representation Forest cover type for 30 x
30 meter cells encoded with 108 features
(elavation, hill shade, wilderness type, soil
type, etc.)

11
Part II

PROTOCOL and SCORING

12
Protocol

Data split training/validation/test.
Data proportions 10/1/100.
Online feed-back on validation data (1st phase).
Validation labels released in February, 2007.
Challenge prolonged until August 1st, 2007.
Final ranking on test data using the five last
complete submissions for each entrant.

13
Performance metrics

Balanced Error Rate (BER) average of error rates
of positive class and negative class.
Area Under the ROC Curve (AUC).
Guess error (for the performance prediction
challenge only)
dBER abs(testBER guessedBER)

14
Ranking

Compute an overall score
For each dataset, regardless of the track, rank
all the entries with test BER.
Scoreentry_rank/max_rank.
Overall_scoreaverage score over datasets.
Keep only the last five complete entries of each
participant, regardless of track.
Individual dataset ranking For each dataset,
make one ranking for each track using test BER.
Overall ranking Rank the entries separately in
each track with their overall score. Entries
having prior knowledge results for at least one
dataset are entered in the prior knowledge
track.

15
Part III

RESULT ANALYSIS

16
Challenge statistics

Date started October 1st, 2006.
Milestone (NIPS 06) December 1st, 2006
Milestone March 1st, 2007
End August 1st, 2007
Total duration 10 months.
Five last complete entries ranked (Aug 1st)
Total ALvsPK challenge entrants 37.
Total ALvsPK development entries 1070.
Total ALvsPK complete entries 90 prior 167
agnos.
Number of ranked participants 13 (prior), 13
(agnos).
Number of ranked submissions 7 prior 12 agnos

17
Learning curves
18
Learning curves
19
BER distribution
Agnostic learning
Prior knowledge
The black vertical line indicates the best ranked
entry (only the 5 last entry of each participant
were ranked). Beware of overfitting!
20
Final AL results
Agnostic learning best ranked entries as of
August 1st, 2007
Best ave. BER still held by Reference (Gavin
Cawley) with the bad. Note that the best entry
for each dataset is not necessarily the best
entry overall.
21
Method comparison (PPC)
Agnostic track no significant improvement so far
dBER
Test BER
22
LS-SVM
Gavin Cawley, July 2006
23
Logitboost
Roman Lutz, July 2006
24
Final PK results
Prior knowledge best ranked entries as of August
1st, 2007
Best ave. BER held by Reference (Gavin Cawley)
with interim all prior. Louis Duclos-Gosselin
is second on ADA with Neural Network13, and S.
Joshua Swamidass second on HIVA, but they are not
entered in the table because he did not submit a
complete entry. The overall entry ranking is
performed with the overall score (average rank
over all datasets). The best performing complete
entry may not contain all the best performing
entries on the individual datasets. We indicate
the ranks of the prior entries only for
individual datasets.
25
AL vs. PK, who wins?
We compare the best results of the ranked entries
for entrants who entered both tracks. If the
Agnostic Learning BER is larger than the Prior
Knowledge BER, 1 is shown in the table. The
sign test is not powerful enough to reveals a
significant advantage of PK or AL.
26
Progress?

On ADA and NOVA, the best results obtained by
the participants is in the agnostic track! But it
is possible to do better with prior knowledge on
ADA, the PK winner has a worse AL entry the PK
best reference entry yields best results on NOVA.
On GINA and SYLVA, significantly better results
are achieved in the prior knowledge track and all
but one participant who entered both tracks did
better with PK.
On HIVA, experts achieve significantly better
results with prior knowledge, but non-experts
entering both tracks do worse in the PK track.

27
Conclusion

PK wins, but not by a huge margin. Improving
performances using PK is not that easy!
AL using fairly simple low level features is a
fast way of getting hard-to-beat results.
The website will remain open for post-challenge
entries http//www.agnostic.inf.ethz.ch.

Write a Comment

User Comments (0)