KDDCup 2004 presentation

About This Presentation

Transcript and Presenter's Notes

Title: KDDCup 2004

1
KDD-Cup 2004

Chairs Rich Caruana Thorsten Joachims
Web Master Lars Backstrom
Cornell University

2
KDD-Cup Tasks

Goal Optimize learning for different performance
metrics
Task1 Particle Physics
Accuracy
Cross-Entropy
ROC Area
SLAC Q-Score
Task2 Protein Matching
Squared Error
Average Precision
Top 1
Rank of Last

3
Competition Participation

Timeline
April 28 tasks and datasets available
July 14 submission of predictions
Participation
500 registrants/downloads
102 teams submitted predictions
Physics 65 submissions
Protein 59 submissions
Both 22 groups
Demographics
Registrations from 49 Countries (including .com)
Winners from China, Germany, India, New Zealand,
USA
Winners half from companies, half from
universities

4
Task 1 Particle Physics

Data contributed by Charles Young et al, SLAC
(Stanford Linear Accelerator)
Binary classification distinguishing B from
B-Bar particles
Balanced 50-50 B/B-Bar
78 features (most real-valued) describing track
Some missing values
Train 50,000 cases
Test 100,000 cases

5
Task 1 Particle Physics Metrics

4 performance metrics
Accuracy had to specify threshold
Cross-Entropy probabilistic predictions
ROC Area only ordering is important
SLAC Q-Score domain-specific performance metric
from SLAC
Participants submit separate predictions for each
metric
About half of participants submitted different
predictions for different tasks
Winner submitted four sets of predictions, one
for each task
Calculate performance using PERF software we
provided to participants

6
(No Transcript)
7
Determining the Winners

For each performance metric
Calculate performance using same PERF software
available to participants
Rank participants by performance
Honorable mention for participant ranked first
Overall winner is participant with best average
rank across all metrics

and the winners are

9
Task 1 Physics Winners

Christophe Lambert (Golden Helix Inc.) 3rd
place overall (out of 65)

Lalit Wangikar et al. (Inductis Inc.) 2nd place
overall, HM Acc
David Vogels et al. (MEDai Inc./University of
Central Florida) 1st place overall, HM ROC, HM
Cross-Entropy, HM SLQ
10
Bootstrap Analysis of Results

How much does selection of winner depend on
specific test set (100k)?
Algorithm
Repeat many times
Take 100k bootstrap sample (with replacement)
from test set
Evaluate performance on bootstrap sample and
re-rank participants
What is probability of winning/placing?

11
Physics Winners Bootstrap Analysis

1000 bootstrap samples

12
Physics Full Table of Results
13
Task 2 Protein Matching

Data contributed by Ron Elber, Cornell University
Finding homologous proteins (structural
similarity)
74 real-valued features describing match between
two proteins
Data comes in blocks
Unbalanced typically lt 10 homologs () per
block of 1000
Train 153 Proteins (145,751 cases)
Test 150 Proteins (139,658 cases)

14
Task 2 Protein Matching Metrics

Four performance metrics
Mean Squared Error probabilistic predictions
Mean Average Precision only ordering within each
block is important
Mean Top 1 best predicted match is true homolog
in each block
Mean Rank of Last finding all homologs
Again participants submitted separate predictions
for each metric
Again, about half of participants submitted
multiple sets of predictions
19/20 top participants submitted multiple sets of
predictions
Optimizing to each metric separately helped more
on Protein than on Physics

15
(No Transcript)
16
Task 2 Protein Winners
Katharina Morik et al. (University of Dortmund)
HM Rank Last
David Vogel et al. (Aimed / University of Central
Florida) 3rd place overall, HM Top1
Yan Fu et al. (Inst. of Comp. Tech., Chinese
Academy of Sci.) 2nd place overall, HM Squared
Error, HM Average Precision
Bernhard Pfahringer (University of Waikato) 1st
place overall
17
Protein Winners Bootstrap Analysis

10,000 bootstrap samples

18
Protein Full Table of Results
19
Does Optimizing to Each Metric Help?

About half of participants submitted different
predictions for each metric
Among winners
Some evidence that top performers benefit from
optimizing to each metric
Some metrics incompatible e.g., optimizing to
APR hurts RMS

20
Did Groups Effectively Optimize to Different
Measures?

Score predictions for one measure using the other
measures.

21
Did Groups Effectively Optimize to Different
Measures?

How often did a submission for another measure
perform better?
Do not count screw-ups and invalid predictions
Count only those predictions, where the rank
stays within a window of ? (x-axis)
Count only the groups in the top 40

Physics
Protein
22
Did Good Groups Benefit more than Bad Groups?

How often did a submission for another measure
perform better?
Do not count screw-ups and invalid predictions
Count only those predictions, where the rank
stays within a window of ? 10
Count only the groups in the top k (x-axis)

Physics
Protein
23
How Big is the Benefit?

How much does swapping predictions change rank?
Count only those predictions, where the rank
stays within a window of ? (x-axis)
Count only the groups in the top 40

Physics
Protein
24
How Much did Predictions Differ Between Groups?

Fit MDS to Euclidian Distance between Prediction
Vectors
Top 30 Groups

MDS PlotProtein, APR
MDS PlotPhysics, RMSE
25
The Easy, the Difficult, and the Impossible

How often do the competitors agree on a
classification?
X-Axis number of competitors
Y-Axis percentage of test examples x competitors
classified correctly

Physics AccuracyTop 10
Physics AccuracyTop 30
26
The Easy and the Impossible

How often does everybody agree?
X-Axis number of competitors from the top
Y-Axis percentage of test examples everybody
classified correctly / incorrectly

Physics AccuracyEverbody Correct
Physics AccuracyEverybody Incorrect
27
How to Win KDD-Cup 2005 Collaborate

Ensemble that averages predictions of best
participants

28
How to Win KDD-Cup 2005 Collaborate

Ensemble that averages predictions of best
participants

29
Lessons Learned

Use WWW site for organizing competition.
Data and all results still available online
Approx. 400 new registrations since end of
competition (used in courses, papers, research)
Registration process that provides anonymity, but
allows tracking
Selection of suitable tasks
Sample size large enough, so that evaluation
statistically reliable
But small enough so that tractable for most
methods
Two tasks one traditional, one that required
non-standard techniques
Well-defined evaluation criteria, if possible
Automation if possible
Provide evaluation software for download (PERF
software)
Automatic format and plausibility checking of
submissions
Crucial team members
Web Master Lars Backstrom (Cornell)
Data Providers Charles Young (SLAC), Ron Elber
(Cornell)
PERF Alex Niculescu (Cornell), Filip Radlinski
(Cornell), Claire Cardie (Cornell),
participants who found bugs Chinese Academy of
Sciences, University of Dortmund
Who is interested in results?
Data providers get connected with Data Mining
experts
Data Mining community

30
Closing

Data and all results available online http//kod
iak.cs.cornell.edu/kddcup
PERF software download http//www.cs.cornell.ed
u/caruana
Thanks to
Web Master Lars Backstrom (Cornell)
Physics Data Charles Young (SLAC)
Protein Data Ron Elber (Cornell)
PERF Alex Niculescu (Cornell), Filip Radlinski
(Cornell), Claire Cardie (Cornell),
Thanks to participants who found bugs in the PERF
software
Chinese Academy of Sciences
University of Dortmund
And of course, thanks to everyone who
participated!

31
The Contest Goes On
Physics
Protein

Write a Comment

User Comments (0)

About PowerShow.com

KDDCup 2004 PowerPoint PPT Presentation