PASCAL CHALLENGE ON EVALUATING MACHINE LEARNING FOR INFORMATION EXTRACTION - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

PASCAL CHALLENGE ON EVALUATING MACHINE LEARNING FOR INFORMATION EXTRACTION

Description:

Unannotated Data 2. 250 Conference CFP. WWW. PASCAL. PASCAL. Annotation Slots. 100.0% 2.3 ... Same as Task1 but can use the 500 unannotated documents ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 35
Provided by: neili7
Category:

less

Transcript and Presenter's Notes

Title: PASCAL CHALLENGE ON EVALUATING MACHINE LEARNING FOR INFORMATION EXTRACTION


1
PASCAL CHALLENGE ON EVALUATING MACHINE LEARNING
FOR INFORMATION EXTRACTION
Neil Ireson Local Challenge Coordinator Web
Intelligent Group Department of Computer
Science University of Sheffield UK
2
Organisers
  • Sheffield Fabio Ciravegna
  • UCD Dublin Nicholas Kushmerick
  • ITC-IRST Alberto Lavelli
  • University of Illinois Mary-Elaine Califf
  • FairIsaac Dayne Freitag

3
Outline
  • Challenge Goals
  • Data
  • Tasks
  • Participants
  • Experimental Results
  • Conclusions

4
Goal Provide a testbed for comparative
evaluation of ML-based IE
  • Standardisation
  • Data
  • Partitioning
  • Same set of features
  • Corpus preprocessed using Gate
  • No features allowed other than the ones provided
  • Explicit Tasks
  • Evaluation Metrics
  • For future use
  • Available for further test with same or new
    systems
  • Possible to publish and new corpora or tasks

5
Data (Workshop CFP)
2005
Testing Data 200 Workshop CFP
2000
Training Data 400 Workshop CFP
1993
6
Data (Workshop CFP)
2005
Testing Data 200 Workshop CFP
2000
Training Data 400 Workshop CFP
1993
7
Data (Workshop CFP)
2005
Testing Data 200 Workshop CFP
2000
Training Data 400 Workshop CFP
1993
8
Data (Workshop CFP)
2005
Testing Data 200 Workshop CFP
2000
Training Data 400 Workshop CFP
1993
9
(No Transcript)
10
Annotation Slots
11
Preprocessing
  • GATE
  • Tokenisation
  • Part-Of-Speech
  • Named-Entities
  • Date, Location, Person, Number, Money

12
Evaluation Tasks
  • Task1 - ML for IE Annotating implicit
    information
  • 4-fold cross-validation on 400 training documents
  • Final Test on 200 unseen test documents
  • Task2a - Learning Curve
  • Effect of increasing amounts of training data on
    learning
  • Task2b - Active learning Learning to select
    documents
  • Given seed documents select the documents to add
    to training set
  • Task3a Semi-supervised Learning Given data
  • Same as Task1 but can use the 500 unannotated
    documents
  • Task3b - Semi-supervised Learning Any Data
  • Same as Task1 but can use all available
    unannotated documents

13
Evaluation
  • Precision/Recall/F1Measure
  • MUC Scorer
  • Automatic Evaluation Server
  • Exact matching
  • Extract every slot occurrence

14
Participants
15
Task1
  • Information Extraction with all the available data

16
Task1 Test Corpus
17
Task1 Test Corpus
18
Task1 Test Corpus
19
Task1 4-Fold Cross-validation
20
Task1 4-Fold Test Corpus
21
Task1 Slot FMeasure
22
Best Slot FMeasures Task1 Test Corpus
23
Task 2a
  • Learning Curve

24
Task2a Learning Curve FMeasure
25
Task2a Learning Curve Precision
26
Task2a Learning Curve Recall
27
Task 2b
  • Active Learning

28
Task2b Active Learning
  • Amilcare
  • Maximum divergence from expected number of tags.
  • Hachey
  • Maximum divergence between two classifiers built
    on different feature sets.
  • Yaoyong (Gram-Schmidt)
  • Maximum divergence between example subset.

29
Task2b Active LearningIncreased FMeasure over
random selection
30
Task 3
  • Semi-supervised learning
  • (not significant participation)

31
Conclusions (Task1)
  • Top three (4) systems use different algorithms
  • Rule Induction, SVM, CRF HMM
  • Same algorithms (SVM) produced different results
  • Brittle Performance
  • Large variation on slot performance
  • Post-processing

32
Conclusion (Task2 Task3)
  • Task 2a Learning Curve
  • Systems performance is largely as expected
  • Task 2b Active Learning
  • Two approaches, Amilcare and Hachey, showed
    benefits
  • Task 3 Semi-supervised Learning
  • Not sufficient participation to evaluate use of
    enrich data

33
Future Work
  • Performance differences
  • Systems what determines good/bad performance
  • Slots different systems were better/worse at
    identifying different slots
  • Combine approaches
  • Active Learning
  • Semi-supervised Learning
  • Overcoming the need for annotated data
  • Extensions
  • Data Use different data sets and other features,
    using (HTML) structured data
  • Tasks Relation extraction

34
Thank You
  • http//tyne.shef.ac.uk/Pascal
Write a Comment
User Comments (0)
About PowerShow.com