Across Framework Evaluation Metrics - PowerPoint PPT Presentation

1 / 7
About This Presentation
Title:

Across Framework Evaluation Metrics

Description:

Most applications require complex modules between parser output ... Many peripheral constructions complicate the problem. How to define the policy/guideline? ... – PowerPoint PPT presentation

Number of Views:17
Avg rating:3.0/5.0
Slides: 8
Provided by: miyaoy
Category:

less

Transcript and Presenter's Notes

Title: Across Framework Evaluation Metrics


1
Across Framework Evaluation Metrics
2
Motivation
  • Deep processing technologies have been matured
  • Ex. Questions for deep parsing
  • Are there any differences among deep parsers?
  • Are deep parsers really better than shallow
    parsers?

3
Topics
  • Cross-framework evaluation of
  • Parsers
  • Generators
  • Grammars
  • Evaluation metrics
  • What should be measured?
  • Gold standard
  • How to define framework-independent answers?

4
Parser evaluation
  • Labeled brackets
  • Grammatical Relations
  • Predicate argument relations (framework-dependent)
  • Semantic roles
  • Logic forms
  • Applications IE, MT, QA
  • Most applications require complex modules between
    parser output and final output

5
Problems
  • Difficulty of format conversion
  • We must spend gt1 months to establish conversion
  • Final figures are largely affected by conversion
    quality
  • Difficulty of defining framework-independent gold
    standard
  • Without any framework/theory, gold standard can
    be arbitrary
  • Many peripheral constructions complicate the
    problem
  • How to define the policy/guideline?

6
Issues
  • What should be evaluated (to show strengths of
    deep parsers)?
  • Shallow parsers should be able to join the game
  • How to ease format conversion?
  • Allow for soft matching (ex. Hierarchy of GR,
    MIN tag in coreference resolution tasks)
  • How to define/develop gold standard?

7
Evaluation of generators/grammars
  • How to define input to generators?
  • How to measure the correctness of output?
  • cf.) BLEU, ROUGE
  • How to evaluate grammars in a framework/theory-ind
    ependent way?
  • Different versions of the same grammar
  • Different implementations of the same theory
  • Different grammar theories (LFG vs. HPSG, etc.),
    languages
Write a Comment
User Comments (0)
About PowerShow.com