NLG Evaluation: Let - PowerPoint PPT Presentation

About This Presentation
Title:

NLG Evaluation: Let

Description:

NLG Evaluation: Let s Open up the Box Chris Mellish and Donia Scott – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 15
Provided by: ohi73
Category:

less

Transcript and Presenter's Notes

Title: NLG Evaluation: Let


1
NLG EvaluationLets Open up the Box
  • Chris Mellish and Donia Scott

2
End to End Evaluation
  • Start from some available, neutral data, e.g.,
    numbers
  • End with simple text
  • Measure text fluency and/or accuracy
  • Appealingly simple idea
  • Minimal constraints anyone can play

3
Problems
  • Can overfit the task the solutions may not be
    interesting
  • What does this have to say about NLG in general
    (i.e. the things we talk about at conferences)?
  • May attract few serious participants, because of
    lack of perceived relevance

4
Opening the Box
  • End-to-end evaluation is a black box approach,
    which sheds no light on what is happening in the
    systems
  • Is there no prospect for white box evaluation
    that sheds light on how best to do general NLG
    tasks?

5
The MUC experience
  • MUC-1 no formal evaluation
  • MUC-2 template filling
  • MUC-3,4 more complex templates
  • MUC-5 nested templates
  • MUC-6 added named entities, some
    domain-independent templates, coreference
  • WHY THE CHANGE?

6
The MUC5/6 Transition
  • demonstrating task-independent component
    technologies of information extraction which
    would be immediately useful
  • while so much effort had been expended, a large
    portion was specific to the particular tasks. It
    wasnt clear whether much progress was being made
    on the underlying technologies which would be
    needed for better understanding
  • (Grishman and Sundheim, COLING-96)

7
How fast can we open the box?
  • It took MUC 8 years from MUC-1 to MUC-6
  • Do we have that long?
  • Do we have the funding/commitment to learn slowly
    through experience?

8
Can we open it?
  • Do we agree on enough? Isnt NLG riven by
    controversy and disagreements?
  • The RAGS project showed that there is a basis for
    agreement, though its not simple
  • RAGS gave a way formally to define complex NLG
    interfaces

9
The RAGS model
10
Devising a whiteish box
  • Devise some example end-to-end NLG task
  • Choose certain internal interfaces
  • Formalise these interfaces, e.g., using RAGS or
    something that improves it.
  • Devise an XML format and create example data/
    requirements for data to be produced
  • Make using that format a condition of
    participation

11
Why start from RAGS?
  • It exists, has been tried, is based on existing
    NLG work
  • It is very flexible (can describe complex mixed
    and partial structures)
  • It provides an XML interchange format
  • RAGS can be (and should be) reinterpreted in
    terms of the semantic web. This would bring
    access to many generic tools.

12
The reinterpretation
  • RAGS abstract type definitions (upper)
    ontologies (e.g., using OWL)
  • Agreed instantiations (lower) ontologies
  • XML offline representation better to use RDF
    representation
  • Native formats RDF input/output models provided
    by most programming languages

13
RAGS as the Les Demoiselles d'Avignon of NLG
ugly!!
A veritable cataclysm!!
Scandalous!!
Shocking!!
A horror!!
  • long recognized as one of the most significant
    paintings of the twentieth century
  • represents a revolutionary breakthrough in the
    history of modern art
  • a pivotal work in the development of modern art

14
(No Transcript)
15
(No Transcript)
16
Lets open the box!
  • This may not lead to the easiest game to devise
    or play, but it may lead to a much more
    meaningful one.
  • Which would be most fun?
Write a Comment
User Comments (0)
About PowerShow.com