Stochastic Language Generation for Dialog Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Stochastic Language Generation for Dialog Systems

Description:

Human-Human dialogs in travel reservations (Leah, ATIS/American Express dialogs) 8/17/09 ... am hotel. arrive_airport hotel_city. arrive_city hotel_price ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 20
Provided by: alic49
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Stochastic Language Generation for Dialog Systems


1
Stochastic Language Generation for Dialog Systems
  • Alice Oh
  • aliceo_at_cs.cmu.edu
  • 27 September 2009

2
Motivation
  • To build a generation engine for a dialog system
    that can combine the advantages, as well as
    overcome the difficulties, of the two current
    approaches (template-based generation, and
    traditional, linguistic NLG)

3

General Spec of NLG input, output, parts (from
Kevin Knights presentation at the DARPA
Communicator Meeting, June 1999)
Intermediate reps all allow for varying depth.
Semantic frame, pronouns, speech act
Labeled syntactic bracketing, speech act
Database, discourse history
Speech acts, discourse history
SABLE
.wav file
Macro planning
Micro planning
Sentence realizing
Prosody
Speech Synthesis
CMU
SRI CTalk
SRI Comm
MIT Comm
MIT/Envoice
chatterbot
4
Current Approaches
  • Traditional NLG produces high-quality output but
    needs hand-crafted rules and other knowledge
    sources
  • Template-based NLG does not offer quality but is
    simple to build
  • Recent workshop on NLG vs. Templates
  • http//www.dfki.de/service/NLG/KI99.html

5
Our Approach
  • corpus-driven
  • easy to build (no expert knowledge)
  • fast prototyping
  • minimal input
  • natural output
  • leverages data-collecting/tagging effort
  • modular (enables plug-n-play)

6
Stochastic NLG overview
  • Language Model an n-gram language model built
    from a corpus of travel reservation dialogs
  • Generation given an utterance class, randomly
    generates a set of candidate utterances based on
    the LM distributions
  • Scoring based on a set of rules, scores the
    candidates and picks the best one
  • Slot filling substitute slots in the utterance
    with the appropriate values in the input frame

7
Stochastic NLG can also be thought of as a way to
automatically build templates from a corpus
  • If you set n equal to a large enough number, most
    utterances generated by LM-NLG will be exact
    duplicates of the utterances in the corpus.

8
Stochastic NLG Language Model
  • Human-Human dialogs in travel reservations
  • (Leah, ATIS/American Express dialogs)

9
Tags
  • Utterance classes (29)
  • query_arrive_city inform_airport
  • query_arrive_time inform_confirm_utterance
  • query_arrive_time inform_epilogue
  • query_confirm inform_flight
  • query_depart_date inform_flight_another
  • query_depart_time inform_flight_earlier
  • query_pay_by_card inform_flight_earliest
  • query_preferred_airport inform_flight_later
  • query_return_date inform_flight_latest
  • query_return_time inform_not_avail
  • hotel_car_info inform_num_flights
  • hotel_hotel_chain inform_price
  • hotel_hotel_info other
  • hotel_need_car
  • hotel_need_hotel
  • hotel_where
  • Attributes (24)
  • airline flight_num
  • am hotel
  • arrive_airport hotel_city
  • arrive_city hotel_price
  • arrive_date name
  • arrive_time num_flights
  • car_company pm
  • car_price price
  • connect_airline
  • connect_airport
  • connect_city
  • depart_airport
  • depart_city
  • depart_date
  • depart_time
  • depart_tod

10
Tagging
  • CMU corpus tagged manually
  • SRI corpus tagged semi-automatically using
    trigram language models built from CMU corpus

11
Stochastic NLG Generation
  • Given an utterance class, randomly generates a
    set of candidate utterances based on the LM
    distributions
  • Generation stops when an utterance has penalty
    score of 0 or the maximum number of iterations
    (50) has been reached
  • Average time 238 msec for Communicator dialogs

12
Stochastic NLG Scoring
  • Assign various penalty scores for
  • unusual length of utterance (thresholds for
    too-long and too-short)
  • slot in the generated utterance with an invalid
    (or no) value in the input frame
  • a new and required attribute in the input
    frame thats missing from the generated utterance
  • repeated slots in the generated utterance
  • Pick the utterance with the lowest penalty (or
    stop generating at an utterance with 0 penalty)

13
Stochastic NLG Slot Filling
  • Substitute slots in the utterance with the
    appropriate values in the input frame
  • Example
  • What time do you need to arrive in arrive_city?
  • What time do you need to arrive in New York?

14
Examples
  • I have a u.s. air flight at ten ten a.m. from
    pittsburgh arriving at twelve eleven p.m.
  • I have a flight departing seattle at one thirty
    arrives into pittsburgh international at eight
    fifty seven.
  • There is a u.s. air flight departing pittsburgh
    at ten ten a.m. arriving at twelve eleven p.m.
  • Which one is template?
  • You WILL know after calling the template-based
    system a few times.

15
Rejected Examples
  • Not enough info
  • There is a flight at depart_time ampm.
  • Contains attributes not specified in the frame
  • I have an airline flight at depart_time
    ampm from depart_city arriving at
    arrive_time ampm with a stop-over in
    connect_city at connect_airport.
  • Scoring to get the best utterance is important!

16
Stochastic NLG Shortcomings
  • What might sound natural (imperfect grammar,
    intentional omission of words, etc.) for a human
    speaker may sound awkward (or wrong) for the
    system.
  • It is difficult to define utterance boundaries
    and utterance classes. Some utterances in the
    corpus may be a conjunction of more than one
    utterance class.
  • Factors other than the utterance class may affect
    the words (e.g., discourse history).
  • Some sophistication built into traditional NLG
    engines is not available (e.g., aggregation,
    anaphorization).

17
Related Work
  • Statistical NLG
  • Irene Langkilde and Kevin Knight (USC/ISI)
  • Jon Oberlander and Chris Brew (U. Edinburgh)
  • NLG in dialog systems
  • Amanda Stent (U. Rochester)
  • Lena Santamarta (Linköping Univ.)

18
Evaluation
  • User satisfaction questionnaire
  • Comparative evaluation
  • two systems with different NLG
  • human reading the output, teasing out TTS
  • compare task completion, as well as user
    satisfaction
  • Batch-mode generation, output evaluated by a
    human grader

19
Future Work
  • How big of a corpus do we need?
  • How much of it needs manual tagging?
  • How does the n in n-gram affect the output?
  • What happens to output when two different human
    speakers are modeled in one model?
  • Can we replace scoring with a search algorithm?
Write a Comment
User Comments (0)
About PowerShow.com