Learning the Structure of Task-Oriented Conversations from the Corpus PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: Learning the Structure of Task-Oriented Conversations from the Corpus


1
Learning the Structure of Task-Oriented
Conversations from the Corpus
  • Ananlada Chotimongkol
  • Language Technologies Institute
  • School of Computer Science
  • Carnegie Mellon University

2
Outline
  • Introduction
  • Form-based dialog structure
  • Task structure
  • Dialog mechanisms
  • Dialog structure learning
  • Concept identification and clustering
  • Form identification
  • Operation Classification

3
Outline
  • Introduction
  • Form-based dialog structure
  • Task structure
  • Dialog mechanisms
  • Dialog structure learning
  • Concept identification and clustering
  • Form identification
  • Operation Classification

4
Building a new dialog system
When would you like to leave?
I would like to fly to Seattle tomorrow.
Domain Knowledge
Speech Synthesizer
Speech Recognizer
Natural Language Generator
Natural Language Understanding
Dialog Manager
5
Domain knowledge
  • Steps in the task
  • Specify the desired flight
  • Search for flights that match the criteria
  • Negotiate the flights
  • Make a reservation
  • Important information, keywords
  • Destination, date, time, airlines, etc.
  • Domain language how do people talk

6
What is the problem?
When would you like to leave?
I would like to fly to Seattle tomorrow.
  • Cant reuse
  • Time consuming
  • May need an expert

Domain Knowledge
Speech Synthesizer
Speech Recognizer
Natural Language Generator
Natural Language Understanding
Dialog Manager
7
Research goal
  • Reduce human effort on acquiring domain knowledge
    when create a dialog system in a new domain
  • By learning the domain knowledge from data

8
Observations
  • Task-oriented conversations have a clear
    structure
  • Reflects domain information e.g. a task is
    divided into sub-tasks
  • Has recurring patterns that are observable
    through the language

9
The solutions
  • To learn domain knowledge from data
  • Specify the structure of task-oriented
    conversations
  • Capture sufficient domain knowledge
  • Domain-independent
  • Learnable
  • Learn the structure from a corpus of human-human
    conversations

10
Dialogue structure
  • Task Structure (data representation)
  • Necessary information for achieving a task goal
  • Steps in the task
  • Domain keywords
  • Dialog mechanism (operations)
  • The ways that the participants communicate and
    perform the task

11
Outline
  • Introduction
  • Form-based dialog structure
  • Task structure
  • Dialog mechanisms
  • Dialog structure learning
  • Concept identification and clustering
  • Form identification
  • Operation Classification

12
Existing dialog structures Theoretical-oriented
  • Examples
  • Theory of Discourse Structure (Grosz and Sidner,
    1986)
  • Discourse Representation Theory (DRT) (Kamp and
    Reyle, 1993)
  • Focus on developing a theory that helps interpret
    discourse meaning
  • Might be too complex to be implemented in a
    dialog system
  • Use hand-written rules to recognize the structure

13
Existing dialog structures Engineering-oriented
  • Examples
  • Plan-based theory (Allen and Perrault, 1980)
  • The theory of Conversation Acts (Traum and
    Hinkelman, 1992)
  • Focus on practical issues
  • Predictability of each dialog component
  • The implementation of the structure in a dialog
    system

14
What are missing?
  • Dont describe key domain information that the
    participants communicate in a dialog.
  • The role of city names in a travel domain
  • It is not clear how to apply the structure in a
    dialog system
  • The relations between dialog structure components
    and dialog system components
  • How a dialog manager should treat each component

15
Form-based dialog structure
  • Describe a dialog structure with an existing
    dialog manger frameworks
  • Have a concrete mapping between dialog structure
    components and dialog system components
  • A form-based architecture has been used
    successfully in many dialog systems
  • A form-based structure consists of
  • A task structure (forms and slots)
  • Dialogue mechanisms (form operators) that advance
    the dialog

16
Outline
  • Introduction
  • Form-based dialog structure
  • Task structure
  • Dialog mechanisms
  • Dialog structure learning
  • Concept identification and clustering
  • Form identification
  • Operation Classification

17
Task Structure
  • 3-level of organization
  • Task a subset of conversations that has a
    specific goal
  • Sub-task a step in a task that contributes
    toward a task goal
  • gt form
  • Concept key information
  • gt slot

18
Task Structure Bus schedule enquiry domain
  • Task (multiple tasks)
  • Which bus runs between A and B?
  • When will the bus X arrive?
  • Sub-tasks no further decomposition
  • Concepts
  • Bus Number61C, 28X,
  • LocationCMU, airport,

19
Departure time query form
F Query_Departure_Time Depart_Location
carnegie_mellon Arrive_Location the
airport Arrive_Time Hour four Minute thirty
Bus_Number 28X
20
Task Structure Travel planning domain
  • Task create travel itinerary
  • Sub-tasks
  • Flight reservation
  • Hotel reservation
  • Car rental reservation
  • Concepts
  • airlinesContinental, US-Airways,
  • hotelHilton, Marriott,

21
Task Structure Map reading domain
  • Task draw a line (a route)
  • Sub-tasks
  • Draw a segment of a line
  • Concepts
  • Landmark white_mountain, Machete,
  • Orientation down, left,
  • Distance a couple of centimeters, an inch,

22
Outline
  • Introduction
  • Form-based dialog structure
  • Task structure
  • Dialog mechanisms
  • Dialog structure learning
  • Concept identification and clustering
  • Form identification
  • Operation Classification

23
Dialogue mechanisms
  • Operations that the participants use to advance
    the dialog toward the goal
  • Task-oriented operations
  • Manipulate a form (data structure)
  • Examples init_form, fill_form
  • Discourse-oriented operations
  • Manage the flow of a conversation
  • Examples acknowledgement, greeting

24
Dialogue mechanisms (2)
  • Have a unique consequence on the state of the
    conversation
  • init_form causes a system to create a new form
  • Domain independent, only operation parameters
    that are different
  • Fill city_name in flight_information form
  • Fill bus_number in bus_information form

25
Air travel-planning domain
PT8     request_form_info WHAT TIME WOULD YOU
LIKE TO DEPART DepLocPITTSBURGH
PT8     request_form_info WHAT TIME WOULD YOU
LIKE TO DEPART DepLocPITTSBURGH   X9    
fill_form_info /UM/ EARLY DepTMORNING NOT
BEFORE DepTHSEVEN
PT8     request_form_info WHAT TIME WOULD YOU
LIKE TO DEPART DepLocPITTSBURGH   X9    
fill_form_info /UM/ EARLY DepTMORNING NOT
BEFORE DepTHSEVEN PT10  
acknowledge OKAY
PT8     request_form_info WHAT TIME WOULD YOU
LIKE TO DEPART DepLocPITTSBURGH   X9    
fill_form_info /UM/ EARLY DepTMORNING NOT
BEFORE DepTHSEVEN PT10  
acknowledge OKAY access_DB
inform_result U.S. AIRWAYS HAS A
NON-STOP
26
Bus schedule enquiry domain
U2 fill_form_info  i wanted to take the 28X bus
from /um/ DepLocforbes avenue to ArLocthe
airport    
F Query_Departure_Time Depart_Location Arrive_L
ocation Arrive_Time Bus_Number
F Query_Departure_Time Depart_Location forbes
avenue Arrive_Location the airport Arrive_Time B
us_Number 28X
27
Outline
  • Introduction
  • Form-based dialog structure
  • Task structure
  • Dialog mechanisms
  • Dialog structure learning
  • Concept identification and clustering
  • Form identification
  • Operation Classification

28
Learning framework
  • Goal minimize human effort
  • Use unsupervised learning when possible
  • Incorporating information from existing knowledge
    sources
  • If additional knowledge from a human is required
  • Train an initial model with a small amount of
    annotated data
  • Use unsupervised learning or active learning to
    selectively explore un-annotated data
  • A human can correct a mistake

29
Dialog structure components
  • Domain-dependent -gt have to learn in every domain
  • Task structure (forms, slots)
  • Expression for task-oriented operations
  • Domain-independent -gt infrastructure or have to
    learn only once
  • List of operations
  • Expression for discourse-oriented operations

30
Outline
  • Introduction
  • Form-based dialog structure
  • Task structure
  • Dialog mechanisms
  • Dialog structure learning
  • Concept identification and clustering
  • Form identification
  • Operation Classification

31
Concept identification and clustering
  • Goal Identify concept members cluster together
    the ones that belong to the same concept
  • CityPittsburgh, Boston, Austin,
  • Assumption
  • Word boundaries include compound word boundaries
    are given

32
Concept identification steps
  • Identify potential concept members
  • Filter out noise, function words
  • Cluster similar words together
  • Statistical-based clustering Mutual
    information-based and Kullback-Liebler-based
  • Knowledgebase clustering WordNet
  • Select clusters that represent domain concepts
  • Use the same criteria as (1), but work on a
    cluster level

33
Outline
  • Introduction
  • Form-based dialog structure
  • Task structure
  • Dialog mechanisms
  • Dialog structure learning
  • Concept identification and clustering
  • Form identification
  • Operation Classification

34
Form Identification
  • Goal determine different types of forms that
    occur in the domain
  • Assumption
  • A dialog may be annotated with concept labels

35
Approach
  • Segment a dialog into a sequence of sub-tasks
    (form boundaries identification)
  • Train a classifier on lexicon cohesion (Hearst,
    1994) and prosodic features
  • Group together the sub-tasks that belong to the
    same form type
  • Use unsupervised clustering based on cosine
    similarity
  • Identify a set of slots that associated with each
    form type
  • Analyze a cluster of similar form instances

36
Outline
  • Introduction
  • Form-based dialog structure
  • Task structure
  • Dialog mechanisms
  • Dialog structure learning
  • Concept identification and clustering
  • Form identification
  • Operation Classification

37
Operation Classification
  • Goal Learn the expressions that associate with
    each operation
  • by classifying an utterance into a pre-defined
    set of operations
  • Assumption
  • A dialog may be annotated with concepts labels
  • List of operation types are given
  • Operation boundaries are known

38
Supervised classification
  • Use a Markov model (Woszczyna and Waibel, 1994)
  • States operation types
  • Transition probability dependency between
    operation types
  • Emission probability P(Woperation_type)
  • Enhanced models
  • Use domain concepts as word classes to reduce a
    data sparseness problem
  • Add prosodic features

39
Unsupervised learning and active learning
  • Train an initial classifier from human-labeled
    data
  • Apply the current classifier to an unlabeled
    operation
  • (Unsupervised learning) if the confidence is
    high, add this instance and the predicted label
    into the training set
  • (Active learning) if the confidence is low, ask a
    human to label this instance and then add it into
    the training set
  • Train a new classifier on all labeled data (both
    machined-labeled and human-labeled)
  • Step 2-3 can be iterated

40
Classifier confidence score
  • Difference in probability between the first rank
    and the second rank
  • The entropy of the classifier output
  • High entropy low confidence

41
Suggestion?
Write a Comment
User Comments (0)
About PowerShow.com