Title: Dialog Structure Design and Annotation
1Dialog Structure Design and Annotation
- Ananlada Chotimongkol
- Language Technologies Institute
- School of Computer Science
- Carnegie Mellon University
2Out Line
- Existing Annotation Schemes
- Linguistic Oriented
- Engineering Oriented
- HCRC dialog structure
- Conversation Acts
- DAMSL
- Comparison
- Form-based dialog structure
3Structure of a dialog
- Explain how the conversation is organized
- To create a theory of dialog in order to
understand the meaning of the dialog - Linguistic-Oriented
- To develop a procedure that support a computer
agent in a dialog system - Engineering-Oriented
4Linguistic-Oriented
- Some are extended from discourse structure (focus
on monologue text) - Provide basic theory for the engineering-oriented
one - Speech Act Theory capture speakers intention
- Rhetorical Structure Theory explain the
coherence between parts of text - Dialog Grammar capture regular patterns in the
dialog
5Engineering-Oriented
- HCRC structure (Edinburgh)
- Conversation Acts (Rochester)
- DAMSL (Multiparty Discourse Group)
6HCRC Dialog Structure
- Carletta, J., Isard, A., Isard, S., Kowtko, J.,
Doherty-Sneddon, G., Anderson, A., HCRC dialogue
structure coding manual, 1996 - http//www.ltg.ed.ac.uk/amyi/maptask/demo.html
- Domain map description
- Focus on describing the phenomenon occurs in the
Map Task corpus - But claim to be task-independent
- Focus on high level structure
- Can use in conjunction with other coding scheme
73-level structure
- Transaction a sub-dialog that accomplish a major
goal of the task - In Map Task 1 segment of the route
- Game (interaction, exchange) a set of utterances
composes of an initiation and a sequence of
responses that fulfills the initiations purpose - Move (dialog act) an utterance or part of
utterance that serves a particular propose e.g.
as an initiation or a response
8Move Coding Scheme
- Tradeoff between semantic distinction and coding
consistency - 12 moves from 3 categories
- Initiating Moves set up an expectation at the
beginning of the game - Instruct, Explain, Check, Align, Query-YN and
Query-W - Response follow the initiation and fulfill the
expectation - Acknowledge, Reply-Y, Reply-N, Reply-W and
Clarify - Ready occur in the transition between games
9Game Coding Scheme
- Games purpose the name of games initiating
move - All games begin with an initiating move but not
all initiating moves begin games - Game can be nested e.g. contain clarification
sub-dialog
10Transaction Coding Scheme
- Divide the dialog into transactions
- Different between giver and followers
perspectives - For a giver, how he divides a route into sub-task
- 4 types of transactions normal, review, overview
and irrelevant - Each transaction (except irrelevant) is
associated with a route segment on the map - For a follower, how he perceives a segment and
performs some actions - 2 types of actions drawing a line and crossing
out a line - A transaction isnt nest (too large)
11Discussion
- No real dialog application. Use as a data for
analyzing phenomena in dialog - Emphasize on how the information is conveyed e.g.
as a question or a response, rather than what
information is conveyed (concept) - Annotate the purpose of the utterance in general
e.g. instruct, explain, question, rather than the
purpose that each utterance serves according to
the task e.g. describe the movement or describe
the landmark
12Conversation Acts
- David R. Traum and Elizabeth A. Hinkelman,
"Conversation Acts in Task-Oriented Spoken
Dialogue", In Computational Intelligence,
8(3)575--599, 1992. Also appears as TR 425,
Computer Science Dept. - Emphasize
- Mutual understanding between participants
- Dialog mechanisms that serve in coordination and
maintenance of the dialog itself rather than the
direct task.
13Dialog units
- Utterance unit (UU)
- Continuous speech by the same speaker
- Each speaker turn can contain more than one UU
- Discourse Unit (DU)
- A sequence of an initial presentation and
subsequent utterances by each party that are
needed to make a unit grounded
14Classes of Conversation acts
- 4 classes
- Turn-taking acts (sub-UU acts)
- Grounding acts (UU acts)
- Core speech acts (DU acts?)
- Argumentation acts (multiple DUs)
- More general than speech act theory
15Turn-taking Act
- Can have more than one turn-taking act in an
utterance (sub-UU act) - Coordinate the control of the speaking channel
- Types of turn-taking acts
- take-turn, keep-turn, release-turn, assign-turn
and pass-up-turn - Turn-taking acts occur all the time
- Should we annotate all of them?
- Which one is important?
16Grounding Act
- Correspond to one utterance unit (UU act)
- Coordinate mutual understanding
- Types of grounding acts
- Initiate (an initial component of a DU)
- Continue
- Acknowledge
- Repair
- ReqRepair
- ReqAck
- Cancel (close off the current DU as ungrounded)
17Core Speech Act
- Similar to a traditional speech act
- Coordinates the local flow of changes in belief,
intentions and obligations - Types of core speech acts
- Inform, WHQ, YNQ, Accept, Request, Reject,
Suggest, Eval, ReqPerm, Offer, Promise - Doesnt correspond to any of dialog units?
18Argumentation Act
- Compose of combinations of core speech acts
(Multiple DUs act) - Coordinate discourse purpose
- Is at the same level as Rhetorical Relations and
Adjacency Pairs - Types of argument acts Elaborate, Summarize,
Clarify, QA, Convince, Find-Plan - Build up hierarchy with in the same class
- The high level acts correspond to steps in task
structure (task-dependent?) - The lower level acts QA
19DAMSL (Dialog Act Markup in Several Layers)
- Coding Dialogs with the DAMSL Annotation Scheme.
Mark Core, James Allen. AAAI Fall Symposium on
Communicative Action in Humans and Machines,
1997. - J. Allen and M. Core. Draft of DAMSL Dialog Act
Markup in Several Layers, 1997.
20DAMSL Tag Set
- Developed by Multiparty Discourse Group
- Contain primitive communicative actions that
manipulates the common ground directly - Allow multiple labels in multiple layers
- Eliminate the restriction in Speech Act Theory
- Design to be domain-independent
- But can add domain relevant acts
- The annotation can be used to
- Interpret utterances in dialog
- Design appropriate dialog strategy
21DAMSL Annotation Scheme
- 3-layer of annotation for each utterance
- Forward Communicative Functions
- Backward Communicative Functions
- Utterance Features
- These 3 layers are orthogonal
- But some utterances may not have a label for
every layer - Can have more than one label in each layer
- Utterance segmentation is based on the intentions
of the speaker - An utterance can have several clauses or just an
initial word
22Forward Communicative Function
- Indicates how the current utterance constrains
the future beliefs and actions - Similar to actions in speech act theory
- Types of Forward Communicative Functions
- Statement
- Influencing Addressee Future Action
- Committing Speaker Future Action
- Performative (make a fact true by saying it)
- Other Forward Function
23Backward Communicative Function
- Indicate how the current utterance relates to the
previous dialog - Types of Backward Communicative Functions
- Agreement (accept/reject)
- Understanding
- Answer (associate with info-request act)
- Information Relation (How this utterance relates
to the previous one) - Similar to Rhetorical Relations
24Utterance Feature
- Capture content and form of utterance
- The features are
- Information Level task, task management,
communication management - Communicative Status abandoned, uninterpretable
- Syntactic Features conventional form,
exclamatory form
25Discussion
- Focus on the primitive purpose of the utterance
- Need more detail representation to get the key
information in the utterance - Also need higher level representations such as
plans and discourse structures - Are these 3 layers orthogonal?
- Are there too many tags for each utterance?
26Comparison Levels of Annotation
- HCRC
- Transaction
- Game
- Move
- Conver. Acts
- Argumentation acts
- Core speech acts
- Grounding
- Turn-taking
- DAMSL
- Forward
- Backward
- Utterance Features
27Comparison Levels of Annotation
- HCRC
- Transaction
- Game
- Move
- (The same level as all DAMSL tags)
- Conver. Acts
- Argumentation acts
- (Dialog Unit)
- Core speech acts
- Grounding
- Turn-taking
28Comparison tags for utterance level
- HCRC
- Initiation
- Instruct, Explain, Check, Align, Query-YN and
Query-W - Response
- Acknowledge, Reply-Y, Reply-N, Reply-W and Clarify
- DAMSL
- Forward
- Statement, Influencing-Addressee-Future-Action,
Committing- Speaker-Future Action, Performative - Backward
- Agreement (accept/reject),
- Understanding,
- Answer, Information Relation
- Conver. Acts
- Inform, Suggest, Offer, Promise Request, ReqPerm,
WHQ, YNQ, Accept, Reject, Eval,
29Form-based dialog structure
- Why we need a new structure
- The existing structures are too general
- Want to capture domain information e.g. task
structure, key concepts - Want to create a dialog system from a structure
- Choose to work on a form-based dialog system
- Represent a structure of a dialog in term of
forms and slots
30Three-level organization
- Task (dialog)
- A task is a subset of conversation that serves
a particular goal of a dialog. - Episode (sub-task)
- A set of utterances that corresponds to a
smaller step in a task - Concept
- An important piece of domain information that
the participants would like to communicate in the
dialog
31Form representation
- A form is a repository of related pieces of
information (concepts) - A sub-task is equivalent to form
- A sub-task is a smallest practical unit
- A task collection of forms (sub-tasks)
32How the task can be accomplished using a form?
- The sub-task is accomplished by manipulating the
form - Fill in the slots
- Execute the form
- Discuss the result
- Operations
-
33Operations
- Operation is an utterance or a part of an
utterance (turn) that causes a unique consequence
in the conversation - U fill_form_info I'D LIKE TO FLY TO
ArLocHOUSTON ArLocTEXAS - S access_DB
- inform_result I HAVE A NON-STOP ON CONTINENTAL
34Question Answer pair
- QA are separated into 2 operations by a turn
boundary - The consequence of the answer is depended on the
question especially the yes/no answer - Dialog1
- U init_form I NEED A HOTEL IN HOUSTON
- Dialog2
- S ask_init_form AND WOULD YOU NEED A HOTEL
WHILE YOU'RE IN HOUSTON - U respond YES
35Lets Go
- Goal request information about the bus schedule
- Tasks (multiple system functions)
- Ask bus number
- Ask departure time
- Ask stop
- Etc.
- One form for each task (a simple task)
- Concept bus_number, hour, minute,
depature_location
36List of Operations
- Form-filling operations
- init_form
- fill_form_info
- change_form_info
- Form execution operations
- access_DB (task-specific)
- Discuss-result operations
- inform_result
- navigate_results
37Air Travel Domain
- Goal Reserve a flight with optional hotel and
car - Tasks
- Reserve a flight
- Reserve a car
- Reserve a hotel
- But car and hotel are always parts of flight
reservation. - So it is better to think of them as sub-tasks
- One form for each sub-task
- Concept airline, city, date, time
38Flight Reservation
- There are 3 form executions (DB access) in the
flight reservation episode - Retrieve departure flight
- Retrieve arrival flight
- Retrieve fare
- Fare is depended on the flights
- Embedded forms
Trip flight info flight info
Departure Leg
fare
Arrival Leg
39Map Task description
- Conversation between 2 participants
- Giver has a map with a route on it
- Follower has a map without a route
- Task a giver tell the follower how to draw the
route on the followers map - The maps are not exactly the same
40(No Transcript)
41Map Task Characteristic
- More casual conversation
- Disfluency
- Repetition
- Anaphora
- No well-defined form
- No constraint from the backend
- There are many ways to describe a segment
- Need a lot of grounding processes
42Map Task Structure
- Goal draw a map from a description
- Task draw a line (a route)
- Sub-task
- draw a segment of a line
- Locate a new landmark (can be embedded)
43Grounding Process
- Create mutual understanding between participants
- Check understanding, correctness of communication
- Confirmation and clarification
- Define a new term
- Discuss the attributes of the object e.g. check
landmark and create landmark
44Grounding process in form-based structure
- Confirmation
- If yes, increases the confidence on the slot
value - If no, crosses out the value from the slot
- Clarification
- S ask_fill_form_info INTO ArLocINTERCONTINENTA
L AIRPORT OR ArLocHOBBY - U fill_form_info AT THE /UH/ ArLocINTERCONTINE
NTAL
45Grounding process in form-based structure (2)
- Define a new term
- A form is a collection of object attributes
- FOLLOWER fill_form_info but golden beach is
away in Locthe far right. - Landmark golden beach
- Location the far right
46Plane simulation task
- 3 participants works on the plane simulation
- Task take pictures of a list of targets
- Each participant has different roles flying the
plane, navigating the route, taking a picture - There are some restriction on controlling a plane
such as speed, altitude and radius from a
destination
47Dialog Structure
- Task Take pictures of a given list of targets
- Sub-tasks Take a picture of one target
- Concept
- target
- waypoint
- distance
- speed
- altitude
48Task Characteristic
- 3-party conversation
- Command Control style
- The physical actions have a time constraint
- Cant execute the form right away after all the
slots get filled - The list of the sub-tasks (targets) is not fixed
and not known in advance
49Sub-task
- Main sub-task take a picture of the target
- Also have to control the plane
- Set destination, altitude and speed (have
restriction) - Report the result in term of the plan status
altitude, speed, destination and the distance
from destination - Grounding process
- Define a landmark as a target or a waypoint
50Forms
- target form (take a picture)
- target name
- required distance from target
- control form contain only a single slot (fly a
plane) - Altitude
- Speed
- Destination (may have radius)
- grounding form (grounding process)
- object name
- attributes e.g. type of landmark