Title: Applications of Discourse Structure for Spoken Dialogue Systems
1Applications of Discourse Structure for Spoken
Dialogue Systems
- Mihai Rotaru
- Ph.D. Proposal Defense
- Committee
- Diane J. Litman (advisor)
- Rebecca Hwa
- Janyce M. Wiebe
- Carolyn P. Rosé (CMU)
2Spoken Dialogue Systems
- Systems that interact with users via speech
- Advantages
- Naturalness
- Efficiency
- Eye and hands free
- Domains
- Information access Raux et al., 2005 Rudnicky
et al., 1999 Zue et al., 2000 - Tutoring Graesser et al., 2001 Litman and
Silliman, 2004 Pon-Barry et al., 2006 - Assistants Allen et al., 2001 Rayner et al.,
2005
3ITSPOKE
- ITSPOKE (Intelligent Tutoring SPOKEn Dialogue
System) Litman and Silliman, 2004 - Speech-enabled version of the Why2-Atlas computer
tutor VanLehn, Jordan, Rose et al., 2002 - Domain Qualitative physics
- Sample ITSPOKE problem
- Suppose a man is in a free-falling elevator and
is holding his keys motionless right in front of
his face. He then lets go. What will be the
position of the keys relative to the man's face
as time passes? Explain.
4Sample dialogue with ITSPOKE
Dialogue Structure
TUTOR1 To analyze this problem we will first
describe the motion of the person and his keys
while he is holding them. Then we will look at
the motion of the person and his keys after he
lets go of them. Let's begin by looking at the
motion of the man and his keys while he is
holding them. How does his velocity compare to
that of his keys? USER1 its the same TUTOR2
Yes. So while he is holding his keys they both
have the same velocity. Now let's see what
happens when he releases them. To do so we will
first find the individual forces exerted on the
man and his keys, then both net forces, both
accelerations, and finally we will describe and
compare the velocities. So what are the forces
exerted on the man after he releases his keys?
Please, specify their directions (e.g.,
vertically up). USER2 vertically down
gravity TUTOR3 Good. Because there is just one
force, gravity's, it is trivial to determine the
NET force (i.e., the vector sum of all forces).
So what is the direction of the NET force on the
person? ..............
5Research problem
- What is the utility of discourse structure for
spoken dialogue systems? - Questions
- Why discourse structure?
- Why would discourse structure be useful for
dialogue systems? - Why now?
6Discourse structure
- Discourse group of utterances
- Monologue
- Dialogue
- Discourse structure
- Grosz Sidner theory Grosz and Sidner, 1986
7Intention/purpose
structure
Solution walkthrough
TUTOR1 To analyze this problem we will first
describe the motion of the person and his keys
while he is holding them. Then we will look at
the motion of the person and his keys after he
lets go of them. Let's begin by looking at the
motion of the man and his keys while he is
holding them. How does his velocity compare to
that of his keys? USER1 its the same TUTOR2
Yes. So while he is holding his keys they both
have the same velocity. Now let's see what
happens when he releases them. To do so we will
first find the individual forces exerted on the
man and his keys, then both net forces, both
accelerations, and finally we will describe and
compare the velocities. So what are the forces
exerted on the man after he releases his keys?
Please, specify their directions (e.g.,
vertically up). USER2 vertically down
gravity TUTOR3 Good. Because there is just one
force, gravity's, it is trivial to determine the
NET force (i.e., the vector sum of all forces).
So what is the direction of the NET force on the
person? ..............
Two time frames before release, after release
Before release
Mans velocity ? keys velocity
After release
Recipe Forces ? Net force ? Acceleration ?
Velocity
Man Forces/acceleration
Forces on the man
Net force on the man
.
.
.
8Why discourse structure?
- Useful for other tasks
- Understand specific lexical and prosodic
phenomena Hirschberg and Nakatani, 1996
Levow, 2004 Passonneau and Litman, 1997 - Anaphoric expressions Allen et al., 2001
- Natural language generation Hovy, 1993
- Predictive/generative models of posture shifts
Cassell et al., 2001 - Useful for spoken dialogue systems?
- 4 intuitions
9Intuition 1 Context matters
Correctness Incorrect Correct Correct Incorrec
t Correct Incorrect Incorrect Correct Correct In
correct Incorrect Correct Correct Correct
- It is more important to be correct at specific
places in the dialogue. - Phenomena related to performance
- not uniformly important across the dialogue
- have more weight at specific places in the
dialogue. - Discourse structure can be used to define places
in the dialogue
10Intuition 2 Structure matters
Student that learned more
Student that learned less
Different discourse structure
11Intuition 3 Dialogue phenomena
Certainty Uncertain Certain Certain Neutral N
eutral Uncertain Certain Neutral Neutral Certain
Certain Neutral Certain Uncertain
- Certainty is not uniformly distributed across the
dialogue. - Dialogue phenomena
- not uniformly distributed across the dialogue
- more frequent at specific places in the dialogue.
- Discourse structure can be used to define places
in the dialogue
12Intuition 4 Graphical representation
- A graphical representation of the discourse
structure - Easier for users to follow the conversation
- Preferred / learn more
13Why now?
- Underlying domain
- Previous work simple domains (e.g. information
access) - Complex domains
- Tutoring
- Discuss law, concepts
- Complex knowledge remediation subdialogues
- Complex dialogue ? richer discourse structure
14Proposed research program
- Investigate the utility of discourse structure
for spoken dialogue systems - System side
- Performance analysis
- Characterization of user affect
- Characterization of speech recognition problems
- User side
- Graphical representation of the discourse
structure - Navigation Map
- Contributions
- Discourse structure important information
source - Novel applications of discourse structure
- Advance state-of-the-art in speech-based computer
tutors
Intuition 1,2
Tutoring ITSPOKE
Intuition 3
Intuition 4
15Details and current status
- System side
- Discourse transitions defining places in the
dialogue - Performance analysis
- Parameters for performance analysis
- Inform and evaluate a modification of ITSPOKE
- Characterization of user affect
- Characterization of speech recognition problems
- User side
- Implement the Navigation Map
- Users perceived utility of the Navigation Map
- Objective utility of the Navigation Map
16Places in the dialogue
- Requirements
- Domain independent
- Automatic
- Discourse structure transitions
- Relationship between current system turn and
previous system turn 6 labels - Ingredients
- Discourse segment hierarchy
- Transition labeling
17Discourse segment hierarchy
- Automatically annotate the discourse segment
hierarchy - Tutoring information encoded in a hierarchical
structure
18ESSAY SUBMISSION ANALYSIS
ITSPOKE behavior Discourse structure annotation
- Similar automatic annotation possible in other
dialogue managers(e.g. COLLAGEN Rich and
Sidner, 1998, RavenClaw Bohus and Rudnicky,
2003)
Q1
Q2
Q3
Q2.1
Q2.2
Remediation subdialogue
19ESSAY SUBMISSION ANALYSIS
Discourse structure transitions
- Properties
- Domain independent
- Automatic
- Places in the dialogue
- Group turns by transition
20Outline
- System side
- Discourse transitions defining places in the
dialogue - Performance analysis
- Parameters for performance analysis
- Inform and evaluate a modification of ITSPOKE
- Characterization of user affect
- Characterization of speech recognition problems
- User side
- The Navigation Map
- Users perceived utility of the Navigation Map
- Objective utility of the Navigation Map
21Performance analysis
- Understand where and why a Spoken Dialogue System
fails or succeeds - Performance models
- Performance metrics e.g. user satisfaction
- Interaction parameters e.g. number of turns,
speech recognition performance - PARADISE framework Walker et al., 2000
Multivariate linearregression
Performance metric
Interaction parameters
22Performance modeling in tutoring
- Tutoring domain
- Performance metric student learning
- Interaction parameters
- Correctness
- Time on task
- User affect (e.g. certainty)
- of hints, of help requests
- Models
- Correlation with learning e.g. Chi et al.,
2001 - PARADISE models Forbes-Riley and Litman, 2006
Feng et al., 2006
- Previous work makes limited use of
- Context in which events occur
- Dialogue patterns
23Intuition 1 Context matters
Correctness Incorrect Correct Correct Incorrec
t Correct Incorrect Incorrect Correct Correct In
correct Incorrect Correct Correct Correct
-
- Correctness overall versusCorrectness after
discourse transitions
- It is more important to be correct at specific
places in the dialogue. - Correctness overall versusCorrectness at
specific places in the dialogue
Push
24Intuition 2 Structure matters
Student that learned more
Student that learned less
Push
Push
Push
Advance
Trajectories 2 consecutive transitions
Different discourse structure
25Discourse structure and performance analysis
- Quality of parameters derived from discourse
structure transitions - Correctness overall versusCorrectness
after specific discourse transitions - Discourse structure patterns for low learners
versusDiscourse structure patterns for high
learners
26Experiment setup - corpus
- Corpus - ITSPOKE
- 20 students, 5 problems per student
- 100 dialogues, 2334 student turns
- Annotations
- Correctness (manual)
- Perfect recognition
- Perfect understanding
- Discourse structure transitions (automatic)
27Experiment setup - parameters
Correctness parameters Counts () and percentages
() for each correctness value per student (e.g.
C, PC )
- Comparisons
- Correctness overall versusCorrectness
after specific discourse transitions - Discourse structure patterns for low learners
versusDiscourse structure patterns for high
learners
Transition correctness parameters Counts ()
and percentages () for each transitioncorrectnes
s value per student (e.g. PopUpC, PushUA
)Relative percentage (rel) (e.g. PopUpI rel)
Transition transition parameters Counts (),
percentages () and relative percentages ( rel)
for each transitiontransition value per
student (e.g. Push-Push)
28Experiment setup
- Methodology
- Correlations between parameters and learning
- Partial Pearson correlation with PostTest
controlling for PreTest - Experiment 1
- Correctness parameters versusTransition
correctness parameters - Experiment 2
- Transition transition parameters
29Results Experiment 1 (a)
- Correctness parameters
- No trend/significant correlations
- Correctness out of context not very informative
for modeling student performance
30Results Experiment 1 (b)
- Transition correctness parameters
Correctness
- PopUpCorrect, PopUpIncorrect
- Interpretation Capture successful learning
events or failed learning opportunities - ITSPOKE modification engage in an additional
remediation dialogue
31Results Experiment 1 (c)
- Transition correctness parameters (continued)
Correctness
- NewTopLevel-Incorrect
- Interpretation ITSPOKE discovers student
knowledge gaps - ITSPOKE modification
- Activate all tutoring topics for a problem
- Skip a tutoring topic if the first user answer is
correct
32Results Experiment 1 (d)
- 1st intuition verified
- Correctness overall lt Correctness after discourse
transitions - Correctness after discourse transitions
informative for performance modeling - Intuitive interpretations of the
trend/significant correlations
33Experiment 2
Student that learned more
Student that learned less
Push
Push
Push
Advance
Trajectories length 2 Transition-transition
parameters
Different discourse structure
34Results Experiment 2
- Transition Transition parameters
Q1
Q2
Q3
Q2.1
Q2.2
- PushPush
- Interpretation ITSPOKE discovers major knowledge
gaps - More specific than PushIncorrect
Q2.1.1
Q2.1.2
- Transition Transition parameters informative
- Overlaps with transition-correctness but offer
additional insights - 2nd intuition verified
35Conclusions
- Parameters derived from discourse structure
(transitions) are informative - Transition correctness
- Transition transition
- Have intuitive interpretations
- ITSPOKE modifications
36Other experiments
- Experiments with certainty
- Similar results
- Preliminary model building experiments
- PARADISE framework includes only transition-based
parameters - Proposed work
- Validate generality to other corpora
- Transition-based parameters
- Models
37Outline
- System side
- Discourse transitions defining places in the
dialogue - Performance analysis
- Parameters for performance analysis
- Inform and evaluate a modification of ITSPOKE
- Characterization of user affect
- Characterization of speech recognition problems
- User side
- The Navigation Map
- Users perceived utility of the Navigation Map
- Objective utility of the Navigation Map
38ITSPOKE modification
- Modifications
- PopUp-Incorrect engage an additional
remediation dialogue - NewTopLevel-Incorrect try all tutoring topics
- Proposed work
- Investigate feasibility of the 2 modifications
and select one - Implement modification
- Run a user study (2 conditions)
- Control condition original ITSPOKE system
- Experimental condition modified ITSPOKE system
- Hypothesis original ITPOKE lt modified ITSPOKE
- Analyze difference between conditions
- Learning (population, subsets)
- Correctness
- Time on task
39Outline
- System side
- Discourse transitions defining places in the
dialogue - Performance analysis
- Parameters for performance analysis
- Inform and evaluate a modification of ITSPOKE
- Characterization of user affect
- Characterization of speech recognition problems
- User side
- The Navigation Map
- Users perceived utility of the Navigation Map
- Objective utility of the Navigation Map
40Intuition 3 Dialogue phenomena
Dialogue phenomena 1 0 0 1 0 1 1 1 0 0 1 1 0
1
- Dialogue phenomena not uniformly distributed
across the dialogue - Dependencies between
- Discourse transitions
- Dialogue phenomena
- User affect - Uncertainty
- Speech Recognition Problems
Transition
?2 test
?
Phenomena
41Results
- Significant dependencies
- Transition Uncertainty
- E.g. Increased uncertainty after Push, PopUpAdv
- Transition Speech Recognition Problems (SRP)
- E.g. Increase SRP after Push, PopUp
- 3rd intuition validated
Transition
Transition
SRP
Uncertainty
42Outline
- System side
- Discourse transitions defining places in the
dialogue - Performance analysis
- Parameters for performance analysis
- Inform and evaluate a modification of ITSPOKE
- Characterization of user affect
- Characterization of speech recognition problems
- User side
- The Navigation Map
- Users perceived utility of the Navigation Map
- Objective utility of the Navigation Map
43Intuition 4 Graphical representation
- A graphical representation of the discourse
structure - Easier for users to follow the conversation
- Preferred / learn more
The Navigation Map
44The Navigation Map (NM)
- The Navigation Map (NM) dynamic graphical
representation of - Discourse segment purpose/intention
- Discourse segment hierarchy
- Additional features
- Information highlight
- Limited horizon
- Correct answers
- Auto-collapse
45- Manually
- Segment in discourse segments
- Annotate purpose/intention
- Annotate hierarchy
46Why the NM?
- Cognitive Load Theory Sweller, 1988
Information
Information
System
User
Information
Visual channel Mousavi et al., 1995
What to communicate over the visual channel?
47What to communicate?
- Current ITSPOKE interface
- Dialogue history
- More important to communicate
- Purpose of the current topic
- How the topic relates to the overall discussion
- Previous tutoring studies
- Graphical organizers Marzano et al., 2000
- Guided instruction better than unguided
instruction Kirschner et al., 2006 - Process worksheets Nadolski et al., 2005
- Our system-side applications of discourse
structure - E.g. PopUp-Incorrect correlations
Digested view Set up expectations Facilitates
integration
Discourse segment intention Discourse segment
hierarchy
48Outline
- System side
- Discourse transitions defining places in the
dialogue - Performance analysis
- Parameters for performance analysis
- Inform and evaluate a modification of ITSPOKE
- Characterization of user affect
- Characterization of speech recognition problems
- User side
- The Navigation Map
- Users perceived utility of the Navigation Map
- Objective utility of the Navigation Map
49Experiment design
- Solve one problem with the NM and one without the
NM (noNM) - Rate tutor after each problem
- 16 questions, 1 (Strongly Disagree) 5 (Strongly
Agree) scale - Two conditions (to account for order and problem)
- F (First) 1st problem NM 2nd problem noNM
- S (Second) 1st problem noNM 2nd problem NM
- Differences due to NM
- decrease for F
- increase for S
50Results subjective metrics (1)
- Collected corpus
- 28 users 13 F, 15 S
- Balanced for gender
- Significant difference between pretest and
posttest - Questionnaire analysis
- Repeated measure ANOVA with one between subjects
factor - Within-subjects factor NM Presence (NMPres)
- Between-subjects factor Condition (Cond)
- Post-hoc tests
51Results subjective metrics (2)
- NM trend/significant effects on system perception
during the dialogue
52Results subjective metrics (3)
- NM trend/significant effects on overall system
perception
53Results subjective metrics (4)
- 24 out of 28 preferred NM over noNM
- 4 liked noNM (2 per condition)
- Divided attention problem
- NM changing to fast
- NM survey
- 75-86 of users agreed (4) or strongly agreed (5)
that NM helped them - Follow the dialogue
- Learn
- Concentrate
- Update essay
- Open question interview
- NM as a structured note taker
- Would NM for additional instruction after the
dialogue
54Results objective metrics
- Preliminary analysis on objective metrics (1st
problem only) - NM presence
- More correct turns
- Fewer speech recognition problems
55Outline
- System side
- Discourse transitions defining places in the
dialogue - Performance analysis
- Parameters for performance analysis
- Inform and evaluate a modification of ITSPOKE
- Characterization of user affect
- Characterization of speech recognition problems
- User side
- The Navigation Map
- Users perceived utility of the Navigation Map
- Objective utility of the Navigation Map
56NM - Objective utility
- Perceived utility reflects in objective utility?
- User study 3 conditions
- Audio-only ITSPOKE NM Dialogue History
- Text ITSPOKE Dialogue History
- NM ITSPOKE NM
- StrippedNM ITSPOKE NM (only discourse segment
purpose and hierarchy) - Hypothesis
- Audio-only lt Text lt StrippedNM lt NM
57NM - Objective utility (2)
- Proposed work
- Run a user study (3 or 4 conditions)
- noVisual, Text, NM, (Stripped NM)
- Hypothesis noVisual lt Text lt Stripped NM lt NM
- Analyze difference between conditions
- Learning (population, subsets)
- Correctness
- Time metrics
- Transition related metrics transition-correctnes
s parameters, transition-speech recognition
problems parameters
58Proposal conclusions
- Applications of discourse structure for spoken
dialogue systems - Useful for system-side and user-side applications
- Performance analysis
- Characterization of user affect and of speech
recognition problems - Navigation Map
- Proposed work
- Validate findings from performance analysis
- Objective utility of the Navigation Map
- Tutoring domain, ITSPOKE
- Easy to replicate in other complex domains/systems
59Acknowledgements
- ITSPOKE group
- Hua Ai, Kate Forbes-Riley, Diane Litman, Greg
Nicholas, Amruta Purandare, Scott Silliman, Joel
Tetrault, Art Ward - NLP Group _at_ U. Pitt
- Committee members
60 61(No Transcript)
62Experiment design (2)
- ITSPOKE dialogue history was disabled
- Compare Audio-Only versus AudioVisual (NM)
NM
noNM
63Intuition 4 Graphical representation
- A graphical representation of the discourse
structure - Easier for users to follow the conversation
- Preferred / learn more
64Outline
- System side
- Discourse transitions defining places in the
dialogue - Performance analysis
- Parameters for performance analysis
- Inform and evaluate a modification of ITSPOKE
- Characterization of user affect
- Characterization of speech recognition problems
- User side
- The Navigation Map
- Users perceived utility of the Navigation Map
- Objective utility of the Navigation Map
65Why user affect?
- People detect and respond to conversational
partners affective state - Affective computing
- Detect
- React
- Exhibit
- Tutoring
- Human tutors respond to students affective state
(e.g. uncertainty, frustration)
66Detecting affect
- Context-dependent features
- turns
- of corrections/repetitions
- Dialogue act of the system turn
- Discourse structure information not
used
- Acoustic-prosodic features
- Pitch
- Amplitude
- Tempo, duration
- Lexical
- Identification
67Intuition 3 Dialogue phenomena
Certainty Uncertain Certain Certain Neutral N
eutral Uncertain Certain Neutral Neutral Certain
Certain Neutral Certain Uncertain
- Certainty is not uniformly distributed across the
dialogue. - Dependencies between
- discourse transitions
- user affect
Transition
?
Affect
68Experiment setup
- Same ITSPOKE corpus
- 20 students, 5 problems per student
- 100 dialogues, 2334 student turns
- Annotations
- Uncertainty (manual)
- Agreement 90, Kappa 0.68
- Discourse structure transitions (automatic)
- Dependencies ?2 test
Transition
?2 test
Affect
69Results
Transition
- Significant dependency between transition and
uncertainty - Even if we discount for correctness
- Some findings
- For incorrect answers
- Decrease in uncertainty after Advance and PopUp
transitions - Hypothesis Users fail to make the connection
between history and current questions - Discourse structure transitions can be used to
characterize user affect - Used in prediction experiments Ai et al., 2006
Affect
Correctness
70Outline
- System side
- Discourse transitions defining places in the
dialogue - Performance analysis
- Parameters for performance analysis
- Inform and evaluate a modification of ITSPOKE
- Characterization of user affect
- Characterization of speech recognition problems
- User side
- The Navigation Map
- Users perceived utility of the Navigation Map
- Objective utility of the Navigation Map
71Speech recognition problems (SRP)
- Significant dependency between transition and SRP
- Increase in recognition problems after specific
transitions (Push, PopUp)
Transition
SRP
72Results subjective metrics (2)
- System perception during the dialogue
- Structure
- With NM easier to identify tutoring structure
- With NM easier to follow the structure
73Visual channel previous work
74(No Transcript)
75Results subjective metrics (3)
- System perception during the dialogue
- Integration
- With NM better forward looking integration
- With NM better backward looking integration
Effect of learning with NM ?
76Results subjective metrics (4)
- System perception during the dialogue
- Correct answers
- With NM easier to know the correct answer
- With NM easier to know if correct (not
significant)
77Results subjective metrics (5)
- System perception during the dialogue
- Level of concentration
- With NM easier to follow the tutor
78Results subjective metrics (6)
- Overall system perception
- With NM
- Easier to understand tutor mains point
- Easier to learn
- Can concentrate better
- Enjoyed working more (not significant)
- NM version preferred in terms of reuse
79(No Transcript)
80(No Transcript)