Title: Error detection in spoken dialogue systems
1Error detection in spoken dialogue systems
- GSLT Dialogue Systems, 5p
- Gabriel Skantze
2Grounding in conversation
- Communication making something common
- Common ground The mutual understanding of the
participants in a joint action - Grounding establish something as part of common
ground well enough for current purposes - The grounding acts will depend on
- Confidence of understanding/prior groundedness
- The grounding criterion (current purposes)
- Cost of task failure
- Cost of grounding
3Miscommunication
- Principle of least effort
- All things being equal, agents try to minimize
their effort in doing what they intend to do. - All communication relies on the trade-off between
efficiency and robustness - The cost of producing a perfectly interpretable
utterance may be more than producing a flawed
utterance, which can be easily repaired. - People normally rely on the error detection and
recovery capabilities of the other speaker. It
would not be efficient to never be misunderstood.
4Miscommunication errors in SDS
- Speech Detection
- Barge-in problems, truncated utterances,
artifacts - ASR
- Deletions, Substitutions, Insertions
- Out of vocabulary utterances
- Parsing/NLU
- Concept failure
- Dialog management
- Reference resolution
- Plan recognition
- Response generation
- Ambiguous references
- Too much information at once
5Errors in human-computer dialogue
- Derriks Willems (1998) compares
- Human-Human dialogue
- Miscommunication occurs due to overlapping speech
and missing elements (ellipsis), perception of
names and numbers. - Human-Computer dialogue (WOZ)
- Less spontaneous Less overlapping speech and
ellipsis Fewer problems - Still problems with recognition of numbers
- New problem sources
- Artificially imposed constraints
- Complete and standardized responses to particular
and partial requests
6Types of miscommunication
- Non-understanding
- A participant fails to obtain any interpretation
at all, or is not able to choose among several
possible interpretations. - Misunderstanding
- A participant obtains an interpretation which she
believes is complete and correct, but which is,
however, not in line with the speakers
intentions. - Misinterpretation (misconception)
- A participants interpretation of an utterance
suggests that the speakers beliefs about the
world are out of alignment.
7Error handling in spoken dialogue systems
- Prevention
- Prediction
- (Prevention)
- ERROR
- Detection
- Recovery
- (Prevention)
8Grounding in human-computer dialogue
- The computer must show a display of understanding
in order to detect errors. - Explicit verification
- U I want to go to Milano
- S Do you want to go to Merano?
- Implicit verification
- U I want to travel from Milano
- S At what time do you want to leave from Merano?
9First-turn repair
- U I want to travel to Stockh.. Detection
disfluency Stocksund
10Second-turn repair
- U I want to travel to Stocksund.
- S Detection non-understandingSorry, I didnt
understand. Please repeat. - U I want to travel to Stocksund.
- S Detection self-misunderstandingDid you say
Stockholm?
11Third-turn repair
- U I want to travel from Stockholm
- S When do you want to leave from Stocksund?
- U Detection other-misunderstandingI said I
want to go from Stockholm!
12Fourth-turn repair
- U I am on Kungstensgatan
- S Take to the left
- U Ok, now I am on Upplandsgatan
- S Detection self-misunderstanding Werent
you on Kungsträdgårdgatan before you turned?
13Error detection approaches
- Early detection
- Decide on the basis of the current user utterance
whether it will be recognized and interpreted
correctly or not. (Error awareness) - Late detection
- Decide on the basis of the current user utterance
whether the processing of a previous user
utterance gave rise to communication problems. - Error prediction
- Decide on the basis of the current user utterance
whether the dialogue will become problematic.
(prediction)
14Using the approaches together
- Error prediction
- Choosing a dialogue strategy to prevent errors.
- Early detection
- Determining confidence of understanding. Choosing
an appropriate grounding act. How should the
system display the understanding? - Late detection
- Interpreting the users response to the grounding
act. Was the previous understanding correct?
15Early and late detection in grounding
- U I want to travel from Stockholm
- S Early detectionWhen do you want to leave
from Stocksund? - U I said I want to go from Stockholm!
- S Late detection Ok, when do you want to
leave from Stockholm?
16Error detection methods
- Early detection (error awareness)
- Feature-based detection
- Acoustic confidence score
- Prosody
- NLP, Dialogue Discourse History
- Late detection
- Detection of negative and positive cues
- Dialogue expectations
- Plan-based models
- Error prediction
17ASR confidence and prosodic features
- Train schedules (Litman et al 2000)
- Ripper classification (if-then-else)
18Features from all dialogue components
- Automated call center (Walker et al 2000)
- ASR
- Num.words, asr-duration, tempo
- 78.89
- NLU
- task, confidence, context-shift, salience
- 84.80
- Discourse (DM History)
- Prompt, reprompt, subdialogue, confirmation
- 71.97
- All components
- 86.16
19Error detection methods
- Early detection (error awareness)
- Feature-based detection
- Acoustic confidence score
- Prosody
- NLP, Dialogue Discourse History
- Late detection
- Detection of negative and positive cues
- Dialogue expectations
- Plan-based models
- Error prediction
20Verification Positive and negative cues
21Verification Cue detection
- Detection of positive and negative cues(Krahmer
et al, 2001)
22Dialogue expectations
- Error detection by expectations
- Unexpected utterances can be signs of
misunderstanding. - Plan-based models
- Detection and repair of misunderstandings are
embedded in the goal-directed behaviour of
maintaining intersubjectivity. Model third and
fourth turn repairs. (McRoy Hirst 1995) - But
- Broken expectations are not always signs of
misunderstanding. Topic and focus shifts can also
lead to unexpected utterances.
23Error detection methods
- Early detection (error awareness)
- Feature-based detection
- Acoustic confidence score
- Prosody
- NLP, Dialogue Discourse History
- Late detection
- Detection of negative and positive cues
- Dialogue expectations
- Plan-based models
- Error prediction
24Error prediction
- Approach
- Decide on the basis of the current user
utterance(s) whether the dialogue will be
problematic. - Walker et al (2000)
- Dialogues were classified as problematic (36)
or task success (64 baseline) - Trained on features from ASR, NLU and DM
- First turn 72
- Second turn 80
- Whole dialogue 87
25Important issues
- Mobile environments
- Laboratory assessments often overestimate
recognition rates in natural field settings
(20-50 drop in accuracy) - Noise, social interchange, multi-tasking, stress
- Multimodal error handling
- Error prevention and error recovery
- Choice of less error-prone modality, simpler
utterances, alternation of modality, mutual
disambiguation