Title: NonNative Users in the Lets Go Spoken Dialogue System: Dealing with Linguistic Mismatch
1Non-Native Users in the Lets Go!! Spoken
Dialogue System Dealing with Linguistic Mismatch
- Antoine Raux Maxine Eskenazi
- Language Technologies Institute
- Carnegie Mellon University
2Background
- Speech-enabled systems use models of the users
language
- Such models are tailored for native speech
- Great loss of performance for non-native users
who dont follow typical native patterns
3Previous Work on Non-Native Speech Recognition
- Assumes knowledge about/data from a specific
non-native population
- Often based on read speech
- Focuses on acoustic mismatch
- Acoustic adaptation
- Multilingual acoustic models
4Linguistic Particularities of Non-Native Speakers
- Non-native speakers might use different lexical
and syntactic constructs
- Non-native speakers are in a dynamic process of
L2 acquisition
5Outline of the Talk
- Baseline system and data collection
- Study of non-native/native mismatch and effect of
additional non-native data
- Adaptive lexical entrainment
6The CMU Lets Go!! SystemBus Schedule
Information for the Pittsburgh Area
ASR Sphinx II
Parsing Phoenix
Dialogue ManagementRavenClaw
HUBGalaxy
Speech Synthesis Festival
NLG Rosetta
7Data Collection
- Baseline system accessible since February 2003
- Experiments with scenarios
- Publicized the phone number inside CMU in Fall
2003
8Data Collection Web Page
9Data
- Directed experiments 134 calls
- 17 non-native speakers (5 from India, 7 from
Japan, 5 others)
- Spontaneous 30 calls
- Total 1768 utterances
- Evaluation Data
- Non-Native 449 utterances
- Native 452 utterances
10Speech Recognition Baseline
- Acoustic Models
- semi-continuous HMMs (codebook size 256)
- 4000 tied states
- trained on CMU Communicator data
- Language Model
- class-based backoff 3-gram
- trained on 3074 utterances from native calls
11Speech Recognition Results
Word Error Rate
- Causes of discrepancy
- Acoustic mismatch (accent)
- Linguistic mismatch (word choice, syntax)
12Language Model Performance
Evaluation on transcripts. Initial model 3074
native utterances
13Language Model Performance
Adding non-native data 3074 native1308 non-nati
ve utterances Initial (native) model Mixed m
odel
14Natural Language Understanding
- Grammar manually written incrementally, as the
system was being developed
- Initially built with native speakers in mind
- Phoenix robust parser (less sensitive to
non-standard expressions)
15Grammar Coverage
- Initial grammar
- Manually written for native utterances
16Grammar Coverage
- Grammar designed to accept some non-native
patterns
- reach arrive
- What is the next bus? When is the next
bus?
17Relative Improvement due to Additional Data
18Effect of Additional Data on Speech Recognition
19Adaptive Lexical Entrainment
- If you cant adapt the system, adapt the user
- System should use the same expressions it expects
from the user
- But non-native speakers might not master all
target expressions
- Use expressions that are close to the non-native
speakers language
- Use prosody to stress incorrect words
20Adaptive Lexical EntrainmentExample
21Adaptive Lexical EntrainmentAlgorithm
I want to go the airport
ASR Hypothesis
ConfirmationPrompt
DP-basedAlignment
PromptSelection
Emphasis
TargetPrompts
22Adaptive Lexical EntrainmentAlgorithm
Id like to go to the airport
I want to go the airport
ASR Hypothesis
ConfirmationPrompt
DP-basedAlignment
PromptSelection
Emphasis
TargetPrompts
23Adaptive Lexical EntrainmentAlgorithm
Id like to go to the airport
I want to go the airport
I want to go to the airport
ASR Hypothesis
ConfirmationPrompt
DP-basedAlignment
PromptSelection
Emphasis
TargetPrompts
24Adaptive Lexical EntrainmentAlgorithm
Id like to go to the airport
I want to go the airport
I want to go to the airport
ASR Hypothesis
ConfirmationPrompt
DP-basedAlignment
PromptSelection
Emphasis
TargetPrompts
25Adaptive Lexical EntrainmentAlgorithm
Id like to go to the airport
I want to go the airport
I want to go to the airport
ASR Hypothesis
ConfirmationPrompt
DP-basedAlignment
PromptSelection
Emphasis
TargetPrompts
26Adaptive Lexical EntrainmentAlgorithm
Id like to go to the airport
I want to go the airport
I want to go to the airport
Did you mean
?
ASR Hypothesis
ConfirmationPrompt
DP-basedAlignment
PromptSelection
Emphasis
TargetPrompts
27Adaptive Lexical EntrainmentStudy
- 24 dialogues, 12.4 turns/dialogue, 2.5
entrainment prompts/dialogue
- Trigger entrainment whenever best target does not
perfectly match ASR
- Too simple? 40.7 false positive
- 64.4 prompts are triggered by misrecognitions
28Conclusion
- LM and Grammar perform significantly worse on
non-native speech
- Including non-native data reduces the gap in
linguistic metrics but not WER
- Future work
- Improve Speech Recognition
- More data for LM
- Acoustic/Lexical adaptation
- Integrate adaptive lexical entrainment in the
dialog management framework
29Example Dialogue
S Welcome to the Lets Go!! bus information
system. What can I do for you? U Excuse me
. Im in the place which uh where the Forbes
Avenue and Murray Avenue are crossing and I wan
t to go to the place mm Forbes
where the Forbes Avenue and Craig Street are
crossing. S I heard SIX. Is that correct?
U Which bus uh should I take? Uh sorry uh maybe
61C bus I have What time the next 61C bus will
come? S The time is currently 537 PM. Where
are you leaving from?
30Example Dialogue
S Welcome to the Lets Go!! bus information
system. What can I do for you? U I want to
go to downtown now Im at Fifth and Bigelow.
S Sorry, did you say I want to go to downtown
? U Which number bus can I take and when will
the next bus come here? S ... What can I do f
or you?