Title: Evoking and Assessing Cooperation in Dialogue (Systems)
1Evoking and Assessing Cooperation in Dialogue
(Systems)
- Talk in the EGK-colloquium
- Dec 3, 2001
- Bettina Braun
2Content
- Introduction what I did so far...
- What is cooperation?
- in general terms
- in phonetic terms
- Plans from now on user behaviour in interaction
with dialogue systems and human operators - Experiment cooperative operators, ...
3So far Error-resolution in computer-directed
speech
- Hyperarticulation to correct machines
- Increased error-rates in ASR-systems
- Find solutions for ASR to cover changes in
- F0-movement
- intensity/spectral tilt
- duration
- segmental realisation, e.g. formant values,
4Pilot study in speech-operated lift
- Evoke corrections in several focus conditions,
triggered by systems output - broad focus
- narrow focus
- contrastive focus
- Broad focus not possible in that domain!
- Concept of contrastive vs. narrow very vague
- For phonetic analysis too poor signal quality
(noise, reverberation, off-talk, ...)
5Loophole -- possible ways to go...
- New recordings with more elaborated system to
evoke hyperarticulation in several contexts (and
focus conditions) - already quite thoroughly analysed
- Except for hyperarticulation various forms of
emotions and attitudes (uncertainty, anger),
e.g. calling contour in addressing the lift - Analysis of global behaviour of users, because...
6Communicative situation
- Computer-directed speech differs from normal
conversation (back-channels, interruptions,
visual co-presence) - Conceptual processing is partner-dependent (e.g.
mutual beliefs, grounding) - Error-resolution demands cooperation!
- Emotional component (relationship to computer,
wrt. greetings, politeness, etc.)
7Now Analysis of human behaviour
- Interest in human behaviour when interacting with
machines (as opposed to human partners) - How would human operators act?
- How can cooperation be evoked in dialogue?
- How can cooperation be assessed (in dialogue
systems)?
8Definition of Cooperation (Allwood 76)
- Take partner in cognitive consideration
- Joint purpose (of understanding)
- Ethical consideration
- not to hurt each other
- not to force each other
- facilitate rational behaviour
- Trust to act in accordance with above points
9Phonetic Realisation of Cooperation
- Expressing degree of understanding, either by
appropriate dialogues act or prosody - cinema info Sie möchten in N. den Film ..
SEHEN? - train timetable Wann möchten Sie von SB nach HH
fahren? - Expressing what part of information was not
understood - Shift focus to appropriate position
- deaccent understood information
Importance?
10Existing dialogue systems (I)
- Technique finite-state-automata, transitions and
speech output (mainly canned speech) determined
by key-word or key-phrase spotter - obligatory mapping to items in lexicon
- Difference in how information slots are filled
- Barge-in not always allowed
- Speech output not prosodically adapted wrt.
- given/new information
- certainty in checks/query-YN
11Existing dialogue systems (II)
- Often (?) successful interaction, because
- users are very adaptive
- only limited domain applications
- BUT user satisfaction also important
- certainty of checks (which check wrong info)
- information packaging mostly inappropriate
- user frustration ...
12How to deal with frustrated users?
- Automatic detection of anger to pass over to
human operator - very difficult task (cf. Batliner et al. 2000)
- Different strategies of clarification initiations
(e.g. excuses by system, cf. Fischer 2000) - Approach here Find out which system behaviour
is judged positively - appropriate
- cooperative
13Overall experimental design
- Investigate user behaviour in 4 conditions
- WoZ with synthesised speech output
- WoZ with canned speech
- human operator with restricted set of
- utterances
- (human operator with free interaction)
- Ask subjects to judge appropriateness of each
systems behaviour (set of subjects!!!)
14Experimental design (overview) How do human
bankers behave?
- Production study with 12 human operators
- studio recordings of all possible utterances
(16) - interactive condition with restricted set of
utterances and believed customers - Evaluation of dialogue structure
- especially check vs. query-YN
- certainty/uncertainty
- after previous correction by customer
- Perception test
15Why this setting?
- Banking scenario very familiar
- Domain small, but extendable (transferals)
- Studio recordings as no context condition
- Restricted set of utterances to simulate real
systems and to exclude other factors that signal
cooperation - adaptation to lexical and syntactic forms
- adaptation to utterance length
- different customers to exclude adaptation to
voice characteristics, f0,...
16Collecting and preparing customer data
- Limited domain WoZ system with bad performance
17Collecting and preparing customer data
- Limited domain WoZ system with bad performance
Deposit or withdraw?
Pardon?
How much withdraw?
How much deposit?
Check amount action
Check amount action
Please take your money
Please put money in case
18Collecting and preparing customer data
- Limited domain WoZ system with bad performance
- Data collection 5 customers (4 tasks)
- Parts of these signals were distorted
- amount of money
- (I want to withdraw 2x DEM)
- action withdraw or deposit
- (I want to xxx 5000 DEM)
19Constructing possible dialogues...
- Dialogue interaction with 7 customers
- Thinks to consider
- (re)action of operator must be predictable
- customer sound files must be prepared for each
possible reaction - customers utterances must fit (prosodically)
into the dialogue (e.g. hyperarticulation in
corrections) - Problem customer reactions for free
interaction can not be simulated!
20Instruction of operators as bank assistants
- Utterances restricted to set of utterances
Deposit or withdraw?
Pardon?
How much withdraw?
How much deposit?
Check amount action
Check amount action
Please take your money
Please put money in case
21Stimuli and (expected) reactions
- Amount not mentioned/distorted
- I want to withdraw money
- I want to withdraw xxxx
- How much do you want to withdraw? (But with
differing intonation) - I want to withdraw DEM 2xx
- You want to withdraw DEM 2000 (200), then?
(query-YN-like, because of uncertainty)
Inappropriate intonation
Certain check
22Stimuli and (expected) reactions
- Action distorted
- I want to xxx money
- I want to xxx DEM 5000
- Do you want to deposit or withdraw money?
(Do you want to deposit or withdraw DEM 5000 is
no possibility)
- How much do you want to deposit/withdraw?
- Pardon?
23(No Transcript)
24Stimuli and (expected) reactions
- Check after user correction
- apologising impression
- Query-W after user correction
- action distorted shift from question particle to
deposit or withdraw - Repeated misunderstanding
- change intonation of pardon?
did not happen
add explain to pardon?, react with a check
25Reasons for not being cooperative
- Humans are not cooperative (v. map task)
- Situation too unnatural
- no real customers
- permitted utterances normally not actively used
- Task too complicated
- too less time to get used to utterances, i.e.
finding the fitting one poses problems
(mapping problem) - signal quality of customers too poor
- Operators indeed understood every customer
26Possible solutions
- Assessment of signal quality ratings from 1 to 6
after each dialogue (gt very poor rates) - Increase familiarity with utterances
- fist role games with experimenter (maybe with
changed roles) - first ratings of appropriateness, cooperation,
etc. of other operators - two runs, discussing problems of first run in
between (real customers no more believed)
27Which behaviour is appropriate for dialogue
systems?
- Should systems behave like humans?
- Assessment of cooperation, appropriateness, etc.
under 2 conditions - assuming that human-human interaction is judged
- assuming that man-machine interaction is judged
28Assessment of cooperation
- User satisfaction in interaction with dialogues
(e.g. PARADISE, M.Walker) mixed-initiative vs.
system-initiative,... - Assessing appropriateness, naturalness of
synthetic speech output (cf J.House, S.Hawkins) - How to tease dialogue behaviour and prosodic
cues to cooperation apart?
29Perception test I
- Comparison between studio and interactive
behaviour of same dialogue with respect to - cooperation, friendliness
- naturalness, appropriateness of operator
- only possible for selected cooperative
utterances! - Interactive conditions contain a lot of
extralinguistic noise (breathing, smacking, long
pauses) gt clean them before - interactive ? studio
30Perception test II
- Comparison of dialogues with same goal but
different operators to evaluate different
(prosodic and non-prosodic ???) strategies - too many variables
- dialectal influence
- male/female
- voice characteristics
- Examples
- ?
31Remaining questions....
- How to build a natural, but controllable
environment for collecting cooperative operator
data - First perform a map-task to investigate
cooperation in human-human dialogue? (changed
roles?)