Title: Back Channel Communication
1Back Channel Communication
- Antoine Raux
- Dialogs on Dialogs 02/25/2005
2Outline
- From Back Channel to backchannels
- Function of the Back Channel
- Characteristics of the Back Channel
- The Back Channel in Spoken Dialogue Systems
3From back channel
- 70s Conversation Analysts attempt to describe
systematic rules for turn-taking management - Goal minimize gaps and overlaps between speakers
- BUT many overlaps in natural speech
- E.g. mm-hmm, okay, yeah
- Back channel (Yngve 1970) Parallel channel for
communication (Duncan 1972) - Back channel communication does not constitute a
turn or a claim for a turn - But it may participate in a variety of
communication functions, including the regulation
of speaking turns.
4to backchannels
- Backchannel listener-produced signal such as
mm-hmm, yeah(To backchannel to produce
such signals) - Does not imply the will to take the turn
- Implies some form of acknowledgment (in general)
5Front vs Back Channel
Front Channel Back Channel
Function Propositional Transactional Conversation managmt Social Conversation managmt Social
Protocol Turn-takingFloor sharing ? (controlled by FC?)No floor to share
Lexical content Anything vocalizations, short words, phrases (Thats true)
6Front-channel cues to back-channel signals
- Koiso et al (1998)
- Analyze the relationship between different
syntactic and prosodic features and the
occurrence of backchannels
7Koiso et al (Methodology)
- Data 8 dialogs from Japanese Map Task corpus
- replica of the Edinburgh MT
- Face-to-face and speech only (no difference)
- Features
- Syntactic POS
- Duration of last mora (normal/long/short)
- F0 pattern of last mora (flat-fall, rise)
- Peak F0 (low/high)
- Energy pattern (late-decr, decr, no-decr)
- Peak energy (low/high)
8Koiso et al (Results)
- Frequency of feature values
BC gt no-BC POSverb-phrase, post-position, conjunction F0 patflat-fall or rise-fallEnergy patlate-decr Peak energyhigh no-BC gt BC POSadv, conjunction, interjection, filler Durshort F0 patfall or flat Energy patnon-decr Peak energylow
9Koiso et al (Results)
- Decision Tree analysis
- Compare the loss in performance by not using each
feature - POS single best feature
- Prosodic features altogether as good as POS
10Koiso et al (Discussion)
- Some POS strongly inhibit BC
- Individual prosodic features are not good
indicators of BC occurrence - BC occurrence is conditioned by both POS and
prosody (as a whole) - What about other languages?
- What about BC overlapping with speech?
11BC cues in English and Japanese
- Ward and Tsukahara (2000)
- Tests one hypothesis (BC are triggered by low
pitch cues) for two languages
12The Low Pitch Cue
- Both in American English and Japanese, it appears
that after a region of low pitch lasting 110 ms
the listener tends to produce back-channel
feedback. - Goal of this paper quantitatively test this on
naturally occurring conversations
13Ward and Tsukahara (Methodology)
- Data
- English 8 conversations, 12 speakers (first
author participates in 5 conversations!) - Japanese 18 conversations, 24 speakers
- Prediction
- Every 10ms decide BC/no-BC by applying a hand
coded rule with 5 parameters tuned to the data
14Ward and Tsukahara (Results)
- Each predicted BC was considered correct if it
fell within 500ms of an actual BC - Low pitch region rule is better than chance both
in English and Japanese
15Ward and Tsukahara (Results)
- Issues
- Evaluation (tolerance window size, speakers
produce BCs with different frequencies) - No actual comparison between languages
- Are low pitch regions and BCs simply correlated
to other phenomena (syntactic completion,
disfluencies) or is there a direct
cause/consequence relationship?
16Effects of Native Language and Gender on BC
- Feke (2003)
- Conversation Analysis study of BC in
native-English and native-Spanish, same- and
mixed-gender dialogs
17Definition of BC
- BC responses of the participant that is clearly
not holding the floor - Very loose compared to previous papers
- e.g. How did you find Quechua? is a BC
- Distinguishes In-Between BC and Overlap BC
18Feke (Methodology)
- Recorded 8 non-scripted conversations between 8
different speakers (2 native languages x 2
genders x 2 subjects) - Manually coded In-Between BCs and Overlap BCs
19Feke (Results)
- No differences observed across cultures
- Participants of both genders tend to use more BC
when conversing with someone of the opposite
gender - Difference seems bigger for females than for males
20Feke (Discussion)
- Interesting/surprising result from the
ethnological/sociological point of view - Very few data points, no significance analysis
- Only looked at number of BCs
- Consequences on SDS? (e.g. using gender
information in BC prediction, selecting the
gender of an agent)
21BC in Practical Systems
- Takeuchi et al (2003)
- Method to determine the timing of turn
transitions and aizuchi (BC) on Japanese
Human-Human corpus
22Takeuchi (Approach)
- Similar to Koiso et al, but only using
automatically extracted features - Every 100 ms decide between
- Take turn
- Aizuchi (BC)
- Leave turn (wait)
23Takeuchi (Approach)
- Decision Tree using
- Syntax (POS, content/function words)
- Utterance duration
- Pause duration/pause since last content wd
- Content word duration
- F0
- Power
24Takeuchi (Results)
- Precision/Recall of frame classification
- Around 80 on the training set
- Less then 50 on a test set
- Subjective evaluation
- Artificially insert BC at predicted time
- Timing was judged good in 70-80
- On real utterances 72 (!)
25Takeuchi (Discussion)
- Found that syntactic information did not help
(contradicts Koiso?) - Underscores the difficulty of evaluating
turn-taking/backchanneling systems
26Conclusion
- Hard to account for simultaneous turns in
conversation - Back Channel framework offers one explanation
- But most work remains very specific
- Missing a good theory of conversation