Emotion in Meetings: Business and Personal - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Emotion in Meetings: Business and Personal

Description:

Julia Hirschberg CS 4995/6998 * * Spotting Hot Spots in Meetings: Human Judgments and Prosodic Cues - Britta Wrede, Elizabeth Shriberg Can human listeners agree ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 33
Provided by: JuliaHir1
Category:

less

Transcript and Presenter's Notes

Title: Emotion in Meetings: Business and Personal


1
Emotion in Meetings Business and Personal
  • Julia Hirschberg
  • CS 4995/6998

2
(No Transcript)
3
Spotting Hot Spots in Meetings Human Judgments
and Prosodic Cues- Britta Wrede, Elizabeth
Shriberg
  • Can human listeners agree on utterance-level
    judgments of speaker involvement?
  • Do judgments of involvement correlate with
    automatically extractable prosodic cues?
  • Why might this be useful for meetings?

4
Corpus
  • ICSI Meeting Corpus
  • 75 unscripted, naturally occurring meetings on
    scientific topics
  • 71 hours of recording time
  • Each meeting contains between 3 and 9
    participants
  • Pool of 53 unique speakers (13 female, 40 male)
  • Speakers recorded by both far field and
    individual close-talking microphones
  • Recordings from the close-talking microphones
    were used here

5
Method
  • Subset of 13 meetings (4-8 spkrs) selected
  • Analyzed utterances for involvement
  • Amusement, disagreement, other
  • Hot Spots labeled 1 spkr had high involvement
  • Labeled as amused, disagreeing, other
  • Why didnt allow context?
  • Why use (9) people who know the spkrs?
  • Why ask them to base their judgment as much as
    possible on the acoustics?
  • Inter-rater agreement measured using Fleiss
    Kappa for pair-wise and overall agreement

6
Inter-rater Agreement
  • Cohens kappa 2 raters, categorical data

7
(No Transcript)
8
  • Fleiss kappa generalizes Cohens to multiple
    raters, categorical data
  • Krippendorfs alpha measures agreement of
    multiple raters, any type of data
  • Observed vs. expected disagreement

9
Inter-rater agreement
  • Nine listeners, all of whom were familiar with
    the speakers provided ratings for at least 45
    utterances but only 8 ratings per utterance were
    used.

10
Inter-rater Agreement for Meetings
  • Agreement for high-level distinction between
    involved and non-involved yielded a kappa of .59
    (p lt .01) -- reasonable
  • When computed over all four categories, reduced
    to .48 (p lt .01)
  • More difficulty making distinctions among types
    of involvement (amused, disagreeing and other)

11
Pair-wise agreement
12
Native vs. nonnative raters
13
Acoustic cues to involvement
  • Why prosody?
  • Not enough data in the corpus to allow robust
    language modeling.
  • Prosody does not require ASR results, which might
    not be available for certain audio browsing
    applications or have poor performance on meeting
    data

14
Potential Acoustic Cues to Involvement
  • Certain prosodic features, such as F0, show good
    correlation with certain emotions
  • Studies have shown that acoustic features tend to
    be more dependent on dimensions such as
    activation and evaluation than on emotions
  • Pitch related measures, energy and duration can
    be useful indicators of emotion

15
Acoustic Features Examined
  • F0 and energy based features were computed for
    each word (mean, minimum and maximum considered)
  • Utterance scores obtained by computing average
    over all the words)
  • Tried absolute or normalized values

16
Correlations with Perceived Involvement
  • Class assigned to each utterance determined as a
    weighted version of its ratings
  • A soft decision, accounting for the different
    ratings in an adequate way
  • Difference between two classes significant for
    many features
  • Most predictive features all F0 based
  • Normalized features more useful than absolute
    features
  • Patterns remain similar the most distinguishing
    features are roughly the same when within speaker
    features are analyzed
  • Normalization removes a significant part of the
    variability across speakers

17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
Conclusions
  • Despite subjective nature of task, raters show
    significant agreement in distinguishing involved
    from non-involved utterances
  • Native/non-native differences in
  • Prosodic features of rated utterances indicate
    involvement can be characterized by deviations in
    F0 and energy
  • Possibly general effect over speakers
  • If true, mean, variance, and baseline
    normalizations may be able to remove most
    variability between speakers

21
  • Analysis of the occurrence of laughter in
    meetings
  • - Kornel Laskowski, Susanne Burger

22
Analysis of the occurrence of laughter in
meetings- Kornel Laskowski, Susanne Burger
  • Questions
  • What is the quantity of laughter, relative to the
    quantity of speech?
  • How does the durational distribution of episodes
    of laughter differ from that of episodes of
    speech?
  • How do meeting participants affect each other in
    their use of laughter, relative to their use of
    speech?

23
Method
  • Analysis Framework
  • Bouts, calls and spurts
  • Laughed speech
  • Talk spurt segmentation
  • Using word-level forced alignments in ICSI Dialog
    Act (MRDA) Corpus
  • 300 ms threshold, based on value adopted by the
    NIST Rich Transcription Meeting Recognition
    evaluations
  • Selection of Annotated Laughter Instances
  • Vocal sound and comment instances
  • Laugh bout segmentation
  • Semi-automatic segmentation

24
(No Transcript)
25
Analysis
  • Quantity of laughter
  • Average participant vocalizes for 14.8 of time
    spent in meetings
  • Of this time, 8.6 spent on laughing and
    additional 0.8 spent on laughing while talking.
  • Participants differ in how much time spent
    vocalizing and on what proportion of that is
    laughter
  • Importantly, laughing time and speaking time do
    not appear to be correlated across participants.

26
(No Transcript)
27
Analysis
  • Laughter duration and separation
  • Duration of laugh bouts and temporal separation
    between bouts for a participant?
  • Duration and separation of islands of laughter,
    produced by merging overlapping bouts from all
    participants
  • Bout and bout island durations follow a
    lognormal distribution, while spurt and spurt
    island durations appear to be the sum of two
    lognormal distributions
  • Bout durations and bout island durations have
    an apparently identical distribution, suggesting
    that bouts are committed either in isolation or
    in synchrony, since bout island construction
    does not lead to longer phenomena.
  • In contrast, construction of speech islands
    does appear to affect the distribution, as
    expected.
  • Distribution of bout and bout island
    separations appears to be the sum of two
    lognormal distributions.

28
(No Transcript)
29
Analysis
  • Interactive aspects(multi-participant behavior)
  • Laughter distribution computed over different
    degrees of overlap
  • Laughter has significantly more overlap than
    speech in relative terms, ratio is 8.1 of
    meeting speech time versus 39.7 of meeting
    laughter time
  • Amount of time spent in which 4 or more
    participants are simultaneously vocalizing is 25
    times higher when laugher considered
  • Exclusion and inclusion of laughed speech

30
(No Transcript)
31
Interactive aspects(continued)
  • Probabilities of transition between various
    degrees of overlap

32
Conclusions
  • Laughter accounts for 9.5 of all vocalizing
    time, which varies extensively from participant
    to participant and appears not to be correlated
    with speaking time
  • Laugh bout durations have smaller variance than
    talk spurt durations
  • Laughter responsible for significant amount of
    vocal activity overlap in meetings, and
    transitioning out of laughter overlap is much
    less likely than out of speech overlap
  • Authors quantified these effects in meetings, for
    the first time, in terms of probabilistic
    transition constraints on the evolution of
    conversations involving arbitrary numbers of
    participants
Write a Comment
User Comments (0)
About PowerShow.com