Robust Recognition of Emotion from Speech - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Robust Recognition of Emotion from Speech

Description:

Animated agents to recognize emotion in e-Learning environment. ... Bagging. J48. NNge. Na ve Bayes. Logistic. AdaBoostM1. RandomForrest. Part. Bayes. Functions ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 25
Provided by: mohamme88
Category:

less

Transcript and Presenter's Notes

Title: Robust Recognition of Emotion from Speech


1
Robust Recognition of Emotion from Speech
  • Mohammed E. Hoque
  • Mohammed Yeasin
  • Max M. Louwerse
  • mhoque, myeasin, mlouwerse_at_memphis.edu
  • Institute for Intelligent Systems
  • University of Memphis

2
Presentation Overview
  • Motivation
  • Methods
  • Database
  • Results
  • Conclusion

3
Motivations
  • Animated agents to recognize emotion in
    e-Learning environment.
  • Agents need to be sensitive and adaptive to
    learners emotion.

4
Methods
  • Our method is partially motivated by the work of
    Lee and Naranyan 1, who first introduced the
    notion of salient words.

5
Shortcomings of Lee and Narayans work
Lee et al. argued that there is one-to-one
correspondence between a word and a positive or
negative emotion. This is NOT true for every
case.
6
Examples
Confusion
Flow
Normal
Delight
Figure 1 Pictorial depiction of the word okay
uttered with different intonations to express
different emotions.
7
More examples..
Scar!! Scar??
8
More examples
Two months!!
Two months??
9
Our Hypothesis
  • Lexical information extracted from combined
    prosodic and acoustic features that correspond to
    intonation pattern of salient words will yield
    robust recognition of emotion from speech.
  • It also provides a framework for signal level
    analysis of speech for emotion.

10
Creation of Database
11
Details on the Database
  • 15 utterances were selected for four emotion
    categories confusion/uncertain, delight, flow
    (confident, encouragement), and frustration 2.
  • Utterances were stand-alone ambiguous expressions
    in conversations, dependent on the context.
  • Examples are Great, Yes, Yeah, No, Ok,
    Good, Right, Really, What, God.

12
Details on the Database
  • Three graduate students listened to the audio
    clips.
  • They successfully distinguished between the
    positive and negative emotions 65 of the time.
  • No specific instructions were given as to what
    intonation patterns to listen to.

13
High Level Diagram
Positive
Feature Extraction
Word Level Utterances
Classifiers
Data Projection
Negative
Figure 2. The high level description of the
overall emotion recognition process.
14
Hierarchical Classifiers
Figure 3. The design of the hierarchical binary
classifiers.
15
Emotion Models using Lexical Information
  • Pitch Minimum, maximum, mean, standard
    deviation, absolute value, quantile, ratio
    between voiced and unvoiced frames.
  • Duration etime eheight
  • Intensity Minimum, maximum, mean, standard
    deviation, quantile.
  • Formant First formant, second formant, third
    formant, fourth formant, fifth formant, second
    formant / first formant, third formant / first
    formant
  • Rhythm Speaking rate.

16
Duration Features
Figure 4. Measures of F0 for computing parameters
(etime, eheight) which corresponds to rising and
lowering of intonation.
Inclusion of height and time accounts for
possible low or high pitch accents.
17
Types of Classifiers
18
Shortcomings of Lee and Narayans work. (2004)
19
Results
20
Summary of Results
21
21 CLASSIFIERS ON POSITIVE AND NEGATIVE EMOTIONS.
22
Limitations and Future work
  • Algorithm
  • Feature Selection
  • Discourse Information
  • Future efforts will include fusion of video and
    audio data in a signal level framework.
  • Database
  • Clipping arbitrary words from a conversation may
    be ineffective at various cases.
  • May need to look words in a sequence.

23
More examples..
24
  • M. E. Hoque, M. Yeasin, M. M. Louwerse. Robust
    Recognition of Emotion from Speech, 6th
    International Conference on Intelligent Virtual
    Agents, Marina Del Rey, CA, August 2006.
  • M. E. Hoque, M. Yeasin, M. M. Louwerse. Robust
    Recognition of Emotion in e-Learning Environment.
    18th Annual Student Research Forum, Memphis, TN
    April, 2006. 2nd Best Poster Award

25
Acknowledgments
  • This research was partially supported by grant
    NSF-IIS-0416128 awarded to the third author. Any
    opinions, findings, and conclusions or
    recommendations expressed in this material are
    those of the authors and do not necessarily
    reflect the views of the funding institution.

26
Questions?
27
Robust Recognition of Emotion from Speech
  • Mohammed E. Hoque
  • Mohammed Yeasin
  • Max Louwerse
  • mhoque, myeasin, mlouwerse_at_memphis.edu
  • Institute for Intelligent Systems
  • University of Memphis

28
References
  • C. Lee and S. Narayanan, "Toward detecting
    emotions in spoken dialogs," IEEE transaction on
    speech and audio processing, vol.13, 2005.
  • B. Kort, R. Reilly, and R. W. Picard, "An
    Affective Model of Interplay Between Emotions and
    Learning Reengineering Educational
    Pedagogy-Building a Learning Companion.,"
    presented at In Proceedings of International
    Conference on Advanced Learning Technologies
    (ICALT 2001), Madison, Wisconsin, August 2001.
Write a Comment
User Comments (0)
About PowerShow.com