Robust Recognition of Emotion from Speech - PowerPoint PPT Presentation

1 / 24

About This Presentation

Title:

Robust Recognition of Emotion from Speech

Description:

Animated agents to recognize emotion in e-Learning environment. ... Bagging. J48. NNge. Na ve Bayes. Logistic. AdaBoostM1. RandomForrest. Part. Bayes. Functions ... – PowerPoint PPT presentation

Number of Views:86

Avg rating:3.0/5.0

Slides: 25

Provided by: mohamme88

Category:

more less

Transcript and Presenter's Notes

Title: Robust Recognition of Emotion from Speech

1
Robust Recognition of Emotion from Speech

Mohammed E. Hoque
Mohammed Yeasin
Max M. Louwerse
mhoque, myeasin, mlouwerse_at_memphis.edu
Institute for Intelligent Systems
University of Memphis

2
Presentation Overview

Motivation
Methods
Database
Results
Conclusion

3
Motivations

Animated agents to recognize emotion in
e-Learning environment.
Agents need to be sensitive and adaptive to
learners emotion.

4
Methods

Our method is partially motivated by the work of
Lee and Naranyan 1, who first introduced the
notion of salient words.

5
Shortcomings of Lee and Narayans work
Lee et al. argued that there is one-to-one
correspondence between a word and a positive or
negative emotion. This is NOT true for every
case.
6
Examples
Confusion
Flow
Normal
Delight
Figure 1 Pictorial depiction of the word okay
uttered with different intonations to express
different emotions.
7
More examples..
Scar!! Scar??
8
More examples
Two months!!
Two months??
9
Our Hypothesis

Lexical information extracted from combined
prosodic and acoustic features that correspond to
intonation pattern of salient words will yield
robust recognition of emotion from speech.
It also provides a framework for signal level
analysis of speech for emotion.

10
Creation of Database
11
Details on the Database

15 utterances were selected for four emotion
categories confusion/uncertain, delight, flow
(confident, encouragement), and frustration 2.
Utterances were stand-alone ambiguous expressions
in conversations, dependent on the context.
Examples are Great, Yes, Yeah, No, Ok,
Good, Right, Really, What, God.

12
Details on the Database

Three graduate students listened to the audio
clips.
They successfully distinguished between the
positive and negative emotions 65 of the time.
No specific instructions were given as to what
intonation patterns to listen to.

13
High Level Diagram
Positive
Feature Extraction
Word Level Utterances
Classifiers
Data Projection
Negative
Figure 2. The high level description of the
overall emotion recognition process.
14
Hierarchical Classifiers
Figure 3. The design of the hierarchical binary
classifiers.
15
Emotion Models using Lexical Information

Pitch Minimum, maximum, mean, standard
deviation, absolute value, quantile, ratio
between voiced and unvoiced frames.
Duration etime eheight
Intensity Minimum, maximum, mean, standard
deviation, quantile.
Formant First formant, second formant, third
formant, fourth formant, fifth formant, second
formant / first formant, third formant / first
formant
Rhythm Speaking rate.

16
Duration Features
Figure 4. Measures of F0 for computing parameters
(etime, eheight) which corresponds to rising and
lowering of intonation.
Inclusion of height and time accounts for
possible low or high pitch accents.
17
Types of Classifiers
18
Shortcomings of Lee and Narayans work. (2004)
19
Results
20
Summary of Results
21
21 CLASSIFIERS ON POSITIVE AND NEGATIVE EMOTIONS.
22
Limitations and Future work

Algorithm
Feature Selection
Discourse Information
Future efforts will include fusion of video and
audio data in a signal level framework.
Database
Clipping arbitrary words from a conversation may
be ineffective at various cases.
May need to look words in a sequence.

23
More examples..
24

M. E. Hoque, M. Yeasin, M. M. Louwerse. Robust
Recognition of Emotion from Speech, 6th
International Conference on Intelligent Virtual
Agents, Marina Del Rey, CA, August 2006.
M. E. Hoque, M. Yeasin, M. M. Louwerse. Robust
Recognition of Emotion in e-Learning Environment.
18th Annual Student Research Forum, Memphis, TN
April, 2006. 2nd Best Poster Award

25
Acknowledgments

This research was partially supported by grant
NSF-IIS-0416128 awarded to the third author. Any
opinions, findings, and conclusions or
recommendations expressed in this material are
those of the authors and do not necessarily
reflect the views of the funding institution.

26
Questions?
27
Robust Recognition of Emotion from Speech

Mohammed E. Hoque
Mohammed Yeasin
Max Louwerse
mhoque, myeasin, mlouwerse_at_memphis.edu
Institute for Intelligent Systems
University of Memphis

28
References

C. Lee and S. Narayanan, "Toward detecting
emotions in spoken dialogs," IEEE transaction on
speech and audio processing, vol.13, 2005.
B. Kort, R. Reilly, and R. W. Picard, "An
Affective Model of Interplay Between Emotions and
Learning Reengineering Educational
Pedagogy-Building a Learning Companion.,"
presented at In Proceedings of International
Conference on Advanced Learning Technologies
(ICALT 2001), Madison, Wisconsin, August 2001.