Title: Empaceptor The empathy interceptor
1EmpaceptorThe empathy interceptor
Developed by Elias Holman for MAS630 Final Project
5/11/04
2Project Constraints
- Start from a small amount of built-in knowledge
- Make it work with tools that users already use,
with tasks that they already perform - Dont require heavyweight processing or
non-standard hardware (anyone can use it)
3What is Empaceptor?
- A pure Java text-processing tool that attempts to
build up an association between a word and a set
of configurable emotions. - A service which can be invoked either via a Java
API or standard TCP/IP sockets.
4What did you build?
POP Server
POP over TCP/IP
Shell script invocation
Empaceptor Email Service
Return value of service
SMTP over TCP/IP
Standard sockets over TCP/IP
Empaceptor Server
Empaceptor architecture showing 1) Incoming
mail path 2) Mbox file access 3) Outgoing mail
path
Local disk access
To incoming mail server TCP/IP
Local disk access
mbox file
Empaceptor XML Persistence
5The Server
Empaceptor Server
- Processes incoming text
- Learns from text, building an association based
on co-occurrence - Scores text, based on learned associations
- Persists associations in simple XML file stored
in users home directory
Empaceptor XML Persistence
6How learning works
- List of emotion words is starting point.
- Words occurring with emotion words are marked as
occurring in emotional context. - Words occurring without emotion words are simply
marked as occurring. - Learned emotional score of word is emotional
occurrences/all occurrences for each emotion. - WordNet gives broader coverage, so scores are
really for synsets, not words.
7The Base Emotion Set
- pleased
- proud
- regretful
- relaxed
- relieved
- restrained
- sad
- satisfied
- scornful
- serene
- shameful
- sorrowful
- surprised
- suspicious
- sympathetic
- tender
- tense
- tranquil
- alarmed
- alert
- angry
- annoyed
- anxious
- astonished
- bold
- bored
- calm
- cautious
- compassionate
- concerned
- confident
- curious
- delighted
- depressed
- disappointed
- disgusted
- distressed
- fearful
- frustrated
- generous
- gloomy
- grateful
- guilty
- blissful
- happy
- haughty
- helpless
- hopeful
- humiliated
- indifferent
- inferior
- interested
- joyful
- lonely
- miserable
- mirthful
8Scoring a piece of text
- Each individual word is scored as in learning
(although occurrences are not counted as in
learning). - Score of text is sum of scores of words divided
by of words.
9Scoring a piece of text
I am happy about my new car. I put it in my
garage.
Word Happy Weight Word Happy Weight
I 1/2 (.5) car 1/1 (1)
am 1/1 (1.0) put 0/1 (0)
happy 1/1 (1.0) it 0/1 (0)
about 1/1 (1.0) in 0/1 (0)
my 1/2 (0.5) garage 0/1 (0)
new 1/1 (1)
10Mbox import
Empaceptor Server
Local disk access
mbox file
- Uses GNU open-source tools to read in mbox file.
- Strips off header and sends content to
Empaceptor. - Messages are used for learning, but are not
scored themselves. - Easy way to load large corpus of training data
(just point it at your Inbox)
11Incoming Mail Path
- Email Program must be set up to invoke arbitrary
shell process as filter (supported by Evolution
and others). - Email Program pipes message and desired scoring
emotion to Java Empaceptor client process. - Java Empaceptor client sends over TCP/IP to
Empaceptor server for scoring only, not for
learning. - Empaceptor server scores, and compares with
configurable threshold. If above threshold,
returns yes, otherwise no. - Empaceptor client process exits with
corresponding integer value 1 for yes, 0 for
no, 15 for error (couldnt connect to server,
etc). - Email program can use information to take
arbitrary action (set color of message).
12Incoming Mail Path
POP Server
POP over TCP/IP
Shell script invocation
Empaceptor Email Service
Return value of service
Standard sockets over TCP/IP
Empaceptor Server
13SMTP Server
- Modified version of jes (Java Email Server) SMTP
server. - Sends outgoing messages to Empaceptor for
scoring and learning before delivery. - Runs on non-standard port (12223) so can be run
along side standard SMTP or sendmail.
SMTP over TCP/IP
Empaceptor Server
To incoming mail server TCP/IP
14The EmpaceptorGUI
- Not much to look at, just a set of tasks to
perform - Open and learn from mbox file
- Start/Stop server to handle email client
requests - Start/Stop SMTP server
- Alter set of emotions to process
- Set threshold for email client requests to
return yes value. - Test box for entering messages to see their
resulting scores as a table.
Lets see a demo
15Usage test
- Ran email through Empaceptor for about five days
(tweaking along the way) - Set up email client (Evolution) to look for happy
emails, and tag as purple if found. - Configured to use SMTP for outgoing mail
- Primed system with Sent mail folders mbox file
(about 440 messages)
16Usage test - Issues
- Email service was too slow. Made scoring faster,
and made learning asynchronous for quick response
time. - Fiddled with threshold 0.1 seems best.
- Bugs, as always
17Usage test Lessons Learned
- System is easily fooled by idiomatic usage of
emotional base words - Happy hour
- Im happy to take care of that
- 440 messages is not a big enough training set to
start. - Performed reasonably well in my opinion, but lots
of false positives, and also what is a good
performance metric?
18Related Work
- Liu et als affective email processing work
- Uses common-sense reasoning
- Specialized email client to do processing
- Eudora mail Mood Watch
- Just keyword spotting, only for strong language
and offensive content - This approach is somewhere in-between
19Future Work
- The ability to look at the last 10,100,1000
messages to see trends. - More sophisticated text processing.
- Take advantage of recency.
- Get it out into the world!
20Thanks
The open-source/free-software community Professor
Picard for feedback and support