Empaceptor The empathy interceptor - PowerPoint PPT Presentation

About This Presentation
Title:

Empaceptor The empathy interceptor

Description:

Empaceptor The empathy interceptor Developed by Elias Holman for MAS630 Final Project 5/11/04 – PowerPoint PPT presentation

Number of Views:136
Avg rating:3.0/5.0
Slides: 21
Provided by: mitEdu69
Category:

less

Transcript and Presenter's Notes

Title: Empaceptor The empathy interceptor


1
EmpaceptorThe empathy interceptor
Developed by Elias Holman for MAS630 Final Project
5/11/04
2
Project Constraints
  • Start from a small amount of built-in knowledge
  • Make it work with tools that users already use,
    with tasks that they already perform
  • Dont require heavyweight processing or
    non-standard hardware (anyone can use it)

3
What is Empaceptor?
  • A pure Java text-processing tool that attempts to
    build up an association between a word and a set
    of configurable emotions.
  • A service which can be invoked either via a Java
    API or standard TCP/IP sockets.

4
What did you build?
POP Server
POP over TCP/IP
Shell script invocation
Empaceptor Email Service
Return value of service
SMTP over TCP/IP
Standard sockets over TCP/IP
Empaceptor Server
Empaceptor architecture showing 1) Incoming
mail path 2) Mbox file access 3) Outgoing mail
path
Local disk access
To incoming mail server TCP/IP
Local disk access
mbox file
Empaceptor XML Persistence
5
The Server
Empaceptor Server
  • Processes incoming text
  • Learns from text, building an association based
    on co-occurrence
  • Scores text, based on learned associations
  • Persists associations in simple XML file stored
    in users home directory

Empaceptor XML Persistence
6
How learning works
  • List of emotion words is starting point.
  • Words occurring with emotion words are marked as
    occurring in emotional context.
  • Words occurring without emotion words are simply
    marked as occurring.
  • Learned emotional score of word is emotional
    occurrences/all occurrences for each emotion.
  • WordNet gives broader coverage, so scores are
    really for synsets, not words.

7
The Base Emotion Set
  • pleased
  • proud
  • regretful
  • relaxed
  • relieved
  • restrained
  • sad
  • satisfied
  • scornful
  • serene
  • shameful
  • sorrowful
  • surprised
  • suspicious
  • sympathetic
  • tender
  • tense
  • tranquil
  • alarmed
  • alert
  • angry
  • annoyed
  • anxious
  • astonished
  • bold
  • bored
  • calm
  • cautious
  • compassionate
  • concerned
  • confident
  • curious
  • delighted
  • depressed
  • disappointed
  • disgusted
  • distressed
  • fearful
  • frustrated
  • generous
  • gloomy
  • grateful
  • guilty
  • blissful
  • happy
  • haughty
  • helpless
  • hopeful
  • humiliated
  • indifferent
  • inferior
  • interested
  • joyful
  • lonely
  • miserable
  • mirthful

8
Scoring a piece of text
  • Each individual word is scored as in learning
    (although occurrences are not counted as in
    learning).
  • Score of text is sum of scores of words divided
    by of words.

9
Scoring a piece of text
I am happy about my new car.  I put it in my
garage.
Word Happy Weight Word Happy Weight
I 1/2 (.5) car 1/1 (1)
am 1/1 (1.0) put 0/1 (0)
happy 1/1 (1.0) it 0/1 (0)
about 1/1 (1.0) in 0/1 (0)
my 1/2 (0.5) garage 0/1 (0)
new 1/1 (1)
10
Mbox import
Empaceptor Server
Local disk access
mbox file
  • Uses GNU open-source tools to read in mbox file.
  • Strips off header and sends content to
    Empaceptor.
  • Messages are used for learning, but are not
    scored themselves.
  • Easy way to load large corpus of training data
    (just point it at your Inbox)

11
Incoming Mail Path
  • Email Program must be set up to invoke arbitrary
    shell process as filter (supported by Evolution
    and others).
  • Email Program pipes message and desired scoring
    emotion to Java Empaceptor client process.
  • Java Empaceptor client sends over TCP/IP to
    Empaceptor server for scoring only, not for
    learning.
  • Empaceptor server scores, and compares with
    configurable threshold. If above threshold,
    returns yes, otherwise no.
  • Empaceptor client process exits with
    corresponding integer value 1 for yes, 0 for
    no, 15 for error (couldnt connect to server,
    etc).
  • Email program can use information to take
    arbitrary action (set color of message).

12
Incoming Mail Path
POP Server
POP over TCP/IP
Shell script invocation
Empaceptor Email Service
Return value of service
Standard sockets over TCP/IP
Empaceptor Server
13
SMTP Server
  • Modified version of jes (Java Email Server) SMTP
    server.
  • Sends outgoing messages to Empaceptor for
    scoring and learning before delivery.
  • Runs on non-standard port (12223) so can be run
    along side standard SMTP or sendmail.

SMTP over TCP/IP
Empaceptor Server
To incoming mail server TCP/IP
14
The EmpaceptorGUI
  • Not much to look at, just a set of tasks to
    perform
  • Open and learn from mbox file
  • Start/Stop server to handle email client
    requests
  • Start/Stop SMTP server
  • Alter set of emotions to process
  • Set threshold for email client requests to
    return yes value.
  • Test box for entering messages to see their
    resulting scores as a table.

Lets see a demo
15
Usage test
  • Ran email through Empaceptor for about five days
    (tweaking along the way)
  • Set up email client (Evolution) to look for happy
    emails, and tag as purple if found.
  • Configured to use SMTP for outgoing mail
  • Primed system with Sent mail folders mbox file
    (about 440 messages)

16
Usage test - Issues
  • Email service was too slow. Made scoring faster,
    and made learning asynchronous for quick response
    time.
  • Fiddled with threshold 0.1 seems best.
  • Bugs, as always

17
Usage test Lessons Learned
  • System is easily fooled by idiomatic usage of
    emotional base words
  • Happy hour
  • Im happy to take care of that
  • 440 messages is not a big enough training set to
    start.
  • Performed reasonably well in my opinion, but lots
    of false positives, and also what is a good
    performance metric?

18
Related Work
  • Liu et als affective email processing work
  • Uses common-sense reasoning
  • Specialized email client to do processing
  • Eudora mail Mood Watch
  • Just keyword spotting, only for strong language
    and offensive content
  • This approach is somewhere in-between

19
Future Work
  • The ability to look at the last 10,100,1000
    messages to see trends. 
  • More sophisticated text processing. 
  • Take advantage of recency. 
  • Get it out into the world!

20
Thanks
The open-source/free-software community Professor
Picard for feedback and support
Write a Comment
User Comments (0)
About PowerShow.com