Learning to Classify Email into - PowerPoint PPT Presentation

About This Presentation

Title:

Learning to Classify Email into

Description:

Imagine an hypothetical email assistant that can detect 'speech acts' ... Multi-class: learning algor. X agreement. 104. 12. 5. 2. 17. Deliver. 11. 11. 3. 1. 1. Commit ... – PowerPoint PPT presentation

Number of Views:92

Avg rating:3.0/5.0

Slides: 28

Provided by: sarahej

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Learning to Classify Email into

1
Learning to Classify Email into Speech Acts

William W. Cohen, Vitor R. Carvalho and Tom M.
Mitchell
Presented by Vitor R. Carvalho
IR Discussion Series - August 12th 2004 - CMU

2
Imagine an hypothetical email assistant that can
detect speech acts
1
urgent Request - may take action - request
pending
Do you have any data with xml-tagged names? I
need it ASAP!
Urgent Request - May take action
2
A Commitment is detected. Should I send Vitor a
reminder on Sunday?
Sure. Ill put it together by Sunday.
should I add this Commitment to your to-do
list?
3
Heres the tar ball on afs vitor/names.tar.gz
A Delivery of data is detected. - pending
cancelled
Delivery is sent - to-do list updated
3
Outline

Setting the base
Email speech act Taxonomy
Data
Inter-annotator agreement
Results
Learnability of email acts
Different learning algorithms, acts, etc
Different representations
Improvements
Collective/Relational/Iterative classification

4
Related Work

Email classification for
topic/folder identification
spam/non-spam
Speech-act classification in conversational
speech
email is new domain - multiple acts/msg
Winograds Coordinator (1987) users manually
annotated email with intent.
Extra work for (lazy) users
Murakoshi et al (1999) hand-coded rules for
identifying speech-act like labels in Japanese
emails

5
Email Acts Taxonomy
From Benjamin Han To Vitor Carvalho Subject
LTI Student Research Symposium Hey Vitor When
exactly is the LTI SRS submission deadline? Also,
dont forget to ask Eric about the SRS
webpage. See you Ben

Single email message may contain multiple acts
An Act is described as a verb-noun pair (e.g.,
propose meeting, request information) - Not all
pairs make sense
Try to describe commonly observed behaviors,
rather than all possible speech acts in English
Also include non-linguistic usage of email (e.g.
delivery of files)

Request - Information Reminder -
action/task
6
A Taxonomy of Email Acts
Verb
Negotiate
Other
Remind
Greet
Conclude
Initiate
Amend
Propose
Request
Deliver
Refuse
Commit
7
A Taxonomy of Email Acts
Noun
Activity
Information
Opinion
Ongoing Activity
Data
Single Event
Meeting Logistics Data
Other Data
Committee
Meeting
Other Short Term Task
ltVerbgtltNoungt
8
A Taxonomy of Email Acts
Noun
Future work integration with task-oriented email
clustering
Activity
Information
Opinion
Ongoing Activity
Data
Single Event
Meeting Logistics Data
Other Data
Committee
Meeting
Other Short Term Task
Only will consider predicting top-level tasks,
not recursive structure
ltVerbgtltNoungt
9
Corpora

4 different datasets
from CSpace (Management game at GSIA)
N01F3 (351 email messages)
N02F2 (341 email messages)
N03F2 (443 email messages) -
from Project World CALO (simulation game at SRI)
Pw_calo (222 email messages)
4 to 6 participants in each group
N03F2 was manually labeled by 2 different
annotators (whats the agreement?)

10
Corpora

Few large, natural email corpora are available
CSPACE corpus (Kraut Fussell)
Email associated with a semester-long project for
GSIA MBA students in 1997
15,000 messages from 277 students in 50 teams (4
to 6/team)
Rich in task negotiation
N02F2, N01F3, N03F2 all messages from students
in three teams (341, 351, 443 messages).
SRIs Project World CALO corpus
6 people in artificial task scenario over four
days
222 messages (publically available)

Double-labeled
11
Inter-Annotator Agreement

Kappa Statistic
A probability of agreement in a category
R prob. of agreement for 2 annotators labeling
at random
Kappa range -11

Inter-Annotator Agreement Inter-Annotator Agreement
Email Act Kappa
Deliver 0.75
Commit 0.72
Request 0.81
Amend 0.83
Propose 0.72
12
Inter-Annotator Agreement for messages with only
one single verb
13
Learnability of Email ActsFeatures un-weighted
word frequency counts (BOW)5-fold
cross-validation(Directive Req or Prop or Amd)
14
Using Different Learners
(Directive Act Req or Prop or Amd)
15
Learning Requests only
16
Learning Commissives
(Commissive Act Delivery or Commitment)
17
Learning Deliveries only
18
Learning to recognize Commitments
19
Overview on Entire Corpus
Act Voted Perceptron AdaBoost SVM Decision Trees
Request Error 0.25 0.22 0.23 0.2
(450/907) F1 0.58 0.65 0.64 0.69
Propose Error 0.11 0.12 0.12 0.1
(140/1217) F1 0.19 0.26 0.44 0.13
Deliver Error 0.26 0.28 0.27 0.3
(873/484) F1 0.8 0.78 0.78 0.76
Commit Error 0.15 0.14 0.17 0.15
(208/1149) F1 0.21 0.44 0.47 0.11
Directive Error 0.25 0.23 0.23 0.19
(605/752) F1 0.72 0.73 0.73 0.78
Commissive Error 0.23 0.23 0.24 0.22
(993/364) F1 0.84 0.84 0.83 0.85
20
Multi-class learning algor. X agreement
(for messages with just one single category)
Annot1 X learner Req Prop Amd Cmt Dlv
Request 27 0 4 3 24
Propose 1 2 0 4 8
Amend 2 0 6 6 4
Commit 1 1 3 11 11
Deliver 17 2 5 12 104
Annot1 X Annot2 Req Prop Amd Cmt Dlv
Request 55 1 0 1 1
Propose 0 11 1 3 0
Amend 0 0 15 1 2
Commit 0 0 0 24 3
Deliver 0 1 0 4 135
21
Most Informative Features (are common words)
RequestAmendPropose
Commit
Deliver
22
Learning document representation

Variants explored
TFIDF -gt TF weighting (dont downweight common
words)
bigrams
For commitment i will, i agree, in top 5
features
For directive do you, could you, can you,
please advise in top 25
count of time expressions
words near a time expression
words near proper noun or pronoun
POS counts

Baseline classifier linear-kernel SVM with TFIDF
weighting

24
Collective Classification (relational)
25
Collective Classification

BOW classifier output as features (7 binary
features req, dlv, amd, prop, etc)
MaxEnt Learner, Training set N03f2, Test set
N01f3
Features current msg parent msg child
message (1st child only)
Related msgs messages with a parent and/or
child message

N01f3 dataset Req Dlv Cmt Prop Amd ReqAmdProp DlvCmt

Entire dataset (351) F1 54.61 74.47 34.61 28.98 16.00 68.30 80.97
Entire dataset (351) Kappa 28.21 34.88 23.94 21.76 13.02 35.00 22.84

Related msgs only (170) F1 56.92 71.71 38.09 39.21 22.22 75.00 80.47
Related msgs only (170) Kappa 33.08 32.74 24.02 28.72 17.93 43.70 27.14
useful for related messages
26
Collective/Iterative Classification
TIME
0.53

Start with baseline (BOW)
How to make updates?
Chronological order
Using family-heuristics (child first, parent
first, etc)
Using posterior probability
(Maximum Entropy learner)
(Threshold, ranking, etc)

0.65
0.85
0.85
0.95
0.93
27
Iterative Classification Commitment
28
Iterative Classification Request
29
Iterative Classification DlvCmt
30
Conclusions/Summary

Negotiating/managing shared tasks is a central
use of email
Proposed a taxonomy for email acts - could be
useful for tracking commitments, delegations,
pending answers, integrating to-do lists and
calendars to email, etc
Inter-annotator agreement ? 70-80s (kappa)
Learned classifiers can do this to some
reasonable degree of accuracy (90 precision at
50-60 recall for top level of taxonomy)
Fancy tricks with IE, bigrams, POS offer modest
improvement over baseline TF-weighted systems

31
Conclusions/Future Work