Summarizing Threads of Email Conversations: Using QA Pairs Detection to Improve Extractive Summaries - PowerPoint PPT Presentation

About This Presentation
Title:

Summarizing Threads of Email Conversations: Using QA Pairs Detection to Improve Extractive Summaries

Description:

Guys, I can't come tonight. Can I reschedule my C session for Wednesday night, 11/8, at 8:00? If that's cool with you guys, please reserve me a room. ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 64
Provided by: Lok3
Category:

less

Transcript and Presenter's Notes

Title: Summarizing Threads of Email Conversations: Using QA Pairs Detection to Improve Extractive Summaries


1
Summarizing Threads of Email Conversations Using
QA Pairs Detection to Improve Extractive Summaries
  • Lokesh Shrestha

2
Reasons for Summarizing Email
  • Email has become a primary means of business and
    personal communication.
  • Conversations take place and decisions are made
    entirely through email.
  • Given the high volume of email each individual
    accumulates, how can we efficiently retrieve
    information from our email archives?

3
Summarizing Email vs. Summarizing Newswire
  • Email has interactive structure
  • Email can have informal language
  • Email does not have different, independent
    documents about same topic (not multi-document
    summarization)

4
Contributions
  • Email specific features can be used for machine
    learning based extractive summarization of email
    threads
  • A novel approach to question-answer pair
    detection
  • Integration of QA pair sentences with extractive
    sentences improve summaries.

5
Overview
  • Related Work
  • Corpus
  • Approach 1 Sentence Extraction
  • Approach 2 Question-Answer Pairs Detection
  • Approach 3 Integration
  • Outlook Email Client
  • Conclusion

6
Related Work
  • Summarizing individual emails
  • Derek Lam, Steven L. Rohall, Chris Schmandt, and
    Mia K. Stern. 2002
  • Sentence extraction
  • Smaranda Muresan, Evelyne Tzoukermann, and Judith
    Klavans. 2001.
  • Key phrase extraction
  • Summarizing discussion lists
  • Ani Nenkova and Amit Bagga. 2003.
  • Sentence extraction
  • Paula Newman and John Blitzer. 2003.
  • Thread topic clustering and sentence extraction.
  • Summarizing speech dialogues
  • Klaus Zechner. 2002.
  • Sentence Extraction and QA pairs

7
Overview
  • Related Work
  • Corpus
  • Approach 1 Sentence Extraction
  • Approach 2 Question-Answer Pairs Detection
  • Approach 3 Integration
  • Outlook Email Client
  • Conclusion

8
Corpus
  • Columbia ACM chapter executive board mailing list
  • Approximately 10 regular participants
  • 300 Threads, 1000 Messages
  • Threads include scheduling and planning of
    meetings and events, question and answer, general
    discussion and chat.
  • Annotated by human annotators
  • Hand-written summary
  • Categorization of threads and messages
  • Highlighting important information (such as
    question-answer pairs)

9
Sample Hand-Written Summary for Thread
  • Annotator 1 Summary Alexander McCaughly asks the
    group if he can reschedule his C-session for
    Wednesday night. Raju Gupta tells McCaughly that
    he is able to reschedule his C-session. Reema
    Ramachandran reminds McCaughly that he scheduled
    an MS Office Session for November 14, and she
    asks McCaughly to confirm that he can be at that
    session.

10
Overview
  • Related Work
  • Corpus
  • Approach 1 Sentence Extraction
  • Approach 2 Question-Answer Pairs Detection
  • Approach 3 Integration
  • Outlook Email Client
  • Conclusion

11
Sentence Extraction
  • Machine learning approach to extractive
    summarization of email threads
  • Creating Training Data
  • Learn extractive rules
  • Use rules to generate summary

12
Sentence Extraction Creating Training Data
  • Using human generated summaries to create a model
    extractive summary
  • Compare thread sentences with human summary
    sentences using SimFinder
  • Given a summary size, select highly ranked
    sentences
  • Represent each sentence with a vector of features
    and the class

13
SimFinder in Action
  • Guys, I can't come tonight.
  • Can I reschedule my C session for Wednesday
    night, 11/8, at 800?
  • If that's cool with you guys, please reserve me a
    room.
  • Sure we can, but that's the day after Election
    Day.
  • Are you sure you want to do it then?
  • alex, a reminder that your scheduled to do an
    MSOffice session on Nov. 14, at 7pm in 252Mudd.
  • --please confirm that you can do that
    session/posters
  • Confirmed. Intro to MS Office, then there will be
    three more where we'll work on the individual
    programs for full sessions
  • Alexander McCaughly asks the group if he can
    reschedule his C-session for Wednesday night.
  • Raju Gupta tells McCaughly that he is able to
    reschedule his C-session.
  • Reema Ramachandran reminds McCaughly that he
    scheduled on MS Office Session for November 14,
    and she asks McCaughly to confirm that he can be
    at that session.

14
SimFinder in Action
  • Guys, I can't come tonight.
  • Can I reschedule my C session for Wednesday
    night, 11/8, at 800?
  • If that's cool with you guys, please reserve me a
    room.
  • Sure we can, but that's the day after Election
    Day.
  • Are you sure you want to do it then?
  • alex, a reminder that your scheduled to do an
    MSOffice session on Nov. 14, at 7pm in 252Mudd.
  • --please confirm that you can do that
    session/posters
  • Confirmed. Intro to MS Office, then there will be
    three more where we'll work on the individual
    programs for full sessions
  • Alexander McCaughly asks the group if he can
    reschedule his C-session for Wednesday night.
  • Raju Gupta tells McCaughly that he is able to
    reschedule his C-session.
  • Reema Ramachandran reminds McCaughly that he
    scheduled on MS Office Session for November 14,
    and she asks McCaughly to confirm that he can be
    at that session.

SimFinder 0.0038
15
SimFinder in Action
  • Guys, I can't come tonight.
  • Can I reschedule my C session for Wednesday
    night, 11/8, at 800?
  • If that's cool with you guys, please reserve me a
    room.
  • Sure we can, but that's the day after Election
    Day.
  • Are you sure you want to do it then?
  • alex, a reminder that your scheduled to do an
    MSOffice session on Nov. 14, at 7pm in 252Mudd.
  • --please confirm that you can do that
    session/posters
  • Confirmed. Intro to MS Office, then there will be
    three more where we'll work on the individual
    programs for full sessions
  • Alexander McCaughly asks the group if he can
    reschedule his C-session for Wednesday night.
  • Raju Gupta tells McCaughly that he is able to
    reschedule his C-session.
  • Reema Ramachandran reminds McCaughly that he
    scheduled on MS Office Session for November 14,
    and she asks McCaughly to confirm that he can be
    at that session.

SimFinder 0.0028
16
SimFinder in Action
  • Guys, I can't come tonight.
  • Can I reschedule my C session for Wednesday
    night, 11/8, at 800?
  • If that's cool with you guys, please reserve me a
    room.
  • Sure we can, but that's the day after Election
    Day.
  • Are you sure you want to do it then?
  • alex, a reminder that your scheduled to do an
    MSOffice session on Nov. 14, at 7pm in 252Mudd.
  • --please confirm that you can do that
    session/posters
  • Confirmed. Intro to MS Office, then there will be
    three more where we'll work on the individual
    programs for full sessions
  • Alexander McCaughly asks the group if he can
    reschedule his C-session for Wednesday night.
  • Raju Gupta tells McCaughly that he is able to
    reschedule his C-session.
  • Reema Ramachandran reminds McCaughly that he
    scheduled on MS Office Session for November 14,
    and she asks McCaughly to confirm that he can be
    at that session.

SimFinder 0.0028
17
SimFinder in Action
  • Guys, I can't come tonight.
  • Can I reschedule my C session for Wednesday
    night, 11/8, at 800?
  • If that's cool with you guys, please reserve me a
    room.
  • Sure we can, but that's the day after Election
    Day.
  • Are you sure you want to do it then?
  • alex, a reminder that your scheduled to do an
    MSOffice session on Nov. 14, at 7pm in 252Mudd.
  • --please confirm that you can do that
    session/posters
  • Confirmed. Intro to MS Office, then there will be
    three more where we'll work on the individual
    programs for full sessions

SimFinder 0.0028
  • Alexander McCaughly asks the group if he can
    reschedule his C-session for Wednesday night.
  • Raju Gupta tells McCaughly that he is able to
    reschedule his C-session.
  • Reema Ramachandran reminds McCaughly that he
    scheduled on MS Office Session for November 14,
    and she asks McCaughly to confirm that he can be
    at that session.

18
SimFinder in Action
  • Guys, I can't come tonight.
  • Can I reschedule my C session for Wednesday
    night, 11/8, at 800?
  • If that's cool with you guys, please reserve me a
    room.
  • Sure we can, but that's the day after Election
    Day.
  • Are you sure you want to do it then?
  • alex, a reminder that your scheduled to do an
    MSOffice session on Nov. 14, at 7pm in 252Mudd.
  • --please confirm that you can do that
    session/posters
  • Confirmed. Intro to MS Office, then there will be
    three more where we'll work on the individual
    programs for full sessions
  • Alexander McCaughly asks the group if he can
    reschedule his C-session for Wednesday night.
  • Raju Gupta tells McCaughly that he is able to
    reschedule his C-session.
  • Reema Ramachandran reminds McCaughly that he
    scheduled on MS Office Session for November 14,
    and she asks McCaughly to confirm that he can be
    at that session.

SimFinder 0.983
19
SimFinder in Action
  • Guys, I can't come tonight.
  • Can I reschedule my C session for Wednesday
    night, 11/8, at 800?
  • If that's cool with you guys, please reserve me a
    room.
  • Sure we can, but that's the day after Election
    Day.
  • Are you sure you want to do it then?
  • alex, a reminder that your scheduled to do an
    MSOffice session on Nov. 14, at 7pm in 252Mudd.
  • --please confirm that you can do that
    session/posters
  • Confirmed. Intro to MS Office, then there will be
    three more where we'll work on the individual
    programs for full sessions
  • Alexander McCaughly asks the group if he can
    reschedule his C-session for Wednesday night.
  • Raju Gupta tells McCaughly that he is able to
    reschedule his C-session.
  • Reema Ramachandran reminds McCaughly that he
    scheduled on MS Office Session for November 14,
    and she asks McCaughly to confirm that he can be
    at that session.

SimFinder 0.563
20
SimFinder in Action
SimFinder 0.0038
  • Guys, I can't come tonight.
  • Can I reschedule my C session for Wednesday
    night, 11/8, at 800?
  • If that's cool with you guys, please reserve me a
    room.
  • Sure we can, but that's the day after Election
    Day.
  • Are you sure you want to do it then?
  • dan, a reminder that your scheduled to do an
    MSOffice session on Nov. 14, at 7pm in 252Mudd.
  • --please confirm that you can do that
    session/posters
  • Confirmed. Intro to MS Office, then there will be
    three more where we'll work on the individual
    programs for full sessions
  • Daniel Kestin asks the group if he can reschedule
    his C-session for Wednesday night.
  • Janak Parekh tells Medina that he is able to
    reschedule his C-session.
  • Christy Lauridsen reminds Medina that he
    scheduled on MS Office Session for November 14,
    and she asks Kestin to confirm that he can be at
    that session.

SimFinder 0.983
SimFinder 0.0038
SimFinder 0.0038
SimFinder 0.0038
SimFinder 0.752
SimFinder 0.221
SimFinder 0.368
21
Determining Summary Size
  • Determine the summary size the human summarizers
    used
  • Create gold-standard data manually
  • Select about 10 of ACM threads
  • gold-standard threads
  • Manually classify sentences in gold-standard
    threads
  • positive if content reflected in human summary
  • negative otherwise
  • Compare SimFinder derived classifications at
    various summary sizes with gold-standard
    classifications

22
Determining Summary Size
  • Results
  • Use 45
  • Verifies the use of SimFinder

Summary size 20 30 40 45 50 55 60
Recall 0.268 0.500 0.625 0.768 0.803 0.821 0.857
Precision 0.750 0.824 0.833 0.827 0.803 0.780 0.750
F-score 0.394 0.622 0.714 0.796 0.803 0.80 0.80
23
Result Sentences Marked as in Summary/not in
Summary
  • Guys, I can't come tonight.
  • Can I reschedule my C session for Wednesday
    night, 11/8, at 800?
  • If that's cool with you guys, please reserve me a
    room.
  • Sure we can, but that's the day after Election
    Day.
  • Are you sure you want to do it then?
  • alex, a reminder that your scheduled to do an
    MSOffice session on Nov. 14, at 7pm in 252Mudd.
  • --please confirm that you can do that
    session/posters
  • Confirmed. Intro to MS Office, then there will be
    three more where we'll work on the individual
    programs for full sessions
  • Alexander McCaughly asks the group if he can
    reschedule his C-session for Wednesday night.
  • Raju Gupta tells McCaughly that he is able to
    reschedule his C-session.
  • Reema Ramachandran reminds McCaughly that he
    scheduled on MS Office Session for November 14,
    and she asks McCaughly to confirm that he can be
    at that session

N Y N N N Y N Y
24
Sentence Features Thread as a document
  • Length number of words in sentence
  • TF-IDF scores highest, sum and mean
  • Centroid similarity
  • Subject similarity
  • Relative position in thread
  • Is question?

25
Sentence FeaturesEmail-Specific Features
  • Number of responses to the email.
  • Number of recipients of email
  • Has sender names does the sentence contain the
    name of the senders of messages in the thread?
  • Email contains forwarded message?
  • Features derived from quoted material

26
Learn extractive rules Results
  • Using full feature set, 5-fold cross-validation
    with Ripper
  • Baseline scores are obtained with random
    classification

Data Set Precision Recall F1-score Baseline F1-score
Annotator 1 0.550 0.516 0.532 0.422
Annotator 2 0.514 0.468 0.490 0.392
27
Sample Ruleset Nice Rules
  1. IF centroid_sim_local ? 0.32 AND thread_line_num
    ? 4 AND isQuestion 1 AND tfidfavg ? 0.21 AND
    tfidfavg ? 0.30 THEN Y.
  2. IF centroid_sim ? 0.72 AND numOfRecipients ? 8
    THEN Y.
  3. IF centroid_sim_local ? 0.31 AND thread_line_num
    ? 4 AND tfidfmax ? 0.61 AND m_rel_pos ? 0.36 AND
    t_rel_pos ? 0.18 THEN Y.
  4. IF centroid_sim_local ? 0.31 AND centroid_sim ?
    0.76 AND centroid_sim ? 0.79 AND tfidfavg ? 0.19
    THEN Y.
  5. IF subject_sim ? 0.33 AND tfidfsum ? 2.84 AND
    tfidfsum ? 2.64 AND tfidfmax ? 0.68 THEN Y.
  6. ELSE N

28
Automatically Generated Sample Summary
  • Regarding "meeting tonight...", on Oct 30, 2000,
    Alexander Max McCaughly wrote Can I reschedule
    my C session for Wednesday night, 11/8, at 800?
  • Responding to this on Oct 30, 2000, Raju J Gupta
    wrote Are you sure you want to do it then?
  • Responding to this on Oct 30, 2000, Reema
    Ramachandran wrote alex, a reminder that your
    scheduled to do an MSOffice session on Nov. 14,
    at 7pm in 252Mudd.

29
Overview
  • Summarizing Email
  • Corpus Development
  • Approach 1 Sentence Extraction
  • Approach 2 Question-Answer Pairs Detection
  • Approach 3 Integration
  • Outlook Email Client
  • Conclusion

30
The Problem
  • Question-answer exchanges common in email
  • Multiple questions in one thread in one message
  • Multiple, possibly contradictory, answers to a
    single question
  • If a summary has question, and answer is in
    thread, summary should have the answer

31
Questions in Email Summaries
  • Complete summary from our rule-based sentence
    extractor
  • Regarding "acm home/bjarney", on Apr 9, 2001,
    Muriel Danslop wrote
  • Two things Can someone be responsible for the
    press releases for Stroustrup?
  • Responding to this on Apr 10, 2001, Theresa Feng
    wrote
  • I think Phil, who is probably a better writer
    than most of us, is writing up something for dang
    and Dave to send out to various ACM chapters.
    Phil, we can just use that as our "press
    release", right?
  • In another subthread, on Apr 12, 2001, Kevin
    Danquoit wrote
  • Are you sending out upcoming events for this
    week?

32
Approach
  • Same machine learning as before Supervised rule
    induction based
  • Ripper (Cohen, 96)
  • Same email corpus as before
  • ACM Corpus

33
Detection of Questions
  • Detecting questions is non-trivial
  • Informal use of question mark
  • Use question mark in cases other than questions
    - to denote uncertainty, to make a suggestion.
  • I am on with Monday - perhaps some time in the
    afternoon or evening?
  • I suggest 7pm?
  • If it's better for ppl we could also have shorter
    lunch meetings (mon,tues,thurs)?
  • Overlook using a question mark after posing a
    question
  • Who can we get in touch with at your organization
    regarding these services.
  • The work we present here is based on the
    detection of interrogative questions inverted
    subject-verb order.

34
Detection of Questions
  • Training Corpus - Speech
  • Switchboard corpus annotated with DAMSL tags.
  • 5000 positive examples, 5000 negative examples
  • negative examples - "statement-opinion" and
    "statement-non-opinion".
  • positive examples - "yes-no-question",
    "Wh-question", and "rhetorical-question"
  • Test Corpus - Email
  • manually extracted from the ACM corpus
  • 300 positive examples, 300 negative examples.

35
Detection of Questions
  • Features
  • POS tags for the first five terms
  • POS tags for the last five terms
  • length of the utterance
  • most discriminating POS-bigrams

36
Detection of Questions
  • Results
  • Recall low because
  • Questions in ACM corpus start with a declarative
    clause
  • So, if you're available, do you want to come?
  • if you don't mind, could you post this to the
    class bboard?
  • Results without declarative clause

Recall 0.56
Precision 0.96
F-measure 0.70
Recall 0.72
Precision 0.96
F-measure 0.82
37
Detection of Answers
  • Detection difficult
  • Multiple topics discussed in parallel
  • Those that begin with a single topic may spin off
    different ones
  • Use of reply back function to answer a question
    asked earlier in the thread.
  • We show how various features derived from the
    structure of email threads can improve upon
    lexical similarity between message segments

38
Detection of Answers
  • ACM Corpus
  • Annotators were asked to
  • Highlight and link Question and Answer pairs.
  • Annotator 1 200 Threads, 81 QA Threads
  • Annotator 2 138 Threads, 62 QA Threads
  • Inter-Annotator Agreement (Kappa statistic)
  • Question Detection 0.68
  • Answer Detection (given question) 0.81

39
Detection of Answers
  • Methods
  • Use human annotated data to generate training
    data
  • Textual Unit
  • use message segments rather than individual
    sentences to reduce lexical gap between questions
    and candidate answers
  • Learn a classifier that predicts if a subsequent
    segment to a question segment answers it
  • Represent each question and candidate answer
    segment by a feature vector

40
Detection of Answers
  • Features Used
  • Standard word counts, word overlap (Cosine,
    Euclidean)
  • Based on thread structure
  • is candidate answer the first
  • number of emails between the question and the
    answer segments
  • the number of emails in the thread before the
    question segment
  • Based on other candidate answer segments
  • is candidate the most similar
  • relative position of the candidate among other
    candidates
  • number of other candidates

41
Detection of Answers
  • Experiments and Results
  • 5 fold cross validation using Ripper (Cohen, 96)

Data Set Precision Recall F1-score
Union 0.698 0.619 0.656
Union lt 2 0.879 0.921 0.899
Union gt 2 0.631 0.619 0.625
Composite 0.728 0.732 0.730
42
Detection of Answers
  • Experiments and Results
  • 5 fold cross validation using Ripper (Cohen, 96)

Data Set Precision Recall F1-score
Union 0.698 0.619 0.656
Union lt 2 0.879 0.921 0.899
Union gt 2 0.631 0.619 0.625
Composite 0.728 0.732 0.730
43
Detection of Answers
  • Experiments and Results
  • 5 fold cross validation using Ripper (Cohen, 96)

Data Set Precision Recall F1-score
Union 0.698 0.619 0.656
Union lt 2 0.879 0.921 0.899
Union gt 2 0.631 0.619 0.625
Composite 0.728 0.732 0.730
44
Overview
  • Summarizing Email
  • Corpus Development
  • Approach 1 Sentence Extraction
  • Approach 2 Question-Answer Pairs Detection
  • Approach 3 Integration
  • Outlook Email Client
  • Conclusion

45
Integrating extractive summaries with QA pairs
Approaches
  • Use QA pairs as features
  • Add corresponding answers to extracted questions
    and corresponding questions to extracted answers
  • Add extractive sentences to QA pairs
  • Use all QA pairs detected as basis for summary
  • Use machine learning technique to identify QA
    pairs to be included in summary

46
Integrating extractive summaries with QA pairs
First Approach
  • Use QA pairs as features
  • Each sentence in the thread is represented by a
    feature vector
  • Relative position of the sentence in email and
    thread
  • TFIDF weights
  • Is question?
  • .
  • .
  • .
  • Is answer?

47
Integrating extractive summaries with QA pairs
First Approach
  • Use QA pairs as features
  • Number of rules learned with this augmented set
    of features 1397
  • Number of rules that include the answer feature
    54
  • Maximum number of rules that any feature is
    included in 160

48
Integrating extractive summaries with QA pairs
Second Approach
  • Add corresponding answers to extracted questions
  • Alex -- since you're in OS, what do you think? Do
    you think students will be working on the 15th?
  • I'm in OS, and yeah, I'm pretty sure people will
    be working on the weekend of a week before.
  • Add corresponding questions to extracted answers
  • Sure we can, but that's the day after Election
    Day.
  • Can I reschedule my C session for Wednesday
    night, 11/8, at 800?

49
Integrating extractive summaries with QA pairs
Third Approach
  • Augment QA pair sentences with extractive
    sentences
  • Automatically detect QA segment pairs in a thread
  • Select the question sentence from each question
    segment
  • Select an answer sentence from each answer
    segment
  • Add extractive sentences if they do are not in
    any automatically detect QA segment pairs

50
Integrating extractive summaries with QA pairs
Third Approach
  • Example Summary Adding questions

Regarding "ACM / CUSFS Film Cosponsorship (fwd)", on Wed Aug 16 100156 EDT 2000, Raju J Gupta wrote Are you all around before September? In a subsequent message in the same thread, on Thu Aug 17 142211 EDT 2000, Raju J Gupta wrote Well, shall we do this the weekend before classes? How about Monday, the labor day before class? Responding to this on Thu Aug 17 205524 EDT 2000, Justin Liu wrote I am on with Monday - perhaps some time in the afternoon or evening?
51
Integrating extractive summaries with QA pairs
Third Approach
  • Example Summary Adding answers

Regarding "ACM / CUSFS Film Cosponsorship (fwd)", on Wed Aug 16 100156 EDT 2000, Raju J Gupta wrote Are you all around before September? Responding to this on Wed Aug 16 120541 EDT 2000, Manij Ali wrote however, i will be around the following week and i'll be able to make any meeting that does not conflict with any orientation event In another subthread, on Thu Aug 17 142211 EDT 2000, Raju J Gupta wrote Well, shall we do this the weekend before classes? How about Monday, the labor day before class? Responding to this on Thu Aug 17 205524 EDT 2000, Justin Liu wrote I am on with Monday - perhaps some time in the afternoon or evening? Responding to this on Fri Aug 18 113125 EDT 2000, Manij Ali wrote so only under the condition that the time does not conflict with anything that i might have been scheduled for will monday afternoon be okay.
52
Integrating extractive summaries with QA pairs
Third Approach
  • Example Summary Adding extractive sentences

Regarding "ACM / CUSFS Film Cosponsorship (fwd)", on Wed Aug 16 100156 EDT 2000, Raju J Gupta wrote Are you all around before September? You guys realize that this means it's time for the 1st meeting. Responding to this on Wed Aug 16 120541 EDT 2000, Manij Ali wrote however, i will be around the following week and i'll be able to make any meeting that does not conflict with any orientation eventi won't be around next week. In another subthread, on Thu Aug 17 040149 EDT 2000, Ritu Shetty wrote I won't be back on campus till Sept. 3 In another subthread, on Thu Aug 17 093040 EDT 2000, Daniel Max Kestin wrote I am back on campus on the 27th. Responding to this on Thu Aug 17 142211 EDT 2000, Raju J Gupta wrote Well, shall we do this the weekend before classes? How about Monday, the labor day before class? ... Alex (Markov), when you get back from wherever you are it should be your responsibility to organize these ) Responding to this on Thu Aug 17 205524 EDT 2000, Justin Liu wrote I am on with Monday - perhaps some time in the afternoon or evening? Responding to this on Fri Aug 18 113125 EDT 2000, Manij Ali wrote so only under the condition that the time does not conflict with anything that i might have been scheduled for will monday afternoon be okay.
53
Integrating extractive summaries with QA pairs
Results
Approach Baseline
Precision 0.55
Recall 0.52
F-score 0.53
54
Integrating extractive summaries with QA pairs
Results
Approach Baseline QA features
Precision 0.55 0.591
Recall 0.52 0.506
F-score 0.53 0.545
55
Integrating extractive summaries with QA pairs
Results
Approach Baseline QA features Add answers and questions to extractive sentences
Precision 0.55 0.591 0.561
Recall 0.52 0.506 0.571
F-score 0.53 0.545 0.566
56
Integrating extractive summaries with QA pairs
Results
Approach Baseline QA features Add answers and questions to extractive sentences Add extractive sentences to QA pair sentences
Precision 0.55 0.591 0.561 0.534
Recall 0.52 0.506 0.571 0.617
F-score 0.53 0.545 0.566 0.573
57
Integrating extractive summaries with QA pairs
Results
Approach Baseline QA features Add answers and questions to extractive sentences Add extractive sentences to QA pair sentences
Precision 0.55 0.591 0.561 0.534
Recall 0.52 0.506 0.571 0.617
F-score 0.53 0.545 0.566 0.573
58
Overview
  • Summarizing Email
  • Corpus Development
  • Approach 1 Sentence Extraction
  • Approach 2 Question-Answer Pairs Detection
  • Approach 3 Integration
  • Outlook Email Client
  • Conclusion

59
What is SUMUI?
  • User Interface that exposes Natural Language
    Processing functionalities through an email
    client such as MS Outlook.
  • NLP functionalities
  • Summarization of email
  • Categorization of email
  • Summarization of email thread
  • Categorization of email thread
  • Email clustering and topic detection
  • Summarization of mailbox
  • Functionalities in italics are work in progress.

60
Components
61
MS Outlook Client Add-On
62
Conclusion
  • Email specific features can be used for machine
    learning based extractive summarization of email
    threads.
  • We presented our novel approach to
    question-answer pair detection with high
    accuracy.
  • We showed how integration of QA pair sentences
    with extractive sentences improve summaries.

63
  • Questions?
Write a Comment
User Comments (0)
About PowerShow.com