Multidocument summarization by people and machines - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Multidocument summarization by people and machines

Description:

... Chilean dictator Augusto Pinochet has been arrested in London at the request of ... Label London was where Pinochet was arrested. Weight=3 ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 43
Provided by: AniNe7
Category:

less

Transcript and Presenter's Notes

Title: Multidocument summarization by people and machines


1
Multi-document summarization by people and
machines
  • Ani Nenkova
  • Department of Computer and Information Science
  • University of Pennsylvania

2
Why is summarization important?
3
Summarizing online news
Pan Am
bombing
Libya
suspects
Gadhafi
trial
Libya refuses to surrender two Pan Am bombing
suspects
UK and USA
???
4
(No Transcript)
5
People like summaries!
  • User study on a report writing task
  • 4 report topics
  • 12 subjects with each interface
  • Interface
  • Newsblaster
  • One line summary (Google news)
  • No summary
  • Main findings multi-document summaries help
  • Higher user satisfaction
  • Better reports

6
Some references
  • Ani Nenkova, Becky Passonneau, Kathy McKeown
  • The pyramid method incorporating human
    content selection variation in summarization
    evaluation
  • ACM Transactions on Speech and Language
    Processing, volume 4, issue 2, 2007
  • Surabhi Gupta, Ani Nenkova and Dan Jurafsky
  • Measuring Importance and Query Relevance in
    Topic-focused Multi-document Summarization
  • ACL 2007 (short paper)
  • Ani Nenkova, Lucy Vanderwende, Kathy McKeown
  • A compositional context-sensitive multi-document
    summarizer
  • ACM SIGIR 2006
  • Kathy McKeown, Rebecca Passonneau, David Elson,
    Ani Nenkova, Julia HirschbergDo Summaries Help?
    A Task-Based Evaluation of Multi-Document
    SummarizationACM SIGIR 2005
  • McKeown, Barzilay, Evans, Hatzivassiloglou,
    Klavans, Nenkova, Sable, Schiffman,
    SigelmanTracking and Summarizing News on a Daily
    Basis with Columbia's NewsblasterHLT 2002

7
A problem human choice variation
  • S1 Pinochet arrested in London on Oct 16 at a
    Spanish judges request for atrocities against
    Spaniards in Chile.
  • S2 Former Chilean dictator Augusto Pinochet has
    been arrested in London at the request of the
    Spanish government.
  • S3 Britain caused international controversy and
    Chilean turmoil by arresting former Chilean
    dictator Pinochet in London.

8
Why is variation a problem?
  • Makes a precise task definition impossible
  • Different people produce different summaries
  • The same person at different times produces a
    different summary
  • How can an automatic system perform well?
  • Evaluation of system output
  • Comparison with a human model
  • Switching the model leads to a different score

9
Human variation content words
  • Summaries differ in vocabulary
  • Differences cannot be
  • explained by paraphrase
  • 7 translations
  • 20 documents
  • 7 summaries
  • ? 20 document sets
  • Faster vocabulary growth in summarization

10
Content units better study of variation
  • Semantic units
  • Emerge from the analysis of several texts
  • Link different surface realizations with the same
    meaning

11
Content unit example
  • S1 Pinochet arrested in London on Oct 16 at a
    Spanish judges request for atrocities against
    Spaniards in Chile.
  • S2 Former Chilean dictator Augusto Pinochet has
    been arrested in London at the request of the
    Spanish government.
  • S3 Britain caused international controversy and
    Chilean turmoil by arresting former Chilean
    dictator Pinochet in London.

12
SCU label, weight, contributors
  • Label London was where Pinochet was arrested
  • Weight3
  • S1 Pinochet arrested in London on Oct 16 at a
    Spanish judges request for atrocities against
    Spaniards in Chile.
  • S2 Former Chilean dictator Augusto Pinochet has
    been arrested in London at the request of the
    Spanish government.
  • S3 Britain caused international controversy and
    Chilean turmoil by arresting former Chilean
    dictator Pinochet in London.

13
Annotated corpora
  • Document Understanding Conference
  • Run by NIST
  • Main forum for summarization research
  • Annual evaluations on common datasets
  • DUC 2006, DUC 2007
  • DUC 2005
  • 20 sets of 7 human summaries
  • DUC 2004
  • 50 sets of 4 human summaries
  • DUC 2004
  • 20 sets both input and 4 summaries annotated

14
Importance class distribution
? Few units are expressed by everyone
? Many units are expressed by only one person
? The distributions of words and content units is
very similar
15
Content pyramids
  • The most important content is in top tier
  • Good content is somewhere in the pyramid

16
Ideally informative summary
  • Does not include an SCU from a lower tier unless
    all SCUs from higher tiers are included as well

17
Ideally informative summary
  • Does not include an SCU from a lower tier unless
    all SCUs from higher tiers are included as well

18
Ideally informative summary
  • Does not include an SCU from a lower tier unless
    all SCUs from higher tiers are included as well

19
Ideally informative summary
  • Does not include an SCU from a lower tier unless
    all SCUs from higher tiers are included as well

20
Ideally informative summary
  • Does not include an SCU from a lower tier unless
    all SCUs from higher tiers are included as well

21
Ideally informative summary
  • Does not include an SCU from a lower tier unless
    all SCUs from higher tiers are included as well

22
Different equally good summaries
  • Pinochet arrested
  • Arrest in London
  • Pinochet is a former Chilean dictator
  • Accused of atrocities against Spaniards

23
Different equally good summaries
  • Pinochet arrested
  • Arrest in London
  • On Spanish warrant
  • Chile protests

24
Diagnosticwhy is a summary bad?
  • Good
  • Less relevant summary

25
Importance of content
  • Can observe distribution in human summaries
  • Assign relative importance
  • Empirical rather than subjective
  • The more people agree, the more important

26
Pyramid score for evaluation
  • New summary with n content units
  • Estimates the percentage of information that is
    maximally important

27
Characteristics of human summarization
  • Zipfian distribution of content units
  • Non-deterministic process
  • Can this process of content selection be modeled?
  • Current automatic methods are completely
    deterministic
  • Would automatic summarizers become better if they
    were based on a cognitively plausible model?

28
How traditional summarizers work
  • Extract representative sentences

Input text1
Input text2
Input text3
Summary
29
Frequency as feature
  • Suggested in earliest research
  • Never used alone in current systems
  • Large scale test collections not available till
    recently
  • Presentation outline
  • Data analysis
  • human summaries
  • Automatic summarizer
  • Considerations after selecting a feature

30
Do people include frequent content in their
summaries?
  • Yes both for words and content units
  • 30 test sets ( 10 docs each)
  • 4 human summaries (100 words each)

31
Do people agree on including frequent content?
  • Yes
  • Very frequent words in the input tend to be those
    that many people include in a summary

32
Content units better study of variation
  • Semantic units
  • Emerge from the analysis of several texts
  • Link different surface realizations with the same
    meaning

33
Content unit frequency
  • Frequent content units appear in human summaries
  • People agree on the inclusion of content units
    frequent in the input
  • 0.64 correlation coefficient between weight in
    the input and weight in pyramid

34
Summarizer approach and features
  • Compositional
  • Content words are basic blocks of meaning
  • Assign importance to them
  • Choose a composition function
  • Assign weight to sentence
  • Context sensitive
  • Relative importance changes after each selection
  • Update weights
  • Each is validated

35
C2S2 algorithm
  • Step 1 Estimate word weights (probabilities)
  • Step 2 Estimate sentence weights
  • Step 3 Choose best sentence
  • Step 4 Update word weights
  • Step 5 Go to 2 if desired length not reached

36
Using words as basic blocks
  • What humans do
  • Include frequently repeated content in their
    summaries
  • Agree on frequently repeated content
  • Summary log-likelihood H(umans) and S(ystems)
  • Parameters estimated from the input
  • Multinomial model

HIGH (-198)
LOW (-227)
HSSSSSSSSSSHSSSHSSHHSHHHHH
37
Frequency in related work
  • Frequency is an often used feature
  • But claims that frequency used alone does not
    give good results
  • Why?
  • ? The composition function matters

38
Composition function CF
  • Choose a composition function CF
  • CFProduct
  • CFSum
  • CFAverage

39
Evaluation results 50 summaries
  • The choice of composition function has a big
    impact
  • Good (sum average) to very bad (product)
    content selection
  • Number of sentences vary considerably

40
Comparison with other systems
  • 2004 Document Understanding Conference
  • Only one system significantly better
  • Out of 16 participants
  • 2004 Multi-lingual summarization task
  • Only one system significantly better
  • Out of 10 participants
  • One of the best in avoiding repetition
  • 0.6 content units per summary vs. 3.4/1.4 for the
    (second) best system

41
Need to update context sensitivity
  • Importance is not static
  • Pinochet was arrested in
  • London. Chile protested
  • the arrest, which was on
  • a Spanish arrest warrant.
  • There are repetitive sentences in the input
  • Update the word weight by setting it to 0
  • Related to earlier work (MMR)
  • But we conclusively demonstrate the usefulness of
    the approach

? Significant increase in repetition without
update
42
What have we learned?
  • Frequency is a powerful feature
  • We now have lots of data to test feature utility
  • Composition function is very important
  • Some frequency-based summarizers can be close to
    the baseline
  • Normalizations are important to report
  • Context adjustment considerably changes
    performance
  • Details not always clear for many systems
Write a Comment
User Comments (0)
About PowerShow.com