Summarization - PowerPoint PPT Presentation

1 / 80
About This Presentation
Title:

Summarization

Description:

Thousands of articles, too much to read, summaries help selectivity ... Uses: Biographies, describe events (hurricane) Differences are the need to: ... – PowerPoint PPT presentation

Number of Views:141
Avg rating:3.0/5.0
Slides: 81
Provided by: SITE71
Category:

less

Transcript and Presenter's Notes

Title: Summarization


1
Summarization
  • Ryan Davies
  • Laura Emond
  • CSI5386, NLP
  • March 16, 2005

2
Outline
  • Introduction
  • Summary of the articles
  • Demo
  • Microsoft Autosummarize
  • MEAD

3
Introduction
  • Why summarize?
  • Thousands of articles, too much to read,
    summaries help selectivity
  • To be concise get just the most crucial
    information
  • Types of summaries extract, summary, abstract,
    abridgement, précis, digest, highlight, synopsis

4
Articles
  • Our articles are from this journal
  • Computational Linguistics, Volume 28, Number 4,
    December 2002
  • Articles
  • Dragomir R. Radev Eduard Hovy Kathleen
    McKeownIntroduction to the Special Issue on
    Summarization
  • Simone Teufel Marc MoensArticles Summarizing
    Scientific Articles Experiments with Relevance
    and Rhetorical Status
  • Klaus ZechnerAutomatic Summarization of
    Open-Domain Multiparty Dialogues in Diverse
    Genres
  • Horacio Saggion Guy LapalmeGenerating
    Indicative-Informative Summaries with SumUM

5
Introduction to the Special Issue on Summarization
  • Driver Increase of online information
  • Present the main ideas in less space
  • Could not easily summarize a document in which
    everything was equally important
  • Try to keep it as informative as possible
  • Information content appears in bursts
  • Indicative summary basically keywords
  • Informative summary reproduces the content

6
Definition
  • Summary
  • Produced from 1 texts
  • lt 50 of the original
  • Conveys most important information

7
Processes
  • Extraction identify important material
  • Abstraction reformulate (novel)
  • Fusion combine extractions
  • Compression get rid of peripheral parts
  • Early approaches from IR, now NLP

8
Approaches
  • 1. Single document (extraction)
  • Take sentences from original document
  • Surface level (signals)
  • Scoring key phrases, frequency, position
  • Now ML, NLP for key passages, relations between
    words instead of bags of words
  • Word relatedness, discourse structure
  • Connected words (anaphora, synonyms, shared
    words)
  • Topic / theme

9
Approaches
  • 2. Single document (abstraction)
  • Abstraction encompasses any method that does not
    rely strictly on extraction
  • Information Extraction
  • Compressive Summarization
  • Ontological Abstraction

10
Information Extraction
  • Designer specifies slots to be filled by
    certain information
  • eg. Earthquake description must include date,
    location, severity

11
Compressive Summarization
  • Borrows from language generation. Two
    approaches
  • Words are extracted from the document, and
    re-formed using a bigram language model.
  • Sentences are selected, combined and then reduced
    by dropping the least important fragments.
    (Similar to how humans cut and paste).

12
Ontological Abstraction
  • Leaves plenty of room for innovation.
  • Relies on an external knowledge base to recognize
    new information in the document.
  • Seems to be less NLP and more AI. Not feasible
    until complete knowledge base is available.

13
Approaches
  • 3. Multiple document summarization
  • Similar to single-document summarization. Fills
    in pre-determined slot values to construct a
    briefing (big picture / whole story / information
    synthesis).
  • Uses Biographies, describe events (hurricane)
  • Differences are the need to
  • Avoid redundancy (measure similarity between
    sentence pairs)
  • Identify differences (discourse rules)
  • nsure summary coherence (time order of articles,
    text order within articles, time stamps)

14
Evaluation
  • Humans agree just 60 of the time for sentence
    content even for straight-forward news articles
  • Better results for short summaries (people
    probably agree on the objective keystone data but
    not the supporting evidence which is more
    subjective)
  • Metrics
  • Form (grammar, coherence, organization)
  • Content (compare to human abstracts)
  • Extraneous information ? precision
  • Omitted ? recall
  • Tasks categorization, question answering, ad hoc

15
State of the art
  • Simple sentence extraction is evolving to become
    a process involving extracting, merging and
    editing phrases
  • Analysis of simple news articles ? analysis of
    longer documents (scientific articles, medical
    journal, patents) Teufel Moens, Saggion
    Lapalme
  • Analysis of written text ? speech Zechner
  • Novel System development to automate the
    summarization process Saggion and Lapalme

16
(No Transcript)
17
Summarizing Scientific Articles Experiments with
Relevance and Rhetorical Status
  • Simone Teufel Marc Moens
  • Contribution Focus on Scientific Articles
  • Scientific articles dissimilar from news
    articles, etc
  • First paragraph not usually a good summary
  • A given sentence may be an important result, or a
    criticism of a previous result

18
Basic Method
  • Authors believe that sentence (or clause)
    extraction is currently still the best basic
    method in most cases
  • Need to recognize scientifically significant
    information from supporting data

19
Identification Approach
  • To identify candidate sentences, authors suggest
    classification by rhetorical status
  • Rhetorical status categorization of sentences
    based on their purpose in the document
  • AIM, TEXTUAL, OWN, BACKGROUND, CONTRAST, BASIS,
    OTHER

20
Classification Tools Used
  • Traditional text extraction characteristics
  • Metadiscourse Agentivity
  • Citations Relatedness

21
Traditional Characteristics
  • Feature-based attribution

22
Feature-Based Attribution (1)
  • Features mostly drawn from already-developed
    summarization techniques.
  • Particular features chosen (and weighted)
    informally by intuition and trial-and-error, ie.
    it just works.

23
Feature-Based Attribution (2)
  • Some features, evaluated individually, give poor
    results
  • ie. some are even worse than chance,
  • or kappa lt 0
  • where -1 means always wrong,
  • 1 means always right,
  • and 0 means equivalent to random sentence
    classification)
  • However, removing any one feature degrades the
    overall performance of the system, even if that
    feature performs poorly individually.

24
Metadiscourse Agentivity
  • Metadiscourse explicit phrases which indicate
    the purpose of the statement (they argue that,
    we conclude that, agree, suggest)
  • List generated manually, again by intuition and
    by examining the corpus.
  • Agentivity we, they, our
  • Superior anaphora resolution techniques would
    help here.

25
Citation Relatedness
  • Need to recognize the context of citations
  • Negative critical, contrastive
  • Positive basis for current work
  • Neutral

26
More Rhetorical Status Indicators
  • Problem structure (problems, solutions, results)
    texts do not always make this clear, but if
    they do this info can be useful
  • Intellectual attribution (other researchers
    claim that, we have discovered that) uses
    metadiscourse heavily
  • Scientific argumentation (progression of
    rhetorically-coherent statements that convey the
    scientific contribution) related to problem
    structure
  • Attitude toward others work (rival, flawed,
    contributing) often explicit

27
Relevance (1)
  • Authors also suggest ordering statements by
    relevance
  • Relevance importance of the statement to the
    meaning of the document

28
Relevance (2)
  • Many methods used
  • Most follow the pattern of matching certain words
    or phrases to a manually-generated list of
    (dis)qualifiers
  • Eg. Some action phrases disqualify (argue,
    intend, lacks)
  • Eg. Some agent phrases qualify or disqualify
    sentences (we qualifies, they usually
    disqualifies)
  • Negation is also considered

29
Summarization Process (1)
  • Set of relevant sentences is considered
  • Humans followed this decision tree for
    categorizing sentences (for learning stage and
    gold standard)

30
Summarization Process (2)
  • Sentences in general were found to have the
    following category distribution (top)
  • Relevant sentences were more evenly distributed
    (bottom)

31
Summarization Process (3)
  • End-result summary focus is on AIM, CONTRAST, and
    BASIS sentences.
  • Sentences from other categories considered as
    well.
  • Decision to include sentences based on relevance
    score.

32
Evaluation
  • Authors evaluated the systems two main
    components individually
  • Categorization
  • Relevance determination

33
Categorization Performance (1)
  • Used F-measure as an indicator of accuracy
    (performance)
  • F 2PR/PR
  • Pprecision, Rrecall

34
Categorization Performance (2)
  • Human performance and stability was fairly high
    in this task
  • The performance of the system was below human
    performance, but far higher than TFIDF
    text-extraction baseline.

35
Categorization Performance (3)
  • Authors also examined which features had the most
    impact on performance.
  • Some features greatly impacted disambiguation of
    some categories, while performing worse than
    chance for others
  • Regardless, removing any one feature from the
    pool consistently decreased overall performance.

36
Relevance Det. Performance (1)
  • The system performed well when only the three
    categories AIM, CONTRAST, and BASIS are
    considered.
  • Recall 0.44 average
  • Precision 0.79 average
  • Once other categories, such as BACKGROUND, are
    considered, recall and precision plummet quickly,
    but remain much higher than the baseline
    performance.

37
Conclusion (1)
  • The authors of this paper made two major
    contributions to this area of the science
  • They applied rhetorical status information to
    scientific articles, and
  • They chose a set of classification features which
    produces higher performance on scientific
    articles than generic sets.

38
Conclusion (2)
  • Areas of future improvement identified in the
    article
  • Automatic gathering of metadiscourse features
  • Using a more sophisticated statistical classifier
  • Improving anaphora resolution for the agent
    feature

39
(No Transcript)
40
Generating Indicative-Informative Summaries with
SumUM
  • Horacio Saggion, Guy Lapalme (Sheffield, U of
    Montreal)
  • Software that does technical indicative
    informative summaries
  • Indicative identifies topics
  • Informative elaborates on these topics
    (qualitative and quantitative)
  • Purpose give the reader an exact and concise
    idea of what is in the source
  • Dynamic summarization
  • Shallow syntactic and semantic analysis
  • Concept identification
  • Text regeneration

41
Steps
  • Interpret the text
  • Extract relevant information (topics)
  • Condense and construct summary
  • Present summary in NL
  • Coherent selection and expression not solved yet

42
Selective Analysis
  • Selective analysis process of conceptual
    identification and text re-generation
  • Imitates how humans write abstracts.
  • Indicative selection The indicative terms are
    used to find concepts, definitions, etc.
  • Informative selection Looks for informative
    marker and matches a pattern.
  • Indicative generation Put in conceptual order,
    merge, put in one paragraph.
  • Informative generation Provides more information
    to the reader by filling in more templates.

43
Template
  • Shows what SumUM looks for to construct the
    indicative abstract

44
Pattern Matching
  • 174 patterns exist

45
Methodology
  • Corpus had 100 document abstract pairs from
    computer and information science journals, which
    were studied.
  • Studied how professional abstracts are written.
  • Mapped location of abstract sentences to the
    document manually (using photocopies).
  • Interpreted sentences with 334 FSTs representing
    linguistic and domain specific patterns.

46
Conceptual Information in Technical Documents
  • Some generic (author, research institution,
    date)
  • Some discipline specific (algorithms in CS,
    treatment in medicine)
  • Identified and classified with thesauri
  • 55 concepts (research activity, article)
  • 39 relations (studying, reporting, thinking)
  • 52 types of information (background,
    elaboration)
  • Extract the information ? sort ? edit (language
    understanding and production like deduction,
    generalization, paraphrase)
  • Remove peripheral linguistics (it appears that ?
    apparently), concatenate, truncate, delete
    phrases, etc.
  • Impersonalize (syntactic verb transformation)

47
Conceptual Information
48
Transformations in human generated summaries
  • Domain verbs (40), noun editing (38),
    merge/split (38), complex reformulation (23),
    no changes (11)
  • 70 from intro, conclusion, title, captions
  • 89 of sentences were edited

49
SumUM Architecture
  • Steps
  • Text segmentation
  • P.O.S. tagging
  • Partial syntactic and semantic analysis
  • Sentence classification
  • Template instantiation
  • Content selection
  • Text regeneration
  • Topic elaboration

Source Saggion, Horacio.
50
Evaluation
  • Compared SumUM to n-STEIN, Microsoft
    AutoSummarize
  • Based on content and quality
  • Co-selection of sentences to include was low
    (37) between humans, so its very subjective, no
    ideal exists
  • Evaluation can be extrinsic or intrinsic
  • Extrinsic how well it helps to perform a task
  • Intrinsic comparison with source document how
    many main ideas does it capture?
  • Parsing
  • Recall measures the of correct syntactic
    constructions identified by algorithm compared to
    the amount existing overall
  • Precision ratio of the of correct syntactic
    constructions to the total number of constructions

51
Future
  • Anaphora resolution
  • Lexical cohesion (elaboration on topics)
  • Local discourse analysis (coherence)
  • SumUM does not demonstrate intelligent behavior
    (question answering, paraphrase, anaphora
    resolution)
  • Currently ignores enumerations
  • Currently overlooks paragraph structure

52
(No Transcript)
53
Automatic Summarization of Open-Domain Multiparty
Dialogues in Diverse Genres
54
Automatic Summarization of Open-Domain Multiparty
Dialogues in Diverse Genres
  • Additional challenges exist due to informality.
  • The following are addressed
  • coping with speech disfluencies
  • identifying the units for extraction
  • maintaining cross-speaker coherence

55
Automatic Summarization of Open-Domain Multiparty
Dialogues in Diverse Genres
  • Some issues not addressed
  • Topic segmentation
  • Anaphora resolution
  • Discourse structure detection
  • Speech recognition error compensation
  • Prosodic information integration

56
Process
  • Pre-process the speech to a transcript
  • Disfluency detection
  • Sentence-boundary detection
  • Distributed information tagging
  • Then text-based summarization approaches are
    applied

57
Disfluency Detection
  • Disfluencies decrease readability and conciseness
    of summaries
  • Goal is to tag the disfluencies so that the final
    summarization step can ignore them

58
Sentence Boundary Detection
  • Most speech recognition processors split the
    transcript into segments based on pauses in the
    flow of sound
  • Some verbal sentences can contain pauses
    internally
  • while several sentences could run together
    without pause between them
  • Goal is to recognize both cases

59
Distributed Information
  • Most common case question-answer pairs.
  • Goal is to mark such groups as inseparable to the
    final-stage summarizer.

60
Methodology
  • Corpus
  • As the corpora, it used 8 recorded dialogues
    (including news shows which are formal dialogue
    and recordings of phone calls and meetings which
    are informal dialogue), which were transcribed
    and tagged.
  • Used Penn TreeBank for training.
  • It was then annotated.

61
Annotations
  • Annotations
  • Annotation done by humans nucleus ideas and
    satellite ideas done independently, then
    collaboratively
  • Gold standard look at topic boundaries
    determined by the majority of annotators
  • Disfluencies were also annotated (non-lexical um
    uh and lexical like, you know) repetitions
    (insertion, substitution, repetition),
    interruptions
  • Detected with 3 things POS tagger, false-start
    (abandoned, incomplete clause 10 of speech) and
    repetition filter
  • Questions were annotated according to type Wh- or
    yes/no
  • Back channel (is that right? Really?) and
    rhetorical Qs not included

62
Tokenization
  • Tokenization
  • Removal of noises (human and nonhuman)
  • Expand contractions
  • Eliminate information on case and punctuation
  • Removal of stop words (using SMART, and a list of
    closed class words)
  • Truncation (stemming)

63
Tagging speech
  • Components
  • POS tagger (Brill, rules-based)
  • Repetition finder
  • Decision tree (false starts)
  • Sentence tagging
  • Complete
  • Noncomplete
  • Turn tagging
  • Performance very good for sentence boundaries at
    the end of turns
  • Question-answer pairs
  • Improves fluency significantly in the summary if
    these are identified
  • More informative
  • More coherent

64
Decision Making
  • Weighting
  • Compute vectors for each sentence and the topical
    segment. Compare. Those with most similarities
    get promoted.
  • Uses Maximum Marginal Relevance algorithm to find
    highly weighted sentences and prevent redundancy
    so it is maximally similar to the segment and
    maximally dissimilar to other sentences being
    included in the summary.
  • Emphasis Factors
  • Lead emphasis (for stuff at beginning)
  • Q-A emphasis
  • False start de-emphasis
  • Speaker emphasis
  • Q-A linking
  • Ensures that the answer immediately follows the
    question 
  • Can do clean or direct from transcript summaries

65
Evaluation
  • Components evaluated individually
  • Two baselines
  • LEAD (first n sentences)
  • MMR (maximum marginal relevance)

66
Eval Disfluencies Boundaries
  • Each component individually increased the
    relevance vector (as compared to the gold
    standard) consistently
  • Both components combined also consistently had a
    better performance effect than each one
    individually
  • They had more effect on less-formal copora
    (10-15 vs. 5)

67
Eval Distributed Info
  • This component only had a significant positive
    effect on two of the corpora, each of which
    contained many question-answer pairs.
  • Relevance vectors largely unaffected
  • Summary coherence is significantly improved
    whenever question-answer pairs are present
    (unmeasurable quantitatively)

68
Conclusion
  • System with all components applied consistently
    performs better than LEAD and MMR baselines
    (except in news corpus)
  • System makes most significant improvement in the
    informal conversation corpora

69
Related Work
  • Restricted-domain dialogue summarization
  • Eg. Spoken news
  • Prosody-based emphasis detection
  • Authors own previous work

70
(No Transcript)
71
Demo of TS Systems
  • Microsofts AutoSummarize
  • MEAD
  • www.newsinessence.com

72
MEAD Demo
  • An example of statistically motivated text
    summarization. Based on sentence extraction.
  • Written in Perl, uses XML.
  • Uses centroid, position, and overlap with first
    sentence (often this is the title)
  • Does multi-document summarization
  • Computational Linguistics And Information
    Retrieval (CLAIR) at U of Michigan
  • Led by one of the computer scientists that wrote
    the introduction to the Computational Linguistics
    journal we studied
  • http//tangra.si.umich.edu/clair/md/demo.cgi

73
Screenshot
http//articles.health.msn.com/id/100101065?GT163
05
74
NewsInEssence a deployment of MEAD
  • Interactive Multi-source News Summarization
  • system for finding and summarizing clusters of
    related news articles from multiple sources on
    the Web
  • 2001, Dragomir Radev
  • NIE can start from a URL and retrieve documents
    that are similar, or NIE can retrieve documents
    that match a given
  • set of keywords
  • www.newsinessence.com

75
NIE
  • Creating Cluster keywords or use a seed article
    as input
  • Sources Chicago Suntimes, Globe and Mail,
    Guardian, International Herald Tribune, Newsday,
    Reuters, San Francisco Chronicle, Seattle Post
    Intelligencer, The Boston Herald

76
NIE
It picks up a lot of page navigation features
since they are prominently located
77
NIE 10 summary
Maybe itd do better if it ignored short phrases
Science, Travel it wastes its quota
78
Questions??
  • Thank you!

79
References
  • ACL Anthology A Digital Archive of Research
    Papers in Computational Linguistics.
    Computational Linguistics, Volume 28, Number 4,
    December 2002. http//acl.ldc.upenn.edu/J/J02/
  • CLAIR. News in Essence. lthttp//www.newsinesse
    nce.comgt
  • NewsInEssence. About. lthttp//lada.si.umich.edu
    8080/clair/nie1/docs/about.htmlgt
  • NewsInEssence. Help. lthttp//lada.si.umich.edu8
    080/clair/nie1/docs/help.htmlgt
  • Pullum, Geoff. Geoff Pullum's Six Golden Rules
    of giving an academic presentation.
    lthttp//people.ucsc.edu/pullum/goldenrules.htmlgt
  • Radev, Dragomir Hovy, Eduard McKeown, Kathleen.
    Introduction to the Special Issue on
    Summarization.
  • Computational Linguistics, Volume 28, Number 4,
    December 2002.
  • Radev, Dragomir. BlairGoldensohn, Sasha Zhang,
    Zhu. Experiments in Single and Multi Document
    Summarization Using MEAD. lthttp//tangra.si.umich
    .edu/radev/papers/duc01.pdfgt
  • Radev, Dragomir et al. MEAD Documentation
  • v3.07. Nov. 2002

80
References (2)
  • Simone Teufel Marc Moens.Articles Summarizing
    Scientific Articles Experiments with Relevance
    and Rhetorical Status Computational Linguistics,
    Volume 28, Number 4, December 2002. Pg. 409-445
  • Text Summarization Project lthttp//www.site.uottaw
    a.ca/tanka/ts.htmlgt
  • Saggion, Horacio Lapalme, Guy.Generating
    Indicative-Informative Summaries with SumUM .
  • Computational Linguistics, Volume 28, Number 4,
    December 2002. Pg. 497-526.
  • Saggion, Horacio. Génération automatique de
    résumes par analyse sélectif. lthttp//www.dcs.she
    f.ac.uk/saggion/TheThesis.psgt
  • Saggion, Horacio. SumUM Summarization at the
    Université de Montréal lthttp//www.dcs.shef.ac.uk
    /saggion/sumumweb01.htmlgt
  • Zechner, Klaus. Automatic Summarization of
    Open-Domain Multiparty Dialogues in Diverse
    Genres Computational Linguistics, Volume 28,
    Number 4, December 2002. Pg. 447-485.
Write a Comment
User Comments (0)
About PowerShow.com