Title: From Extracting to Abstracting
1From Extracting to Abstracting
- Generating Quasi-abstractive Summaries
Zhuli Xie Application Software Research
Center Motorola Labs
Barbara Di Eugenio, Peter C. Nelson Department
of Computer Science University of Illinois at
Chicago
2Outline
- Introduction
- Quasi-abstractive summaries
- Model Approach
- Experimental Results
- Conclusion Discussion
3Introduction
- Types of text summaries
- Extractive composed of whole sentences or
clauses from source text. Paradigm adopted by
most automatic text summarization systems - Abstractive obtained using various techniques
like paraphrasing. Equivalent to human-written
abstracts. Still well beyond state-of-the-art.
4Quasi-abstractive Summaries
- Composed not of whole sentences from source text
but of fragments that form new sentences Jing
02 - We will show they are more similar to
human-written abstracts, as measured with cosine
similarity ROUGE-1,2 metrics
5Quasi Abstractive Rationale
Two sentences from a human written abstract A1
We introduce the bilingual dual-coding theory as
a model for bilingual mental representation. A2
Based on this model, lexical selection neural
networks are implemented for a connectionist
transfer project in machine translation.
Extractive Summary (by ADAMS) E1 We have
explored an information theoretical neural
network that can acquire the verbal associations
in the dual-coding theory. E2 The bilingual
dual-coding theory partially answers the above
questions.
Candidate sentence set for A1 S1 The bilingual
dual-coding theory partially answers the above
questions. S2 There is a well-known debate in
psycholinguistics concerning the bilingual mental
representation. . . Candidate sentence set for
A2 S3 We have explored an information
theoretical neural network that can acquire the
verbal associations in the dual-coding theory. S4
It provides a learnable lexical selection
sub-system for a connectionist transfer project
in machine translation.
6Model Approach
- Learn model that can identify Candidate Sentence
Set (CSS) - Label generate patterns of correspondence
- Train classifier to identify the CSSs
- Generate summary for a new document
- Generate CSSs
- Realize Summary
7CSSs Discovery Diagram
8Learn the CSS Model (1)
- Label
- decomposition of abstract sentences based on
string overlaps - 70.8 of abstract sentences are composed of
fragments of length gt 2, which can be found in
the text to be summarized in our test data
(CMP-LG corpus)
9Learn the CSS Model (2)
- Train classifier Given docs where all CSSs have
been labelled, transform each doc into sentence
pair set. Each instance is represented by feature
vector and target feature is whether pair
belongs to same CSS - Used Decision Trees, also tried Support Vector
Machines Joachims, 2002 and Naïve Bayes
classifiers Borgelt, 1999 - Sparse data problem Japkowicz 2000 Chawla et
al., 2003
10Summary Generation
- Generate CSSs for unseen documents
- Use classifier to identify sentence pairs
belonging to same CSS and merge them - CSSs formation exhibits natural order since
sentences and sentence pairs are labeled
sequentially i.e., first CSS will contain at
least one fragment which appears earlier in
source text than any fragments in second CSS - Summary Realization
11Summary Realization
- Simple Quasi-abstractive (SQa)
- New sentence generated by appending new word
to previously generated sequence according to
n-gram probabilities calculated from CSS - Each CSS is used only once
12Summary Realization
- Quasi-abstractive with Salient Topics (QaST)
- Salient NPs model based on social networks
Wasserman Faust, 94 Xie 2005 - Sort predicted salient NPs according to their
lengths - Traverse list of salient NPs and of CSS-based
n-gram probabilities in parallel to generate
sentence use highest ranked NP which has not
been used yet, and first n-gram probability model
that contains this NP
13Topic Prediction
- Salient NPs
- Abstract should contain salient topics of article
- Topics are often expressed by NPs
- We assume that NPs in an abstract represent most
salient topics in article - NP Network NP Centrality
- Collocated NPs can be connected and hence
network can be formed - Social network analysis techniques used to
analyze network Wasserman Faust 94 and
calculate centrality for nodes Xie 05
14Experiments
- Data 178 documents from CMP-LG corpus, 3-fold
cross validation - Four Models
- Lead first sentence from first m paragraphs.
- ADAMS top m sentences ranked according to
sentence ranking function ADAMS learned. - SQa uses n-gram probabilities over first m
discovered CSSs to generate new sentences. - QaST anchors choice of specific set of n-gram
probabilities in salient topics. Stops after m
sentences have been generated.
15Evaluation Metrics
- Cosine similarity bag of words method
- ROUGE-1,2 Lin 2004
- A recall measure to compare machine-generated
summary and its reference summaries - Still bag of words/n-gram method
- But showed high correlation with human judges
16Experimental Results
- SQas performance is even lower than Lead
- ADAMS achieved 13.6, 27.9, and 37.8
improvement over Lead for the three metrics - QaST achieved 29.4, 31.5, and 64.3
improvement over Lead, and 13.9, 2.8, 19.3
over ADAMS - All differences between QaST and others are
statistically significant (two sample t-test)
except for ADAMS/ROUGE-1
17Generated Sentence Sample
In collaborative expert-consultation dialogues,
two participants ( executing agent and the
consultant bring to the plan construction task
different knowledge about the domain and the
desirable characteristics of the resulting domain
plan.
In collaborative expert-consultation dialogues,
two participants (executing agent and consultant)
work together to construct a plan for achieving
the executing agents domain goal. The
executing agent and the consultant bring to the
plan construction task different knowledge about
the domain and the desirable characteristics of
the resulting domain plan.
18Sample Summary
QaST In this paper, we present a plan-based
architecture for response generation in
collaborative consultation dialogues, with
emphasis on cases in which the user has indicated
preferences. to an existing tripartite model
might require inferring a chain of actions for
addition to the shared plan, can appropriately
respond to user queries that are motivated by
ill-formed or suboptimal solutions, and handles
in a unified manner the negotiation of proposed
domain actions, proposed problem-solving actions,
and beliefs proposed by discourse actions as well
as the relationship amongst them. In
collaborative expert-consultation dialogues, two
participants( executing agent and the consultant
bring to the plan construction task different
knowledge about the domain and the desirable
characteristics of the resulting domain plan. In
suggesting better alternatives, our system
differs from van Beeks in a number of ways.
Abstract This paper presents a plan-based
architecture for response generation in
collaborative consultation dialogues, with
emphasis on cases in which the system
(consultant) and user (executing agent) disagree.
Our work contributes to an overall system for
collaborative problem-solving by providing a
plan-based framework that captures the
Propose-Evaluate-Modify cycle of collaboration,
and by allowing the system to initiate
subdialogues to negotiate proposed additions to
the shared plan and to provide support for its
claims. In addition, our system handles in a
unified manner the negotiation of proposed domain
actions, proposed problem-solving actions, and
beliefs proposed by discourse actions.
Furthermore, it captures cooperative responses
within the collaborative framework and accounts
for why questions are sometimes never answered.
19Conclusion Discussion
- New type of machine generated summary
Quasi-abstractive summary - N-gram model anchored by salient NPs gives good
results - Further investigation needed in several aspects
- CSSs Discovery with cost-sensitive classifiers
Domingos, 1999 Ting, 2002 - Grammaticality and length of generated summaries
Wan et al, 2007