Text Summarization - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Text Summarization

Description:

put a book on the scanner, turn the dial to 2 pages', and read the result... Japanese email to the summarizer, select 1 par', and skim the translated summary. ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 21
Provided by: yxie
Category:

less

Transcript and Presenter's Notes

Title: Text Summarization


1
Text Summarization
  • Dr. Ying Xie

2
  • This lecture note primarily describes the work of
    Gunes Erkan and Dragomir R. Radev
  • LexRank Graph-based Centrality as Salience
    in Text Summarization, Journal of Artificial
    Intelligence Research 22, 2004.

3
Text Summarization
  • Text summarization is the process of
    automatically creating a compressed version of a
    given text that provides useful information for
    the user

4
Applications of Text Summarization
  • put a book on the scanner, turn the dial to 2
    pages, and read the result...
  • ...download 1000 documents from the web, send
    them to the summarizer, and select the best ones
    by reading the summaries of the clusters...
  • ...forward the Japanese email to the summarizer,
    select 1 par, and skim the translated summary.

5
Applications of Text Summarization(2)
Cited from Dr. Radevs slides NLP/IR tools
developed at the CLAIR Lab
6
Two types of summarization
  • -- Topic-oriented summarization focus on a users
    topic of interest, and extract information
    related to the specified topic.
  • -- General summarization covers as much of the
    information content as possible and preserves the
    general topical organization of the original text

7
Two types of summarization techniques
  • There are two types of summarization methods
  • - Extractive summarization produce summaries
    by choosing a subset of the sentences in the
    original document(s)
  • - Abstractive summarization information in
    the text is rephrased in a abstractive way.

8
Sentence Centrality
  • Assess the centrality of each sentence in a
    cluster and extract the most important ones to be
    included in the summary.
  • Different ways to evaluate sentence centrality
  • - Centroid-based
  • - Graph-based

9
Each sentence is represented as a vector of terms
10
Centroid-based centrality
  • Conduct clustering analysis on all the sentences.
    The sentences closed to the center of each
    cluster is the most salient sentences.

LexRank
11
Graph-based centrality
  • Applying social networking to summarization
  • Social networking is a mapping of relationships
    between interacting entities.
  • Social networking is widely applied in computer
    networks and information retrieval (Recall
    pageranking?)
  • Here social network is used to represent the
    similarity relations between sentences.

12
Graph-based centrality (2)
  • The assumption is sentences that are similar to
    many of the other sentences in a cluster are more
    central to the topic.
  • Two problems to solve
  • - how to evaluate the similarity between
    sentences
  • - how to compute the overall centrality of a
    sentence given the similarity relationship.

13
Similarity Between Sentences
LexRank
14
LexRank
15
LexRank
16
Degree Centrality
  • Degree Centrality the degree of the
    corresponding sentence in the graph.
  • This method takes advantage of the graph in the
    simplest way.

17
LexRank
  • Since we have a graph, why not use the idea of
    pageRank?
  • The idea is, each sentence has a centrality value
    and distributes its centrality to its neighbors.
  • So, a sentence linked by many sentences with high
    centrality will have high centrality as well.

18
LexRank (2)
LexRank
19
LexRank (3)
  • LexRank takes into consideration of the
    information subsumption among sentences.
  • In other words, if the information content of a
    sentence subsumes other sentences in a cluster,
    it is more preferred to be contained in the
    summary.

20
lexRank Demo
LexRank
Write a Comment
User Comments (0)
About PowerShow.com