Graph-Based Methods for Automatic Text Summarization - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

Graph-Based Methods for Automatic Text Summarization

Description:

Current graph-based approaches to text summarization assume static graphs. A suitable evolutionary text graph model may impart a better understanding of the texts. ... – PowerPoint PPT presentation

Number of Views:313
Avg rating:3.0/5.0
Slides: 2
Provided by: u311
Category:

less

Transcript and Presenter's Notes

Title: Graph-Based Methods for Automatic Text Summarization


1
Graph-Based Methods for Automatic Text
Summarization
Lin Ziheng1, Kan Min-Yen2 and Lee Wee Sun2
School of Computing, National University of
Singapore 3 Science Drive 2, Singapore 117543
A paper has been submitted to and accepted by
HLT-NAACL 2007 TextGraphs-2 workshop. We also
participated in DUC 2007 and have submitted a
paper for our system.
1. Abstract Current graph-based approaches to
text summarization assume static graphs. A
suitable evolutionary text graph model may impart
a better understanding of the texts. We propose a
timestamped graph (TSG) model that is based on
evolving networks and human writing and reading
processes.
  • 3. Timestamped Graph
  • Assumptions
  • Writers write articles from the first sentence to
    the last
  • Readers read articles from the first sentence to
    the last.
  • Approach
  • Add sentences into the graph in chronological
    order.
  • Suitable in modeling the growth of single
    documents for multi-document, treat it as
    multiple instances of the single documents, which
    evolve in parallel.
  • Definition
  • The example is just one instance of TSG with
    specific parameter settings. We generalize and
    formalize the TSG algorithm.
  • A timestamped graph algorithm tsg(M) is a 9-tuple
    (d, e, u, f,s, t, i, s,t) that specifies a
    resulting algorithm that takes as input the set
    of texts M and outputs a graph G.
  • e - edges to add per vertex per time step
  • u - unweighted or weighted edges
  • s- vertex selection function s(u, G)
  • s - skew degree.
  • 2. Motivation
  • No existing text graph approach that models how
    texts emerge. (LexRank, TextRank)
  • Natural evolving networks.
  • Human writing and reading processes.
  • The success of graph ranking algorithms, such as
    PageRank.

An example
The growth of TSG
Citation network
The WWW
Skew degree
  • 4. Evaluation
  • Dataset DUC 2005, 2006 and 2007. Evaluation
    tool ROUGE.
  • Each dataset contains 50 clusters, each cluster
    contains a query and 25 documents.
  • Summarization system (1) Graph construction
    phase TSG (2) Sentence ranking phase
    PageRank (3) Sentence extraction phase MMR
    re-ranker.
  • 5. Conclusion
  • Proposed a timestamped graph model for text
    understanding and summarization.
  • Applied TSG on DUC 05, 06 and 07, and achieved
    comparable results.
  • Best performance achieved with specific parameter
    settings.
  • TSG subsumes the graphs used by LexRank and
    TextRank.
  • 6. Future Works
  • Currently looking further on skewed timestamped
    graphs.
  • Analyzing in-degree distribution of timestamped
    graphs.

e 2
e 2
12th
N
N
N
Topic-sensitive Weighted edges ROUGE-1 ROUGE-2
No No 0.39358 0.07690
Yes No 0.39443 0.07838
No Yes 0.39823 0.08072
Yes Yes 0.39845 0.08282
3rd
Skew degree ROUGE-1 ROUGE-2
0 0.36982 0.07580
1 0.37268 0.07682
2 0.36998 0.07489
Results of participation in DUC 2007
  • Optimal performance e 2 topic-sensitive
    PageRank and weighted edges s 1.
  • DUC results show TSG is better tailored to deal
    with update summaries.

1. Student 2. Supervisor
Write a Comment
User Comments (0)
About PowerShow.com