Extracting the Discussion Structure in Comments on NewsArticles - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Extracting the Discussion Structure in Comments on NewsArticles

Description:

A commenter is person who comments on an. article. Data ... In this method we tokenize both the commenter's. name and the comment, and check whether the ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 17
Provided by: 140191
Category:

less

Transcript and Presenter's Notes

Title: Extracting the Discussion Structure in Comments on NewsArticles


1
Extracting the Discussion Structure in Comments
onNews-Articles
  • Speaker ???
  • Anne Schuth, Maarten Marx, Maarten de Rijke,
  • Universiteit van Amsterdam
  • Proceedings of the 9th annual ACM international
    workshop on Web information and data management
  • Year of Publication 2007

2
Outline
  • Introduction
  • Data description and acquisition
  • Data exploration
  • Thread structure
  • Conclusion

3
Introduction
  • In the past few years, the World Wide Web saw
    some fundamental changes among which the
    phenomenon known as user generated content
    users, as opposed to owners of websites, are now
    able to add their content.

4
Data description and acquisition
  • Terminology- An article is a complete
    news-article published on a news-site.- The
    comments on articles consist of at least the
    name of the author and the comment itself.- A
    comment-thread, or just thread is a flat list
    consisting of comments, in reverse
    chronological order.- A commenter is person who
    comments on an article.

5
Data description and acquisition
6
Data exploration
  • Articles
  • Comment threads

7
Data exploration
8
Data exploration
  • Comments

9
Data exploration
10
Data exploration
11
Data exploration
  • Thread structure- Commenters not only comment on
    news articles, they also react on other
    comments, using the news article commenting
    facility.- We can define this reacts-on
    relation declaratively as follows

12
Thread structure
  • Now we describe our baseline algorithm for
    identifying the reacts-on relation between
    comments.
  • It implements the clue commenters name appears in
    comment by a case-insensitive string match
    between each preceding commenter and the full
    text of the current comment.

13
Thread structure
14
Thread structure
  • We will describe three methods for mining the
    reacts-on relation.- method B Word boundaries
    In this method we tokenize both the commenters
    name and the comment, and check whether the
    first list of tokens occurs as a sublist of the
    second list of tokens.- method C POS-tagging
    plus loose match Do a part-of-speech tagging of
    all comments.- method D _at_-trigger plus loose
    match A name and an n-gram match if the n-gram
    is preceded by the _at_-symbol and the name and
    n- gram are similar.

15
Thread structure
16
Conclusion
Write a Comment
User Comments (0)
About PowerShow.com