IR, IE and QA over Social Media - PowerPoint PPT Presentation

1 / 3
About This Presentation
Title:

IR, IE and QA over Social Media

Description:

[from Andrew Tomkins/Yahoo!, 'Future or Web ... Setting: Community QA (Yahoo! Answers) ... Outperform Yahoo! default ranking or na ve ranking by user votes ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 4
Provided by: lepeti
Category:
Tags: answers | media | over | social | yahoo

less

Transcript and Presenter's Notes

Title: IR, IE and QA over Social Media


1
IR, IE and QA over Social Media
  • Social media (blogs, community QA, news
    aggregators)
  • Complementary to traditional news sources
    (Rathergate)
  • Grow faster than traditional web content, gap
    widening
  • Traditional/published 4Gb/day social media
    10gb/day from Andrew Tomkins/Yahoo!, Future or
    Web Search, May 2007
  • Research challenges
  • Low(er) quality
  • Content more dynamic
  • User interactions crucial
  • ratings, comments, link structure
  • to retrieve documents and to
  • evaluate extracted information

2
Finding High Quality Content for IE/QA
E. Agichtein, C. Castillo, D. Donato, A. Gionis,
G. Mishne, Finding High Quality Content in
Social Media, in Proc. of WSDM 2008
  • Goal find high-quality content (accurate
    well-presented)
  • Setting Community QA (Yahoo! Answers)
  • Classifying social media (e.g., cQA) is
    substantially different from document
    classification
  • Sources of information
  • Content analysis
  • Usage data (page views, etc)
  • Community ratings, link analysis
  • General framework for quality estimation in
    social media
  • Graph-based model of contributor relationships,
    combined with content and usage analysis
  • Can identify high-quality items with accuracy
    human agreement

3
Finding Relevant Content for IE/QA
J. Bian, Y. Liu, E. Agichtein and H. Zha. Finding
the Right Facts in the Crowd Factoid Question
Answering over Social Media, to appear in Proc.
of WWW 2008
  • Goal given a query, rank social content (cQA) by
    expected relevance and quality
  • Approach Learn ranking functions specifically
    for social media retrieval
  • Features
  • Textual content relevance, stylistics, language
    models
  • User Interactions link structure, discussion
    threads
  • User ratings incorporate user-provided content
    ratings
  • Method Gradient boosting (GBrank)
  • Developed a new objective function for learning
    ranking function using (noisy) preference data.
  • Results
  • Outperform Yahoo! default ranking or naïve
    ranking by user votes
  • Can be made robust to ratings spam same
    authors, to appear in AIRWeb 2008
Write a Comment
User Comments (0)
About PowerShow.com