Clustering for Large Result Sets

1 / 2
About This Presentation
Title:

Clustering for Large Result Sets

Description:

Our example: 'cheap beautiful Jaguars for sale' ... returns Jaguar car parts, perfume, Jacksonville Jaguar tickets, ... (animals? Mac OS 10.2? ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 3
Provided by: larse

less

Transcript and Presenter's Notes

Title: Clustering for Large Result Sets


1
Clustering for Large Result Sets
  • Proposed problem large result sets, most of
    which are irrelevant to the intended meaning
  • Our example cheap beautiful Jaguars for sale
  • Google returns Jaguar car parts, perfume,
    Jacksonville Jaguar tickets, (animals? Mac OS
    10.2?)
  • Existing technology teoma.com, vivismo.com,
    others?
  • Many clusters appear identical (Classified ads
    from Fresno, Cars and trucks in San Francisco,
    Cars and trucks in Los Angeles)
  • Others dont appear (animals, even though top
    link was Exotic animals for sale)
  • Some are just weird (Shop Slashdot? Real Estate?
    Mexico?)

2
Research Topics
  • Context
  • What makes these clustering engines work?
  • What knowledge-base techniques could apply here?
  • Natural language processing?
  • Research directions
  • On what basis do we create clusters?
  • Dictionary definitions ? document classification
    using classic IR techniques
  • Fixed number (easier for user to visualize?) ?
    minimize some sort of distance metric
  • Neural-net classifiers?
  • How do we refine the query?
  • assuming weve narrowed it down to cars, we may
    want to cluster on cheap or beautiful
  • unstructured, semi-structured ? structured
  • How do we present the results?
  • How do we describe a cluster?
  • What document best represents a cluster?
Write a Comment
User Comments (0)