TOPIC CENTRIC QUERY ROUTING - PowerPoint PPT Presentation

About This Presentation
Title:

TOPIC CENTRIC QUERY ROUTING

Description:

Searching online can be both rewarding and frustrating. ... In such context, query routing attempts dynamically route ... Lycos, Alta Vista etc. ... – PowerPoint PPT presentation

Number of Views:104
Avg rating:3.0/5.0
Slides: 18
Provided by: Rav960
Learn more at: http://www.cs.bsu.edu
Category:
Tags: centric | query | routing | topic | alta | vista

less

Transcript and Presenter's Notes

Title: TOPIC CENTRIC QUERY ROUTING


1
TOPIC CENTRIC QUERY ROUTING
  • Research Methods (CS689)
  • 11/21/00

By Anupam Khanal
2
  • Introduction
  • What is query routing?
  • Searching online can be both rewarding and
    frustrating.
  • General search engines such as Yahoo, Lycos
    return many
  • irrelevant information to users query.
  • In such context, query routing attempts
    dynamically route
  • each users query to the appropriate
    specialized search.

3
Problem Description
  • There are many general search engines such as
    Yahoo,
  • Lycos, Alta Vista etc.
  • There are also many topic specific search
    engines such as VaccationSpot.com,
    KidsHealth.com etc.
  • However, many casual users are not familiar
    with all these topic specific search engines.
  • In such context, topic centric query expansion
    is important.

4
Why Topic Centric Query Routing
  • It is of utmost importance to analyze other query
    routing systems as well before we discuss the
    importance of Topic
  • Centric routing.
  • Manual Query Routing Services
  • - provide the categorized list of specialized
    search engine - users have to choose the
    search engines
  • - although keyword search interface is
    provided the terms that can be accepted
    as the keywords are limited.
  • Query Routing based on Centroids
  • - consist centroids which are summaries
    of databases
  • - these summaries consists a complete
    list of terms and frequencies of the
    databases.

5
  • - search engine is located by dividing
    which databases are relevant to a user query by
    comparing the query with each centroid.
  • - this technique cannot be applied to
    most of the topic specific search engines
    provided on the Web because of the restricted
    access to their internal database.
  • Query Routing Without Centroids
  • - Instead of centroids this systems generate
    a short text to explain the centroids of
    databases.
  • - if the search keywords are contained is
    such text then only the search engine will
    be located.

6
Research Objective
  • In such context, Topic Centric Query Routing is
  • appropriate as it uses the routing model to
    expand
  • the query.
  • The general framework of the query routing model
    is as
  • follows
  • Getting relevant terms from the Web
  • - routing model does not use any special
    dictionaries, but it uses the Web as the
    source of relevant terms.
  • - finds the Web documents relevant to
    the user query
  • dynamically by submitting that query
    to a general search engine.
  • - the relevant terms are extracted from
    those documents.

7
  • Co-occurrence based evaluation of term
    relevance.
  • - the mutual relevance of terms is evaluated
    on the basis
  • of their co-occurrences in the documents.
  • - the co-occurrences of the search keywords
    are counted
  • in all the documents retrieved by the
    general search engine.
  • - the routing model list all the distinct
    terms contained in
  • all documents and counts for each term the
    number of
  • documents that contain both the search key
    word and that
  • term.
  • Using a pseudo-feedback technique
  • - it is difficult to determine the term
    relevance from only the
  • results of a single document search on the
    general search
  • engine.

8
  • -even relevant terms often have few
    co-occurrences in
  • the selected documents of the first search.
  • in such context, query routing model re-evaluates
  • such low co-occurrences terms selecting terms to
    be
  • re-evaluated from the first search results,
    formulating
  • new queries by adding the selected terms to the
    original
  • query and performing the co-occurrence based
    evaluation
  • for each formulated query.

9

Figure 4 Query expansion procedure.
10
6. Clustering all terms in D0 to at most three
clusters W1w11, ..., w1m, W2w21, ..., w2k
and W3w31, ..., w3j. 7. Formulate three
queries Q1-Q3 by combining W1-W3 with Q0 (for
example, Q1"w01 ... w0n w11 ... w1m"). 8. Get
document sets DT1-DT4 and D1-D3 by sending
QT1-QT4 and Q1-Q3 independently to a general
search engine. 9. Count co-occurrences in
DT1-DT4 and D1-D3. Sets of high co-occurrence
terms WTH1-WTH4 and WH1-WH3, as well as WH0 in
step 3, are query expansion results.
11
(No Transcript)
12
Query Routing Result Query python
User query
  • If you are looking for information about
  • movie-monty python

Phrase to explain topic
1600 Search/Go to Search the Internet Movie
Database 1600 Search/Go to The Roger Ebert
Movie Files 1600 Search/Go to Horror Search
Recommended topic Search Engines
13
Other Topics.
Object oriented programming in python 7500
Search/Go to Index to Object Oriented Information
Sources 3600 Search/Go to Unix Programming
jpython- python in java 6300 Search/Go to
java.sun.com The Source for Java
Technology 5641 Search/Go to Gamelan- The
official Java Directory 4921 Search/Go to
JCentral Search the web for Java 4266
Search/Go to Index to Object Oriented Information
Sources
14
  • Importance of Topic Centric Query Routing
  • Query Routing Model is used.
  • Query Routing model doesnt generate centroids.
  • IT consists an off line pre-processing
    component and online interface.
  • Offline Query Routing Model takes as input a
    set of search engines and creates for each
    engine, an approximate textual model of that
    engines content or scope.

15
  • Online Query Routing Model takes a user query
    as input and applies a novel query expansion
    technique to the query
  • Then it clusters the output of the query
    expansion to suggest multiple topics that user
    may be interested in.
  • Each topic is associated with a set of search
    engines, eg., Python

16
  • Query Expansion model has the ability to
    automatically obtain terms relevant to a query
    from the web.
  • Using Query Expansion model, it is not
    necessary to maintain a massive dictionary of
    terms in a wide range of fields.

17
Conclusion
  • Topic centric query routing uses a query
    expansion model.
  • Query expansion model obtains all the
    information necessary in query routing form the
    web.
  • Thus Query routing model is an intelligent
    agent that uses the web as its knowledge and
    identifies topics of given queries dynamically
    by query expansion.
Write a Comment
User Comments (0)
About PowerShow.com