Optimizing Search Engines using Clickthrough Data - PowerPoint PPT Presentation

1 / 24

About This Presentation

Title:

Optimizing Search Engines using Clickthrough Data

Description:

Optimizing Search Engines using Clickthrough Data. by. Thorsten Joachims ... [18] J. Kemeny and L. Snell. Mathematical Models in the Social Sciences. Ginn & Co, 1962. ... – PowerPoint PPT presentation

Number of Views:306

Avg rating:3.0/5.0

Slides: 25

Provided by: cmpeBo

Category:

more less

Transcript and Presenter's Notes

Title: Optimizing Search Engines using Clickthrough Data

1
Optimizing Search Engines using Clickthrough Data

by
Thorsten Joachims

Presentation by M. Sükrü Kuran
2
Outline

Search Engines
Clickthrough Data
Learning of Retrieval Functions
Support Vector Machine (SVM) for
Learning of Ranking Functions
Experiment Setup
Offline Experiment
Online Experiment
Analysis of The Online Experiment
Conclusion and Future Work
References
Questions

3
Search Engines

Search engines utilize ranking systems to list
results based on their relevance to the query
Current ranking systems are not optimized for
relevance
As an alternative solution we can use
Clickthrought Data to find
more relevance
optimized results

4
Clickthrough Data

What is Clickthrough Data ?
Clickthrough data is the set of links that the
user selects from the list of the links retreived
by the search engine to a user-given query.
Why is Clickthrough Data Important?
These are the most relevant links among the query
results
Easier to acquire than user feedback (since the
data is already in the logs of the search engines)

5
Clickthrough Data (2)

Users are less likely to click on a link that has
a low ranking
(Independent of the actual relevence)
Users typically scan the first 10 links in the
result set 24
Thus, clickthrough data is not the absolute
relevence
value for the query but a good relative relevence
value

6
Clickthrough Data (3)

Example
Results for a search for SVM
1. Kernel Machines 6. Archives of Support
Vector
2. Support Vector Machine Machines
3. SVM-Light Support Vector Machine 7. SVM demo
Applet
4. Intr. To Support Vector Machines 8.
Royal Holloway Support Vector
5. Support Vector Machine and Machine
Kernel Methods Ref. 9. Support Vector
Machine
The Software
10. Lagrangian Support Vector
Machine Home Page

Among the 10 results, only links 1,3 and 7 is
chosen (clickthrough data)
7
Clickthrough Data (4)

link3 lt link2
link7 lt link2
link7 lt link4
link7 lt link5
link7 lt link6

ranking preferred by the user (binary
relation)
We can generalize this preference information,
link i lt link j for all pairs 1 lt j lt i,
with and
8
Learning of Retrieval Functions

Goal
We have to find a retrival function whose
results are close to
In order to calculate the similarity between any
given
and , we have to use a performance
metric
Average Precision (binary relevance)
Kendalls

Very Simple
Good Performance Metric
9
Learning of Retrieval Functions (2)

Kendalls
Between any two ranking functions the distance
is,
D Set of documents in a query result
P of concordant pairs in D x D
Q of discordant pairs in D x D
m of documents/links in D

10
Learning of Retrieval Functions (3)

Problem Defination of Learning an Appropriate
Retrieval Function
For a fixed (but unknown) distribution of queries
and target (user preferred) rankings the goal is,
where is the distribution of queries

11
Support Vector Machine (SVM) for Learning of
Ranking Functions

Usually machine learning in information learning
is based on binary classification.
(A document is either related to the query or
not)
Since the information gathered from clickthrought
data is not an absoulte relevancy information we
cannot use binary classification

12
Support Vector Machine (SVM) for Learning of
Ranking Functions (2)

Using a set of queries and user ranking sets
(training data) we will select a ranking function
among a family (F) of ranking functions

Selection will be based on minimizing
n of queries in the training set
13
Support Vector Machine (SVM) for Learning of
Ranking Functions (3)

Then, we need to find a sound family of ranking
functions.
How to find an F which includes an efficent
ranking function (f) ?

14
Support Vector Machine (SVM) for Learning of
Ranking Functions (4)

A set of functions,
Where s are description based retrieval
functions 10,11
s are weight vectors (2D) adjusted by
learning

15
Support Vector Machine (SVM) for Learning of
Ranking Functions (5)

Instead of maximizing directly our goal function
we can minimize the Q in our performance measure
By using calssification SVMs 7

minimize
subject to
16
Experiment Setup

A baseline meta-search engine called Striver is
used for testing purposes
Striver forwards a query to MSNSearch, Google,
Excite, Altavista and Hotbot
Acquires top 100 results from each search engine
Based on the learned retrival function it selects
top 50 of the 500(may be lesser if more than one
engine has found a specific document)

17
Offline Experiment

Using Striver 112 Queries are recorded
A huge set of features are used to calculate the
description based retrieval functions
The testing is done with different values of
training set queries
Results from Google and MSNSearch are used for
benchmarking purposes

18
Offline Experiment (2)
19
Online Experiment

Striver is used by a group of people (20 people)
Based on these peoples queries training set of
Striver is composed of 260 queries
The results are compared with results from
Google, MSNSearch and Toprank (a simple
meta-search engine)

20
Online Experiment (2)
More clicks mean that (for Google) users clicked
more links in the learned engine than they do in
Google for 29 queries out of 88. Less clicks mean
that (for Google) users clicked less links in the
learned engine than they do in Google for 13
queries out of 88
21
Analysis of the Online Experiment

Since all of the users have used the engine for
academic searches the learned data is good for
searches in academic research topics
But it may not give that good results for
different groups of people
We can say that learned engine is a customizable
engine unlike traditional engines

22
Future Work and Conclusions

What is the optimal group size for user
custimization?
Features can be tuned for better performance
Clustering algorithms can cluster users in WWW
into subgroups based on their clickthrough datas
?
Can malicious users corrupt the learning process
by clicking irrelevant links, how it is avoided?

23
References

1 R. Baeza-Yates and B. Ribeiro-Neto. Modern
Information Retrieval. Addison-Wesley-Longman,
Harlow, UK, May 1999.
2 B. Bartell, G. Cottrell, and R. Belew.
Automatic combination of multiple ranked
retrieval systems. In Annual ACM SIGIR Conf. on
Research and Development in Information Retrieval
(SIGIR), 1994.
3 D. Beeferman and A. Berger. Agglomerative
clustering of a search engine query log. In ACM
SIGKDD International Conference on Knowledge
Discovery and Data Mining (KDD), 2000.
4 B. E. Boser, I. M. Guyon, and V. N. Vapnik. A
traininig algorithm for optimal margin
classifiers. In D. Haussler, editor, Proceedings
of the 5th Annual ACM Workshop on Computational
Learning Theory, pages 144152, 1992.
5 J. Boyan, D. Freitag, and T. Joachims. A
machine learning architecture for optimizing web
search engines. In AAAI Workshop on Internet
Based Information Systems, August 1996.
6 W. Cohen, R. Shapire, and Y. Singer. Learning
to order things. Journal of Artificial
Intelligence Research, 10, 1999.
7 C. Cortes and V. N. Vapnik. Supportvector
networks. Machine Learning Journal, 20273297,
1995.
8 K. Crammer and Y. Singer. Pranking with
ranking. In Advances in Neural Information
Processing Systems (NIPS), 2001.
9 Y. Freund, R. Iyer, R. Shapire, and Y.
Singer. An efficient boosting algorithm for
combining preferences. In International
Conference on Machine Learning (ICML), 1998.
10 N. Fuhr. Optimum polynomial retrieval
functions based on the probability ranking
principle. ACM Transactions on Information
Systems, 7(3)183204, 1989.
11 N. Fuhr, S. Hartmann, G. Lustig, M.
Schwantner, K. Tzeras, and G. Knorz. Air/x - a
rule-based multistage indexing system for large
subject fields. In RIAO, pages 606623, 1991.
12 R. Herbrich, T. Graepel, and K. Obermayer.
Large margin rank boundaries for ordinal
regression. In Advances in Large Margin
Classifiers, pages 115132. MIT Press, Cambridge,
MA, 2000.
13 K. Hoffgen, H. Simon, and K. van Horn.
Robust trainability of single neurons. Journal of
Computer and System Sciences, 50114125, 1995.
14 T. Joachims. Making large-scale SVM learning
practical. In B. Scholkopf, C. Burges, and A.
Smola, editors, Advances in Kernel Methods -
Support Vector Learning, chapter 11. MIT Press,
Cambridge, MA,1999.
15 T. Joachims. Learning to Classify Text Using
Support Vector Machines Methods, Theory, and
Algorithms. Kluwer, 2002.
16 T. Joachims. Unbiased evaluation of
retrieval quality using clickthrough data.
Technical report, Cornell University, Department
of Computer Science, 2002. http//www.joachims.org
.
17 T. Joachims, D. Freitag, and T. Mitchell.
WebWatcher a tour guide for the world wide web.
In Proceedings of International Joint Conference
on Artificial Intelligence (IJCAI), volume 1,
pages 770 777. Morgan Kaufmann, 1997.
18 J. Kemeny and L. Snell. Mathematical Models
in the Social Sciences. Ginn Co, 1962.
19 M. Kendall. Rank Correlation Methods.
Hafner, 1955.