Title: Learning to Rank A Brief Review
1Learning to Rank --A Brief Review
2Ranking and sorting
- Rank only has K structured categories
- Sorting each sample has a distinct rank
- Generally, no need to differentiate them
3Overview
- Rank aggregation
- Label ranking
- Query and rank by example
- Preference learning
- Problems left, what we can do?
4Ranking aggregation
- Needs of combining different ranking results
- Voting systems, welfare economics, decision
making - 1. Hillary Clinton gt John Edwards gt Barack Obama
- 2. Barack Obama gtJohn Edwards gt Hillary Clinton
- gt ?
5Ranking aggregation (cont.)
- Arrows impossibility theorem
- Kenneth Arrow, 1951
If the decision-making body has at least two
members and at least three options to decide
among, then it is impossible to design a social
welfare function that satisfies all these
conditions at once.
6Ranking aggregation (cont.)
- Arrows impossibility theorem
- 5 fair assumptions
- non-dictatorship, unrestricted domain or
universality, independence of irrelevant
alternatives, positive association of social and
individual values or monotonicity, non-imposition
or citizen sovereignty - Cannot be satisfied simultaneously
7Ranking aggregation (cont.)
- Bordas method (1971)
- Given lists , each has n items
- For each
- Define as the number of items rank below
j in - Rank all items by
-
- Hillary Clinton 2, John Edwards 2, Barack
Obama 2
8Ranking aggregation (cont.) -- Border
- Condorcet Criteria
- If the majority prefers x to y, then x must be
ranked above y - Borders method does not satisfy CC, neither any
method that assigns weights to each rank position
9Ranking aggregation (cont.)
- Assumption relaxation
- Maximize consensus criteria
- Equivalent to minimize disagreement (Kemeny,
Social Choice Theorem) - NP Hard!
- Sub-optimal solutions using heuristics
10Ranking aggregation (cont.)
- Basic idea
- Assign different weights to different experts
- Supervised aggregation
- Weighting according to a final judger (ground
truth) - Unsupervised aggregation
- Aims to minimize the disagreement measured by
certain distances
11Ranking aggregation (cont.)
- Distance measure
- Spearman footrule distance
- Kendal tau distance
- Kendal tau distance for multiple lists
- Scaled footrule distance
12Ranking aggregation (cont.) -Distance Measure
- Kemeny optimal ranking
- Minimizing Kendal distance
- Still NP-Hard to compute
- Local Kemenization (local optimal aggregation)
- Can be computed in O(knlogn)
13Ranking aggregation (cont.)
- Supervised Ranking Aggregation (SRA WWW07)
- Ground truth preference matrix H
- Example
- Goal rank by the score
- It can be seen that , or with
relaxation
14Ranking aggregation (cont.) -- SRA
- Method
- Use Bordas score
- Objective
15Ranking aggregation (cont.)
- Markov Chain Rank Aggregation (MCRA, WWW05)
- Map a ranked list to a Markov Chain M
- Compute the stationary distribution of M
- Rank items based on
- Example
- B gt C gt D
- A gt D gt E
- A gt B gt E
16Ranking aggregation (cont.) - MCRA
- Different transition strategies
- MC1
- all out-degree edges have uniform
probabilities - MC2
- choose a list, then choose next item on the
list -
- For disconnected graph, define transition
probability based on measure item similarity
17Ranking aggregation (cont.)
- Unsupervised Learning Algorithm for Rank
Aggregation (ULARA Dan Roth ECML07) - Goal
- Method maximize agreement
18Ranking aggregation (cont.) - UCLRA
- Method
- Algorithm iterative gradient decent
- Initially, w is uniform, then updated iteratively
19Overview
- Rank aggregation
- Label ranking
- Query and rank by example
- Preference learning
- Problems left, what we can do?
20Label Ranking
- Goal Map from the input space to the set of
total order over a finite set of labels - Related to multi-label or multi-class problems
Input Customer information Output Porsche gt
Toyota gt Ford
21Label Ranking (cont.)
- Pairwise ranking (ECML03)
- Train a classifier for each pair of labels
- When judge on an example
- If the classifier predicts , then
count it as a vote on - Then rank all labels according to their votes
- Total classifiers
22Label Ranking (cont.)
- Constraint Classification (NIPS 02)
- Consider a linear sorting function
- Goal learn the values of
- rank all labels by the score
23Label Ranking (cont.) -- CC
- Expand the feature vector
- Generate positive/ negative samples in
24Label Ranking (cont.) -- CC
- Learn a separating hyper plane
- Can be solved by SVM
25Overview
- Rank aggregation
- Label ranking
- Query and rank by example
- Preference learning
- Problems left, what we can do?
26Query and rank by example
- Given one query, rank retrieved items according
to their relevancy w.r.t the query.
27Query and rank by example (cont.)
- Rank on manifold
- Convergence form
- Essentially, this is an one-class semi-supervised
method
28Preference learning
- Given a set of items, and a set of user
preference over these items, to rank all items
according to the user preference. - Motivated by the needs of personalized search.
29Preference learning
- Input
- preference a set of partial order
on X - Output a total order on X
- or, map X onto a structured label space Y
- Preference function
30Existing methods
- Learning to order things W. Cohen 98
- Large margin ordinal regression R. Herbrich 98
- PRanking with Ranking K Crammer 01
- Optimizing Search Engines using Clickthrough Data
T Joachims 02 - Efficient boosting algorithm for combining
preferences Yoav Freund 03 - Classification Approach towards Ranking and
Sorting Problems S Rajaram 03
31Existing methods
- Learning to Rank using Gradient Descent C Burges
05 - Stability and Generalization of Bipartite Ranking
S Agarwal 05 - Generalization Bounds for k-Partite RankingS
Rajaram 05 - Ranking with a p-norm push C Rudin 05
- Magnitutde-Preserving Ranking Algorithms C
Cortes 07 - From Pairwise Approach to Listwise Z Cao 07
32Large Margin Ordinal Regression
- Mapping to an axis using inner product
33Large Margin Ordinal Regression
- Consider
- Then
- Introduce soft margin
- Solve using SVM
34Learn to order things
- A greedy ordering algorithm to order things
Calculate a score for each item
35Learn to order things (cont.)
- Combine different ranking functions
- To learn the weight iteratively
36Learn to order things
Combine preference functions
Do ranking aggregation
Update weights based on feedbacks
37- Initially, w is uniform
- At each step
- Compute a combined ranking function
- Produce a ranking aggregation
- Measure the loss
38RankBoost
- Bipartite ranking problems
- Combine weaker rankers
- Sort based on values of H(x)
39RankBoost (cont.)
Sampling distribution Initialization
- Bipartite ranking problem
Learn weak ranker
Sampling distribution updation
normalization
Combine weak rankers
40Stability and Generalization
- Bipartite ranking problems
- Expected rank error
- Empirical rank error
41Stability and Generalization (cont.)
- Stability
- Remove one training sample, how much changes
- Generalization
- Generalize to k-partite ranking problem
42Rank on graph data
43P-norm push
- Focus on the topmost ranked items
- The top left region is the most important
44P-norm push (cont.)
- Height of k (k is a negative sample)
-
- Cost of sample k
- g is convex, monotonically
incresasing
45p-norm push
- Run RankBoost to solve the problem
46Thanks!