Bayesian Ranking using Expectation Propagation and Factor Graphs - PowerPoint PPT Presentation

About This Presentation

Title:

Bayesian Ranking using Expectation Propagation and Factor Graphs

Description:

EP: Tom Minka. TrueSkillTM: Ralf Herbrich & Thore Graepel _at_ MSR Cambridge (UK) ... Tom Minka's thesis in two lines. Approximate. By. Iterate. Pick a factor ... – PowerPoint PPT presentation

Number of Views:461

Avg rating:3.0/5.0

Slides: 35

Provided by: Dumi4

Category:

more less

Transcript and Presenter's Notes

Title: Bayesian Ranking using Expectation Propagation and Factor Graphs

1
Bayesian Ranking using Expectation Propagation
and Factor Graphs

Dumitru Erhan
LISA/DIRO _at_ Université de Montréal

2
Preface

Not my work (at all)
EP Tom Minka
TrueSkillTM Ralf Herbrich Thore Graepel _at_ MSR
Cambridge (UK)
TrueChess Pierre Dangauthier _at_ INRIA
Rhône-Alpes (France)
Slides, plots, and results taken with permission

3
Outline

Problem setting
Xbox Live
Factor Graphs
Exact inference in Factor Graphs
Approximate inference using EP
Loopy schedules and chess ratings
Results

4
The Ranking Problem

Vaguely speaking
Input ordered subsets of data
Output a ranking function
For example
Chess
Online games
Movie ratings
Internet search

5
Modelling Ranking

Ordinal regression
Order learning

f (x)
Rank 5
Rank 4
Rank 3
Rank 2
Rank 1
rank (x)
f (x)
f (a)
f (b)
f (c)
6
Xbox Live
7
Modelling the Bayesian Way I

Track belief distributions
Allow performance variations
Model game outcome

8
Modelling the Bayesian Way II

This leads to a probit-based likelihood
Posterior is not Gaussian!
Implications for inference, tracking, etc.
What if we could
obtain a nice visualization of the model and
stay in the Gaussian/exponential family, and
perform the approximations efficiently?
Factor Graphs Expectation Propagation!

9
Factor Graphs mini intro

A bi-partite graph that represents the
factorization of a mathematical function
Nodes Factors Variables
Function product of all factors
Edges Dependencies of factors on variables

z
x
y
10
Factor Graphs continued

Used for modelling joint PDFs
Interested in marginals of the type
P(hidden observed)
Use the sum-product algorithm/belief propagation
to compute them

11
Sum-Product Algorithm I
y
f3(x,y)
v
w
x
f1(v,w)
f2(w,x)
z
f4(x,z)

Observation Sum of products becomes product of
sums of all messages from neighboring factors to
variable!

12
Sum-Product Algorithm II
y
f3(x,y)
w
x
f2(w,x)
z
f4(x,z)

Observation Factors only need to sum out all
their local variables!

13
Sum-Product Algorithm III
y
f3(x,y)
x
f2(w,x)
z
f4(x,z)

Observation Variables pass on the product of all
incoming messages!

14
Belief Propagation

Concept of a message from node X to node Y X
tells Y what state Y should be in
First propagate observed data
Then nodes exchange messages (start with leaves)
Messages priors conditional probabilities ?
updates of beliefs
Belief(x) product of incoming messages
Basically, unnormalized marginals
Pass messages until convergence
If graph is tree guaranteed
If not

15
Approximate message passing

Problem The exact messages from factors to
variables may not be closed under products
TrueSkillTM Gaussian x Step-fun Gaussian
Solution Approximate the marginal as well as
possible in the sense of minimal KL divergence
Expectation Propagation Approximate the marginal
by so-called moment-matching

16
Expectation Propagation
Message
Old marginal
New marginal
Exact

Approx

17
Tom Minkas thesis in two lines

Approximate
By
Iterate
Pick a factor
Remove its influence
Project and refine

18
Formal Problem Setting

Problem Setting
k teams of n1,,nk many players
The outcome is a ranking among the teams
(including draws)
Questions
Skill si of each player such that the higher the
skill the more likely the win
Global ranking among all players.
High quality of match among k teams.

19
TrueSkillTM Factor Graph
Player 1 wins over Player 2 3 draws with Player
4
s4
s1
s2
s3
Individual Skills
t1
t2
t3
Team Performances
Performances Differences
d1
d2
20
TrueSkillTM Model Details

Priors
Hidden variables
Performance
Team performance
Likelihood
Win
Draw
Skill evolution

21
More details and assumptions

Specifies an order on the real line
OK if we agree that 1-d is good enough
Draws transitivity not good
Assume and
A mini-FG is generated each time!
EP updates can be done efficiently
Moments of a truncated Gaussian
Information flows forward only
No updates in the light of future data

22
The Alternative ELO

Quite similar
Performances distributed around fixed skills
Win probability
Skill updates
Linear update
Differences
No uncertainty tracking
Linearized updates
No notion of teams, multiple players/teams, etc.
Not a generative model
TrueSkillTM is a generalization of ELO

23
Experimental setup

Types of experiments
Team ranking
Match quality
Win probability
Convergence properties
Ultimate goals
Provide reliable rankings
Better game experience

24
Data Halo 2 Multiplayer Beta

Publicly available
Real one is much larger
Number of Games 60022
Number of Players 5943
Parameters in all experiments
Performance variation factor 60
Draw Probability 5
Dynamics variation factor 2

25
Convergence properties
40
35
30
25
Level
20
15
Player 1 (TrueSkill)
10
Player 2 (TrueSkill)
Player 1 (ELO)
5
Player 2 (ELO)
0
0
100
200
300
400
25
26
Win probability
27
Other results

TrueSkillTM better at predicting tight matches
The additive team performance assumption does
not hold in some cases (Capture-the-Flag)
There are some feedback loop issues

28
TrueSkillTM conclusions

Every Xbox 360 Live game uses TrueSkillTM
Service launched in November 2005.
Distinguishing properties
is a generalization of ELO
tracks a belief distribution
can deal with multiple teams/players/draws
First real-world implementation of EP
However
Draws are handled somewhat strangely (hack)
Information flows only forward in time

29
What if

we created a schedule that passes messages back
in time?
Effectively, this means that future information
is used for updating the current beliefs!
However, the FG is not a tree now
Loopy message passing schedule
Too much data in case of Xbox Live
Lets do chess instead!
Makes sense the game graph is not very
connected in time
Hard to have a fair comparison between players

30
Chess Factor Graph
S1
S2
Performance noise
P1
P2
D P1 - P2
D gt eps
Morphy gt Paulsen Morphy Paulsen
Morphy gt Paulsen
Games in 1857
31
Chess dataset