A Statistical Analysis of the PrecisionRecall Graph - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

A Statistical Analysis of the PrecisionRecall Graph

Description:

AMS-IMS-SIAM Joint Summer Research Conference on Machine Learning, Statistics, and Discovery. ... Average Precision (TAIN SET) Hugo Zaragoza. ... – PowerPoint PPT presentation

Number of Views:156
Avg rating:3.0/5.0
Slides: 29
Provided by: ral122
Category:

less

Transcript and Presenter's Notes

Title: A Statistical Analysis of the PrecisionRecall Graph


1
A Statistical Analysis of the Precision-Recall
Graph
  • Ralf Herbrich, Hugo Zaragoza, Simon Hill.
  • Microsoft Research,
  • Cambridge University,
  • UK.

2
Overview
  • 2-class ranking
  • Average-Precision
  • From points to curves
  • Generalisation bound
  • Discussion

3
Search cost-functions
  • Maximise the number of relevant documents found
    in the top 10.
  • Maximise the number of relevant documents at the
    top (e.g. weight inversely proportional to rank)
  • Minimise the number of documents seen by the user
    until he is satisfied.

4
Motivation
  • Why should 45 August, 2003 work for document
    categorisation?
  • Why should any algorithm obtain good
    generalisation average-precision?
  • How to devise algorithms to optimise rank
    dependant loss-functions?

5
2-class ranking problem
X,Y
Mapping X ? R
y1
Relevancy P(y1x) ? P(y1f(x))
6
Collection samples
  • A collection is a sample
  • z ((x1,y1),...,(xm,ym)) ? (X x 0,1)m
  • where
  • y 1 if the document x is relevant to a
    particular topic,
  • z is drawn from the (unknown) distribution pXY
  • let k denote the number of positive examples

7
Ranking the collection
  • We are given a scoring function f X?R
  • This function imposes an order in the collection
  • (x(1) ,, x(m)) such that f(x(1)) gt gt f(x(m))
  • Hits (i1,, ik) are the indices of the positive
    y(j).

f(x(i))
y(i) 1 1 0 1 0 0 1 0
0 0 ij 1 2 4
7
8
Classification setting
  • If we threshold the function f, we obtain a
    classification
  • Recall
  • Precision

f(x(i))
t
9
Precision .vs. PGC
PGC
PGC
PRECISION
PRECISION
10
The Precision-Recall Graph
After reordering
f(x(i))
11
Graph Summarisations
Break-Even point
12
Precision-Recall Example
13
overfitting?
Average Precision (TEST SET)
Average Precision (TAIN SET)
14
Overview
  • 2-class ranking
  • Average-Precision
  • From points to curves
  • Generalisation bound
  • Discussion

15
From point to curve bounds
  • There exist SVM margin-bounds Joachims 2000 for
    precision and recall.
  • They only apply to a single (unknown a priori)
    point of the curve!

Precision
Recall
16
Max-Min precision-recall
17
Max-Min precision-recall (2)
18
Features of Ranking Learning
  • We cannot take differences of ranks.
  • We cannot ignore the order of ranks.
  • Point-wise loss functions do not capture the
    ranking performance!
  • ROC or precision-recall curves do capture the
    ranking performance.
  • We need generalisation error bounds for ROC and
    precision-recall curves

19
Generalisation and Avg.Prec.
  • How far can the observed Avg.Prec. A(f,z)be from
    the expected average A(f) ?
  • How far can train and test Avg.Prec.?

20
Approach
  • McDiarmids inequality For any function gZn?R
    with stability c, for all probability measures P
    with probability at least 1-d over the IID draw
    of Z

21
Approach (cont.)
  • Set n 2m and call the two m-halves Z1 and Z2.
    Define gi (Z)A(f,Zi). Then, by IID

22
Bounding A(f,z) - A(f,zi)
  • How much does A(f,z) change if we can alter one
    sample (xi,yi)?
  • We need to fix the number of positive examples in
    order to answer this question!
  • e.g. if k1, the change can be from 0 to 1.

23
Stability Analysis
  • Case 1 yi0
  • Case 2 yi1

24
Main Result
  • Theorem For all probability measures, for all
    fX?R, with probability at least 1- d over the
    IID draw of a training and test sample both of
    size m, if both training sample z and test sample
    z contain at least am positive examples for all
    a?(0,1), then

25
(No Transcript)
26
Positive results
  • First bound which shows that asymptotically
    training and test set performance (in terms of
    average precision) converge!
  • The effective sample size is only the number of
    positive examples.
  • The proof can be generalised to arbitrary test
    sample sizes.
  • The constants can be improved.

27
Open questions
  • How can we let k change, so as to investigate
  • What algorithms could be used to directly
    maximise A(f,z) ?

28
Conclusions
  • Many problems require ranking objects in some
    degree.
  • Ranking learning requires to consider
    non-point-wise loss functions.
  • In order to study the complexity of algorithms we
    need to have large deviation inequalities for
    ranking performance measures.
Write a Comment
User Comments (0)
About PowerShow.com