Optimistic Active Learning using Mutual Information - PowerPoint PPT Presentation

About This Presentation
Title:

Optimistic Active Learning using Mutual Information

Description:

an optimistic active learner that exploits the discriminative partition ... Over 100 sample sizes, Mm M was 'statistically better' 85 times 'statistically worse' ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 2
Provided by: Comp632
Category:

less

Transcript and Presenter's Notes

Title: Optimistic Active Learning using Mutual Information


1
Optimistic Active Learning using Mutual
Information Yuhong Guo and Russell
Greiner University of Alberta, Edmonton, Canada
Active Learning A process of sequentially
deciding which unlabeled instance to label, with
the goal of producing the best classifier with
limited number of labelled instances.
an optimistic active learner that exploits the
discriminative partition information in the
unlabeled instances, makes an optimistic
assessment of each candidate instance, and
temporarily switches to a different policy if the
optimistic assessment is wrong.
Idea
Optimistic Query Selection
Online Adjustment
MmM Algorithm
pima
1. Most uncertain query selection ( MU )
Optimistic query selection tries to identify the
well separated partition. E.g.
Shortcoming ignores the unlabeled data !
2. Select the query that maximizes its
conditional mutual information about the
unlabeled data
vehicle
breast
Example
Experimental Evaluation
  • Empirical results (over 143 databases)
  • show MmM works
  • better than
  • MU
  • MCMImin
  • MCMIavg
  • MU (MU-SVM)
  • Random
  • Future work
  • Understand when
  • MmM is appropriate
  • Design further
  • variants

Question How to determine yi ?
Comparing MmM with other Active Learners
Proposals
(a) Take the expectation wrt Yi ( MCMIavg )
Shortcoming aggravates the ambiguity caused by
the limited labelled data.
Online Adjustment
(b) Take an optimistic strategy use only the
best query label ( MCMImin )
  • Can easily detect this guessed wrong
    situation, in the immediate next step,
  • Simply compare the actual label for the query
    with its optimistically predicted label
  • Whenever MmM guesses wrong,
  • it switches to a different query selection
  • criterion (MU) for the next 1 iteration
  • Comparing MmM vs MU, for PIMA dataset
  • Over 100 sample sizes, MmM was
  • statistically better 85 times
  • statistically worse 2 times
  • tied 13 times
  • Signed Rank Test shows MmM is better
  • Comparing MmM vs MCMIavg, over 17 datasets
  • MmM was
  • statistically better for gt5 more sample-sizes
    13 times
  • statistically worse for gt5 more sample-sizes
    2 times
  • Signed Rank Test shows MmM is
  • better 13 times
  • worse 1 time

L set of labeled instancesU index set
for unlabeled instances XU set of all
unlabeled instances
Write a Comment
User Comments (0)
About PowerShow.com