VITALAS at the University of Sunderland - PowerPoint PPT Presentation

1 / 5
About This Presentation
Title:

VITALAS at the University of Sunderland

Description:

soccer (soccer, soccers, soccere): 27278.65 ... won (won, wons): 16436.22. championship (championships, championship): 16256.48 ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 6
Provided by: osirisSun
Category:

less

Transcript and Presenter's Notes

Title: VITALAS at the University of Sunderland


1
VITALAS at the University of Sunderland
  • Michael Oakes, Marco Palomino and Yan Xu

2
Example of an automatically-selected concept
vocabulary
  • soccer (soccer, soccers, soccere) 27278.65
  • pictur (picture, pictures, pictured, picturing,
    pictur, picturs, pictureds) 22565.06
  • minist (minister, ministers, ministe, minist,
    ministeer, ministes, ministered) 21035.98
  • team (team, teams, teamed, teaming) 18273.36
  • cup (cup, cups, cupping, cupped) 18187.04
  • citi (city, cities, citi, citys, citis, citie)
    17369.72
  • leagu (league, leagues, leagu, leagus) 16693.42
  • celebr (celebrates, celebrate, celebrations,
    celebration, celebrated, celebrating,
    celebrities, celebrity, celebrants, celebrer,
    celebre, celebrators, celebrateing, celebreated,
    celebreates, celebres, celebrational, celebral,
    celebratings, celebrant) 16545.52
  • won (won, wons) 16436.22
  • championship (championships, championship)
    16256.48

3
Statistical methods used
  • Derived from 10,000 Belga captions in each case.
  • Chi-squared test, with Europarl as the reference
    corpus.
  • Information Radius, with Europarl as the
    reference corpus
  • Raw frequency
  • PageRank the most important words occur in the
    same captions as other important words.

4
Using search logs to make recommendations to
future users
Text query
Search Engine
Associated keywords
Images chosen by past users
Match
Search logs
User
Ranked images
5
Using TF.IDF to index the images in the search
logs
  • .
  • wkd is the weight reflecting the typicality of
    term k with respect to image d
  • fkd is the frequency term k was used by searchers
    looking for image d
  • NDoc is the total number of images downloaded in
    the search logs
  • Dk is the number of images which which were
    searched for using term k at least once.
  • The highest TF.IDF scores are given to those
    terms which were frequently used when searching
    for a given image, but were rarely used when
    searching for other images.
Write a Comment
User Comments (0)
About PowerShow.com