Automatic Image Annotation and Retrieval using Cross-Media Relevance Models - PowerPoint PPT Presentation

About This Presentation

Title:

Automatic Image Annotation and Retrieval using Cross-Media Relevance Models

Description:

wn and collection C of images. OUTPUT: images described by query words. ... Corel Stock Photo CDs (5000 images 4000 training, 500 evaluation, 500 testing) ... – PowerPoint PPT presentation

Number of Views:279

Avg rating:3.0/5.0

Slides: 21

Provided by: Carl339

Learn more at: https://www.cs.rutgers.edu

Category:

more less

Transcript and Presenter's Notes

Title: Automatic Image Annotation and Retrieval using Cross-Media Relevance Models

1
Automatic Image Annotation and Retrieval using
Cross-Media Relevance Models

J. Jeon, V. Lavrenko and R. Manmathat
Computer Science Department
University of Massachusetts Amherst

Presenter Carlos Diuk
2
Introduction

The Problem
Automatically annotate and retrieve images from
large collections.
Retrieval example answer query Tigers in grass
with

3
Introduction

Manual annotation being done in libraries.
Different approaches to automatic image
annotation
Co-occurence Model
Translation Model
Cross-media relevance model

4
Introduction related work

Co-occurence Model
Looks at co-occurence of words with image regions
created using a regular grid.
Translation Model
Image annotation viewed as task of translating
from vocabulary of blobs to vocabulary of words.

5
Introduction CMRM

Cross-media relevance models (CMRM)
Assume that images may be described from small
vocabulary of blobs.
From a training set of annotated images, learn
the joint distribution of blobs and words.

6
Introduction CMRM

Cross-media relevance models (CMRM)
Allow query expansion
Standard technique for reducing ambiguity in
information retrieval.
Perform initial query and expand by using terms
from the top relevant documents.
Example in image context tigers more often
associated with grass, water, trees than with
cars or computers.

7
Introduction CMRM

Variations
Document based expansion
PACMRM (probabilistic annotation CMRM)
Blobs corresponding to each test image are
used to generate words and associated
probabilities. Each test generates a vector of
probabilities for every word in vocabulary.
FACMRM (fixed annotation-based CMRM)
Use top N words from PACMRM to annotate images.
Query based expansion
DRCMRM (direct-retrieval CMRM)
Query words used to generate a set of blob
probabilities. Vector of blob probabilities
compared with vector from test image using
Kullback-Lieber divergence and resulting KL
distance.

8
Discrete features in images

Segmentation of images into regions yields
fragile and erroneous results.
Normalized-cuts are used instead (Duygulu et al)
33 features extracted from images.
K (500) clustering algorithm used to cluster
regions based on features. Vocabulary of 500
blobs.

9
CMRM Algorithms

Image I b1 .. bm set of blobs
Training collection of images J b1 .. bm w1
.. wn
Two problems
Given un-annotated image I, assign meaningful
keywords.
Given text query, retrieve images that contain
objects mentioned.

10
CMRM Algorithms

Calculating probabilities.

11
CMRM Algorithms

Image retrieval
INPUT query Q w1 .. wn and collection C of
images
OUTPUT images described by query words.
Annotation-based retrieval model (PACMRM-FACMRM)
Annotate images as shown.
Perform text retrieval as usual.
Fixed-length annotation vs probabilistic
annotation

12
CMRM Algorithms

Image retrieval
INPUT query Q w1 .. wn and collection C of
images
OUTPUT images described by query words.
Direct retrieval model (DRCMRM)
Convert query into language of blobs, instead of
images into words.
Estimation
Ranking

13
Results

Dataset
Corel Stock Photo CDs (5000 images 4000
training, 500 evaluation, 500 testing). 371 words
and 500 blobs. Manual annotations.
Metrics
Recall number of correctly retrieved images
divided by number of relevant images.
Precision number of correctly retrieved images
divided by number of retrieved images.
Comparisons
Co-occurence vs Translation vs FACMRM

14
Results

Dataset
Corel Stock Photo CDs (5000 images 4000
training, 500 evaluation, 500 testing). 371 words
and 500 blobs. Manual annotations.
Metrics
Recall number of correctly retrieved images
divided by number of relevant images.
Precision number of correctly retrieved images
divided by number of retrieved images.
Comparisons
Co-occurence vs Translation vs FACMRM

15
Results

Precision and recall for 70 one-word queries.

16
Results

PACMRM vs DRCMRM

17
Some nice examples
Automatically annotated as sunset, but not
manually
18
Some nice examples
Response to query tiger
Response to query pillar
19
Some bad examples
20
Questions - Discussion