CSM06 Information Retrieval - PowerPoint PPT Presentation

1 / 44

About This Presentation

Title:

CSM06 Information Retrieval

Description:

Different kinds of metadata for visual information. Manual Annotation of Images ... Photographs: people, scenes, actions; holiday albums, criminal investigations ... – PowerPoint PPT presentation

Number of Views:79

Avg rating:3.0/5.0

Slides: 45

Provided by: csp9

Category:

more less

Transcript and Presenter's Notes

Title: CSM06 Information Retrieval

1
CSM06 Information Retrieval

LECTURE 7 Tuesday 16th November
Dr Andrew Salway
a.salway_at_surrey.ac.uk

2
Lecture 8 Image Retrieval

Different kinds of metadata for visual
information
Manual Annotation of Images
Similarity-based Image Retrieval using Perceptual
Features, e.g. QBIC and Blobworld
The Sensory Gap, and the Semantic Gap
Automatic Image Annotation using Collateral Text,
e.g. WebSEEK system

3
Image Data

Raw image data a bitmap with a value for every
picture element (pixel) (cf. vector graphics)
Captured with digital camera or scanner
Different kinds of images
Photographs people, scenes, actions holiday
albums, criminal investigations
Fine art and museum artefacts
Medical images x-rays, scans
Geographic Information Systems images from
satellites
Meteorological Images

4
Querying Strategies

Language-based query
Text-based query describing entities, actions,
meanings, etc
By visual example
Sketch-based query draw coloured regions
Image choose an image and ask for more which are
visually similar

5
Image Description Exercise

Imagine you are the indexer of an image
collection.
1) List all the words you can think of that
describe the following image, so that it could be
retrieved by as many users as possible who might
be interested in it. Your words do NOT need to
be factually correct, but they should show the
range of things that could be said about the
image
2) Try and put your words into groups so that
each group of words says the same sort of thing
about the image
3) Which words (metadata) do you think a machine
could extract from the image automatically?

6
(No Transcript)
7
Words to describe the image
8
Organising Image Metadata

Picture worth a thousand words its a cliché
but then clichés are often true
These words relate to different aspects of the
image ?
we need to have labels to talk about different
kinds of metadata for images
and to structure how we store metadata
Some kinds of metadata are more likely to require
human input than others

9
Metadata for Visual Information

Del Bimbo (1999) content-independent
content-dependent content-descriptive
Shatford (1986) in effect refines content
descriptive ? pre-iconographic iconographic
iconological based on Panofskys ideas

10
3 Kinds of Metadata for Visual Information (Del
Bimbo 1999)

Content-independent data which is not directly
concerned with image content, and could not
necessarily be extracted from it, e.g. artist
name, date, ownership
Content-dependent perceptual facts to do with
colour, texture, shape can be automatically (and
therefore objectively) extracted from image data
Content-descriptive entities, actions,
relationships between them as well as meanings
conveyed by the image more subjective and much
harder to extract automatically

11
Three levels of visual content

Pre-iconographic generic who, what, where, when
Iconographic specific who, what, where, when
Iconological abstract aboutness
Based on Panofsky (1939) adapted by Shatford
(1986) for indexing visual information

12
(No Transcript)
13
(No Transcript)
14
Image Annotation manual

Keyword-based descriptions of image content can
be manually annotated
May use a controlled vocabulary and consensus
decisions to minimise subjectivity and ambiguity

15
Systems

For examples of manually annotated image
libraries, see
www.tate.org.uk
www.corbis.com
Iconclass has been developed as an extensive
classification scheme for the content of
paintings, see
www.iconclass.nl

16
(No Transcript)
17
(No Transcript)
18
Visual Similarity

Remember images can be indexed / queried at
different levels of abstraction (cf. del Bimbos
metadata scheme)
When dealing with content-dependent metadata
(e.g. perceptual features like colour, texture
and shape) it is possible to automate indexing
To query
draw coloured regions (sketch-based query)
or choose an example image (query by example)
Images with similar perceptual features are
retrieved (not necessarily similar semantic
content)

19
(No Transcript)
20
(No Transcript)
21
Similarity-based Retrieval

Perceptual Features (for visual similarity)
Colour
Texture
Shape
Spatial Relations
These features can be computed directly from
image data they characterise the pixel
distribution in different ways
Different features may help retrieve different
kinds of images

22
(No Transcript)
23
Perceptual Features Colour

Colour can be computed as a global metric, i.e. a
feature of an entire image or of a region
Colour is considered a good metric because it is
invariant to image translation and rotation and
changes only slowly under effects of different
viewpoints, scale and occlusion
Colour values of pixels in an image are
discretized and a colour histogram is made to
represent the image / region

24
Perceptual Features Texture

There is less agreement about what constitutes
texture and a variety of metrics
Generally they capture patterns in the image data
(or lack of them), e.g. repetitiveness and
granularity
(Compare the texture of a brick wall, a stainless
steel kettle, ripples in a puddle and a grassy
field)

25
Perceptual Features Shape

Unless a simple geometric form (e.g. rectangle,
circle, triangle) then an objects shape will be
captured by a set of features relating to, e.g.
Area
Elongatedness
Major axis orientation
Shape outline

26
Similarity-based Retrieval

Based on a distance function - one metric, or
combination of metrics, e.g. colour, shape,
texture, is chosen to measure similarity between
images or regions
Key features may be extracted for each
image/region to reduce dimensionality
Retrieval is a matter of finding nearest
neighbours to query (sketch-based or example
image)
Similarity-based Retrieval is more appropriate
for some kinds of image collections / users than
others

27
Systems

For examples of image retrieval systems using
visual similarity see
QBIC (Query By Image Content), developed by IBM
and used by, among others, the Hermitage Art
Museum
http//wwwqbic.almaden.ibm.com/
Blobworld - developed by researchers at the
University of California
http//elib.cs.berkeley.edu/photos/blobworld/start
.html

28
(No Transcript)
29
(No Transcript)
30
The Sensory Gap

The sensory gap is the gap between the object in
the world and the information in a
(computational) description derived from a
recording of that scene
(Smeulders et al 2000).

31
The Semantic Gap

The semantic gap is the lack of coincidence
between the information that one can extract from
the visual data and the interpretation that the
same data have for a user in a given situation
(Smeulders et al 2000).

32
Possible Solution?

One way to resolve the semantic gap comes from
sources outside the image by integrating other
sources of information about the image in the
query. Information about an image can come from
a number of different sources the image content,
labels attached to the image, images embedded in
a text, and so on.
(Smeulders et al 2000).

33
Image Annotation using collateral text

Images are often accompanied by, or associated
with, collateral text, e.g. the caption of a
photograph in a newspaper, or the caption of a
painting in an art gallery
Keywords, and possibly other information, can be
extracted from the collateral text and used to
index the image

34
Image Annotation using collateral text WebSEEK
System

The WebSEEK system processes HTML tags linking to
image data files, (as well as processing the
image data itself), in order to index visual
information on the Web
Here we concentrate on how WebSEEK exploits
collateral text, e.g. HTML tags, for image
indexing-retrieval NB. Current web search
engines, like Google and AltaVista, appear to be
doing something similar

35
Image Annotation with collateral text WebSEEK
(Smith and Chang 1997)

Keyword indexing and subject-based classification
for WWW-based image retrieval user can query or
browse hierarchy
System trawls Web to find HTML pages with links
to images
The HTML text in which the link to an image is
embedded is used for indexing and classifying the
video
gt500,000 images and videos indexed with 11,500
terms 2,128 classes manually created

36
Image Annotation using collateral text

The WebSeek system processed HTML tags linking to
image and video data files in order to index
visual information on the Web
The success of this kind of approach depends on
how well the keywords in the collateral text
relate to the image
Keywords are mapped automatically to subject
categories the categories are created previously
with human input

37
Image Annotation using collateral text WebSEEK
System

Term Extraction terms extracted from URLs, alt
tags and hyperlink text, e.g.
http//www.mynet.net/animals/domestic-beasts/dog37
.jpg
animals, domestic, beasts, dog
Terms used to make an inverted index for
keyword-based retrieval
Directory names also extracted, e.g.
animals/domestic-beasts

38
Image Annotation using collateral text WebSEEK
System

Subject Taxonomy manually created is-a
hierarchy with key-term mappings to map key-terms
automatically to subject classes
Facilitates browsing of the image collection

39
(No Transcript)
40
(No Transcript)
41
Image Annotation using collateral text WebSEEK
System

The success of this kind of approach depends on
how well the keywords in the collateral text
relate to the image
URLs, alt tags and hyperlink text may or may not
be informative about the image content even if
informative they tend to be brief perhaps
further kinds of collateral text could be
exploited

42
Image Retrieval in Google

Rather like WebSEEK, Google appears to match
keywords in file names and in alt caption, e.g.
ltimg src"/images/020900.jpg" width150
height180 alt"David Beckham tussles with
Emmanuel Petit"gt

43
Essential Exercise

Image Retrieval Exercise download this from
module webpage.

44
Further Reading

A paper about the WebSEEK system
Smith and Chang (1997), Visually Searching the
Web for Content, IEEE Multimedia July-September
1997, pp. 12-20. Available via librarys
eJournal service.
Different kinds of metadata for images, and an
overview of content-based image retrieval
Excerpts from del Bimbo (1999), Visual
Information Retrieval available in library
short-term loan articles.
For a comprehensive review of CBIR, and
discussions of sensory gap and semantic gap
Smeulders, A.W.M. Worring, M. Santini, S.
Gupta, A. Jain, R. (2000), Content-based image
retrieval at the end of the early years. IEEE
Transactions on Pattern Analysis and Machine
Intelligence, Volume 22, number 12, pp.1349-1380.
Available online through librarys
eJournals.
Eakins (2002), Towards Intelligent Image
Retrieval, Pattern Recognition 35, pp. 3-14.
Enser (2000), Visual Image Retrieval seeking
the alliance of concept-based and content-based
paradigms, Journal of Information Science 26(4),
pp. 199-210.