Title: CS257 Modelling Multimedia Information LECTURE 4
1CS257 Modelling Multimedia InformationLECTURE 4
- Dr Andrew Salway, (a.salway_at_surrey.ac.uk)
- Tuesday 8th February
2Describing Images
- Imagine you are the indexer of an image
collection. - 1) List all the words you can think of that
describe the following image, so that it could be
retrieved by as many users as possible who might
be interested in it. Your words do NOT need to
be factually correct, but they should show the
range of things that could be said about the
image - 2) Try and put your words into groups so that
each group of words says the same sort of thing
about the image - 3) Which words (metadata) do you think a machine
could extract from the image automatically?
3(No Transcript)
4Words to describe the image
5Overview of LECTURE 4
- PART 1 Different Kinds of Metadata for Visual
Information Thesauri and Controlled
Vocabularies for Images - PART 2 Standards for Indexing Visual Information
- PART 3 Current Image Retrieval Systems how much
indexing-retrieval can be automated - LAB Evaluating state-of-the-art image retrieval
systems
6PART 1 Different Kinds of Metadata for Visual
Information
- Picture worth a thousand words its a cliché
but then clichés are often true - These words relate to different aspects of the
image ? - need to have labels to talk about different kinds
of metadata for images (useful for specifying /
designing / comparing image retrieval
applications) - may want to structure how metadata is stored
(rather than one long list of keywords may help
to reduce polysemy problems) - NB. Some kinds of metadata for visual
information (image and video data) are more
likely to require human input than others
7Different Kinds of Metadata for Visual Information
- The kinds of image metadata used for a particular
image retrieval application will depend in part
on the types of images being stored, and the
types of user (Enser and Sandom 2003) - Four types of image
- Documentary general purpose (like news,
wildlife) - Documentary special purpose (medical X-rays
etc., fingerprints) - Creative (artworks)
- Models (maps, plans, charts)
- Two types of user
- Specialist users
- General public
8Different Kinds of Metadata for Visual Information
- Frameworks to structure metadata for visual
information into different proposed by - Del Bimbo (1999)
- Shatford (1986)
- Chang and Jaimes (2000)
93 Kinds of Metadata for Visual Information (Del
Bimbo 1999)
- Content-independent Metadata data which is not
directly concerned with image content, and could
not necessarily be extracted from it, e.g. artist
name, date, ownership - Content-dependent Metadata perceptual facts to
do with colour, texture, shape can be
automatically (and therefore objectively)
extracted from image data - Content-descriptive Metadata entities, actions,
relationships between them as well as meanings
conveyed by the image more subjective and much
harder to extract automatically
10Three levels of visual content
- Content descriptive metadata can be broken down
into three levels of visual content - Pre-iconographic generic who, what, where, when
- Iconographic specific who, what, where, when
- Iconological abstract meanings, that require
interpretation - May equate pre-iconographic and iconographic with
the of-ness of an image, it is an image of,
and iconological with the about-ness of an
image, it is an image about - This framework for indexing visual information
by Shatford (1986) predates del Bimbos work, and
was based on the much earlier work of the art
theorist Erwin Panofsky (1939).
11EXERCISE 4-1Fill in for following images
- Some possible metadata of each kind (doesnt need
to be factually correct) - Content-independent Metadata
- Content-dependent Metadata
- Content-descriptive Metadata
12Fill in for following images
- Some possible metadata of each kind (doesnt need
to be factually correct) note, this should come
from your Content-descriptive metadata - Pre-iconographic
- Iconographic
- Iconological
13(No Transcript)
14(No Transcript)
15(No Transcript)
16Metadata for Visual Information SUMMARY
- Del Bimbo (1999) content-independent
content-dependent content-descriptive - Shatford (1986) in effect refines content
descriptive ? pre-iconographic iconographic
iconological based on Panofskys ideas - Jaimes and Chang (2000) 10 Levels in their
conceptual framework for visual information
more about this in the Optional Reading.
17PART 2 Standards for Indexing Visual Information
- Exif common standard for images taken by
digital camers - AAT Art and Architecture Thesaurus
- NASA Thesaurus
- ICONCLASS
18Exif
- Exif image file specification stipulates the
method of recording image data in files, and
specifies - Structure of image data files
- Tags used by this standard
- Definition and management of format versions
- Includes details about the camera that took the
image, shutter speed, apeture, time, etc. - www.exif.org
19AAT Art and Architecture Thesaurus
www.getty.edu/research/tools/vocabulary/aat
- Under development since 1980s
- Now 120,000 terms of controlled vocabulary
organised by concepts - Concept
- linked to several terms (including preferred
term) - positioned in a hierarchy
- linked to related concepts
- Terms cover objects, vocabulary to describe
objects and scholarly concepts of theory and
criticism
20potpourri vases Note Covered containers in the
shape of a vase or jar and characterized by
pierced decorations on the shoulder or cover, or
both. They were intended primarily for liquid or
dried potpourri, which is a mixture of flower
petals, spices, fruit juices, or other aromatic
substances. For similar containers, often
vaselike in form, in which aromatic pastilles may
be burned or liquid perfumes evaporated, use
"cassolettes."Terms potpourri vases
(preferred, C,U,D,English, British-P) vases,
potpourri (C,U,UF) potpourri vase (C,U,AD)
pot-pourri vases (C,U,UF,English, British)
potpourri (container) (C,U,UF) pots-pourris
(C,U,UF,French) Hierarchical PositionObjects
Facet .... Furnishings and Equipment
........ Containers ....... ltcontainers by
function or contextgt .................... potpo
urri vases
21NASA Thesaurus
- contains the authorized subject terms by which
the documents in the NASA STI Databases are
indexed and retrieved - 17,700 terms and 3,832 definitions
- http//www.sti.nasa.gov/thesfrm1.htm
22ICONCLASS
- Iconclass is a subject specific international
classification system for iconographic research
and the documentation of images. - http//www.iconclass.nl/
-
23Part 3 Approaches taken by Image Retrieval
Systems
- Systems that retrieve images based on Visual
Similarity QBIC (Query-by-Image-Content) and
Blobworld - Systems that use the text associated with images
as a source of keywords, e.g. search engines like
Google - Manually indexed image collections like online
art galleries (e.g. Tate) and commercial picture
collections (e.g. Corbis). You can start your
own image collection with freeware image
management software, like Picassa
www.picassa.com
24Visual Similarity
- Images can be indexed / queried at different
levels of abstraction (cf. del Bimbos metadata
scheme) - When dealing with content-dependent metadata
(e.g. perceptual features like colour, texture
and shape) it is possible to automate indexing - To query
- draw coloured regions (sketch-based query)
- or choose an example image (query by example)
- Images with similar perceptual features are
retrieved (not necessarily similar semantic
content)
25(No Transcript)
26(No Transcript)
27(No Transcript)
28Perceptual Features Colour
- Colour can be computed as a global metric, i.e. a
feature of an entire image or of a region - Colour is considered a good metric because it is
invariant to image translation and rotation and
changes only slowly under effects of different
viewpoints, scale and occlusion - Colour values of pixels in an image are
discretized and a colour histogram is made to
represent the image / region
29Similarity-based Retrieval
- Perceptual Features (for visual similarity)
- Colour
- Texture
- Shape
- Spatial Relations
- These features can be computed directly from
image data they characterise the pixel
distribution in different ways - Different features may help retrieve different
kinds of images consider images of leaves,
fabrics, adverts
30(No Transcript)
31Problems for Automation
- How much metadata for images can be created
automatically? - Two sets of problems have been labelled
- Sensory Gap
- Semantic Gap
32The Sensory Gap
- The sensory gap is the gap between the object in
the world and the information in a
(computational) description derived from a
recording of that scene - (Smeulders et al 2000).
33The Semantic Gap
- The semantic gap is the lack of coincidence
between the information that one can extract from
the visual data and the interpretation that the
same data have for a user in a given situation - (Smeulders et al 2000).
34EXERCISE 4-2
- Amy, Brian and Claire are developing the
following image database systems - Amy is developing an image database of peoples
holiday photographs, to be used by families to
store and retrieve their personal photographs. - Brian is developing an image database of fabric
samples, i.e. images of different materials, to
be used by fashion designers to select fabrics
for their clothes. - Claire is developing an image database of
paintings from an art gallery to be used by the
general public to choose prints to decorate their
homes. - Which kinds of metadata do you think Amy, Brian
and Claire should each include in their systems,
following del Bimbos (1999) classification of
content-independent, content-dependent and
content-descriptive metadata for visual
information? Note, each person may use more than
one kind of metadata. In your answer give
examples of each kind of metadata that you think
Amy, Brian and Claire should use.
35EXERCISE 4-3
- What kinds of metadata can be indexed
automatically more reliably? - What users information needs are likely to be met
by this kind of metadata?
36Lab Exercise
- To use and evaluate current image retrieval
systems - Systems that use visual similarity (QBIC and
Blobworld) - Systems that use keywords from HTML (Google,
AltaVista) - Systems in which images are manually indexed
(Tate Gallery, Corbis)
37LECTURE 4LEARNING OUTCOMES
- After the lecture and lab exercise, you should be
able to - Distinguish different kinds of metadata for
images, and give examples for a given image - Del Bimbo content-dependent, content-independent,
content-descriptive - Shatford pre-iconographic, iconographic,
iconological - Decide what kinds of metadata are appropriate
given user information needs - Assess the cost of producing the metadata, e.g.
how much can be automated? - Compare and contrast the approaches taken to
image retrieval by current systems, considering
in particular the kinds of users / image data
they are best suited for - Discuss problems for image retrieval like the
sensory gap and the semantic gap
38Optional Reading
- A Framework with 10 Levels of Metadata for
Images - Jaimes and Chang (2000). Alejandro Jaimes and
Shih-Fu Chang, A Conceptual Framework for
Indexing Visual Information at Multiple Levels,
IST/SPIE Internet Imaging, Vol. 3964, San Jose,
CA, Jan. 2000. - www.ctr.columbia.edu/ajaimes/Pubs/spie00_internet
.pdf (concentrate on the distinctions made in
Section 2 and on the different levels described
in Section 3.1) - TASI (Technical Advisory Service for Images)
paper on Metadata and Digital Images
www.tasi.ac.uk/advice/delivering/metadata.html - A paper on creating an online art museum -
- http//www.research.ibm.com/visualtechnologies/pdf
/cacm_herm.html - For more on the kinds of metadata for visual
information - del Bimbo (1999), Visual Information Retrieval.
In Library Article Collection.