CS257 Modelling Multimedia Information LECTURE 4 - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

CS257 Modelling Multimedia Information LECTURE 4

Description:

1) List all the words you can think of that describe the following image, so ... materials, to be used by fashion designers to select fabrics for their clothes. ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 39
Provided by: andrew105
Category:

less

Transcript and Presenter's Notes

Title: CS257 Modelling Multimedia Information LECTURE 4


1
CS257 Modelling Multimedia InformationLECTURE 4
  • Dr Andrew Salway, (a.salway_at_surrey.ac.uk)
  • Tuesday 8th February

2
Describing Images
  • Imagine you are the indexer of an image
    collection.
  • 1) List all the words you can think of that
    describe the following image, so that it could be
    retrieved by as many users as possible who might
    be interested in it. Your words do NOT need to
    be factually correct, but they should show the
    range of things that could be said about the
    image
  • 2) Try and put your words into groups so that
    each group of words says the same sort of thing
    about the image
  • 3) Which words (metadata) do you think a machine
    could extract from the image automatically?

3
(No Transcript)
4
Words to describe the image
5
Overview of LECTURE 4
  • PART 1 Different Kinds of Metadata for Visual
    Information Thesauri and Controlled
    Vocabularies for Images
  • PART 2 Standards for Indexing Visual Information
  • PART 3 Current Image Retrieval Systems how much
    indexing-retrieval can be automated
  • LAB Evaluating state-of-the-art image retrieval
    systems

6
PART 1 Different Kinds of Metadata for Visual
Information
  • Picture worth a thousand words its a cliché
    but then clichés are often true
  • These words relate to different aspects of the
    image ?
  • need to have labels to talk about different kinds
    of metadata for images (useful for specifying /
    designing / comparing image retrieval
    applications)
  • may want to structure how metadata is stored
    (rather than one long list of keywords may help
    to reduce polysemy problems)
  • NB. Some kinds of metadata for visual
    information (image and video data) are more
    likely to require human input than others

7
Different Kinds of Metadata for Visual Information
  • The kinds of image metadata used for a particular
    image retrieval application will depend in part
    on the types of images being stored, and the
    types of user (Enser and Sandom 2003)
  • Four types of image
  • Documentary general purpose (like news,
    wildlife)
  • Documentary special purpose (medical X-rays
    etc., fingerprints)
  • Creative (artworks)
  • Models (maps, plans, charts)
  • Two types of user
  • Specialist users
  • General public

8
Different Kinds of Metadata for Visual Information
  • Frameworks to structure metadata for visual
    information into different proposed by
  • Del Bimbo (1999)
  • Shatford (1986)
  • Chang and Jaimes (2000)

9
3 Kinds of Metadata for Visual Information (Del
Bimbo 1999)
  • Content-independent Metadata data which is not
    directly concerned with image content, and could
    not necessarily be extracted from it, e.g. artist
    name, date, ownership
  • Content-dependent Metadata perceptual facts to
    do with colour, texture, shape can be
    automatically (and therefore objectively)
    extracted from image data
  • Content-descriptive Metadata entities, actions,
    relationships between them as well as meanings
    conveyed by the image more subjective and much
    harder to extract automatically

10
Three levels of visual content
  • Content descriptive metadata can be broken down
    into three levels of visual content
  • Pre-iconographic generic who, what, where, when
  • Iconographic specific who, what, where, when
  • Iconological abstract meanings, that require
    interpretation
  • May equate pre-iconographic and iconographic with
    the of-ness of an image, it is an image of,
    and iconological with the about-ness of an
    image, it is an image about
  • This framework for indexing visual information
    by Shatford (1986) predates del Bimbos work, and
    was based on the much earlier work of the art
    theorist Erwin Panofsky (1939).

11
EXERCISE 4-1Fill in for following images
  • Some possible metadata of each kind (doesnt need
    to be factually correct)
  • Content-independent Metadata
  • Content-dependent Metadata
  • Content-descriptive Metadata

12
Fill in for following images
  • Some possible metadata of each kind (doesnt need
    to be factually correct) note, this should come
    from your Content-descriptive metadata
  • Pre-iconographic
  • Iconographic
  • Iconological

13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
Metadata for Visual Information SUMMARY
  • Del Bimbo (1999) content-independent
    content-dependent content-descriptive
  • Shatford (1986) in effect refines content
    descriptive ? pre-iconographic iconographic
    iconological based on Panofskys ideas
  • Jaimes and Chang (2000) 10 Levels in their
    conceptual framework for visual information
    more about this in the Optional Reading.

17
PART 2 Standards for Indexing Visual Information
  • Exif common standard for images taken by
    digital camers
  • AAT Art and Architecture Thesaurus
  • NASA Thesaurus
  • ICONCLASS

18
Exif
  • Exif image file specification stipulates the
    method of recording image data in files, and
    specifies
  • Structure of image data files
  • Tags used by this standard
  • Definition and management of format versions
  • Includes details about the camera that took the
    image, shutter speed, apeture, time, etc.
  • www.exif.org

19
AAT Art and Architecture Thesaurus
www.getty.edu/research/tools/vocabulary/aat
  • Under development since 1980s
  • Now 120,000 terms of controlled vocabulary
    organised by concepts
  • Concept
  • linked to several terms (including preferred
    term)
  • positioned in a hierarchy
  • linked to related concepts
  • Terms cover objects, vocabulary to describe
    objects and scholarly concepts of theory and
    criticism

20
potpourri vases Note Covered containers in the
shape of a vase or jar and characterized by
pierced decorations on the shoulder or cover, or
both. They were intended primarily for liquid or
dried potpourri, which is a mixture of flower
petals, spices, fruit juices, or other aromatic
substances. For similar containers, often
vaselike in form, in which aromatic pastilles may
be burned or liquid perfumes evaporated, use
"cassolettes."Terms potpourri vases
(preferred, C,U,D,English, British-P) vases,
potpourri (C,U,UF) potpourri vase (C,U,AD)
pot-pourri vases (C,U,UF,English, British)
potpourri (container) (C,U,UF) pots-pourris
(C,U,UF,French) Hierarchical PositionObjects
Facet  ....  Furnishings and Equipment
 ........ Containers  .......  ltcontainers by
function or contextgt  ....................  potpo
urri vases
21
NASA Thesaurus
  • contains the authorized subject terms by which
    the documents in the NASA STI Databases are
    indexed and retrieved
  • 17,700 terms and 3,832 definitions
  • http//www.sti.nasa.gov/thesfrm1.htm

22
ICONCLASS
  • Iconclass is a subject specific international
    classification system for iconographic research
    and the documentation of images.
  • http//www.iconclass.nl/

23
Part 3 Approaches taken by Image Retrieval
Systems
  • Systems that retrieve images based on Visual
    Similarity QBIC (Query-by-Image-Content) and
    Blobworld
  • Systems that use the text associated with images
    as a source of keywords, e.g. search engines like
    Google
  • Manually indexed image collections like online
    art galleries (e.g. Tate) and commercial picture
    collections (e.g. Corbis). You can start your
    own image collection with freeware image
    management software, like Picassa
    www.picassa.com

24
Visual Similarity
  • Images can be indexed / queried at different
    levels of abstraction (cf. del Bimbos metadata
    scheme)
  • When dealing with content-dependent metadata
    (e.g. perceptual features like colour, texture
    and shape) it is possible to automate indexing
  • To query
  • draw coloured regions (sketch-based query)
  • or choose an example image (query by example)
  • Images with similar perceptual features are
    retrieved (not necessarily similar semantic
    content)

25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
Perceptual Features Colour
  • Colour can be computed as a global metric, i.e. a
    feature of an entire image or of a region
  • Colour is considered a good metric because it is
    invariant to image translation and rotation and
    changes only slowly under effects of different
    viewpoints, scale and occlusion
  • Colour values of pixels in an image are
    discretized and a colour histogram is made to
    represent the image / region

29
Similarity-based Retrieval
  • Perceptual Features (for visual similarity)
  • Colour
  • Texture
  • Shape
  • Spatial Relations
  • These features can be computed directly from
    image data they characterise the pixel
    distribution in different ways
  • Different features may help retrieve different
    kinds of images consider images of leaves,
    fabrics, adverts

30
(No Transcript)
31
Problems for Automation
  • How much metadata for images can be created
    automatically?
  • Two sets of problems have been labelled
  • Sensory Gap
  • Semantic Gap

32
The Sensory Gap
  • The sensory gap is the gap between the object in
    the world and the information in a
    (computational) description derived from a
    recording of that scene
  • (Smeulders et al 2000).

33
The Semantic Gap
  • The semantic gap is the lack of coincidence
    between the information that one can extract from
    the visual data and the interpretation that the
    same data have for a user in a given situation
  • (Smeulders et al 2000).

34
EXERCISE 4-2
  • Amy, Brian and Claire are developing the
    following image database systems
  • Amy is developing an image database of peoples
    holiday photographs, to be used by families to
    store and retrieve their personal photographs.
  • Brian is developing an image database of fabric
    samples, i.e. images of different materials, to
    be used by fashion designers to select fabrics
    for their clothes.
  • Claire is developing an image database of
    paintings from an art gallery to be used by the
    general public to choose prints to decorate their
    homes.
  • Which kinds of metadata do you think Amy, Brian
    and Claire should each include in their systems,
    following del Bimbos (1999) classification of
    content-independent, content-dependent and
    content-descriptive metadata for visual
    information? Note, each person may use more than
    one kind of metadata. In your answer give
    examples of each kind of metadata that you think
    Amy, Brian and Claire should use.

35
EXERCISE 4-3
  • What kinds of metadata can be indexed
    automatically more reliably?
  • What users information needs are likely to be met
    by this kind of metadata?

36
Lab Exercise
  • To use and evaluate current image retrieval
    systems
  • Systems that use visual similarity (QBIC and
    Blobworld)
  • Systems that use keywords from HTML (Google,
    AltaVista)
  • Systems in which images are manually indexed
    (Tate Gallery, Corbis)

37
LECTURE 4LEARNING OUTCOMES
  • After the lecture and lab exercise, you should be
    able to
  • Distinguish different kinds of metadata for
    images, and give examples for a given image
  • Del Bimbo content-dependent, content-independent,
    content-descriptive
  • Shatford pre-iconographic, iconographic,
    iconological
  • Decide what kinds of metadata are appropriate
    given user information needs
  • Assess the cost of producing the metadata, e.g.
    how much can be automated?
  • Compare and contrast the approaches taken to
    image retrieval by current systems, considering
    in particular the kinds of users / image data
    they are best suited for
  • Discuss problems for image retrieval like the
    sensory gap and the semantic gap

38
Optional Reading
  • A Framework with 10 Levels of Metadata for
    Images
  • Jaimes and Chang (2000). Alejandro Jaimes and
    Shih-Fu Chang, A Conceptual Framework for
    Indexing Visual Information at Multiple Levels,
    IST/SPIE Internet Imaging, Vol. 3964, San Jose,
    CA, Jan. 2000.
  • www.ctr.columbia.edu/ajaimes/Pubs/spie00_internet
    .pdf (concentrate on the distinctions made in
    Section 2 and on the different levels described
    in Section 3.1)
  • TASI (Technical Advisory Service for Images)
    paper on Metadata and Digital Images
    www.tasi.ac.uk/advice/delivering/metadata.html
  • A paper on creating an online art museum -
  • http//www.research.ibm.com/visualtechnologies/pdf
    /cacm_herm.html
  • For more on the kinds of metadata for visual
    information
  • del Bimbo (1999), Visual Information Retrieval.
    In Library Article Collection.
Write a Comment
User Comments (0)
About PowerShow.com