ESAEUSC 2006: - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

ESAEUSC 2006:

Description:

Combining Linguistic and Visual. properties for Unsupervised Web Mining ... Visual feature extraction (LTI-lib) Merging (LSA based ad-hoc implementation) ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 17
Provided by: Supp3
Category:
Tags: esaeusc

less

Transcript and Presenter's Notes

Title: ESAEUSC 2006:


1

University of Rome Tor Vergata
in collaboration with the
ESA-EUSC 2006 Image Information Mining for
Security and Intelligence EUSC, Torrejon air
base - Madrid (Spain), 28/11/2006 Combining
Linguistic and Visual properties for Unsupervised
Web Mining R. Basili (Univerisity of Rome "Tor
Vergata" - Italy), R. Petitti (Univerisity of
Rome "Tor Vergata" - Italy),M. V. Marabello
(Exprivia SpA - Italy), D. Saracino (Exprivia SpA
- Italy)
2
Context
Collateral information is an important support to
image understanding. The WWW is an important
source for collateral information acquisition and
inspection ...BUT the manual continuous
monitoring of the WWW is actually costly in terms
of time and effort THEREFORE there is a
strong need for intelligent and domain specific
agents supporting the inspection of the so-called
open source of information available on the web.
3
Searching the web the textual dimension
lthtmlgt ltbodygt ltp style"font-size 40pt
font-family arial font-weightbold"gtFlowerlt/pgt
ltimg src"flower-picture-3.jpg width"500"gtlt/
imggt lt!-- link to video here! --gt lt!-- link
to audio here! --gt ltbodygt lthtmlgt
4
Searching the web the visual dimension
lthtmlgt ltbodygt ltp style"font-size 40pt
font-family arial font-weightbold"gtFlowerlt/pgt
ltimg src"flower-picture-3.jpg width"500"gtlt/
imggt lt!-- link to video here! --gt lt!-- link
to audio here! --gt ltbodygt lthtmlgt
5
Searching the web video audio
lthtmlgt ltbodygt ltp style"font-size 40pt
font-family arial font-weightbold"gtFlowerlt/pgt
ltimg src"flower-picture-3.jpg width"500"gtlt/
imggt lt!-- link to video here! --gt lt!-- link
to audio here! --gt ltbodygt lthtmlgt
6
Combining linguistic visual dimensions
7
An architecture for multimedia conceptual
retrieval
8
Clustering images
  • Two possible strategies
  • One stage clustering (monolithic) all the
    available features are considered at the same
    time
  • Two stage clustering (hierarchical)
  • the first stage is operated by the use of a
    subset of the available features
  • the second stage is applied according to the
    remaining part of features with the objective to
    refine results of the first stage clustering

9
Clustering images via textual features only
10
Clustering images via visual features only
11
Clustering images by combining features
12
Learning
  • Three possible types of training
  • Unsupervised clusters (with a high level of
    cohesion) are directly used to train the SVD
    categorizer.
  • No human effort is required
  • Semi-supervised clusters (with a high level of
    cohesion) are labelled having an overall check of
    the images grouped together. No image per image
    labelling is required. Labelled clusters are then
    used to train the SVD categorizer.
  • Annotation rate 60 images per minute
  • Supervised each image used to train the SVD
    categorizer is labelled by its relevant class
    (e.g. bridge)
  • Annotation rate 4 images per minute

13
Experimental set-up
  • Corpus properties
  • Number of target categories 2 (bridge,
    aircraft) 1 (others)
  • Number of HTML pages 23,520
  • Number of sections derived 1,527
  • Number of images 3,959
  • Number of selected terms 53,251
  • Experiments
  • Experiment 1 cluster reconstruction
  • Training evidence text only, visual only,
    combined
  • Type of clustering monolithic/hierarchical
  • Type of training unsupervised
  • Experiment 2 image semantic classification
  • Training evidence text only, visual only,
    combined
  • Type of clustering monolithic/hierarchical
  • Type of training supervised/semi-supervised

14
Experiment 1 results
15
Experiment 2 results
16
Conclusions
  • A robust and accurate semi-supervised approach to
    image categorization has been proposed
  • Correlations among heterogeneous media
    information is effectively captured by LSA
  • Textual features representative of clusters
    enable traditional (i.e. keyword based) image
    search modalities
  • Future research directions
  • Explore the adoption of co-training models on top
    of the LSA representation
  • Model the problem of the textual naming of image
    clusters
  • Superimpose models of Web layout to the training
    phase
  • Study and experiment over other corpora and
    different domains
Write a Comment
User Comments (0)
About PowerShow.com