ESAEUSC 2006: - PowerPoint PPT Presentation

1 / 16

About This Presentation

Title:

ESAEUSC 2006:

Description:

Combining Linguistic and Visual. properties for Unsupervised Web Mining ... Visual feature extraction (LTI-lib) Merging (LSA based ad-hoc implementation) ... – PowerPoint PPT presentation

Number of Views:34

Avg rating:3.0/5.0

Slides: 17

Provided by: Supp3

Category:

Tags: esaeusc

more less

Transcript and Presenter's Notes

Title: ESAEUSC 2006:

1

University of Rome Tor Vergata
in collaboration with the
ESA-EUSC 2006 Image Information Mining for
Security and Intelligence EUSC, Torrejon air
base - Madrid (Spain), 28/11/2006 Combining
Linguistic and Visual properties for Unsupervised
Web Mining R. Basili (Univerisity of Rome "Tor
Vergata" - Italy), R. Petitti (Univerisity of
Rome "Tor Vergata" - Italy),M. V. Marabello
(Exprivia SpA - Italy), D. Saracino (Exprivia SpA
- Italy)
2
Context
Collateral information is an important support to
image understanding. The WWW is an important
source for collateral information acquisition and
inspection ...BUT the manual continuous
monitoring of the WWW is actually costly in terms
of time and effort THEREFORE there is a
strong need for intelligent and domain specific
agents supporting the inspection of the so-called
open source of information available on the web.
3
Searching the web the textual dimension
lthtmlgt ltbodygt ltp style"font-size 40pt
font-family arial font-weightbold"gtFlowerlt/pgt
ltimg src"flower-picture-3.jpg width"500"gtlt/
imggt lt!-- link to video here! --gt lt!-- link
to audio here! --gt ltbodygt lthtmlgt
4
Searching the web the visual dimension
lthtmlgt ltbodygt ltp style"font-size 40pt
font-family arial font-weightbold"gtFlowerlt/pgt
ltimg src"flower-picture-3.jpg width"500"gtlt/
imggt lt!-- link to video here! --gt lt!-- link
to audio here! --gt ltbodygt lthtmlgt
5
Searching the web video audio
lthtmlgt ltbodygt ltp style"font-size 40pt
font-family arial font-weightbold"gtFlowerlt/pgt
ltimg src"flower-picture-3.jpg width"500"gtlt/
imggt lt!-- link to video here! --gt lt!-- link
to audio here! --gt ltbodygt lthtmlgt
6
Combining linguistic visual dimensions
7
An architecture for multimedia conceptual
retrieval
8
Clustering images

Two possible strategies
One stage clustering (monolithic) all the
available features are considered at the same
time
Two stage clustering (hierarchical)
the first stage is operated by the use of a
subset of the available features
the second stage is applied according to the
remaining part of features with the objective to
refine results of the first stage clustering

9
Clustering images via textual features only
10
Clustering images via visual features only
11
Clustering images by combining features
12
Learning

Three possible types of training
Unsupervised clusters (with a high level of
cohesion) are directly used to train the SVD
categorizer.
No human effort is required
Semi-supervised clusters (with a high level of
cohesion) are labelled having an overall check of
the images grouped together. No image per image
labelling is required. Labelled clusters are then
used to train the SVD categorizer.
Annotation rate 60 images per minute
Supervised each image used to train the SVD
categorizer is labelled by its relevant class
(e.g. bridge)
Annotation rate 4 images per minute

13
Experimental set-up

Corpus properties
Number of target categories 2 (bridge,
aircraft) 1 (others)
Number of HTML pages 23,520
Number of sections derived 1,527
Number of images 3,959
Number of selected terms 53,251
Experiments
Experiment 1 cluster reconstruction
Training evidence text only, visual only,
combined
Type of clustering monolithic/hierarchical
Type of training unsupervised
Experiment 2 image semantic classification
Training evidence text only, visual only,
combined
Type of clustering monolithic/hierarchical
Type of training supervised/semi-supervised

14
Experiment 1 results
15
Experiment 2 results
16
Conclusions

A robust and accurate semi-supervised approach to
image categorization has been proposed
Correlations among heterogeneous media
information is effectively captured by LSA
Textual features representative of clusters
enable traditional (i.e. keyword based) image
search modalities
Future research directions
Explore the adoption of co-training models on top
of the LSA representation
Model the problem of the textual naming of image
clusters
Superimpose models of Web layout to the training
phase
Study and experiment over other corpora and
different domains