Title: ESAEUSC 2006:
1 University of Rome Tor Vergata
in collaboration with the
ESA-EUSC 2006 Image Information Mining for
Security and Intelligence EUSC, Torrejon air
base - Madrid (Spain), 28/11/2006 Combining
Linguistic and Visual properties for Unsupervised
Web Mining R. Basili (Univerisity of Rome "Tor
Vergata" - Italy), R. Petitti (Univerisity of
Rome "Tor Vergata" - Italy),M. V. Marabello
(Exprivia SpA - Italy), D. Saracino (Exprivia SpA
- Italy)
2Context
Collateral information is an important support to
image understanding. The WWW is an important
source for collateral information acquisition and
inspection ...BUT the manual continuous
monitoring of the WWW is actually costly in terms
of time and effort THEREFORE there is a
strong need for intelligent and domain specific
agents supporting the inspection of the so-called
open source of information available on the web.
3Searching the web the textual dimension
lthtmlgt ltbodygt ltp style"font-size 40pt
font-family arial font-weightbold"gtFlowerlt/pgt
ltimg src"flower-picture-3.jpg width"500"gtlt/
imggt lt!-- link to video here! --gt lt!-- link
to audio here! --gt ltbodygt lthtmlgt
4Searching the web the visual dimension
lthtmlgt ltbodygt ltp style"font-size 40pt
font-family arial font-weightbold"gtFlowerlt/pgt
ltimg src"flower-picture-3.jpg width"500"gtlt/
imggt lt!-- link to video here! --gt lt!-- link
to audio here! --gt ltbodygt lthtmlgt
5Searching the web video audio
lthtmlgt ltbodygt ltp style"font-size 40pt
font-family arial font-weightbold"gtFlowerlt/pgt
ltimg src"flower-picture-3.jpg width"500"gtlt/
imggt lt!-- link to video here! --gt lt!-- link
to audio here! --gt ltbodygt lthtmlgt
6Combining linguistic visual dimensions
7An architecture for multimedia conceptual
retrieval
8Clustering images
- Two possible strategies
- One stage clustering (monolithic) all the
available features are considered at the same
time - Two stage clustering (hierarchical)
- the first stage is operated by the use of a
subset of the available features - the second stage is applied according to the
remaining part of features with the objective to
refine results of the first stage clustering
9Clustering images via textual features only
10Clustering images via visual features only
11Clustering images by combining features
12Learning
- Three possible types of training
- Unsupervised clusters (with a high level of
cohesion) are directly used to train the SVD
categorizer. - No human effort is required
- Semi-supervised clusters (with a high level of
cohesion) are labelled having an overall check of
the images grouped together. No image per image
labelling is required. Labelled clusters are then
used to train the SVD categorizer. - Annotation rate 60 images per minute
- Supervised each image used to train the SVD
categorizer is labelled by its relevant class
(e.g. bridge) - Annotation rate 4 images per minute
13Experimental set-up
- Corpus properties
- Number of target categories 2 (bridge,
aircraft) 1 (others) - Number of HTML pages 23,520
- Number of sections derived 1,527
- Number of images 3,959
- Number of selected terms 53,251
- Experiments
- Experiment 1 cluster reconstruction
- Training evidence text only, visual only,
combined - Type of clustering monolithic/hierarchical
- Type of training unsupervised
- Experiment 2 image semantic classification
- Training evidence text only, visual only,
combined - Type of clustering monolithic/hierarchical
- Type of training supervised/semi-supervised
14Experiment 1 results
15Experiment 2 results
16Conclusions
- A robust and accurate semi-supervised approach to
image categorization has been proposed - Correlations among heterogeneous media
information is effectively captured by LSA - Textual features representative of clusters
enable traditional (i.e. keyword based) image
search modalities - Future research directions
- Explore the adoption of co-training models on top
of the LSA representation - Model the problem of the textual naming of image
clusters - Superimpose models of Web layout to the training
phase - Study and experiment over other corpora and
different domains