Exploiting Ontologies for Automatic Image Annotation - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Exploiting Ontologies for Automatic Image Annotation

Description:

Richardson, Texas. Motivation. Automatic Image Annotation Problem. Ontologies for ... Caption: Ronaldo seals Brazil's place in the last eight with a shot through ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 26
Provided by: dccU
Category:

less

Transcript and Presenter's Notes

Title: Exploiting Ontologies for Automatic Image Annotation


1
Exploiting Ontologies for Automatic Image
Annotation
  • M. Srikanth, J. Varner, M. Bowden, D. Moldovan
  • Language Computer Corporation
  • www.languagecomputer.com
  • Richardson, Texas

2
Contents
  • Motivation
  • Automatic Image Annotation Problem
  • Ontologies for
  • Defining Visual Vocabularies
  • Hierarchical Models for image annotation
  • Related Work
  • Experiments Results
  • Conclusion and Future Work

3
Motivation Multimedia Question Answering
  • Majority of efforts in Q/A focus on textual
    corpora and processing
  • Large amounts of information held within
    multimedia sources images/audio/video
  • Extend the Power of Q/A into the realm of
    multimedia
  • Exploit commonality and union of text and
    multimedia information

4
Multimedia Question Answering
  • Some ways in which multimedia can be used in Q/A
  • Multimedia (video clip/image) as Answer
  • Multimedia and Lexical combination providing
    enhanced understanding to Answer questions

5
Approach
  • Feature extraction
  • High- and Low-level features
  • Object recognition
  • Auto Annotation of images
  • Object semantics extraction
  • Locative/temporal/etc
  • Build Knowledge Representation from Image/Video
  • Merge with audio/text Knowledge Representation
  • Lexical information from ASR and VOCR
  • Provide Multimedia Q/A based using Multimedia
    Ontologies
  • Feature extraction
  • High- and Low-level features
  • Object recognition
  • Auto Annotation of images
  • Object semantics extraction
  • Locative/temporal/etc
  • Build Knowledge Representation from Image/Video
  • Merge with audio/text Knowledge Representation
  • Lexical information from ASR and VOCR
  • Provide Multimedia Q/A based using Multimedia
    Ontologies

6
Automatic Image Annotation
  • Task of automatically assigning words to an image
    that describe the contents of the image
  • Most models exploit the correlation between
    images and words
  • Exploit the correlation between the annotation
    words themselves to
  • Define visual vocabularies
  • Develop hierarchical models for automatic image
    annotation

Use ontological information about annotation
words to improve image annotation
7
Prior Work Translation Models
  • Models for translating visual representation of
    concept to textual representation (Duygulu et
    al., 2002)
  • Based on Brown model for Machine Translation
    (Brown et al., 1993)
  • Image Features translate to Annotation Words
  • K-Means used to cluster image features to
    generate blobs
  • Dependencies between blobs and words is not
    explicitly captured

Use ontology to drive the definition of blobs
8
Prior Work HACM Model
  • Hierarchical Aspect Cluster Model (T. Hofmann,
    1998)
  • Induces an hierarchical structure from
    co-occurrence of image features
  • Topology is externally defined
  • Depth of the induced hierarchy is user selected
  • Levels define the generality of the concept
    expressed in regions and words

The hierarchies defined in ontologies have
well-defined semantics Image feature hierarchy
induced from a text ontology
9
Prior Work Classification Approaches
  • Estimate P(wI) to classify an Image I
    (represented by image features) into one of the
    classes (annotation word w)
  • Generative Models
  • Flat classification Learn one classifier per
    annotation word
  • SVM Classifier (Cusano et al., 2004)
  • Discriminative Models
  • Jeon and Manmatha (2004) showed improvements over
    translation using Maximum Entropy Models
  • Unigram (blob, word) and Bigram (horizontal blob
    pairs, word) feature

Explore hierarchical classification using ontology
10
Image Representation usingVisual Vocabulary
Image Segmentation
Feature Extraction
Image Representation
Image
  • Image Segmentation
  • Image regions corresponding to objects in the
    image
  • Grid-based image segmentation
  • Feature Extraction
  • Extract image features from image regions
  • Color, Shape, Texture
  • Image Representation
  • real-valued feature vectors
  • Visual vocabulary derived based on clustering
    feature vectors
  • Cluster centers (Blobs) define the vocabulary

11
Visual vocabulary from Ontologies
  • Image regions from images are organized in the
    hierarchy based on the image annotation
  • Image attributes of children nodes are related
    parent nodes image attributes

12
Using Ontologies in Translation Models for
Automatic Image Annotation
  • Ontology-induced visual vocabulary
  • Annotation word hierarchy used in selecting the
    initial set of blobs for K-means clustering
  • Ontology-weighed K-means clustering
  • Weight the cluster membership of image regions in
    the estimation of cluster centers (blobs)

n(w,c) number of image regions in cluster c
associated with word w n(c) number of image
regions in cluster c f(r) feature vector for
region r
13
Image Annotation by Hierarchical Classification
  • Based on hierarchical approach to text
    classification (McCallum et al., 1998)
  • Statistical, back-off model induced by the
    hierarchy derived from annotation word ontology
  • Given an image I with blob sequence
    , the probability of word w is given by
  • Assuming a Bernoulli model for annotations, the
    blob likelihood given a word is estimated as

V Visual vocabulary T Training set of
annotated images W Set of annotation words
14
Image Annotation using Hierarchical
Classification (contd.)
  • The IS-A hierarchy among annotation words is used
    to estimate blob-likelihood probability

ROOT

animal
feline
cat
  • Feature weights learned using EM algorithm

tiger
cougar
leopard
lion
lynx
15
Experiments
  • Corel Data Set
  • Annotated images using pre-processed data from
    (Duygulu, et al., 2002)
  • 4500 images annotated using 374 words
  • 4000 for training 500 for testing
  • Image Representation
  • Image Segmentation using N-cuts (Duygulu et al.,
    2002)
  • 36 different image features represent each image
    region
  • Ontology WordNet
  • Hierarchy with 714 unique concepts was induced
    from 374 annotation words

16
Image Annotation Evaluation
  • Annotation systems predict P(wI)
  • A cut-off or threshold required to assign
    annotations
  • Unnormalized take top 5 words
  • Normalized take top m words, where m is of
    annotations for I
  • Metrics
  • Number of words of positive recall
  • Mean per-word Precision-Recall
  • All words in the dictionary
  • Selected set of words
  • Retrieved words retrieved using the method
  • Common words predicted by all annotation systems
  • Union all words predicted by at least one
    annotation system

17
Results Translation Models and Ontologies
Features Description Precision Recall Predicted Positive Recall
KM-500 Baseline K-means clustering 0.2204 0.2412 28 27
WKM-500 Weighted K-means clustering 0.2042 0.2524 27 26
ONT-714 Using 714 clusters with one cluster per word in the induced ontology 0.2634 0.2724 36 35
ONT-500 Reducing ONT-714 to 500 clusters by combining close clusters 0.2482 0.2499 33 32
  • Precision/Recall numbers are average over
    pooled set of 42 words
  • Observations
  • Using ontologies increase the number of words
    predicted with postive recall
  • Hierarchy based initial clusters attaches better
    semantics to clusters
  • Results for ontology-induced clusters is based on
    One blob per concept

18
Results Classification Approaches and Ontologies
  • Comparing Flat classification versus Hierarchical
    classification for image annotations

Features Precision Recall Ret. Pos. Recall
Flat KMeans-500 0.1627 0.2766 152 86
Hier KMeans-500 0.1805 0.3174 146 93
  • Precision/Recall numbers correspond to using the
    KM-500 visual vocabulary
  • Observations
  • Improved Precision (10) and Recall (14) values
  • Increase in number of annotations with positive
    recall
  • Hierarchy derived from annotation ontology
    results in improved performance

19
Results Hierarchical Classification with
Ontology-induced Visual Vocabularies
Measures KM-500 WKM-500 ONT-714 ONT-500
Baseline Flat Classification Method Baseline Flat Classification Method Baseline Flat Classification Method Baseline Flat Classification Method Baseline Flat Classification Method
Precision 0.1627 0.1867 0.1647 0.1643
Recall 0.2766 0.2831 0.2724 0.2697
Predicted 152 153 150 141
Positive Recall 86 90 84 80
Hierarchical Classification Method Hierarchical Classification Method Hierarchical Classification Method Hierarchical Classification Method Hierarchical Classification Method
Precision 0.1805 0.1882 0.1723 0.1754
Recall 0.3174 0.3135 0.2926 0.2903
Predicted 146 140 150 137
Positive Recall 93 91 91 81
  • Hierarchical approach improves precision/recall
    values on different visual vocabularies
  • ONT-714 has improved positive recall numbers
  • Ontologies defined on text annotations provide a
    good framework for developing hierarchical models
    for image features

20
Results Comparing Translation and Classification
Approaches
Measures KM-500 WKM-500 ONT-714 ONT-500
Common Words 27 26 35 32
Translation Method Translation Method Translation Method Translation Method Translation Method
Precision 0.3270 0.3134 0.3040 0.3124
Recall 0.3720 0.4043 0.3244 0.3253
Flat Classification Method Flat Classification Method Flat Classification Method Flat Classification Method Flat Classification Method
Precision 0.3243 0.3157 0.2924 0.3000
Recall 0.5666 0.5649 0.5591 0.5632
Hierarchical Classification Method Hierarchical Classification Method Hierarchical Classification Method Hierarchical Classification Method Hierarchical Classification Method
Precision 0.3223 0.3104 0.3018 0.3068
Recall 0.5652 0.5362 0.5453 0.5605
  • Comparison based on common annotation words
    predicted by different models
  • Significant improvement in recall using
    classification approaches

21
Ontologies in Automatic Image Annotation
  • Experimental Results
  • Ontology in translation model
  • 19.5 increase in average precision
  • 13 increase in average recall
  • Ontology in classification
  • 10 increase in average precision
  • 14 increase in average recall
  • Using word hierarchies improve annotation results
    when used
  • as a source for selecting initial blobs, and
  • as framework for hierarchical classification

22
Summary and Future Work
  • Proposed methods for using ontologies in
    automatic image annotation
  • Translation Models Defining Visual vocabulary
  • Hierarchical Classification Models Provide the
    hierarchy for models defined image features
  • Explore the use of ontologies in other approaches
    to automatic image annotation
  • Discriminative models
  • Exploit the dependence between annotation words
    in automatic image annotation
  • Correlation between annotation words of an image
    can be exploited

23
Summary and Future Work (Contd.)
  • Utilize hierarchical organization of concepts and
    language models on image blobs to develop
    multi-modal ontologies
  • Use multi-modal ontologies in Q/A

24
Multimedia Ontology Example Node
  • Transportation WordNet hierarchy with Multimedia
    data

25
  • Thank You.
Write a Comment
User Comments (0)
About PowerShow.com