Image annotation? Trends and Direction for real-world image Annotation

About This Presentation

Title:

Image annotation? Trends and Direction for real-world image Annotation

Description:

Tiger Woods' (People), palm(tree), building, grass, golf, sky ... Tiger, grass. Motivation for KBIAR. There is a gap between perceptual issue and conceptual issue. ... – PowerPoint PPT presentation

Number of Views:243

Avg rating:3.0/5.0

Slides: 99

Provided by: yoh9

Category:

more less

Transcript and Presenter's Notes

Title: Image annotation? Trends and Direction for real-world image Annotation

1
Image annotation? Trends and Direction for
real-world image Annotation
Yohan Jin, Latifur Khan et al. "Image
Annotations By Combining Multiple Evidence
WordNet", In Proc. of 13th Annual ACM Multimedia
2005 (ACM MM05'), Singapore, November 2005, Page
706-715

2
Image Annotation?
soldiers,clock, door,wall,mirror
yohan jin, military officer,military
friends
3
Problem Statement

What is the Auto Image Annotation (AIA)?

Tiger Woods (People), palm(tree), building,
grass, golf, sky
Automatically annotate images then retrieve based
on the textual annotations.
4
Problem Statement contd.

Why many multimedia researchers love to do it?
Big Burst of Multimedia Content
E.g., Flicker, YouTube
Interdisciplinary Area
Computer Vision, Machine Learning,
Information Retrieval, NLP and so on.

5
Overall Procedure

Problem Statement
Motivation for KBIAR
Approach
Semantic-Similarity (WordNet)
Several Similarity Measures
Combining Semantic Evidence Model
Results
Contributions
Future Work as Ongoing Project

6
Motivation

How to retrieve images/videos?
CBIR is based on similarity search of visual
features
Doesnt support textual queries
Doesnt capture semantics
Automatically annotate images then retrieve based
on the textual annotations.

Example Annotations Tiger, grass.
7
Motivation for KBIAR

There is a gap between perceptual issue and
conceptual issue.
Semantic gap Hard to represent semantic meaning
using low-level image features like color,
texture and shape.
Its possible to answer query Red ball with
Red Rose.

8
Motivation

Most of current automatic image annotation and
retrieval approaches consider
Keywords
Low-level image features for visual
token/region/object
Correspondence between keywords and visual
tokens
Our goal is to develop automated image annotation
tecniques with better accuracy

9
Annotation
10
Annotation

Major steps
Segmentation into regions
Clustering to construct blob-tokens
Analyze correspondence between key words and
blob-tokens
Auto Annotation

11
Annotation Segmentation Clustering

Images Segments
Blob-tokens

12
Annotation Correspondence/Linking

Our purpose is to find correspondence between
words and blob-tokens.
P(TigerV1), P(V2grass)

13
Auto Annotation

.
14
Co-Occurrence Models

Mori et al. 1999
Create co-occurrence table using a training set
of annotated images
Tend to annotate with high frequency words
Context is ignored
Needs joint probability models

P( w1 v1 ) 12/(12201)0.8 P( v3 w2 )
12/(2401243)0.12
15
Correspondence Translation Model (TM)
16
Translation Models

Duygulu et al. 2002
Use classical IBM machine translation models to
translate visterms into words
IBM machine translation models
Need a bi-lingual corpus to train the models

V2 V4 V6
Maui People Dance
Mary no daba una botefada a la bruja verde
Mary did not slap the green witch
V1 V34 V321 V21
Tiger grasssky

17
Correspondence (TM )
B
B
N
X

W
N
W
18
Correspondence (TM )
B
W
Bj
Wi
N
N
19
Correspondence (TM )

Cosine Method (CSM)
Apply cosine to calculate the matrix with the
dimension of W x B in which the element of ith
row and jth column is the cosine between ith row
in and jth column in .

20
Correspondence (TM )

EM algorithm
EM algorithm can be used to estimate some set of
parameters ? that describe a hidden probability
distribution.
IBM model-2 for Translation
Try to maximize likelihood

21
Correspondence (TM )

EM algorithm
Calculate correspondences based on an estimate of
the probability table and use the correspondences
to update the estimate of the probability table
Two constraints

22
Correspondence (TM )
23
Correspondence (TM )
24
Motivation Contd.

Annotation results in images

Fish, building, sea, coral, flower
Sky, mountain, city
Noisy keywords
25
Motivation Contd.

Semantic Grouping

fish
sea
choose
coral
Relevant keywords
building
flower
remove
26
Motivation Contd.

Human can do Semantic Grouping
What about Computer?
Semantic Similarity
fish sea, fish
building
How much similar?
WordNet

27
Approach

WordNet
A lexical database for the English language
English nouns, verbs, adjectives and adverbs are
organized into synonym sets
Each representing one underlying lexical concept

28
WordNet

Car a motor vehicle with four wheels usually
propelled by an internal combustion engine
.self-propelled vehicle
. wheeled vehicle
. vehicle
.instrumentation
. artifact
.object
.entity

29
Detect Concept
30
Semantic Similarity (4)

To remove noisy annotation words in each image.
Three different Approaches
(Node-based , Distance-based, Gloss-based
approaches)
We use the WordNet2.0, SemCor2.0
as the Knowledge-base

31
Node-based Approaches

Resnik Measure (RIK)95
Jiang and Conrath Measure (JNC)97
Line Measure (LIN)98

32
Resnik Measure(1)

Introduces first Information Content (IC)
Use the Corpus (SemCor2.0)
A concept with high IC value
?Concept has a more detailed information
IC(Cable-television) greater than
IC(television)

33
Resnik Measure(2)

In the Corpus,
calculate the frequency of one word.
Get the Prob. by relative frequency

word(c) set of words subsumed by c
34
Resnik Measure(3)
artifact

Determine the lcs (lowest common subsumer)
between two words (hotel,door)
IC value of lcs is the semantic similarity value

structure
door
building
hotel
35
Resnik Measure(4)

After detecting the lcs between two words

36
Resnik Measure(4) Weakness
If the lcs is the same Soil rock
?material(4.82) Sand rock?material(4.82)
The simantic similarity value is the same!!
No way to discriminate!!
37
Jiang Conrath Measure

Use the IC (Information Content) notion
Consider
- IC value of lcs between the two words
and IC value of two words also!!

38
Lin Measure

Use the ratio of commonality
information amount of each words

39
Distance-based Measure

Leacock and Chodorow Measure (LNC) 98

40
Leacock and Chodorow Measure (LNC)

Measure by following the IS-A relation in
WordNet.
Compute the Shortest Number of intermediate
Nodes.

41
Leacock and Chodorow Measure2 (LNC)
ShortestLen (professional golf, baseball game)
5 D17(overall depth)
42
Gloss-based Measure

Banerjee and Pedersen Measure(BNP)

43
Benerjee and Pedersen Measure (BNP)

Use Gloss-Overlap for computing similarity
The more share, the more related!!
All relations
-hypernym,hyponym,meronym,holonym..

44
Benerjee and Pedersen Measure(BNP)

By gathering all glosses between A and B through
all relations in WordNet.

Related-paris(gloss,gloss),(hype,hype),(hypo,hyp
o), (hype,gloss),(gloss,hype)
45
Limitations of Measures(1)

RIK measure
Cannot differentiate the two words which have the
same lcs.
JNC, LIN measure
Suggest the way to differentiate the word which
have same lcs, so sensitive to Corpus

46
Limitations of Measures(2)

LNC measure
Shortest distance in WordNet may not
reflect the true distance!
Ex) furniture vs sky ? 8
furniture vs door ? 8
However, furniture is definitely close to
door!

47
Limitations of Measures(3)

BNP measure
Rely heavily on the shared glosses
If there is no common word in glosses
throughout every relations in WordNet
gt no way to get the distance..
Ex) jet sky ? no shared word ? score 0!!

48
TMHD model

TMHD Applying semantic similarity measure on
top of the TM model.
We choose the JNC, LIN, BNP measures outperform
other measures
Combine these scores for each keywords using
Dempster-Shaper Theory.

49
Dempster-Shafer Basics

Frame of Discernment (T)
a mutually exclusive elements
Power set (2T)
All subsets of T
Elements
Propositions
Hypothesis

50
Dempster-Shafer Basics

Basic probability assignment or mass function (m)
m(A) a measure of that portion of the total
belief to A
Uncertainty of the evidence

51
Dempster-Shafer Example

What is the next page a user will visit?
Given a web site of three pages B, C, D
T B, C, D
2T PB,PC ,PD,PB , PC , PB, PD, PD ,
PC, PB , PC , PD,?
We have Evidence that
m(B) 0.3
m(B,D) 0.1
m(D) 0.2
Uncertainty
m(T) 1 0.3 0.1 0.2 0.5

52
Dempsters Rule

Belief
Plausibility

53
Dempsters Rule in TM Model

Why?
JNC,LIN,BEN give better results
Combine three measures into one by giving
different weights.
Importance of each measure is different to each
image locally
TMHD model can combine dynamically three
evidences using the Dempsters Rule.

54
Dempsters Rule in TM Model Cont.

H (Hypothesis) assignment of a similarity value
between annotated keywords.
e.g. Hypothesis semantic dominance of sky in
one image
? semantic similarity of sky with other
keywords in a particular image

55
Dempsters Rule in TM Model Cont.

Dempsters Rule (with 2 evidences)
H a member of power set of

frame of discernment
56
Dempsters Rule in TM Model Cont.

Enhanced Dempsters Rule (with 3 evidences)

m1(A) is the portion of belief assigned to A by m1
m1,2,3(A) is the Dempsters combined prob. for a
hypothesis
57
Dempsters Rule in TM Model Cont.

Example 1)
Image has three annotation words (A,B,C)
Proper subset of

58
Dempsters Rule in TM Model Cont.

Example 1)

In many cases, basic probabilities of every
proper subset of may not available!!
If there is no belief of one subset,

59
Dempsters Rule in TM Model Cont.

We expect the evidences (JNC, LIN, BEN) to
evaluate semantic dominance of only one keyword
at a time.
We have positive evidence for only,

60
Dempsters Rule in TM Model Cont.

The Uncertainty of the evidence in our case is
TMHD model predict the semantic similarity by
combining Dempsters Rule in this way

m
61
Dempsters Rule in TM Model Cont.

So, we can get

62
Dempsters Rule in TM Model Cont.

The simplified Dempsters Rule
After eliminating zero terms

63
Dempsters Rule in TM Model Cont.
Uncertainty in the bodies of evidence for the

TSD (Total Semantic Distance) the summation
semantic distance for a word between every
other words in a image
64
Dempsters Rule in TM Model Cont.

Uncertainty based on the TSD
TSD (JNC) summation distance of JNC
over pair wise keywords

65
Dempsters Rule in TM ModelCont.

Example 2
Apply the dempsters rule to remove the
noisy word get from TM model

Sun, water, field, pillar -gt TMs annotation
result
66
Dempsters Rule in TM ModelCont.

Within this image,
TSD (JNC) 2.2087, TSD (LIN) 2.2875
TSD (BNP) 5.69211
Uncertainty
Basic probs.

67
Dempsters Rule in TM ModelCont.

Get the final combination result
Normalize the values for ranking

Remove!!
68
Results(1)

Dataset
Corel Stock Photo CDs.
600 CDs, each of them consists of 100 images
under same topic.
We select 5000 images (4500 training, 500
testing). Each image has manual annotation.
374 words and 500 blobs.

sun city sky mountain
grizzly bear meadow water

69
Results(2)
70
Results(3)
71
Overall Procedure

Problem Statement
Motivation for KBIAR
Approach
Semantic-Similarity (WordNet)
Several Similarity Measures
Combining Semantic Evidence Model
Results
Contributions
Future Work as Ongoing Project

72
Contributions

Open the new branch for Content-based Image
Retrieval area ? Context
Refinement Process ? required for many multimedia
data (image video retrieval)
Web-image annotation

73
Major Following Works in Knowledge-based Image
Annotation Refinement

Image annotation refinement using random walk
with restartsC Wang, F Jing, L Zhang, HJ Zhang -
Proceedings of the 14th annual ACM international
conference on Multimedia 2006.
An adaptive graph model for automatic image
annotation J Liu, M Li, WY Ma, Q Liu, H Lu -
Proceedings of the 8th ACM international workshop
on Multimedia Information Retrieval (MIR 06')
A Search-Based Web Image Annotation MethodX Rui,
N Yu, T Wang, M Li - Multimedia and Expo, 2007
IEEE International Conference on, 2007 -
ieeexplore.ieee.org
Automatic Refinement of Keyword Annotations for
Web Image SearchB Wang, Z Li, M Li - Springer,
(MMM 07')
Refining image annotation using contextual
relations between wordsY Wang, S Gong -
Proceedings of the 6th ACM international
conference on Image and Video Retrieval (CIVR
07')
A Content-Based Image Annotation RefinementC
Wang, F Jing, L Zhang, HJ Zhang - Computer Vision
and Pattern Recognition, 2007. CVPR'07. IEEE
Image Annotation Refinement using NSC-Based Word
CorrelationJ Liu, M Li, Q Liu, H Lu, S Ma -
Multimedia and Expo, 2007 IEEE International
Conference on, 2007
Automatic image annotation by an iterative
approach incorporating keyword correlations and
region matching CIVR07

74
Smoothing Effect

Low-level features ? Classification
Wait a minute, we have time and resource for
bridging this semantic gap!!

75
Smoothing Effect

Re-ranking Graph Algorithms
Graph-random Walks (Wang et al, ACM MM06)
Adaptive graph model (Liu et al, MIR 06)
Iterative annotation graph model (Zhou et al.
ACM CIVR07)
Web-image Annotation based on Search
Web-keywords Refinement (Wang et al. MMM07)
Bipartite graph model for web-image annotation
(Rui et al. ACM MM07)

76
Refining Decision Table
Candidate of Candidate keywords
Global textual relations
Web-search results
Candidate keywords
Knowledge-base
Local textual co-occurrence
Visual-similarity
Graph-heuristic algorithm
Machine-Learning algorithm
Decision Table is Too heavy
77
Problem Reduction

Image Annotation Refinement ? weighted Maximum
Cut Problem

building
woods
sky
desktop
G(V,E)
fish
V
politics
E
Woods, Building, fish, Sky, desktop, politics
Semantic Distance
Candidate keywords
78
Complexity of Image Annotation Problem
NP-Complete Original Problem
NP-Complete
Image Annotation Problem
KBIAR (Knowledge-base Image Annotation Refinement)
79
Optimal Solution of weighted Maximum-Cut
(-1)
building
(-1)
woods
(-1)
sky
desktop
(1)
We need to check Another Guess!!
(1)
fish
politics
(1)
80
Approximation Algorithm for WMC using
Randomization (Goeman et al.95)

Relaxation Process1
1-dimensional variable of unit norm
2-dimensional vector space of
unit norm
Let us define WMC 2-relaxation problem
(WMC-2VQP)

81
Approximation Algorithm for WMC using
Randomization (Goeman et al.95)

Relaxation Process2
2-dimensional variable of unit norm
n-dimensional vector space of
unit norm
Construct a positive semi-definite matrix M,
Formulate WMC-SDP(Semi-definite Program)

82
Randomized Approximation Scheme (2-way)
83
Relaxation Effect on Image Annotation Refinement
building
building
crystal
anemone
crystal
anemone
palace
reef
palace
reef
people
people
Edge-Values
Node-Variable
84
RMCA for Image Annotation Refinement
85
RMCA for Image Annotation Refinement
86
RMCA for Image Annotation Refinement
87
RMCA for Image Annotation Refinement
88
2-dimensional Random-hyper plane for decision

Deterministic Decision

89
2-dimensional Random-hyper plane for decision
90
Another Semantic Distance (Normalized Google
Distance)

Dynamic
Diversity
Example

91
Result with Corel Image Set
92
Google Image Labeler
93
Context, rather than Concept
94
Web? There is answer..but
Woods, hoping to extend winning streak, charges
to lead at Dubai Desert Classic Published
January 30, 2008 DUBAI, United Arab Emirates
(AP) Tiger Woods picked up right where he left
off last week - at the top of the
leaderboard. Woods, who won the Buick
Invitational on Sunday by eight strokes, shot a
7-under 65 Thursday to take a two-shot lead after
the first round of the Dubai Desert Classic. "I
played well today, just a bunch of good golf
shots,"" Woods said after his bogey-free round at
the Emirates Golf Club. Eleven players, including
Miguel Angel Jimenez and Abu Dhabi Golf
Championship winner Martin Kaymer, were tied for
second at 67. Ernie Els, Sergio Garcia and
defending champion Henrik Stenson were tied with
10 others another stroke back. Woods said he
played better in Dubai than he did last week at
Torrey Pines. "I had two good days of practice
the last couple days and started to hit the ball
a lot better than I did last week,"" said Woods,
who won the Dubai tournament in 2006.
woods
Tiger Woods
golf shot
stroke
95
Result with Web-images
96
Decision-2d hyper plane
today
winning streak
top
winner
Golf Club
second
first round
week
others
Golf shot
Golf championship
Kaymer
stoke
Classic
woods
Eight stroke
shot
Abu Dubai
r
Emirates
round
desert
Dubai
Tiger woods
Buick Invitational
Sunday
lead
Leader board
eleven players
Jimenez
Thursday
Ernie Els
Sergio Garcia
Defending Champion
two-shot
Hendrik Stenson
(a)
97
2nd round decision
Bethlehem
West bank
r
local
Israeli Civilian
militant group
sources
militant
town
night
leader
Palestinian
arrest
Israeli Forces
Jihad
terror
home
attack
terrorist
forces
soldiers
hours
(b)
Israeli Civilian
Palestinian
militant
Bethlehem
Jihad
r
arrest
town
attacks
night
soldiers
terror
forces
terrorist
(c)
98
Discussion Future Work

Evaluation scheme
More refined web-data (e.g. wiki-pedia)
Vision vs Semantic Measure
Video refinement

Write a Comment

User Comments (0)

About PowerShow.com

Image annotation? Trends and Direction for real-world image Annotation - PowerPoint PPT Presentation

Image annotation? Trends and Direction for real-world image Annotation

Tiger Woods' (People), palm(tree), building, grass, golf, sky ... Tiger, grass. Motivation for KBIAR. There is a gap between perceptual issue and conceptual issue. ... – PowerPoint PPT presentation