Image annotation? Trends and Direction for real-world image Annotation - PowerPoint PPT Presentation

1 / 98
About This Presentation
Title:

Image annotation? Trends and Direction for real-world image Annotation

Description:

Tiger Woods' (People), palm(tree), building, grass, golf, sky ... Tiger, grass. Motivation for KBIAR. There is a gap between perceptual issue and conceptual issue. ... – PowerPoint PPT presentation

Number of Views:243
Avg rating:3.0/5.0
Slides: 99
Provided by: yoh9
Category:

less

Transcript and Presenter's Notes

Title: Image annotation? Trends and Direction for real-world image Annotation


1
Image annotation? Trends and Direction for
real-world image Annotation
Yohan Jin, Latifur Khan et al. "Image
Annotations By Combining Multiple Evidence
WordNet", In Proc. of 13th Annual ACM Multimedia
2005 (ACM MM05'), Singapore, November 2005, Page
706-715

2
Image Annotation?
soldiers,clock, door,wall,mirror
yohan jin, military officer,military
friends
3
Problem Statement
  • What is the Auto Image Annotation (AIA)?

Tiger Woods (People), palm(tree), building,
grass, golf, sky
Automatically annotate images then retrieve based
on the textual annotations.
4
Problem Statement contd.
  • Why many multimedia researchers love to do it?
  • Big Burst of Multimedia Content
  • E.g., Flicker, YouTube
  • Interdisciplinary Area
  • Computer Vision, Machine Learning,
    Information Retrieval, NLP and so on.

5
Overall Procedure
  • Problem Statement
  • Motivation for KBIAR
  • Approach
  • Semantic-Similarity (WordNet)
  • Several Similarity Measures
  • Combining Semantic Evidence Model
  • Results
  • Contributions
  • Future Work as Ongoing Project

6
Motivation
  • How to retrieve images/videos?
  • CBIR is based on similarity search of visual
    features
  • Doesnt support textual queries
  • Doesnt capture semantics
  • Automatically annotate images then retrieve based
    on the textual annotations.

Example Annotations Tiger, grass.
7
Motivation for KBIAR
  • There is a gap between perceptual issue and
    conceptual issue.
  • Semantic gap Hard to represent semantic meaning
    using low-level image features like color,
    texture and shape.
  • Its possible to answer query Red ball with
    Red Rose.

8
Motivation
  • Most of current automatic image annotation and
    retrieval approaches consider
  • Keywords
  • Low-level image features for visual
    token/region/object
  • Correspondence between keywords and visual
    tokens
  • Our goal is to develop automated image annotation
    tecniques with better accuracy

9
Annotation
10
Annotation
  • Major steps
  • Segmentation into regions
  • Clustering to construct blob-tokens
  • Analyze correspondence between key words and
    blob-tokens
  • Auto Annotation

11
Annotation Segmentation Clustering
  • Images Segments
    Blob-tokens

12
Annotation Correspondence/Linking
  • Our purpose is to find correspondence between
    words and blob-tokens.
  • P(TigerV1), P(V2grass)

13
Auto Annotation

.
14
Co-Occurrence Models
  • Mori et al. 1999
  • Create co-occurrence table using a training set
    of annotated images
  • Tend to annotate with high frequency words
  • Context is ignored
  • Needs joint probability models

P( w1 v1 ) 12/(12201)0.8 P( v3 w2 )
12/(2401243)0.12
15
Correspondence Translation Model (TM)
16
Translation Models
  • Duygulu et al. 2002
  • Use classical IBM machine translation models to
    translate visterms into words
  • IBM machine translation models
  • Need a bi-lingual corpus to train the models

V2 V4 V6
Maui People Dance
Mary no daba una botefada a la bruja verde
Mary did not slap the green witch
V1 V34 V321 V21
Tiger grasssky






17
Correspondence (TM )
B
B
N
X

W
N
W
18
Correspondence (TM )
B
W
Bj
Wi
N
N
19
Correspondence (TM )
  • Cosine Method (CSM)
  • Apply cosine to calculate the matrix with the
    dimension of W x B in which the element of ith
    row and jth column is the cosine between ith row
    in and jth column in .

20
Correspondence (TM )
  • EM algorithm
  • EM algorithm can be used to estimate some set of
    parameters ? that describe a hidden probability
    distribution.
  • IBM model-2 for Translation
  • Try to maximize likelihood

21
Correspondence (TM )
  • EM algorithm
  • Calculate correspondences based on an estimate of
    the probability table and use the correspondences
    to update the estimate of the probability table
  • Two constraints

22
Correspondence (TM )
23
Correspondence (TM )
24
Motivation Contd.
  • Annotation results in images

Fish, building, sea, coral, flower
Sky, mountain, city
Noisy keywords
25
Motivation Contd.
  • Semantic Grouping

fish
sea
choose
coral
Relevant keywords
building
flower
remove
26
Motivation Contd.
  • Human can do Semantic Grouping
  • What about Computer?
  • Semantic Similarity
  • fish sea, fish
    building
  • How much similar?
  • WordNet

27
Approach
  • WordNet
  • A lexical database for the English language
  • English nouns, verbs, adjectives and adverbs are
    organized into synonym sets
  • Each representing one underlying lexical concept

28
WordNet
  • Car a motor vehicle with four wheels usually
    propelled by an internal combustion engine
  • .self-propelled vehicle
  • . wheeled vehicle
  • . vehicle
  • .instrumentation
  • . artifact
  • .object
  • .entity

29
Detect Concept
30
Semantic Similarity (4)
  • To remove noisy annotation words in each image.
  • Three different Approaches
  • (Node-based , Distance-based, Gloss-based
    approaches)
  • We use the WordNet2.0, SemCor2.0
  • as the Knowledge-base

31
Node-based Approaches
  • Resnik Measure (RIK)95
  • Jiang and Conrath Measure (JNC)97
  • Line Measure (LIN)98

32
Resnik Measure(1)
  • Introduces first Information Content (IC)
  • Use the Corpus (SemCor2.0)
  • A concept with high IC value
  • ?Concept has a more detailed information
  • IC(Cable-television) greater than
    IC(television)

33
Resnik Measure(2)
  • In the Corpus,
  • calculate the frequency of one word.
  • Get the Prob. by relative frequency

word(c) set of words subsumed by c
34
Resnik Measure(3)
artifact
  • Determine the lcs (lowest common subsumer)
    between two words (hotel,door)
  • IC value of lcs is the semantic similarity value

structure
door
building
hotel
35
Resnik Measure(4)
  • After detecting the lcs between two words

36
Resnik Measure(4) Weakness
If the lcs is the same Soil rock
?material(4.82) Sand rock?material(4.82)
The simantic similarity value is the same!!
No way to discriminate!!
37
Jiang Conrath Measure
  • Use the IC (Information Content) notion
  • Consider
  • - IC value of lcs between the two words
  • and IC value of two words also!!

38
Lin Measure
  • Use the ratio of commonality
  • information amount of each words

39
Distance-based Measure
  • Leacock and Chodorow Measure (LNC) 98

40
Leacock and Chodorow Measure (LNC)
  • Measure by following the IS-A relation in
    WordNet.
  • Compute the Shortest Number of intermediate
    Nodes.

41
Leacock and Chodorow Measure2 (LNC)
ShortestLen (professional golf, baseball game)
5 D17(overall depth)
42
Gloss-based Measure
  • Banerjee and Pedersen Measure(BNP)

43
Benerjee and Pedersen Measure (BNP)
  • Use Gloss-Overlap for computing similarity
  • The more share, the more related!!
  • All relations
  • -hypernym,hyponym,meronym,holonym..

44
Benerjee and Pedersen Measure(BNP)
  • By gathering all glosses between A and B through
    all relations in WordNet.

Related-paris(gloss,gloss),(hype,hype),(hypo,hyp
o), (hype,gloss),(gloss,hype)
45
Limitations of Measures(1)
  • RIK measure
  • Cannot differentiate the two words which have the
    same lcs.
  • JNC, LIN measure
  • Suggest the way to differentiate the word which
    have same lcs, so sensitive to Corpus

46
Limitations of Measures(2)
  • LNC measure
  • Shortest distance in WordNet may not
  • reflect the true distance!
  • Ex) furniture vs sky ? 8
  • furniture vs door ? 8
  • However, furniture is definitely close to
    door!

47
Limitations of Measures(3)
  • BNP measure
  • Rely heavily on the shared glosses
  • If there is no common word in glosses
  • throughout every relations in WordNet
  • gt no way to get the distance..
  • Ex) jet sky ? no shared word ? score 0!!

48
TMHD model
  • TMHD Applying semantic similarity measure on
    top of the TM model.
  • We choose the JNC, LIN, BNP measures outperform
    other measures
  • Combine these scores for each keywords using
    Dempster-Shaper Theory.

49
Dempster-Shafer Basics
  • Frame of Discernment (T)
  • a mutually exclusive elements
  • Power set (2T)
  • All subsets of T
  • Elements
  • Propositions
  • Hypothesis

50
Dempster-Shafer Basics
  • Basic probability assignment or mass function (m)
  • m(A) a measure of that portion of the total
    belief to A
  • Uncertainty of the evidence

51
Dempster-Shafer Example
  • What is the next page a user will visit?
  • Given a web site of three pages B, C, D
  • T B, C, D
  • 2T PB,PC ,PD,PB , PC , PB, PD, PD ,
    PC, PB , PC , PD,?
  • We have Evidence that
  • m(B) 0.3
  • m(B,D) 0.1
  • m(D) 0.2
  • Uncertainty
  • m(T) 1 0.3 0.1 0.2 0.5

52
Dempsters Rule
  • Belief
  • Plausibility

53
Dempsters Rule in TM Model
  • Why?
  • JNC,LIN,BEN give better results
  • Combine three measures into one by giving
  • different weights.
  • Importance of each measure is different to each
    image locally
  • TMHD model can combine dynamically three
    evidences using the Dempsters Rule.

54
Dempsters Rule in TM Model Cont.
  • H (Hypothesis) assignment of a similarity value
    between annotated keywords.
  • e.g. Hypothesis semantic dominance of sky in
    one image
  • ? semantic similarity of sky with other
    keywords in a particular image

55
Dempsters Rule in TM Model Cont.
  • Dempsters Rule (with 2 evidences)
  • H a member of power set of

frame of discernment
56
Dempsters Rule in TM Model Cont.
  • Enhanced Dempsters Rule (with 3 evidences)

m1(A) is the portion of belief assigned to A by m1
m1,2,3(A) is the Dempsters combined prob. for a
hypothesis
57
Dempsters Rule in TM Model Cont.
  • Example 1)
  • Image has three annotation words (A,B,C)
  • Proper subset of

58
Dempsters Rule in TM Model Cont.
  • Example 1)

In many cases, basic probabilities of every
proper subset of may not available!!
If there is no belief of one subset,

59
Dempsters Rule in TM Model Cont.
  • We expect the evidences (JNC, LIN, BEN) to
    evaluate semantic dominance of only one keyword
    at a time.
  • We have positive evidence for only,

60
Dempsters Rule in TM Model Cont.
  • The Uncertainty of the evidence in our case is
  • TMHD model predict the semantic similarity by
    combining Dempsters Rule in this way

m
61
Dempsters Rule in TM Model Cont.
  • So, we can get

62
Dempsters Rule in TM Model Cont.
  • The simplified Dempsters Rule
  • After eliminating zero terms

63
Dempsters Rule in TM Model Cont.
Uncertainty in the bodies of evidence for the

TSD (Total Semantic Distance) the summation
semantic distance for a word between every
other words in a image
64
Dempsters Rule in TM Model Cont.
  • Uncertainty based on the TSD
  • TSD (JNC) summation distance of JNC
  • over pair wise keywords

65
Dempsters Rule in TM ModelCont.
  • Example 2
  • Apply the dempsters rule to remove the
  • noisy word get from TM model

Sun, water, field, pillar -gt TMs annotation
result
66
Dempsters Rule in TM ModelCont.
  • Within this image,
  • TSD (JNC) 2.2087, TSD (LIN) 2.2875
  • TSD (BNP) 5.69211
  • Uncertainty
  • Basic probs.

67
Dempsters Rule in TM ModelCont.
  • Get the final combination result
  • Normalize the values for ranking

Remove!!
68
Results(1)
  • Dataset
  • Corel Stock Photo CDs.
  • 600 CDs, each of them consists of 100 images
    under same topic.
  • We select 5000 images (4500 training, 500
    testing). Each image has manual annotation.
  • 374 words and 500 blobs.
  • sun city sky mountain
  • grizzly bear meadow water

69
Results(2)
70
Results(3)
71
Overall Procedure
  • Problem Statement
  • Motivation for KBIAR
  • Approach
  • Semantic-Similarity (WordNet)
  • Several Similarity Measures
  • Combining Semantic Evidence Model
  • Results
  • Contributions
  • Future Work as Ongoing Project

72
Contributions
  • Open the new branch for Content-based Image
    Retrieval area ? Context
  • Refinement Process ? required for many multimedia
    data (image video retrieval)
  • Web-image annotation

73
Major Following Works in Knowledge-based Image
Annotation Refinement
  • Image annotation refinement using random walk
    with restartsC Wang, F Jing, L Zhang, HJ Zhang -
    Proceedings of the 14th annual ACM international
    conference on Multimedia 2006.
  • An adaptive graph model for automatic image
    annotation J Liu, M Li, WY Ma, Q Liu, H Lu -
    Proceedings of the 8th ACM international workshop
    on Multimedia Information Retrieval (MIR 06')
  • A Search-Based Web Image Annotation MethodX Rui,
    N Yu, T Wang, M Li - Multimedia and Expo, 2007
    IEEE International Conference on, 2007 -
    ieeexplore.ieee.org
  • Automatic Refinement of Keyword Annotations for
    Web Image SearchB Wang, Z Li, M Li - Springer,
    (MMM 07')
  • Refining image annotation using contextual
    relations between wordsY Wang, S Gong -
    Proceedings of the 6th ACM international
    conference on Image and Video Retrieval (CIVR
    07')
  • A Content-Based Image Annotation RefinementC
    Wang, F Jing, L Zhang, HJ Zhang - Computer Vision
    and Pattern Recognition, 2007. CVPR'07. IEEE
  • Image Annotation Refinement using NSC-Based Word
    CorrelationJ Liu, M Li, Q Liu, H Lu, S Ma -
    Multimedia and Expo, 2007 IEEE International
    Conference on, 2007
  • Automatic image annotation by an iterative
    approach incorporating keyword correlations and
    region matching CIVR07

74
Smoothing Effect
  • Low-level features ? Classification
  • Wait a minute, we have time and resource for
    bridging this semantic gap!!

75
Smoothing Effect
  • Re-ranking Graph Algorithms
  • Graph-random Walks (Wang et al, ACM MM06)
  • Adaptive graph model (Liu et al, MIR 06)
  • Iterative annotation graph model (Zhou et al.
    ACM CIVR07)
  • Web-image Annotation based on Search
  • Web-keywords Refinement (Wang et al. MMM07)
  • Bipartite graph model for web-image annotation
  • (Rui et al. ACM MM07)

76
Refining Decision Table
Candidate of Candidate keywords
Global textual relations
Web-search results
Candidate keywords
Knowledge-base
Local textual co-occurrence
Visual-similarity
Graph-heuristic algorithm
Machine-Learning algorithm
Decision Table is Too heavy
77
Problem Reduction
  • Image Annotation Refinement ? weighted Maximum
    Cut Problem

building
woods
sky
desktop
G(V,E)
fish
V
politics
E
Woods, Building, fish, Sky, desktop, politics
Semantic Distance
Candidate keywords
78
Complexity of Image Annotation Problem
NP-Complete Original Problem
NP-Complete
Image Annotation Problem
KBIAR (Knowledge-base Image Annotation Refinement)
79
Optimal Solution of weighted Maximum-Cut
(-1)
building
(-1)
woods
(-1)
sky
desktop
(1)
We need to check Another Guess!!
(1)
fish
politics
(1)
80
Approximation Algorithm for WMC using
Randomization (Goeman et al.95)
  • Relaxation Process1
  • 1-dimensional variable of unit norm
    2-dimensional vector space of
    unit norm
  • Let us define WMC 2-relaxation problem
  • (WMC-2VQP)

81
Approximation Algorithm for WMC using
Randomization (Goeman et al.95)
  • Relaxation Process2
  • 2-dimensional variable of unit norm
    n-dimensional vector space of
    unit norm
  • Construct a positive semi-definite matrix M,
  • Formulate WMC-SDP(Semi-definite Program)

82
Randomized Approximation Scheme (2-way)
83
Relaxation Effect on Image Annotation Refinement
building
building
crystal
anemone
crystal
anemone
palace
reef
palace
reef
people
people
Edge-Values
Node-Variable
84
RMCA for Image Annotation Refinement
85
RMCA for Image Annotation Refinement
86
RMCA for Image Annotation Refinement
87
RMCA for Image Annotation Refinement
88
2-dimensional Random-hyper plane for decision
  • Deterministic Decision

89
2-dimensional Random-hyper plane for decision
90
Another Semantic Distance (Normalized Google
Distance)
  • Dynamic
  • Diversity
  • Example

91
Result with Corel Image Set
92
Google Image Labeler
93
Context, rather than Concept
94
Web? There is answer..but
Woods, hoping to extend winning streak, charges
to lead at Dubai Desert Classic Published
January 30, 2008 DUBAI, United Arab Emirates
(AP) Tiger Woods picked up right where he left
off last week - at the top of the
leaderboard. Woods, who won the Buick
Invitational on Sunday by eight strokes, shot a
7-under 65 Thursday to take a two-shot lead after
the first round of the Dubai Desert Classic. "I
played well today, just a bunch of good golf
shots,"" Woods said after his bogey-free round at
the Emirates Golf Club. Eleven players, including
Miguel Angel Jimenez and Abu Dhabi Golf
Championship winner Martin Kaymer, were tied for
second at 67. Ernie Els, Sergio Garcia and
defending champion Henrik Stenson were tied with
10 others another stroke back. Woods said he
played better in Dubai than he did last week at
Torrey Pines. "I had two good days of practice
the last couple days and started to hit the ball
a lot better than I did last week,"" said Woods,
who won the Dubai tournament in 2006.
woods
Tiger Woods
golf shot
stroke
95
Result with Web-images
96
Decision-2d hyper plane
today
winning streak
top
winner
Golf Club
second
first round
week
others
Golf shot
Golf championship
Kaymer
stoke
Classic
woods
Eight stroke
shot
Abu Dubai
r
Emirates
round
desert
Dubai
Tiger woods
Buick Invitational
Sunday
lead
Leader board
eleven players
Jimenez
Thursday
Ernie Els
Sergio Garcia
Defending Champion
two-shot
Hendrik Stenson
(a)
97
2nd round decision
Bethlehem
West bank
r
local
Israeli Civilian
militant group
sources
militant
town
night
leader
Palestinian
arrest
Israeli Forces
Jihad
terror
home
attack
terrorist
forces
soldiers
hours
(b)
Israeli Civilian
Palestinian
militant
Bethlehem
Jihad
r
arrest
town
attacks
night
soldiers
terror
forces
terrorist
(c)
98
Discussion Future Work
  • Evaluation scheme
  • More refined web-data (e.g. wiki-pedia)
  • Vision vs Semantic Measure
  • Video refinement
Write a Comment
User Comments (0)
About PowerShow.com