What's New in ContentBased Image Retrieval

1 / 54
About This Presentation
Title:

What's New in ContentBased Image Retrieval

Description:

'Cherokee', 'Indian' 'Beach' 'Cherokee' 'Flower' 'Indian' 'Tourism' 'Beach' 1 0 0 0 0.7 'Cherokee' 1 0 0.6 0.3 'Flower' 1 0 0.1 'Indian' 1 0.5 'Tourism' 1 [0, 1, 0, ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 55
Provided by: qit

less

Transcript and Presenter's Notes

Title: What's New in ContentBased Image Retrieval


1
What's New in Content-Based Image Retrieval?
Xiang Sean Zhou IFP, Beckman Institute for
Advanced Science and Technology University of
Illinois at Urbana Champaign
2
Background MARS
  • MARS Multimedia Analysis and Retrieval System

MARS Content Analyzer
MARS User Interface with Relevance feedback
Metadata
3
Outline
  • Small sample learning algorithm for relevance
    feedback
  • Structural features for image content
    representation
  • Unification of keywords and visual contents

4
  • Small sample learning algorithm for relevance
    feedback
  • Image content representation structure
  • Global Structure Representation
  • Local Structure Representation
  • Unification of keywords and visual contents

5
Relevance Feedback Scenario
  • Machine provides initial retrieval results,
    through query-by-keyword, sketch, or example,
    etc.
  • Iteratively
  • User provides judgment on the current results as
    to whether, and to what degree, they are relevant
    to her/his request
  • The machine learns and tries again.

6
The current optimal schemes only deal with
positive examples However, without a doubt,
negative examples can help.

7
Question
  • With both positive and negative feedbacks, how
    different is relevance feedback learning from the
    age-old two-class classification problem?

8
2-class SVM under small samples

Target Cluster
9
Observation
  • The small sample problem
  • The small number of negative examples can NOT be
    representative for the negative class
    distribution while for the positive class of
    interest, the situation is usually better.
  • Positive examples are all alike in a way, each
    negative example is negative in its own fashion.
  • Sean Tolstoy

10
Intuition
  • Happy families are all alike, every unhappy
    family is unhappy in its own way.
  • --Leo Tolstoy, Anna Karenina

11
BiasMap Linear
xi, i1,,NP the positive examples yi,
i1,,NN the negative examples mx is the mean
vector of xi. We want a linear transformation
matrix W, such that

where
and

 
12
The Small Sample Issue Statistical Bias
  • Sample-based plug-in biased under small samples
    (RDA Friedman, 1989)
  • Regularized BDA
  • Discounting negative examples

13
Boosting BiasMap using RankBoost (Freund, et al.
1998)
Given positive set X1, and negative set X0,
Initialize training example weights For t
1, , T    Train a weak BDT using weights Wt
use weighted covariance matrix as the scatter
matrix estimates (Equation (8)). Outputs are in
rt(x).     Get weak hypothesis ht x?(0, 1),
using         Update the weights   
Normalize the weights to sum to 1 separately
among positive and negative.
14
A Faster, ad hoc Variant RankBoost.H
For t 1, , T    Train a weak BDT using
weights Wt Outputs are in rt(x).     here
the notation ? 1 when the predicate ? holds,
and 0 otherwise.     Update the weights   
Normalize the weights to sum to 1 separately
among positive and negative.
15
Kernel Machine
The original linear algorithm is applied in a
feature space F, which is related to the
original space by a non-linear mapping ? C ? F
x ?? (x) However this mapping will
not be carried out explicitly, but through the
evaluation of a kernel function k(xi, xj)
?T(xi) ?(xj).

16
BiasMap in Feature Space
The task is to rewrite the BiasMap formula in the
feature space into dot-product form

17
Solutions in Kernel Form
The numerator
The denominator
The projection of a new pattern z onto w is given
by
18
The Kernel Matrices
19
Does Kernel Help? Image database testing
20
BiasMap vs. KDA and SVM on Face and non-face
classification
  • Examples of faces (1000)

Examples of non-faces (1000)
21
Precision in top 1000 SVM?larger_margin_first
Precision in top 1000. ?50. Each point is an
average of 100 trials. SVM returns the points
with larger margins first.
22
Boosting vs. Kernel Face vs. Nonface
  • Comparable improvement to BDA.
  • RankBoost.H outperforms RankBoost clearly in
    terms of rank difference. In terms of hit rate in
    top 1000, they are very close.

23
Boosting vs. Kernel Image Database
Averaged hit rate in top 100 for 500 rounds of
testing
24
  • Small sample learning algorithm for relevance
    feedback
  • Image content representation structure
  • Global Structure Representation
  • Local Structure Representation
  • Unification of keywords and visual contents

25
Quest for Structure Features
  • Texture
  • Repetitive patterns
  • Effective only for uniform texture images or
    regions, or requires reliable segmentation
  • Shape
  • Object contour
  • Requires good segmentation and are only
    effective for simple and clean images

26
Defining Structural Feature..
  • Non-repetitive illuminance patterns in the image
  • Low-level (or generic)
  • Features in-between texture and shape
  • Image/object structure (e.g., edge length),
    structural complexity, loops, etc., which may not
    be readily expressible by Texture or Shape.

27
Edge contains structural information
28
Gathering Information from Edge Maps
  • Edge length?
  • Connectivity?
  • Complexity?
  • Line-likeness?
  • Loopy structure?
  • Edge Directions?
  • Etc.

29
Water-Filling Algorithm
  • Given an edge map, treat the edges as canals
    (water channels)
  • For each set of connected canals (edges), fill
    in water until all the water fronts stop
  • Extract features during the water-filling
    process.

30
Water-Filling Edge Features Feature
Primitives
  • . FillingTime . ForkCount
  • . LoopCount . WaterAmount
  • . Horizontal (vertical) Cover .

LoopCount
  • IF Assume When two water-heads collide, we see
    one splash and when n water-heads collide at
    the same time, we see n-1 splashes (Think it as
    n-1 collide with one sequentially).
  • THEN
  • splashes of non-overlapping loops
  • No overhead in computation.

31
City/Building vs. Landscape
32
Retrieval of Buildings
Scatter plot for of hits in top 10 and top 20
returns

17 out of the 92 images are labeled as
buildings by a human subject
33
Images w/ clear structure (Corel dataset
17,000 images)
Water-filling retrieving images of clear
structure
34
Compare to Texture Features
Water-filling (WF) versus Wavelet Variances (WV)
100 airplanes and 100 eagles as query images
Airplanes

Hit in top 10

Top 20

Top 40

Top 80

WF

3.56

6.29

10.92

18.03

WV

3.32

5.75

9.94

17.07

Eagles

Hit in top 10

Top 20

Top 4
0

Top 80

WF

2.65

3.33

4.91

6.79

WV

1.98

2.82

4.43

6.58


Note that although the averaged numbers are
comparable between the two features, the
underlying matching mechanisms are very different
35
Feature analysis under relevance feedback
20 randomly selected horse images are used as
initial queries,
Feature performance with relevance feedback (C
Color T Texture S Water-filling)
For the horses example, it is also observed
that water-filling features are capable of
pulling the system out of the convergence where
all horses are confined by certain color by
adding horses of different color but the same
edge structures into the top 20 returns.
36
  • Small sample learning algorithm for relevance
    feedback
  • Image content representation structure
  • Global Structure Representation
  • Local Structure Representation
  • Unification of keywords and visual contents

37
Histogram-based Structure Modeling
Product of m 3-D, mltn
3n-dimensional
Histograms
ICA
IDW DW
x2
s2
Local features vectors
x1
s1
x3
s3
All possible tuples
Image and local k-tuples
IDW DW Inverse distance weighted and distance
weighted histograming, i.e., histograming
increments depend on the compactness of the
tuple.
38
Harris Interest Points
Differential Invariant Jets
39
Differential Invariant Gaussian Jets (Koenderink
and Van Doorn 1987)
  • Based on Taylor expansion at a point
  • Stacking partial derivative up to the 3rd order
  • L Image Intensity L1 ?L/ ?x L2 ?L/ ?y
  • Einsteinian notation is used, i.e., summation is
    implied
  • J3 Lij Li Lj
  • L11 L1 L1 2 L12 L1 L2 L22 L2 L2
  • and J7 has 8 terms, each is a product of one
    3rd order and three 1st order derivatives
  • ?12 -?21 1 ?11 ?22 0
  • The derivatives are carried out by convolution
    with the derivatives of Gaussian
  • Rotation, translation invariants Scale
    invariance needs some work (Schmid and Mohr,
    1997).

40
2-tuple histograms, 4 ICs
41
Image Retrieval/Classification
  • Test sets COIL, COREL subsets.
  • Compared against traditional texture and
    structure feature representations and Euclidean
    metric
  • Preliminary results Comparable or Outperform.
  • COIL COREL-1
  • Traditional 96
    91
  • PASM 97(HI) 99(L-0.9Norm)
    96(HI)
  • COIL Columbia Object Image Library
  • COREL-1 Corel Subset of 7 classes and 10 images
    per class.
  • Distance metrics HI, K-L, Chi-squared, Lp

42
?
No ICA, 9 Jets
ICA 9
ICA 3
ICA 7
43
20 rank3
16 car rank2
Object19 rank1
13Phone rank1
7 Piggy rank1
6Marlboro rank1
44
Where is the leopard?
45
Where is the tiger?
46
  • Small sample learning algorithm for relevance
    feedback
  • Image content representation structure
  • Global Structure Representation
  • Local Structure Representation
  • Unification of keywords and visual contents

47
Desired Working Scenarios
My name is Socks!
  • A user calls his cat Socks? The system need to
    learn that Socks is a cat or a pet!
  • Keywords Example(s) ? relevance feedback.

48
Soft Vector Representation of Annotations
Beach Cherokee Flower
Indian Tourism Beach 1 0
0 0 0.7 Cherokee 1
0 0.6 0.3 Flower 1
0 0.1 Indian
1 0.5 Tourism
1
Cherokee, Indian
0, 1, 0, 1, 0.5
49
Pseudoclassification in Image Domain WARF
  • Relevant Terms Cherokee, Indian
  • Relevant Term Frequency
  • fCherokee 3
  • fIndian 2
  • Co-occurrence Frequency
  • cCherokee, Indian 1
  • So
  • SCherokee, Indian SCherokee, Indian
    3X(2-1)
  • SCherokee, Indian 3

CherokeeIndian
Shop Cherokee
Cherokee Ceremony
Ceremony Beach
Indian Artifacts
Shop Artifacts
50
Estimated Concept Similarity Matrix
First simulation 30 words, 5000 images, up to
3 keywords per image. User model searching
car, truck, and motorcycle (a) 5 rounds of
training (b) 20 rounds (c) 80 rounds
51
Scalability?
  • 1000 words, 5000 images.
  • Up to 5 keywords per image, 30 rounds training
  • (b) Up to 5 keywords per image, 100 rounds
  • (c) Up to 80 keywords per image, 30 rounds.

52
Multiple Classes
The Second Simulation User model Searching for
three keyword classes.
53
Keyword Classification
  • Keyword classification
  • words ? classes
  • to facilitate automatic query expansion and
    other inference tasks
  • Hopfield Network Activation An iterative
    clustering approach based on mutual distances.


54
Conclusion
  • Relevance feedback process can be utilized to
    learn relationships among words.
  • Unifying keywords and contents in image retrieval
    can provide the user with more accuracy and more
    flexibility.
Write a Comment
User Comments (0)