Title: Efficient Image Retrieval Methods For Large Scale Dynamic Image Databases
1Efficient Image Retrieval Methods For Large Scale
Dynamic Image Databases
- Suman Karthik
- 200407013
- Advisor Dr. C.V.Jawahar
2Images
- Cheap Imaging Hardware
- Plummeting Storage costs
- User Generated Content
3Image Databases
- Large Scale
- Millions to billions of images
- Dynamic
- Highly dynamic in nature
Number of Images on Flickr from December 2005 to
November 2007 In millions
4CBIR
- Content Based IR
- Uses image content
- Pros
- Good Quality
- Annotation agnostic
- Cons
- Inefficient
- Not scalable
shape
color
texture
5Bag Of Words
Index
Feature Extraction
Vector Quantization
Semantic Indexing
Words
W
Compute SIFT descriptors Lowe99
D1
D2
D3
Inverted Index
PLSA, Hoffman, 2001
J Sivic Zisserman,2003 Nister Henrik,2006
Philbin,Sivic,Zisserman et la,2008
6Dynamic Databases
- Large scale
- New images added continuously
- High rate of change
- Nature of data not known apriori
Internet
Videos
Images
7 Text vs Images Dynamic databases
- Vocabulary known
- Rate of change of vocabulary low
- Stable vocabulary
- Vocabulary unknown
- Rate of change of vocabulary high
- Unstable vocabulary
8Quantization and Semantic indexingIn Dynamic
Databases
- As DB changes vocabulary is outmoded
- Updating vocabulary is too costly
- Not incremental
- Cannot keep up with rate of change
- As DB changes semantic index is invalid
- Updating semantic index is resource intensive
- Not incremental
- Cannot keep up with rate of change or scale
9Dynamic Databases
Internet
Dynamic Database
Index
Feature Extraction
Vector Quantization
Semantic Indexing
Videos
Quantization and semantic indexing methods are a
bottleneck
Images
10Objective 1
- A. Motivation
- CBIR is inefficient and not scalable
- B. Objective
- Develop methods to improve efficiency and
scalability of CBIR - C. Contributions
- C 1.1 Virtual Textual Representation
- C 1.2 A new efficient indexing structure
- C 1.3 Relevance feedback methods that improves
performance
11Objective 2
- A. Motivation
- Quantization is bottleneck for BoW when dealing
with dynamic image databases - B. Objective
- Develop incremental quantization method for
BoW model to successfully deal with dynamic image
databases - C. Contributions
- C 2.1 Incremental Vector Quantization
- C 2.2 Comparison of retrieval performance with
existing methods - C 2.3 Comparison of incremental quantization
with existing methods
12Objective 3
- A. Motivation
- Semantic Indexing is not scalable for BoW when
dealing with dynamic image databases - B. Objective
- Develop incremental semantic indexing method
for BoW model to successfully deal with dynamic
image databases - C. Contributions
- C 3.1 Bipartite Graph Model
- C 3.2 An algorithm for semantic indexing on
BGM - C 3.3 Search engines for images
13CBIR
14Literature
- Global image retrieval
- Region based image retrieval
- Region Based Relevance feedback
- Costly nearest neighbor based retrieval
- Spatial Indexing
- Relevance feedback heavily used
Image retrieval Past, present, and future,
Yong Rui, Thomas S. Huang, Shih F. Chang In
International Symposium on Multimedia Information
Processing 1997
Blobworld A System for Region-Based Image
Indexing and Retrieval, Chad Carson , Megan
Thomas , Serge Belongie , Joseph M. Hellerstein
, Jitendra Malik In Third International
Conference on Visual Information Systems 1999
Region-Based Relevance Feedback In Image
Retrieval, Feng Jing , Mingjing Li , Hong-jiang
Zhang , Bo Zhang, Proc. IEEE International
Symposium on Circuits and Systems 2002
15Search
16Transformation
Feature Space
Bins represented by strings or words
Quantization
Compactness
Position
Color
17Virtual Textual Representation
- Quantization
- Uniform quantization (grid)
- Density based quantization(kmeans)
- Each cell is a string
18CBIR Indexing
- Spatial Databases
- Relevance feedback skews the feature space
rendering spatial databases inefficient.
details
Indexing for Relevance Feedback Image
Retrieval, Jing Peng , Douglas R. Heisterkamp,
In Proceedings of the IEEE International
Conference on Image Processing (ICIP03)
19Elastic Bucket Trie
Null
BBC
Insert
Query
CAB
CBA
A
C
B
Nodes
A
B
A
B
Overflow
Split
B
A
B
Buckets
Retrieved Bucket
20Relevance Feedback
Retrieved
Query
Relevance Feedback
21Region importance based relevance feedback
KEYWORDS
Pseudo Image for next iteration
Keyword Selection
Relevant Images
Extracted Words
Errors In Retrieval
22Discriminative Relevance Feedback
- Classification is given precedence over
clustering. - Discriminative segments become the keywords.
- Non-discriminative segments are ignored.
FLOWERS
ROSES
SURFERS
WAVES
23Discriminative Relevance Feedback
KEYWORDS
Pseudo Image for next iteration
Keyword Selection
Relevant Images
Extracted Words
Irrelevant Images
No Errors In Retrieval
24Performance
High Fscore
Low Fscore
- Discriminative Relevance Feedback consistently
out performs Region Based Importance method.
25Region based Relevance feedback
Blobworld, (no indexing)
Our work
Non Spatial indexing
Simplicity (no indexing)
Global image retrieval
Local Image retrieval
Early CBIR
Spatial Indexing
Global relevance Feedback or No relevance feedback
26Analysis
- Relevance feedback algorithms need to be modified
to work with text. - Keywords emerge with relevance feedback
signifying association between key segments. - EBT can be used without any modifications with
discriminative relevance feedback. - Advent of Bag of Words model for image retrieval
27Quantization
28Literature
- Kmeans
- Hierarchical Kmeans
- Kmeans, Soft assignment
- Time consuming offline quantization
- Representative data available apriori
- Quantization is not incremental
Video Google A Text Retrieval Approach to
Object Matching in Videos, Josef Sivic, Andrew
Zisserman, ICCV 2003
Scalable Recognition with a Vocabulary Tree, D.
Nistér and H. Stewénius, CVPR 2006
Lost in quantization Improving particular
object retrieval in large scale image databases,
James Philbin, Ondrej Chum, Michael Isard, Josef
Sivic, Andrew Zisserman, CVPR 2008
29Losses
Quantization
- Perceptual Loss
- Under quantization
- Synonymy
- Poor precision
- Binning Loss
- Over quantization
- Polysemy
- Poor recall
30Incremental Vector Quantization
- Control perceptual loss
- Minimize binning loss
- Create quality code books
- Data dependent
- Incremental in nature
31Algorithm
Puts a upper bound on perceptual loss
Soft BinAssignment Minimizes binning loss
Builds quality codebooks by ignoring noise
r
L 2 L minimum cardinality of a cell
32(No Transcript)
33An experiment
- Given
- All possible feature points in a feature space
that could be generated by natural processes. - Quantize
- K-means with apriori knowledge of entire data
- IVQ with no apriori information.
- Performance
- F-score
- Time taken for incremental quantization
Details
34Fscore
IVQ outperforms Kmeans
IVQ 1115 bins Kmeans 1000 bins
35Time
- IVQ quantizes in 0.1 seconds
- IVQ time complexity is linear
- Kmeans takes 1000 seconds
- Time complexity exponential
IVQ outperforms Kmeans
36Holiday Dataset
- Datasets
- Holiday dataset
- 1491 images
- 500 categories
- Pre-processing
- sift feature extraction.
- quantization using k-means.
- quantization using ivq
-
37Incremental Quantization
- Datasets
- ALOI dataset
- 100,000 images
- 1000 batches of 100 image each
- Pre-processing
- sift feature extraction.
- quantization using k-means/online kmeans.
- quantization using IVQ
-
S seconds, D Days
Batch 100 images of 100,000 image ALOI dataset
Added sequentially
38Analysis
- IVQ bins higher than Kmeans (constant perceptual
loss) - IVQ efficient due to local changes
- LSH used to accelerate IVQ
- Semantic indexing can improve mAP
More
39Incremental
IVQ (local)
Adaptive Vocabulary Tree (global)
Density Based
Online Kmeans
Kmeans
Offline quantization
Online quantization
Non density based
Regular Lattice
Non incremental
40Semantic Indexing
41Semantic Indexing
Words clustered around latent topics
Visual Words clustered around latent topics
Animal
Whippet
GSD
doberman
d
Whippet
daffodil
w
GSD
tulip
doberman
P(wd)
rose
daffodil
LSI, pLSA, LDA
tulip
rose
Flower
Hoffman 1999 Blei, Ng Jordan, 2004 R.
Lienhart and M. Slaney,2007
42Literature
- Visual pLSA
- Visual LDA
- Spatial semantic indexing
- High space complexity due to large matrix
operations. - Slow, resource intensive offline processing.
Discovering Objects and Their Location in
Images , Josef Sivic, Bryan Russell, Alexei A.
Efros, Andrew Zisserman, and Bill Freeman, ICCV
2005
Image Retrieval on Large-Scale Image Databases,
Eva Horster, Rainer Lienhart, Malcolm Slaney,
CIVR 2007
Spatial Latent Dirichlet Allocation, X. Wang
and E. Grimson, in Proceedings of Neural
Information Processing Systems Conference (NIPS)
2007
43 Bipartite Graph Model
Cash Flow Algorithm
w1
subprime
d1
Financial Crisis
- Vector space model is encoded as bipartite graph
of words and document. - TF values retained as edge weights.
- IDF values retained as term weights
w2
reforms
d2
Bush Popularity
TF
11.7
w3
war
25
8.3
d3
50
Saddam Captured
5
100
50
12.5
w4
Iraq
25
d4
Iraq Pullout
IDF
12.5
w5
elections
d5
Obama Elected
w6
democrats
words
Documents
44BGM with BoW
- Feature extraction
- Local detectors, SIFT
- Vector quantization
- K-means
- BGM insertion
- Words, Documents
- TF
- IDF
45Why BGM is Superior ?
Query image
w1
w2
Inverted Index
Cash Flow
w5
w1
w2
w3
w4
46Naïve vs BGM
- Datasets
- 9000 images of flickr.
- 9 Sports Categories
- 5 Animal Categories
- Pre-processing
- sift feature extraction.
- quantization using k-means.
- F-score
- 2(pr)/(pr)
-
47BGM vs pLSA, IpLSA
Number Of Concepts Known
mAP Time Space
pLSA 0.649 5144s 3267Mb
IpLSA 0.612 63s 3356Mb
BGM 0.594 42s 57Mb
- pLSA
- Cannot scale for large databases.
- Cannot update incrementally.
- Latent topic intialization difficult
- Space complexity high
- IpLSA
- Cannot scale for large databases.
- Cannot update new latent topics.
- Latent topic intialization difficult
- Space complexity high
- BGMCashflow
- Efficient
- Low space complexity
- Datasets
- Holiday dataset
- 1491 images
- 500 categories
- Pre-processing
- sift feature extraction.
- quantization using k-means.
-
Number Of Concepts unknown
mAP Time Space
pLSA 0.553 5062s 3267Mb
IpLSA 0.567 56s 3356Mb
BGM 0.594 42s 57Mb
48Near Duplicate Retrieval
- Dataset 500,000 movie frames
- SIFT vectors
- Kmeans quantization
- Indexed using text search library Ferret.
- Efficient Indexing and retrieval
- Effectively scalable to large data.
-
- Query frame given as query to Ferret index.
-
- Cash propagated to every node until cut-off.
-
49Sample Retrieval
Query
Retrieval
Fastest Indian
Fight Club
Harry Potter
50Analysis
- Low index insert time for new images
- Less than 200 seconds to insert 1000 images in a
million image index - Marginally higher retrieval time
- Due to multiple levels of graph traversal
- Memory usage minimal
- Works without concept number apriori
- BGM is a hybrid model
- Generative
- discriminative
51Incremental
BGM IDF
Discriminative
IPLSA
BGM (generative discriminative)
BGM TF
Offline Semantic indexing
Online Semantic indexing
LDA
Generative
PLSA
Non incremental
52Conclusion
- Efficient methods for retrieval in large scale
dynamic image databases - Scalability and adaptability have been addressed
- A step closer to real world image retrieval
- Features and their mixture, a long way to go
53Future Work
- Quality and quantity of features
- Automatic feature modeling
- Text search engines for image search
- GPU based quantization methods
- Multiple vocabularies for image retrieval
- Multimodal semantic indexing with BGM
54List of publications
- Suman Karthik, C.V. Jawahar, "Incremental On-line
semantic Indexing for Image Retrieval in Dynamic.
Databases" 4th International Workshop on Semantic
Learning and Applications, CVPR, 2008, Florida - Suman Karthik, C.V. Jawahar, "Analysis of
Relevance Feedback in Content Based Image
Retrieval", Proceedings of the 9th International
Conference on Control, Automation, Robotics and
Vision (ICARCV), 2006, Singapore. - Suman Karthik, C.V. Jawahar, Virtual Textual
Representation for Efficient Image Retrieval.
Proceedings of the 3rd International Conference
on Visual Information Engineering (VIE), 26-28
September 2006 in Bangalore, India. - Suman Karthik, C.V. Jawahar, Effecient Region
Based Indexing and Retrieval for Images with
Elastic Bucket Tries, Proceedings of the
International Conference on Pattern Recognition
(ICPR), 2006
55The End
56Intuitive way of learning content
- Over segmentation and subsequent deduction of
content through relevance feedback.
Document
Words
Transformation
Segmentation
Segments
Text
Image
Discriminative Relevance Feedback leverages this
advantage to achieve better performance than
standard techniques.
57Kmeans
- Pros
- Simple
- Efficient
- Cons
- Computationally expensive
- Representative Training Set
- Sensitive to parameter K
58A naive quantization scheme
Quantization
F1
F3
F2
Advantages - High speed. No quantization
overhead - As dataset size grows precision
increases Disadvantages - Not data dependent,
no idea of visual concept - Information loss due
to hard assignment Suman karthik, C. V.
jawahar, Virtual Textual Representation for
Efficient Image Retrieval VIE 2006 Tuytelaars,
T. and Schmid, C. Vector Quantizing Feature
Space with a Regular Lattice ICCV 2007
59C 2.1 Methodology
- Data
- 1000 Random feature vectors each generated from
1000 normal distributions in a 2-d feature space.
A total of 1 million feature points in the space. - 100,000 Virtual images falling into 100
categories where each category image is generated
by drawing random numbers from 10 normal
distributions from the above data. - Algorithms
- Kmeans (quantized with the entire data and ideal
K1000) - IVQ
- Kmeans with soft assignment
- Measures
- F-score for retrieval performance
- Time estimates for incremental quantization
Back
60Performance
Back
61Performance
62Image Retrieval
- Contemporary approach
- Uses textual cues
- Pros
- Simple
- Efficient
- Cons
- Images are Subjective
- Text cues unscalable
- Quality Suffers
Love
Rose
Flower
Petals
Gift
Red
Bud
Green
63Losses
- High Perceptual Loss
- High Binning Loss
- Optimal Quantization
64Image retrieval as Text retrieval
- Can an image be indexed, queried for and
- retrieved as a text document?
Can this become
this????????????
65Relevance Feedback
- Statistical
- Delta mean algorithm
- Query Point Movement
- Inverse Variance
- Membership Criterion
- Kernel Based
- Parzen Windows
- SVM
- Kernel BDA
- Entropy Based
- KL divergence
ltltBack
66Semantic Indexing for Images
- Objects and their location in images.
- Large Scale Image Databases
- Web image selection
- Spatial Latent Dirichlet Allocation
- Image auto-annotation
- High space complexity due to large matrix
operations. - Slow, resource intensive offline processing.
Sivic, J. Russell, B.C. Efros, A.A. Zisserman, A.
Freeman
Lienhart, R. Slaney, M
Keiji Yanai
Xianggang Wang, Eric Grimson
Monay, Florent and Gatica-Perez, Daniel,