Content-Based Image Retrieval - PowerPoint PPT Presentation

About This Presentation
Title:

Content-Based Image Retrieval

Description:

Content-Based Image Retrieval Rong Jin Retrieval by Bag-of-words Model Generate visual vocabulary Represent each key point by its nearest visual word ... – PowerPoint PPT presentation

Number of Views:173
Avg rating:3.0/5.0
Slides: 52
Provided by: rong7
Learn more at: http://www.cse.msu.edu
Category:

less

Transcript and Presenter's Notes

Title: Content-Based Image Retrieval


1
Content-Based Image Retrieval
  • Rong Jin

2
Content-based Image Retrieval
  • Retrieval by text
  • Label database images by text tags
  • Image retrieval as text retrieval
  • Find images for textual queries using standard
    text search engines

3
Example Flickr.com
  • Con require manually labeling

4
Image Labeling by Human Computing
  • ESP game http//www.gwap.com/gwap/gamesPreview/esp
    game
  • Collect annotations for web images via a game

5
Content-based Image Retrieval
  • Retrieval based on visual content
  • Represent images by their visual contents
  • Each query is an image
  • Search for images that have similar visual
    content as the query image

6
Content-based Image Retrieval
  • Given a query image, try to find visually
    similar images from an image database

Query
Answer
7
Example www.like.com
8
CBIR Challenges
  • How to represent visual content of images
  • What are visual contents ?
  • Colors, shapes, textures, objects, or meta-data
    (e.g., tags) derived from images
  • Which type of visual content should be used for
    representing image ?
  • Difficult to understand the information needs of
    an user from a query image
  • How to retrieve images efficiently
  • Should avoid linear scan of the entire database

9
Image Representation
  • Similar color distribution

Histogram matching
  • Similar texture pattern

Texture analysis
Image Segmentation, Pattern recognition
  • Similar shape/pattern

Degree of difficulty
  • Similar real content

Life-time goal -)
10
Vector based Image Representation
  • Represent an image by a vector of fixed number of
    elements
  • Color histogram discretize color space count
    pixels for each discretized color bin
  • Texture Gabor filters ? texture features

11
Vector based Image Representation
0.5 0.1 0.4 V2
0.3 0.5 0.2 Vq
0.4 0.5 0.1 V1
R G B
V1 Vq lt V2 Vq
gt
12
Images with Similar Colors
13
Images with Similar Shapes
14
Images with Similar Content
15
Challenges in CBIR
  • You get drunk,
  • REALLY drunk
  • Hit over the head
  • Kidnapped to another city
  • in a country on the other side of the world
  • When you wake up,
  • You try to figure out what city are you in, and
    what is going on
  • Thats what its like to be a CBIR system!

16
Near Duplicate Image Retrieval
  • Given a query image, identify gallery images with
    high visual similarity.

17
Appearance based Image Matching
  • Parts-based image representation
  • Parts (appearance) shape (spatial relation)
  • Parts local features by interesting point
    operator
  • Shape graphical models or neighborhood
    relationship

18
Interesting Point Detection
  • Local features have been shown to be effective
    for representing images
  • They are image patterns which differ from their
    immediate neighborhood.
  • They could be points, edges, small patches.
  • We call local features key points or interesting
    points of an image

19
Interesting Point Detection
  • An image example with key points detected by a
    corner detector.

20
Interesting Point Detection
  • The detection of interesting point needs to be
    robust to various geometric transformations

Original
ScalingRotationTranslation
Projection
21
Interesting Point Detection
  • The detection of interesting point needs to be
    robust to imaging conditions, e.g. lighting,
    blurring.

22
Descriptor
  • Representing each detected key point
  • Take measurements from a region centered on a
    interesting point
  • E.g., texture, shape,
  • Each descriptor is a vector with fixed length
  • E.g. SIFT descriptor is a vector of 128 dimension

23
Descriptor
  • The descriptor should also be robust under
    different image transformation.

They should have similar descriptors
24
Image Representation
  • Bag-of-features representation an example

Each descriptor is 5 dimension
22 0 19 23 1
66 103 45 6 38
232 44 0 11 48
29 55 129 0 1
11 78 110 1 32
220 30 11 34 21
Descriptors of the key points
Original image
Detected key points
25
Retrieval
22 0 19 23 1
66 103 45 6 38
232 44 0 11 48
29 55 129 0 1
...
How to measure similarity?
26
Retrieval
22 0 19 23 1
66 103 45 6 38
232 44 0 11 48
29 55 129 0 1
...
Count number of matches !
27
Retrieval
If the distance between two vectors is smaller
than the threshold, we get one match
28
Retrieval
Matched points 1
Matched points 5
29
Problems
  • Computationally expensive
  • Requiring linear scan of the entire data base
  • Example match a query image to a database of 1
    million images
  • 0.1 second for computing the match between two
    images
  • Take more than one day to answer a single query

30
Bag-of-words Model
  • Compare to the bag-of-words representation in
    text retrieval

An image
A document
What is the
difference
A collection of the words in the document
A collection of the key points of the image
31
Bag-of-words
An image
A document
What is the
difference
A collection of the words in the document
A collection of the key points of the image
The same word appears in many documents
No same key point, but similar key point
appears in many images which have similar visual
content
Group similar key point in different images in
to visual words
32
Bag-of-words Model
b4
Represent images by histograms of visual words
Group key points into visual words
33
Bag-of-words
  • The grouping is usually done by clustering.
  • Clustering the key points of all images into a
    number of cluster centers (e.g 100,000 clusters).
  • Each cluster center is called a visual word
  • The collection of all cluster centers is called
    visual vocabulary

34
Retrieval by Bag-of-words Model
  • Generate visual vocabulary
  • Represent each key point by its nearest visual
    word
  • Represent an image by a bag of visual words
  • Text retrieval technique can be applied directly.

35
Project
  • Build a system for near duplicate image
    retrieval
  • A database with 10,000 images
  • Construct bag-of-words models for each image
    (offline)
  • Construct a bag-of-words model for a query image
  • Retrieve first 10 visually most similar images
    from the database for the given query

36
Step 1 Dataset
  • 10,000 color images under the folder ./img
  • The key points of each image have already been
    extracted
  • Key points of all images are saved in a single
    file ./feature/esp.feature
  • Each line corresponds to a key point with 128
    attributes
  • Attributes in each line are separated by tabs

37
Step 1 Dataset
  • To locate key points for individual images, two
    other files are needed
  • ./imglist.txt the order of images when saving
    their keypoints
  • ./feature/esp.size the number of key points an
    image have.

38
Step 1 Dataset
  • Example Three images imgA, imgB, imgC.
  • imgA 2 key points imgB 3 key points imgC 2
    key points.

39
Step 2 Key Point Quantization
  • Represent each image by a bag of visual words
  • Construct the visual vocabulary
  • Clustering all the key points into 10,000
    clusters
  • Each cluster center is a visual word
  • Map each key point to a visual word
  • Find the nearest cluster center for each key
    point (nearest neighbor search)

40
Step 2 Key Point Quantization
  • Clustering 7 key points into 3 clusters
  • The cluster centers are cnt1, cnt2, cnt3
  • Each center is a visual word w1, w2, w3
  • Find the nearest center to each key point

41
Step 2 Key Point Quantization
  • imgA.jpg
  • 1st key point ? w2
  • 2nd key point ? w1
  • imgB.jpg
  • 1st key point ? w3
  • 2nd key point ? w3
  • 3rd key point ? w2
  • imgC.jpg
  • 1st key point ? w3
  • 2nd key point ? w2

Bag-of-words Rep. imgA.jpg w2 w1 imgB.jpg w3
w3 w2 imgC.jpg w3 w2
42
Step 2 Key Point Quantization
  • We provide FLANN library for clustering and
    nearest neighbor search.
  • For clustering, use flann_compute_cluster_centers(
  • float dataset, // your key points
  • int rows, // number of key points
  • int cols, // 128, dim of a key point
  • int clusters, // number of clusters
  • float result, // cluster centers
  • struct IndexParameters index_params, struct
    FLANN

43
Step 2 Key Point Quantization
  • For nearest neighbor search
  • Build index for the cluster centers
  • flann_build_index(
  • float dataset, // your cluster centers
  • int rows, int cols, float speedup, struct
    IndexParameters index_params, struct
    FLANNParameters flann_params)
  • For each key point, search nearest cluster center
  • flann_find_nearest_neighbors_index(
  • FLANN_INDEX index_id, // your index above
  • float testset, // your key points
  • int trows, int result, int nn, int checks,
    struct FLANNParameters flann_params)

44
Step 2 Key Point Quantization
  • In this step, you need to save
  • the cluster centers to a file. You will use this
    later on for quantizing key points of query
    images
  • bag-of-words representation of each image in
    trec format.

Bag-of-words Rep. imgA.jpg w2 w1 imgB.jpg w3
w3 w2 imgC.jpg w3 w2
ltDOCgt ltDOCNOgtimgBlt/DOCNOgt ltTEXTgt w3 w3
w2 lt/TEXTgt lt/DOCgt
ltDOCgt ltDOCNOgtimgAlt/DOCNOgt ltTEXTgt w2
w1 lt/TEXTgt lt/DOCgt
ltDOCgt ltDOCNOgtimgClt/DOCNOgt ltTEXTgt w3
w2 lt/TEXTgt lt/DOCgt
45
Step 3 Build index using Lemur
  • The same as what we did in the previous home work
  • Use KeyfileIncIndex index
  • No stemming
  • No stop words

46
Step 4 Extract key points for a query
  • Three sample query images under ./sample
    query/
  • The query images are in the format of .pgm
  • Extracting tool is under ./sift tool/
  • For windows, use siftW32.exe
  • For Linux, use sift
  • Example issue command
  • Sift lt input.pgm gt output.keypoints

47
Step 5 Generate a bag-of-words model for a query
  • Map each key point of a given query to a visual
    word.
  • Use the cluster center file generated in step 2
  • Build index for the cluster centers using
    flann_build_index()
  • For each key point, search nearest cluster center
    using flann_find_nearest_neighbors_index()

48
Step 5 Generate a bag-of-words model for a query
  • Write the bag-of-words model for a query image in
    the Lemur format.
  • ltDOC 1gt
  • The mapped cluster ID for the 1st key point
  • The mapped cluster ID for the 2nd key point
  • The mapped cluster ID for the 1st key point
  • lt/DOCgt

49
Step 6 Image Retrieval by Lemur
  • Use the Lemur command RetEvalas
  • RetEval ltparameter_filegt
  • An example of parameter file
  • ltparametersgt
  • ltindexgt/home/user1/myindex/myindex.keylt/indexgt
  • ltretModelgttfidflt/retModelgt
  • lttextQuerygt/home/user1/query/q1.querylt/textQuerygt
  • ltresultFilegt/home/user1/result/ret.resultlt/result
    Filegt
  • ltTRECResultFormatgt1lt/TRECResultFormatgt
  • ltresultCountgt10lt/resultCountgt
  • lt/parametersgt

50
Step 7 Graphical User Interface
  • Build a GUI for the image retrieval system
  • Browse the image database
  • Select an image from the database to query the
    database and display the top 10 retrieved results
  • Extract the bag-of-words representation of the
    query
  • Write it into the file with the format specified
    in step7
  • Run the RetEval command for retrieval
  • Load in the external query image, search the
    images in the database and display the top 10
    retrieved results

51
Step 8 Evaluation
  • Demo your system in the classes of the last week.
  • We will provide a number of test query images
  • Run your GUI, load in each test query image and
    display the first ten most similar images from
    the database
Write a Comment
User Comments (0)
About PowerShow.com