SIFT (Lowe 99) - PowerPoint PPT Presentation

About This Presentation

Title:

SIFT (Lowe 99)

Description:

SIFT Lowe 99 – PowerPoint PPT presentation

Number of Views:60

Avg rating:3.0/5.0

Slides: 43

Provided by: M976

Learn more at: https://people.csail.mit.edu

Category:

more less

Transcript and Presenter's Notes

Title: SIFT (Lowe 99)

1
SIFT (Lowe 99)Beyond Bags of Features
Spatial Pyramid Matching for Recognizing Natural
Scene Categories (Lazebnik et al
2006)(various slides stolen from the web)
TexPoint fonts used in EMF. Read the TexPoint
manual before you delete this box. AAA
2
Scale-Invariant Feature Transform

Generates image features, keypoints
invariant to image scaling and rotation
partially invariant to change in illumination and
3D camera viewpoint
many can be extracted from typical images
highly distinctive

3
Algorithm Stages

Scale-space Extrema Detection
Uses difference-of-Gaussian function
Keypoint Localization
Sub-pixel location and scale fit to a model
Orientation assignment
1 or more for each keypoint
Keypoint descriptor
Created from local image gradients

4
Scale Space
5
Difference Of Gaussian Pyramid
Blur Resample
A
B
A
B
6
Difference Of Gaussian Pyramid
A- B
7
Extrema Detection

Keypoint must be a minima or maxima of its 8
neighbors at its scale and the 9 neighbors above
and 9 below.

8
Extrema Detection
9
Keypoint Localization and Refinement

Refine keypoint/extrema position fitting a 3D
quadratic model to get subpixel accuracy of x,y
position and scale.
Throw out points that have low contrast
Remove points that are too edgy.

10
Keypoint Localization and Refinement
11
Keypoint Localization and Refinement
12
Orientation Assignment

Create histogram of local gradient directions
computed at selected scale
Assign canonical orientation at peak of smoothed
histogram
Each keypoint specifies stable 2D coordinates (x,
y, scale, orientation)

13
Example from paper
14
SIFT Descriptor

Try to mimic complex cells in the visual cortex
Selective to spatial frequency and orientation
but allows for shifts in position
Be robust to small affine transformations
Local affine transformations affect positions
more than orientation and spatial frequency.

15
SIFT Descriptor

Thresholded image gradients are sampled over
16x16 array of locations at keypoint scale
Create array of orientation histograms rotated
relative to orientation of keypoint.
8 orientations x 4x4 histogram array 128
dimensions
Distribute each sample to adjacent bins by
trilinear interpolation (avoids boundary effects)

16
3D object recognition example from paper
17
SIFT Review

Generates image features, keypoints
invariant to image scaling and rotation
partially invariant to change in illumination and
3D camera viewpoint
many can be extracted from typical images
Each keypoint has an associated descriptor that
is
Relative to keypoint orientation and scale
Is robust to small affine transformations.

18
SIFT Review

Note
We can skip the keypoint detection.
Pick a grid over the image and make descriptor
for each point.
Fei Fe and Perona (CVPR 2005) showed this works
better for scene classification.

19
Beyond Bags of Features Spatial Pyramid Matching
for Recognizing Natural Scene Categories
(Lazebnik et. al 2006)Many slides borrowed
from http//www.ima.umn.edu/2005-2006/W5.22-26.06
/activities/Lazebnik-Svetlana/ima_poster.pdfand
http//people.csail.mit.edu/kgrauman/slides/pyr_m
atch_iccv2005.ppt
20
Overview

Adds approximate global geometric
correspondence to bag of features techniques
for scene recognition
Spatial pyramid matching partitions the image
into multiscale subregions and computes feature
histograms.
Use weak-features (orientated edges at multiple
scales) and strong-features (Vocabulary formed
by gridded SIFT descriptors)

21
Motivation

A pre-attentive approach Recognize scene as
whole without examining its constituent objects.

22
Images as collections of features

Image as unordered set of d-dimensional feature
vectors
Varying number of vectors per instance

23
Classifiers (hand wavy)

Training data multiple images for each class
Image is represented by unordered set of features
We need some way to compare feature set X to
feature set Y.
Some similarity function K(X,Y).

24
Classifiers (hand wavy)

Nearest neighbor Input X,
find Y that maximizes K(X,Y) for all Y in the
training set.
Label X with the class label for Y.
SVM use K(X,Y) as kernel function
Inner product
Mercer Kernel

25
Partial matching

Compare sets by computing a partial matching
between their features.

26
Computing the partial matching

Earth Movers Distance
Rubner, Tomasi, Guibas 1998
Hungarian method
Kuhn, 1955
Greedy matching
Pyramid match

Grauman and Darrell, ICCV 2005
for sets with features of dimension
27
Pyramid match overview
Pyramid match measures similarity of a partial
matching between two sets

Place multi-dimensional, multi-resolution grid
over point sets
Consider points matched at finest resolution
where they fall into same grid cell
Approximate optimal similarity with worst case
similarity within pyramid cell

No explicit search for matches!
28
Pyramid match overview
29
Pyramid Match

d dimensional feature vectors
A sequence of grids at resolutions 0 L
At level l

d2, L2
30
Pyramid match Kernel

Matches at level l include matches at level l 1
New matches at level l (for l0L-1)
Penalize easy matches at larger scales with
weight
Match kernel

31
Vocabulary of M features

Only features of the same type can be matched.
Each channel m treated separately

32
Vocabulary of M features
33
Spatial pyramid representation
d2 (x,y)
M classes of features
34
Feature Extraction
35
Experimental Results
36
Scene Category Dataset
37
Scene Category Retrieval
38
Scene Category Confusion
39
Caltech 101
40
Caltech 101 Comparision
41
Caltech 101 Challenges
42
Gratz

Write a Comment

User Comments (0)