Automatic Matching of MultiView Images presentation

About This Presentation

Transcript and Presenter's Notes

Title: Automatic Matching of MultiView Images

1
Automatic Matching of Multi-View Images

Ed Bremer
University of Rochester

2
References

1 Mikolajczyk, K., Schmid, C., 2004, A
performance evaluation of local descriptors,
Submitted to PAMI, October 2004,
http//lear.inrialpes.fr/pubs/2004/MS04a
2 Mikolajczyk, K., Tuytelaars, T., Schmid, C.,
Zisserman, A., Matas, J., Schaffalitzky, F.,
Kadir, T., Van Gool, L., 2004, A comparison of
affine region detectors, Submitted to
International Journal of Computer Vision, August
2004, http//lear.inrialpes.fr/pubs/2004/MTSZMSKG0
4
3 Lowe, D., 2004. Distinctive Image Features
from Scale-Invariant Keypoints, International
Journal of Computer Vision, 60, 2 (2004), pp.
91-118.
4 Matas, J., Chum, O., Urban, M., Pajdla,T.
2002. Robust Wide Baseline Stereo From Maximally
Stable Extremal Regions, Proc British Machine
Vision Conference BMVC2002, pages 384 393.
5 Zisserman, A., Schaffalitzky, F., 2002,
Multi-view matching for unordered image sets, or
How do I organize my holiday snaps?,
Proceedings of the 7th European Conference on
Computer Vision, Copenhagen, Denmark, pages
414-431, vol 1.
6 Baumberg, A., 2000, Reliable Feature Matching
Across Widely Separated Views, In Proc. CVPR
,pages 774-781.
7 Mikolajczyk, K, Schmid, C., 2001, Indexing
based on scale invariant interest points, In
Proc. 8th ICCV, pages 525-531.

3
Outline

Motivation
Applications
Process Components
Region Detectors
Descriptors
Matching Criteria
Performance Evaluation
Conclusion Next Steps

4
Motivation

Multi-view/Multi-image Matching
Multiple images of scene taken by single or
multiple cameras with different rotation, scale,
viewpoint and illumination

3D scene
5
Motivation

Applications
detecting matching regions is used in all the
following
Image registration
Super-resolution
Stereo vision
Object detection and recognition
Object and motion tracking
Indexing and retrieval of objects
3D scene reconstruction
Scene recognition

6
Examples of Multi-view Images 2
2 Mikolajczyk, K., Tuytelaars, T., Schmid, C.,
Zisserman, A., Matas, J., Schaffalitzky, F.,
Kadir, T., Van Gool, L., 2004, A comparison of
affine region detectors, Submitted to
International Journal of Computer Vision, August
2004, http//lear.inrialpes.fr/pubs/2004/MTSZMSKG0
4
7
Process Components

Covariant region detection
Detect image regions covariant to class of
transformation between reference image and
transformed image
Invariant descriptor
Compute invariant descriptors from covariant
regions
Descriptor matching
Compute distance between descriptors in reference
image and transformed image
1 Mikolajczyk, K., Schmid, C., 2004, A
performance evaluation of local descriptors,
Submitted to PAMI,
http//lear.inrialpes.fr/pubs/2004/MS04a

8
Region Detectors

Support regions for computation of descriptors
Determined independently in each image
Scale invariant or Affine invariant
Can be points (feature points) or regions
(covariant)
Provide dense (local) coverage robust to
occlusion
Need to be stable and repeatable
Five region detectors -
Harris points -gt invariant to rotation
Harris-Laplacian -gt invariant to rotation and
scale
Hessian-Laplace -gtinvariant to rotation and scale
Harris-Affine -gt invariant to affine image
transformations
Hessian-Affine -gt invariant to affine image
transformations
1 Mikolajczyk, K., Schmid, C., 2004, A
performance evaluation of local descriptors,
Submitted to PAMI,
http//lear.inrialpes.fr/pubs/2004/MS04a

9
Region Detectors

Harris points -
Maxima of Harris function used to locate interest
point
Support region fixed in size, 41x41 neighborhood
centered at interest point
Harris-Laplace regions -
Scale adapted Harris function
Interest point is local minima or maxima across
scale-space by Laplacian-of-Gaussian
1 Mikolajczyk, K., Schmid, C., 2004, A
performance evaluation of local descriptors,
Submitted to PAMI,
http//lear.inrialpes.fr/pubs/2004/MS04a

10
Region Detectors

Harris-Laplace Performance -
Approximately 10 better than Laplacian, Lowe or
gradient methods.
Harris standard detector is very poor under
scale changes
7 Mikolajczyk, K., Schmid, C., 2001, Indexing
based on scale invariant interest points, In
Proc. 8th ICCV, Pages 525-531.

11
Region Detectors

Hessian-Laplace regions -
Interest point is at local maxima of Hessian
determinant
Location in scale-space using maxima of
Laplacian-of-Gaussian (can also use
Difference-of-Gaussians)
1 Mikolajczyk, K., Schmid, C., 2004, A
performance evaluation of local descriptors,
Submitted to PAMI,
http//lear.inrialpes.fr/pubs/2004/MS04a

12
Region Detectors

Harris-Affine regions -
Find regions using Harris-Laplace detector
Region based on 2nd moment affine adapted
Hessian-Affine regions -
Find regions using Hessian-Laplace detector
Affine adapted region based on 2nd moment.
2 Mikolajczyk, K., Tuytelaars, T., Schmid, C.,
Zisserman, A., Matas, J., Schaffalitzky, F.,
Kadir, T., Van Gool, L., 2004, A comparison of
affine region detectors, Submitted to
International Journal of Computer Vision, August
2004, http//lear.inrialpes.fr/pubs/2004/MTSZMSKG0
4

13
Region Detectors

Regions produced by Harris-Affine and
Hessian-Affine detectors

14
Region Detectors

Affine normalization using 2nd moment matrix for
region L and R

15
Region Detectors

Region normalization
Detectors produce circular or elliptical regions
Size dependant on detection scale
Map regions to circular region with constant
radius
Rotate regions in direction of dominant gradient
orientation
Illumination normalization
Use affine transformation -gt aI(x) b
Mean and standard deviation of pixel intensities
1 Mikolajczyk, K., Schmid, C., 2004, A
performance evaluation of local descriptors,
Submitted to PAMI,
http//lear.inrialpes.fr/pubs/2004/MS04a

16
Descriptors

Descriptors -gt Feature vector
Invariant to changes in scale, rotation, affine
translation and affine illumination
Need to be distinct, stable and repeatable
Distribution (histogram) type or Covariance type
Ten Descriptor types
Scale-Invariant Feature Transform (SIFT)
Gradient Location and Orientation histogram
(GLOH)
Shape Context
Principal Component Analysis (PCA)-SIFT
Steerable Filters
Differential Invariants
Complex Filters
Moment Invariants
Cross-Correlation
Spin Image
1 Mikolajczyk, K., Schmid, C., 2004, A
performance evaluation of local descriptors,
Submitted to PAMI,
http//lear.inrialpes.fr/pubs/2004/MS04a

17
Descriptors

SIFT and GLOH 3D Descriptors
SIFT -gt 4 x 4 x 8 128 dimension descriptor
GLOH -gt Log-polar (2 x 8) 1 x 16 272
dimension descriptor

18
Matching Criteria

Distance measure
Find putative matches between images
Mahalanobis distance used for covariant
descriptors
Euclidean distance used for distribution
(histogram) descriptors
Direct distance comparison not suitable for
indexing or database searching
Simple threshold
Descriptors match if distance between is below
threshold t
Descriptor in reference image can have many
matches to descriptors in transformed image
Nearest Neighbor (NN)
Find closest match between descriptors in
reference and transformed image
Descriptor in reference image can have only 1
match to descriptor in transformed image

19
Performance Evaluation

Criterion basis
Recall rate correct matched/correspondences
1-precision false matches/correct matches
false matches
Ideal descriptor -gt recall rate 1, for all
precision given no overlap error

20
SIFT - Scale Invariant Feature Transform

Scale Invariant Feature Transform (SIFT) Lowe 3
Features
Invariant to image scale, rotation
Invariant for small changes in illumination and
3D camera viewpoint
Extracts large number of highly distinctive
features
Enables detection of small objects
Improved performance in cluttered scenes
Algorithms are efficient complex operations
applied to local regions or features vs whole
image
Procedure
Scale-space extrema detection
Keypoint localization
Orientation asignment
Keypoint vector (descriptor)

21
SIFT - Scale Invariant Feature Transform 3

Scale-Space Blob Detector -
Search for stable features over all scales and
image locations
Scale-space kernel -gt Gaussian function
Difference of Gaussian

22
SIFT - Scale Invariant Feature Transform 3

Difference of Gaussian (DoG)
simple subtraction of blurred L images
Approximation to scale-normalized Laplacian of
Gaussian
Maxima or minima of scale-normalized Laplacian
produces the most stable image features compared
to gradient, Hessian, or Harris corner function
(Mikolajczyk 2002)

23
SIFT - Scale Invariant Feature Transform 3

Scale-Space Image Set -
Divide each octave into s intervals
Compute s 3 filtered (increasing blurry)
images, k 2(1/s)
s 3, k 1.26 -gt 6th gt 3.18s
5th gt 2.52s
4th gt 2.00s
3rd gt 1.59s
2nd gt 1.26s
1st gt 1.00s
Subtract adjacent images to produce DoG images
Repeat for next octave using 2nd image from top
and decimate by 2

24
SIFT - Scale Invariant Feature Transform 3

Scale-Space Pyramid -
(from Lowe)

25
SIFT - Scale Invariant Feature Transform 3

Locating Scale-Space Extrema -
Detection of local maxima or minima of D(x, y, s)
Compare each sample point to 8 neighbors in same
scale image and 9 neighbors in scale image above
and below.
Mark if sample is greater than or less than all
of the neighbors
Compares s number of DoG images

26
SIFT - Scale Invariant Feature Transform 3

Improving Localization -
Reject points that have low contrast using
ltthreshold
Where gt
Gives offset extremum -gt
Hessian and derivative of D(x, y, s) uses
differences of neighboring sample points. x
(x, y , s)T is offset from sample point

27
SIFT - Scale Invariant Feature Transform 3

Edge Rejection -
Eliminate poorly defined peaks (edges) using
Hessian matrix
Verify ratio of principal curves is less than
threshold rlt10
Efficient to compute -gt less than 20 floating
point operations

28
SIFT - Scale Invariant Feature Transform 3

Results from Lowe 3 832 keypoints reduced to
536 (233x189 image)

29
SIFT - Scale Invariant Feature Transform

Results from Lowe 3 performance measures

30
SIFT - Scale Invariant Feature Transform

Results from Lowe 3 performance measures

31
SIFT - Scale Invariant Feature Transform 3

Orientation rotational invariance
Use scale of point to select image L(x, y, s)
Compute the gradient m(x, y) and orientation ?(x,
y) at each image sample using differences.
Orientation histogram of sample points entries
weighted by gradient magnitude and a Gaussian
window around the keypoint, bins cover 360 range
Peaks in histogram correspond to dominant
directions of local gradients

32
SIFT - Scale Invariant Feature Transform 3

Descriptor the feature vector
8x8 sub-region histograms allow shift in gradient
positions
128 element feature vector -gt 4x4 array of 8
orientations
(2x2x8 from Lowe is shown below)
Feature vectors matched by nearest neighbor
(Euclidean distance)

33
SIFT - Scale Invariant Feature Transform 3

Results from Lowe 3
Two training objects recognized in cluttered
image
Small squares show point matches
Large rectangles shown border of training image
after affine transformation

34
Conclusions

Conclusions
Harris-Laplacian region detector performs better
than Laplacian, DoG and gradient scale-space
operators
Scale-space detectors provide invariance to
rotation, scale and small changes to illumination
and viewpoint.
Affine adaptation provides invariance to affine
transformations
GLOH and SIFT descriptors provide the best
performance.
Dense, localized descriptors perform well under
occlusions
Nexts steps
Coding and testing of region detectors,
descriptors and matching

Write a Comment

User Comments (0)

About PowerShow.com

Automatic Matching of MultiView Images PowerPoint PPT Presentation