Algebraic Functions of Views for 3D Object Recognition presentation

About This Presentation

Transcript and Presenter's Notes

Title: Algebraic Functions of Views for 3D Object Recognition

1
Algebraic Functions of Views for 3D Object
Recognition

CS773C Advanced Machine Intelligence Applications
Spring 2008 Object Recognition

2
Object Appearance

The appearance of an object can have a large
range of variation due to
Photometric effects
Scene clutter
Changes in shape (e.g., non-rigid objects)
Viewpoint changes

3
Algebraic Functions of Views (AFoVs)

A powerful mathematical foundation for
investigating variations in the geometrical
appearance of an object due to viewpoint changes.
the variety of of 2D views depicting the
geometrical appearance of a 3D object can be
expressed as a combination of a small number of
2D views of the object

S. Ullman and R. Basri, "Recognition by Linear
Combinations of Models", IEEE Transactions on
Pattern Analysis and Machine Intelligence, vol.
13, no. 10, pp. 992-1006, 1991.
4
Orthographic Projection

Case of
3D rigid
transformations
(3 ref. views)

5
Orthographic Projection

Case of 3D linear transformations

(2 ref views)
6
More Results

Perspective projection
(2 ref. views, obtained under orthographic
projection)
Objects with smooth surfaces and non-rigid
objects
More reference views are required.

A. Shashua, Algebraic functions for
recognition, IEEE Transactions on Pattern
Analysis and Machine Intelligence, vol. 17, no.
8, pp. 779-789, 1995.
7
A Word of Caution!

Only common features in the reference views can
be predicted in a novel view.

reference view
reference view
novel view
8
Recognition Framework Using AFoVs

novel 2D views of a 3D object can be recognized
by matching them to combinations of a small
number of known 2D views of the object

9
Representation and Matching using AFoVs

Representation
Objects are represented by a small number of
views.
Each view is represented by some geometric
features (e.g., points)
Matching
Predict the geometric appearance of an object in
a novel view by combining a small number of
reference views of the object.

10
Advantages of the Method

No 3D models or camera calibration are required.
Only a small number of 2D views are required.
Novel views can be different from the stored
ones.
Simpler verification scheme.
More general framework (family of methods).
Evidence that the human visual system works
similarly.

11
Main Challenges

Which model views to combine to predict a novel
view?
How to establish the correspondences between
novel and reference views?
How to find the coefficients of the combination?.
How to handle occlusions?
How to choose the reference views?

Integrate AFoVs with Indexing!
12
Method Overview(G. Bebis, M. Georgiopoulos, M.
Shah, and N. da Vitoria Lobo, "Indexing Based on
Algebraic Functions of Views", Computer Vision
and Image Understanding (CVIU), Vol. 72, No. 3,
pp. 360-378, 1998)

Preprocessing step
(1) Extract groups of points from each model.
(2) Sample the space of appearances of each
group.
(3) Store information about the groups in an
index table
Recognition step
(1) Extract groups of points from the scene.
(2) Predict their appearance.
(3) Verify the predictions.

13
Overview of the Method (contd)
14
Which Model Groups to Choose?

Cluster geometric features into higher level
descriptions.
Consider properties that are unlikely to occur at
random.
Property used in our work convexity

15
Which Model Groups to Choose? (contd)
16
How to generate the appearances of a group?

Estimate each parameters range of values
Sample the space of parameter values
Generate a new appearance for each sample of
values

17
Estimate the Range of Values of the Parameters
or
and
Using SVD
and
18
Estimate the Range of Values of the Parameters
(contd)

Assume normalized coordinates
Use Interval Arithmetic (Moore, 1966)
(note that the solutions
will be identical)

19
Example
20
Preconditioning the Reference Views

Transform the original views to new views

effect of the condition number of P on the
intervals
such that has the best possible condition.
21
Preconditioning the Reference Views (contd)

Choosing
This implies
Thus

22
Example (preconditioned views)
23
Decouple Image Coordinates

Same transformation generates the x- and
y-coordinates
Represent only the x-coordinates in the index
table.
For each group, store the following entry

24
Hypothesis Generation and Verification
1.take intersection of hypotheses
2. apply constraints to reject invalid hypotheses
model
25
How to Choose the Scene Groups?

Using convex grouping to extract salient scene
groups.

26
Implementation Issues

Space requirements
select salient groups
reject groups giving rise to bad conditioned
matrices
coarse sampling of parameters
Index computation and table size

27
Important Implementation Issues (contd)

Sampling step (i.e., parameters of AFoVs)
Noise tolerance

actual
predicted
make additional entries in a neighborhood
around the indexed location
28
Experiments and Results
model objects and reference views used in our
experiments
29
Experiments and Results (contd)
novel view
novel view
reference views
reference views
30
Experiments and Results (contd)
novel view
novel view
reference views
31
Experiments and Results (contd)
novel view
novel view
reference views
reference views
32
Criticism of the Method

Relies heavily on feature extraction
It has high memory requirements.
The index table might represent unrealistic model
appearances.
Indexing based on hashing is not very efficient.
No explicit ranking of hypotheses.

33
Improving AFoVs Recognition Framework

Reject unrealistic appearances
Reduce storage requirements and improve speed
Develop a probabilistic hypothesis generation
scheme
Learn shape appearance
Rank hypotheses
Represent object appearance more efficiently
using improved indexing schemes and probabilistic
models.

W. Li, G. Bebis, and N. Bourbakis, "Integrating
Algebraic Functions of Views with Indexing and
Learning for 3D Object Recognition", IEEE
Workshop on Learning in Computer Vision and
Patter Recognition (in conjunction with CVPR04),
Washington DC, June 28, 2004.
34
Combine Indexing with Learning

Sample the space of appearances sparsely and
represent the samples in a K-d tree
Sample the space of views densely and represent
the samples using probabilistic models.
Given a novel view
(1) Use K-d tree to retrieve a small number of
candidate models
(2) For each candidate model, compute the
probability that it might have produced the novel
view
(3) Verify most likely hypotheses first

35
Combine Indexing with Learning (contd)

The first stage provides hypothetical matches
fast.
The second stage evaluates the feasibility of
hypothetical matches fast, without having to
apply verification explicitly.
Only highly likely hypotheses are verified
explicitly.

36
Improved Framework
TRAINING PHASE
RECOGNITION PHASE
Reference views
New image
Extract image groups
Extract model groups
Access
Using SVD IA
Retrieve
Estimate the range of AFoVs parameters
K-d Tree
Hypothetical matches
Sampling AFoVs parameter space
Rank hypotheses
dense
coarse
dense
Validate views
Estimate AFoVs parameters
Random Projection
coarse
Low-dimensional representation
Verify hypotheses
Manifold learning using EM
Recognition results
37
Eliminate Unrealistic Model Appearances

Under the assumption of linear transformations,
many unrealistic views could be generated.
Impose rigidity constraints to eliminate them.
Storage requirements can be reduced
significantly.
Recognition becomes faster and more efficient.

38
Eliminate Unrealistic Model Appearances
Unrealistic Views (without constraints)
Realistic Views (with constraints)
39
Indexing Appearances

Sample the space of views coarsely and
represent the samples in an index table.
Hashing might not very well in this case ...
Need an improved indexing scheme.

40
Range Search vs Nearest Neighbor Search

Range search is not appropriate when storing a
sparse number of views.
K-d trees perform a nearest-neighbor search.

Nearest Neighbor Search
Range Search
41
K-d Trees for Indexing

K-d trees perform a nearest-neighbor search.

42
Learning Geometric Appearance

We can pre-compute the views that an object can
produce off-line.
These views form a manifold in lower dimensional
space.
Model object appearance using a pdf.
Sample the space of appearances.
Fit a parametric model (e.g., mixtures of
Gaussians using EM).
Use mutual information theory to choose the
number of components.
EM has problems when the dimensionality of the
data is high.
Apply Random Projection first, then run EM
algorithm.

43
Manifolds of Real Objects An Example

Need to store a small number of parameters only
for each model

44
Hypothesis Ranking

Each hypothesis generated by the K-d tree is
ranked by computing its probability using mixture
models.
For each test group, we compute two
probabilities, one from x coordinates, and the
other from y coordinates.
The overall probability for a particular
hypothesis is computed according to the
following equation

where
45
Reference Views
1st Reference view
2nd Reference view
46
Reference Views (contd)
1st Reference view
2nd Reference view
47
Test Views
(a)
(b)
(c)
(d)
(f)
(e)
48
Test Views (contd)
Hypothesis rejected
Hypothesis rejected
49
Integrate Geometric Appearance with Intensity
Appearance

Using geometrical information only does not
provide enough discrimination for objects having
similar geometric appearance but probably
different intensity appearance.
Integrating geometric and intensity apperance
during hypothesis verification to improve
discrimination power and robustness.

W. Li, G. Bebis, and N. Bourbakis, "3D Object
Recognition Using 2D Views", IEEE Transactions
on Image Processing (under revision).
50
Dense Correspondences

For each group of corresponding points, apply
triangulation recursively to get denser
correspondences.
Divide triangles into four sub-triangles by
considering the middle point of each side of each
triangle.

51
Refine AFoVs parameters
(before refinement)
(after refinement)
52
Predict Intensity Appearance - Example
Reference view 1
Reference view 2
Test view
Prediction
53
Predict Intensity Appearance - Example
Reference view 2
Reference view 1
Test view
Prediction
54
Predict Intensity Appearance - Example
(hypothesis accepted)
(hypothesis rejected)

Write a Comment

User Comments (0)

About PowerShow.com

Algebraic Functions of Views for 3D Object Recognition PowerPoint PPT Presentation