Robot Vision - PowerPoint PPT Presentation

1 / 44

About This Presentation

Title:

Robot Vision

Description:

Create an image of a scene and extract features. Very difficult problem for machines ... the arrowhead pointing along the edge such that surface doing the occluding is ... – PowerPoint PPT presentation

Number of Views:216

Avg rating:3.0/5.0

Slides: 45

Provided by: zha94

Category:

more less

Transcript and Presenter's Notes

Title: Robot Vision

1
Robot Vision

Chapter 6.

2
Introduction

Computer vision
Endowing machines with the means to see
Create an image of a scene and extract features
Very difficult problem for machines
Several different scenes can produce identical
images.
Images can be noisy .
Cannot directly invert the image to reconstruct
the scene.

3
Human Vision (1)
4
Human Vision (2)
5
Human Vision (3)
6
Steering an Automobile

ALVINN system Pomerleau 1991,1993
Uses Artificial Neural Network
Used 3032 TV image as input (960 input node)
5 Hidden node
30 output node
Training regime modified on-the-fly
A human driver drives the car, and his actual
steering angles are taken as correct labels for
the corresponding inputs.
Shifted and rotated images were also used for
training.
ALVINN has driven for 120 consecutive kilometers
at speeds up to 100km/h.

7
Steering an Automobile-ALVINN

8
Two stages of Robot Vision (1)

Finding out objects in the scene
Looking for edges in the image
Edgea part of the image across which the image
intensity or some other property of the image
changes abruptly.
Attempting to segment the image into regions.
Regiona part of the image in which the image
intensity or some other property of the image
changes only gradually.

9
Two stages of Robot Vision (2)

Image processing stage
Transform the original image into one that is
more amendable to the scene analysis stage.
Involves various filtering operations that help
reduce noise, accentuate edges, and find regions.
Scene analysis stage
Attempt to create an iconic or a feature-based
description of the original scene, providing the
task-specific information.

10
Two stages of Robot Vision (3)

Scene analysis stage produces task-specific
information.
If only the disposition of the blocks is
important, appropriate iconic model can be (C B A
FLOOR)
If it is important to determine whether there is
another block on top of the block labeled C,
adequate description will include the value of a
feature, CLEAR_C.

11
Averaging (1)

Original image can be represented as an mn array
of numbers. The numbers represent the light
intensities at corresponding points in the image.
Certain irregularities in the image can be
smoothed by an averaging operation.
Averaging operation involves sliding an averaging
widow all over the image array.

12
Averaging (2)

Smoothing operation thickens broad lines and
eliminates thin lines and small details.
The averaging window is centered at each pixel,
and the weighted sum of all the pixel numbers
within the averaging window is computed. This sum
then replaces the original value at that pixel.

13
Averaging (3)

Common function used for smoothing is a Gaussian
of two dimensions.
Convolving an image with a Gaussian is equivalent
to finding the solution to a diffusion equation
when the initial condition is given by the image
intensity field.

14
Averaging (4)
15
Edge enhancement (1)

Edge any boundary between parts of the image
with markedly different values of some property.
Edges are often related to important object
properties.
Edges in the image occur at places where the
second derivative of the image intensity is zero.

16
Edge enhancement (2)
17
Combining Edge Enhancement with Averaging (1)

Edge enhancement alone would tend to emphasize
noise elements along with enhancing edges.
To be less sensitive to noise, both operations
are needed. (First averaging and then edge
enhancing)
We can convolve the one-dimensional image with
the second derivative of a Gaussian curve to
combine both operation.

18
Combining Edge Enhancement with Averaging (2)

Laplacian is second-derivate-type operation that
enhances edges of any orientation.
Laplacian of the two-dimensional Gaussian
function looks like an upside-down hat, often
called a sombrero function.
Entire averaging/edge-finding operation can be
achieved by convolving the image with the
sombrero function(Called Laplacian filtering)

19
6.4.4 Finding Region

Another method for processing image
? to find regions
Finding regions ? Finding outlines

20
A region of the image

A region is homogeneous.
The difference in intensity values of pixels in
the region is no more than some ?
A polynomial surface of degree k can be fitted to
the intensity values of pixels in the region with
largest error less than ?
For no two adjacent regions is it the case that
the union of all the pixels in these two regions
satisfies the homogeneity property.
Each region corresponds to a world object or a
meaningful part of one.

21
Split-and-merge method

The algorithm begins with just one candidate
region, the whole image.
Until no more splits need be made.
For all candidate regions that do not satisfy the
homogeneity property, are each split into four
equal-sized candidate regions.
Adjacent candidate regions are merged if their
pixels satisfying homogeneity property.

22
(No Transcript)
23
Regions Found by Split Merge for a Grid-World
Scene (from Fig.6.12)
24
Cleaned Up the regions found by Split-and-merge
method

Eliminating very small regions (some of which are
transitions between larger regions).
Straightening bounding lines.
Taking into account the known shapes of objects
likely to be in the scene.

25
6.4.5 Using Image Attributes Other Than Intensity

Image attributes other than the homogeneity
? Visual texture
fine-grained variation of the surface
reflectivity of the objects
Ex) a field of grass, a section of carpet,
foliage in tree, the fur of animals
The reflectivity variations in objects cause
similar fine-grained structure in image intensity.

26
Methods for analyzing texture

Structural methods
Represent regions in the image by a tessellation
(??) of primitive texels small shapes
comprising black and white parts
Statistical methods
Based on the idea that image texture is best
described by a probability distribution for the
intensity values over regions of the image.
Ex) an image of a grassy field in which the
blades of grass are oriented vertically
? a probability distribution that peaks for
thin, vertically oriented regions of high
intensity, separated by regions of low intensity

27
Other attributes

If we had a direct way to measure the range from
the camera to objects in the scene, we could
produce a range image and look for abrupt range
differences.
Range image each pixel value represents the
distance from the corresponding point in the
scene to the camera.
Motion, color

28
6.5 Scene Analysis (1)

Scene Analysis
Extracting from the image the needed information
about the scene
Requires either additional images (for stereo
vision) or general information about the kinds of
scenes, since the scene-to-image transformation
is many-to-one.
The required knowledge
very general or quite specific
explicit or implicit

29
6.5 Scene Analysis (2)

Knowledge of surface reflectivity characteristics
and shading of intensity in the image
? give information about the shape of smooth
objects in the scene.
Iconic scene analysis
Build a model of the scene or parts of the scene
Feature-based scene analysis
Extracts features of the scene needed by task
Task-oriented or purposive vision

30
6.5.1 Interpreting Lines and Curves in the Image

Interpreting the line drawing
Association between scene properties and the
components of a line drawing

Trihedral vertex polyhedra
The scene to contain only planar surfaces such
that no more than three surfaces intersect in a
point

31
Three kinds of edges in Trihedral vertex
polyhedra (1/2)

There are only three kinds of ways in which two
planes can intersect in a scene edge.
Occlude
One kind of edge is formed by two planes, with
one of them occluding the other.
labeled in Fig. 6.15 with arrows (?).
the arrowhead pointing along the edge such that
surface doing the occluding is to the right of
the arrow.

32
Three kinds of edges in Trihedral vertex
polyhedra (2/2)

Blade
Two planes can intersect such that both planes
are visible in the scene.
Two surfaces form a convex edge.
Labeled with pluses ().
Fold
Edge is concave.
Labeled with minus (?)

33
Labels for Lines at Junctions
34
Line-labeling scene analysis (1/2)

Labeling all of the junctions in the image as V,
W, Y, or T junctions according to the shape of
the junctions in the image

35
Line-labeling scene analysis (2/2)

Assign , ?, or ? labels to the lines in the
image.
An image line that connects two junctions must
have a consistent labeling.
If there is no consistent labeling,
? there must have been some error in converting
the image into a line drawing.
? the scene must not have been one of trihedral
polyhedra.
Constraint satisfaction problem

36
6.5.2 Model-Based Vision (1/2)

If, we knew that the scene contained a
parallelepiped (in Figure 6.15), we could attempt
to fit a projection of a parallelepiped to
components of an image of this scene.

A generalized cylinders as building blocks for
model construction
Each cylinder has 9 parameters.

37
Model-Based Vision (2/2)

An example rough scene reconstruction of a human
figure
Hierarchical representation
Each cylinder in the model can be articulated
into a set of smaller cylinders

38
6.6 Stereo Vision and Depth Information

Depth information can be obtained using stereo
vision, which based on triangulation calculations
using two (or more) images.
Some depth information can be extracted from a
single image.
The analysis of texture in the image can indicate
that some elements in the scene are closer than
are others.
More precise depth information If we know that a
perceived object is on the floor and the camera
height above the floor, we can calculate the
distance to the object.

39
Depth Calculation from a Single Image
40
Stereo Vision

Stereo vision uses triangulation.
Two lenses whose centers are separated by a
baseline, b.
The image point of a scene point, at distance d,
created by these lenses.
The angles of these image points from the lens
centers, ?, ?.
The optical axes are parallel, the image planes
are coplanar, and the scene point is in the same
plane as that formed by two parallel optical axes.

41
Triangulation in Stereo Vision
42
The main complication

In scenes containing more than one point, it must
be established which pair of points in the two
images correspond to the same scene point.
We must be able to identify a corresponding pixel
in the other image. ? correspondence problem

43
Techniques for correspondence problem

Geometric analysis reveals that we need only
search along one dimension (epipolar line).
One-dimensional searches can be implemented by
cross-correlation of two image intensity profiles
along corresponding epipolar lines.
We do not have to find correspondences between
individual pairs of image points but can do so
between pairs of larger image components, such as
lines.

44
Assignments