Robot Vision - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

Robot Vision

Description:

Create an image of a scene and extract features. Very difficult problem for machines ... the arrowhead pointing along the edge such that surface doing the occluding is ... – PowerPoint PPT presentation

Number of Views:216
Avg rating:3.0/5.0
Slides: 45
Provided by: zha94
Category:

less

Transcript and Presenter's Notes

Title: Robot Vision


1
Robot Vision
  • Chapter 6.

2
Introduction
  • Computer vision
  • Endowing machines with the means to see
  • Create an image of a scene and extract features
  • Very difficult problem for machines
  • Several different scenes can produce identical
    images.
  • Images can be noisy .
  • Cannot directly invert the image to reconstruct
    the scene.

3
Human Vision (1)
4
Human Vision (2)
5
Human Vision (3)
6
Steering an Automobile
  • ALVINN system Pomerleau 1991,1993
  • Uses Artificial Neural Network
  • Used 3032 TV image as input (960 input node)
  • 5 Hidden node
  • 30 output node
  • Training regime modified on-the-fly
  • A human driver drives the car, and his actual
    steering angles are taken as correct labels for
    the corresponding inputs.
  • Shifted and rotated images were also used for
    training.
  • ALVINN has driven for 120 consecutive kilometers
    at speeds up to 100km/h.

7
Steering an Automobile-ALVINN

8
Two stages of Robot Vision (1)
  • Finding out objects in the scene
  • Looking for edges in the image
  • Edgea part of the image across which the image
    intensity or some other property of the image
    changes abruptly.
  • Attempting to segment the image into regions.
  • Regiona part of the image in which the image
    intensity or some other property of the image
    changes only gradually.

9
Two stages of Robot Vision (2)
  • Image processing stage
  • Transform the original image into one that is
    more amendable to the scene analysis stage.
  • Involves various filtering operations that help
    reduce noise, accentuate edges, and find regions.
  • Scene analysis stage
  • Attempt to create an iconic or a feature-based
    description of the original scene, providing the
    task-specific information.

10
Two stages of Robot Vision (3)
  • Scene analysis stage produces task-specific
    information.
  • If only the disposition of the blocks is
    important, appropriate iconic model can be (C B A
    FLOOR)
  • If it is important to determine whether there is
    another block on top of the block labeled C,
    adequate description will include the value of a
    feature, CLEAR_C.

11
Averaging (1)
  • Original image can be represented as an mn array
    of numbers. The numbers represent the light
    intensities at corresponding points in the image.
  • Certain irregularities in the image can be
    smoothed by an averaging operation.
  • Averaging operation involves sliding an averaging
    widow all over the image array.

12
Averaging (2)
  • Smoothing operation thickens broad lines and
    eliminates thin lines and small details.
  • The averaging window is centered at each pixel,
    and the weighted sum of all the pixel numbers
    within the averaging window is computed. This sum
    then replaces the original value at that pixel.

13
Averaging (3)
  • Common function used for smoothing is a Gaussian
    of two dimensions.
  • Convolving an image with a Gaussian is equivalent
    to finding the solution to a diffusion equation
    when the initial condition is given by the image
    intensity field.

14
Averaging (4)
15
Edge enhancement (1)
  • Edge any boundary between parts of the image
    with markedly different values of some property.
  • Edges are often related to important object
    properties.
  • Edges in the image occur at places where the
    second derivative of the image intensity is zero.

16
Edge enhancement (2)
17
Combining Edge Enhancement with Averaging (1)
  • Edge enhancement alone would tend to emphasize
    noise elements along with enhancing edges.
  • To be less sensitive to noise, both operations
    are needed. (First averaging and then edge
    enhancing)
  • We can convolve the one-dimensional image with
    the second derivative of a Gaussian curve to
    combine both operation.

18
Combining Edge Enhancement with Averaging (2)
  • Laplacian is second-derivate-type operation that
    enhances edges of any orientation.
  • Laplacian of the two-dimensional Gaussian
    function looks like an upside-down hat, often
    called a sombrero function.
  • Entire averaging/edge-finding operation can be
    achieved by convolving the image with the
    sombrero function(Called Laplacian filtering)

19
6.4.4 Finding Region
  • Another method for processing image
  • ? to find regions
  • Finding regions ? Finding outlines

20
A region of the image
  • A region is homogeneous.
  • The difference in intensity values of pixels in
    the region is no more than some ?
  • A polynomial surface of degree k can be fitted to
    the intensity values of pixels in the region with
    largest error less than ?
  • For no two adjacent regions is it the case that
    the union of all the pixels in these two regions
    satisfies the homogeneity property.
  • Each region corresponds to a world object or a
    meaningful part of one.

21
Split-and-merge method
  • The algorithm begins with just one candidate
    region, the whole image.
  • Until no more splits need be made.
  • For all candidate regions that do not satisfy the
    homogeneity property, are each split into four
    equal-sized candidate regions.
  • Adjacent candidate regions are merged if their
    pixels satisfying homogeneity property.

22
(No Transcript)
23
Regions Found by Split Merge for a Grid-World
Scene (from Fig.6.12)
24
Cleaned Up the regions found by Split-and-merge
method
  • Eliminating very small regions (some of which are
    transitions between larger regions).
  • Straightening bounding lines.
  • Taking into account the known shapes of objects
    likely to be in the scene.

25
6.4.5 Using Image Attributes Other Than Intensity
  • Image attributes other than the homogeneity
  • ? Visual texture
  • fine-grained variation of the surface
    reflectivity of the objects
  • Ex) a field of grass, a section of carpet,
    foliage in tree, the fur of animals
  • The reflectivity variations in objects cause
    similar fine-grained structure in image intensity.

26
Methods for analyzing texture
  • Structural methods
  • Represent regions in the image by a tessellation
    (??) of primitive texels small shapes
    comprising black and white parts
  • Statistical methods
  • Based on the idea that image texture is best
    described by a probability distribution for the
    intensity values over regions of the image.
  • Ex) an image of a grassy field in which the
    blades of grass are oriented vertically
  • ? a probability distribution that peaks for
    thin, vertically oriented regions of high
    intensity, separated by regions of low intensity

27
Other attributes
  • If we had a direct way to measure the range from
    the camera to objects in the scene, we could
    produce a range image and look for abrupt range
    differences.
  • Range image each pixel value represents the
    distance from the corresponding point in the
    scene to the camera.
  • Motion, color

28
6.5 Scene Analysis (1)
  • Scene Analysis
  • Extracting from the image the needed information
    about the scene
  • Requires either additional images (for stereo
    vision) or general information about the kinds of
    scenes, since the scene-to-image transformation
    is many-to-one.
  • The required knowledge
  • very general or quite specific
  • explicit or implicit

29
6.5 Scene Analysis (2)
  • Knowledge of surface reflectivity characteristics
    and shading of intensity in the image
  • ? give information about the shape of smooth
    objects in the scene.
  • Iconic scene analysis
  • Build a model of the scene or parts of the scene
  • Feature-based scene analysis
  • Extracts features of the scene needed by task
  • Task-oriented or purposive vision

30
6.5.1 Interpreting Lines and Curves in the Image
  • Interpreting the line drawing
  • Association between scene properties and the
    components of a line drawing
  • Trihedral vertex polyhedra
  • The scene to contain only planar surfaces such
    that no more than three surfaces intersect in a
    point

31
Three kinds of edges in Trihedral vertex
polyhedra (1/2)
  • There are only three kinds of ways in which two
    planes can intersect in a scene edge.
  • Occlude
  • One kind of edge is formed by two planes, with
    one of them occluding the other.
  • labeled in Fig. 6.15 with arrows (?).
  • the arrowhead pointing along the edge such that
    surface doing the occluding is to the right of
    the arrow.

32
Three kinds of edges in Trihedral vertex
polyhedra (2/2)
  • Blade
  • Two planes can intersect such that both planes
    are visible in the scene.
  • Two surfaces form a convex edge.
  • Labeled with pluses ().
  • Fold
  • Edge is concave.
  • Labeled with minus (?)

33
Labels for Lines at Junctions
34
Line-labeling scene analysis (1/2)
  • Labeling all of the junctions in the image as V,
    W, Y, or T junctions according to the shape of
    the junctions in the image

35
Line-labeling scene analysis (2/2)
  • Assign , ?, or ? labels to the lines in the
    image.
  • An image line that connects two junctions must
    have a consistent labeling.
  • If there is no consistent labeling,
  • ? there must have been some error in converting
    the image into a line drawing.
  • ? the scene must not have been one of trihedral
    polyhedra.
  • Constraint satisfaction problem

36
6.5.2 Model-Based Vision (1/2)
  • If, we knew that the scene contained a
    parallelepiped (in Figure 6.15), we could attempt
    to fit a projection of a parallelepiped to
    components of an image of this scene.
  • A generalized cylinders as building blocks for
    model construction
  • Each cylinder has 9 parameters.

37
Model-Based Vision (2/2)
  • An example rough scene reconstruction of a human
    figure
  • Hierarchical representation
  • Each cylinder in the model can be articulated
    into a set of smaller cylinders

38
6.6 Stereo Vision and Depth Information
  • Depth information can be obtained using stereo
    vision, which based on triangulation calculations
    using two (or more) images.
  • Some depth information can be extracted from a
    single image.
  • The analysis of texture in the image can indicate
    that some elements in the scene are closer than
    are others.
  • More precise depth information If we know that a
    perceived object is on the floor and the camera
    height above the floor, we can calculate the
    distance to the object.

39
Depth Calculation from a Single Image
40
Stereo Vision
  • Stereo vision uses triangulation.
  • Two lenses whose centers are separated by a
    baseline, b.
  • The image point of a scene point, at distance d,
    created by these lenses.
  • The angles of these image points from the lens
    centers, ?, ?.
  • The optical axes are parallel, the image planes
    are coplanar, and the scene point is in the same
    plane as that formed by two parallel optical axes.

41
Triangulation in Stereo Vision
42
The main complication
  • In scenes containing more than one point, it must
    be established which pair of points in the two
    images correspond to the same scene point.
  • We must be able to identify a corresponding pixel
    in the other image. ? correspondence problem

43
Techniques for correspondence problem
  • Geometric analysis reveals that we need only
    search along one dimension (epipolar line).
  • One-dimensional searches can be implemented by
    cross-correlation of two image intensity profiles
    along corresponding epipolar lines.
  • We do not have to find correspondences between
    individual pairs of image points but can do so
    between pairs of larger image components, such as
    lines.

44
Assignments
  • Page 111112
  • Ex.6.2, Ex. 6.4, Ex. 6.5
Write a Comment
User Comments (0)
About PowerShow.com