Evaluation Techniques in Computer Vision - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

Evaluation Techniques in Computer Vision

Description:

Segmentation is rarely an end in itself but a component in an overall machine vision system ... some known probability distribution (for example chi-squared) ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 50
Provided by: spa84
Category:

less

Transcript and Presenter's Notes

Title: Evaluation Techniques in Computer Vision


1
Evaluation Techniques in Computer Vision
  • EE4H, M.Sc 0407191
  • Computer Vision
  • Dr. Mike Spann
  • m.spann_at_bham.ac.uk
  • http//www.eee.bham.ac.uk/spannm

2
Contents
  • Why evaluate?
  • Images synthetic/natural?
  • Noise
  • Example 1. Evaluation of thresholding/segmentation
    methods
  • Example 2. Evaluation of optical flow methods

3
Why evaluate?
  • Computer vision algorithms are complex and
    difficult to analyse mathematically
  • Evaluation is usually through measurement of the
    algorithms performance on test images
  • Use of a range of images to establish performance
    envelope
  • Comparison with existing algorithms
  • Performance on degraded (noise-added) images
    (robustness)
  • Sensitivity to algorithm parameter settings

4
Test images
  • Real images
  • Ground truth difficult to establish
  • Pseudo-real images
  • Could be synthetic objects moving against real
    background
  • Often a good compromise
  • Synthetic images
  • Noise and illumination variation over object
    surfaces hard to model realistically

5
Simple synthetic images
  • Simple object-background synthetic images used
    to evaluate thresholding and segmentation
    algorithms
  • They obey a very simple image model (piecewise
    constant Gaussian noise)
  • Unrealistic in practice images are not like
    this!

6
Simple synthetic images
Medium noise
Zero noise
Low noise
7
Pseudo-real images
  • More realistic object background images are
    better used to evaluate segmentation algorithms
  • Images of natural objects in natural illumination
  • Ground truth can be established using hand
    segmentation tools (such as built into many image
    processing packages)

8
Pseudo-real images
Screws
Keys
Cars
Washers
9
Simple synthetic edges
  • Again, piecewise constant Gaussian noise image
    model
  • Ideal step edge
  • Precise edge location but not achievable by
    finite aperture imaging systems

10
Simple synthetic edges
Low noise
Medium noise
High noise
11
Pseudo-real edges
  • More realistic edge profiles can be created by
    smoothing an ideal step edge



Step edge
Gaussian filter
12
Pseudo-real movies
  • The yosemite sequence is a computer generated
    movie of a rendering of a fly-through the
    Yosemite valley
  • Background clouds are real
  • Enables true flow (ground truth) to be determined
  • Used extensively in the evaluation of optical
    flow algorithms
  • yosemite.avi
  • yosemite_flow.avi

13
Noise
  • Often used to evaluate the robustness of
    algorithms
  • Additive noise usual in optical images but
    multiplicative is more realistic in sonar/radar
    images
  • Noise level proportional to signal level
  • Usual noise model is independent random variables
    (usually Gaussian)
  • Correlated noise often more realistic

14
Noise
  • Standard noise model is zero-mean identical
    independently distributed (iid) Gaussian (normal)
    random variables
  • Characterised by variance
  • Probability distribution of rvs

15
Noise
  • Noise level characterised by the signal-to-noise
    ratio
  • Usually expressed in dBs
  • Defined as
  • is the mean-square grey level defined (for a
    pixel image) as

16
Noise
dB
30dB
0dB
17
Noise (mean-square error)
  • We can regard the mean-square error (difference)
    between 2 images as noise
  • Often used to evaluate image compression
    algorithms in comparing the original and
    decompressed images
  • Image differences can also be expressed as the
    peak-signal-to-noise-ratio (PSNR) in dB by taking
    the signal level as 255

18
Noise (mean-square error)
19
Other types of noise
  • The other main category of (additive) noise is
    impulse (sometimes called salt and pepper)
    noise
  • Characterised by the impulse rate (spatial
    density of noise impulses) and mean square
    amplitude of impulse
  • Can normally be easily filtered out using median
    filters

20
Other types of noise
Salt and pepper noise
Original
De-speckled
21
Other types of noise
  • There are many other types of noise which can be
    considered in algorithm evaluation
  • Essentially more sophisticated and realistic
    probability distributions of noise rvs
  • For example a generalised Gaussian model is
    often considered to model heavy tailed
    distributions
  • However, in my humble opinion, a more realistic
    source of noise is the deviation away from the
    ideal of the illumination variation across
    object surfaces

22
Other types of noise
23
Other types of noise
24
Evaluation of thresholding segmentation methods
  • Segmentation and thresholding algorithms
    essentially group pixels into regions (or
    classes)
  • Simplest case is object/background
  • Simple evaluation metrics just quantify the
    number of miss-classified pixels
  • For basic images models such as constant
    greylevel in object/background regions plus iid
    Gaussian noise, the probability of error can be
    computed analytically

25
Evaluation of thresholding segmentation methods
  • For a simple object/background image

26
Evaluation of thresholding segmentation methods
  • Miss-classification probability is a function of
    a threshold T
  • For a simple constant region greylevel model plus
    additive iid Gaussian noise we can easily derive
    an analytical expression for
  • Not very useful in practice as limited image
    model and we also require the ground truth
  • More useful just to simply measure the
    miss-classification error as a function of
    threshold

27
Evaluation of thresholding segmentation methods
  • Usual to represent correct classification
    probabilities and false alarm probabilities
    jointly within a receiver operating curve (ROC)
  • For example, the ROC shows how these vary as a
    function of threshold for an object/background
    classification

28
Evaluation of thresholding segmentation methods
1.0
T0
Prob. of correct classification
T255
0.0
0.0
1.0
Prob. of false alarm
29
Evaluation of thresholding segmentation methods
  • More useful methods of evaluation can be found by
    taking account of the application of the
    segmentation
  • Segmentation is rarely an end in itself but a
    component in an overall machine vision system
  • Also, the level of under- or over- segmentation
    of an algorithm needs to be determined

30
Evaluation of thresholding segmentation methods
Ground truth
Under-segmentation
Over-segmentation
31
Evaluation of thresholding segmentation methods
  • Under-segmentation is bad as distinct regions are
    merged
  • Over-segmentation can be acceptable as
    sub-regions comprising a single ground truth
    region can be merged using high level knowledge
  • Also, the level of over-segmentation can be
    controlled by parameter settings of the algorithm

32
Evaluation of thresholding segmentation methods
  • A possible segmentation metric is to quantify
    correctly detected regions, over-segmentation and
    under-segmentation
  • Depends upon some threshold setting T
  • Region rather than pixel based
  • Used in Koester and Spanns paper (IEEE Trans.
    PAMI, 2000) to evaluate range image segmentations

33
Evaluation of thresholding segmentation methods
  • Correct detection
  • At least T of the pixels in region k of the
    segmented image are marked as pixels in region j
    of the ground truth image
  • And vice versa

Segmentation
GT image
34
Evaluation of thresholding segmentation methods
  • Over-segmentation
  • Region j in the ground truth image corresponds
    to regions k1, k2 km in the segmented image if
  • At least T of the pixels in region ki are
    marked as pixels of region j
  • At least T of the pixels in region j are marked
    as pixels in the union of regions k1, k2 km

35
Evaluation of thresholding segmentation methods
GT image
Segmentation
36
Evaluation of thresholding segmentation methods
  • Under-segmentation
  • Regions j1, j2 jm in the ground truth image
    correspond to region k in the segmented image if
  • At least T of the pixels in region k are
    marked as pixels in the union of regions j1, j2
    jm
  • At least T of the pixels in region ji are
    marked as pixels in region k

37
Evaluation of thresholding segmentation methods
GT image
Segmentation
38
Evaluation of thresholding segmentation methods
  • The metric also allows us to quantify missed and
    noise regions
  • Missed regions regions in the ground truth
    image not found in the segmented image
  • Noise regions regions in the segmented image
    not found in the ground truth image
  • Overall, the average number of correct, over,
    under, missed and noise regions can be quantified
    over an image database and different algorithms
    compared

39
Evaluation of optical flow methods
  • Optical flow algorithms compute the 2D optical
    flow vector at each pixel using consecutive
    frames in a video sequence
  • Optical flow algorithms are notoriously un-robust
  • Crucial to evaluate the effectiveness of any
    method used (or any new method devised)
  • Usually ground truth difficult to come by

40
Evaluation of optical flow methods
41
Evaluation of optical flow methods
  • This simple error measurement naturally amplifies
    errors when the flow vectors are large (for the
    same relative flow error)
  • Can normalize the error by the product of the
    magnitudes of the ground truth flow and flow
    estimate

42
Evaluation of optical flow methods
  • Often the ground truth is not available
  • A useful (but often crude) way of comparing the
    quality of two optical flow fields
    and is to compute the displaced
    frame difference (DFD) statistic
  • Uses the two consecutive frames of a sequence
    from which the flows were computed

43
Evaluation of optical flow methods
44
Evaluation of optical flow methods
  • DFD is a crude estimate because it says nothing
    about the accuracy of the motion field directly
    just the quality of the pixel mapping from one
    frame to the next
  • Plus it says nothing about the confidence
    attached to optical flow estimates
  • However, it is the basis of motion compensation
    algorithms for most of the current video
    compression standards (MPEG, H261 etc)

45
Evaluation of optical flow methods
  • In optical flow estimation, as in other types of
    estimation algorithms, we are often interested in
    the quality of the estimates
  • In classic estimation theory, we often compute
    confidence limits on estimates
  • We can say with a certain degree of confidence
    (say 90) that the parameter lies within certain
    bounds
  • We usually assume that the quantities we are
    estimating follow some known probability
    distribution (for example chi-squared)

46
Evaluation of optical flow methods
  • In the case of optical flow vectors, confidence
    regions are ellipses in 2 dimensions
  • They essentially characterise the distribution of
    the estimation error
  • Assuming a normal distribution of the flow
    error, confidence ellipses can be drawn for
    any confidence limit
  • Orientation and shape of ellipses determined by
    the covariance matrix defining the normal
    distribution
  • The eigenvalues of the covariance matrix define a
    particular confidence limit

47
Evaluation of optical flow methods
99
90
70
Confidence ellipses of
48
Evaluation of optical flow methods
Yosemite true flow
Yosemite
Yosemite flow (LK)
Yosemite flow (LK) confidence thresholded
49
Conclusions
  • Evaluation in computer vision is a difficult and
    often controversial topic
  • I would suggest 3 rules of thumb to consider when
    evaluating your work for the purposes of
    assignments
  • Consider carefully your test data. Make it as
    realistic as possible
  • Make your evaluations as much as possible
    application driven
  • Make your algorithms self evaluating if
    possible through the use of confidence statistics
Write a Comment
User Comments (0)
About PowerShow.com