Evaluation Techniques in Computer Vision - PowerPoint PPT Presentation

1 / 49

About This Presentation

Title:

Evaluation Techniques in Computer Vision

Description:

Segmentation is rarely an end in itself but a component in an overall machine vision system ... some known probability distribution (for example chi-squared) ... – PowerPoint PPT presentation

Number of Views:33

Avg rating:3.0/5.0

Slides: 50

Provided by: spa84

Category:

more less

Transcript and Presenter's Notes

Title: Evaluation Techniques in Computer Vision

1
Evaluation Techniques in Computer Vision

EE4H, M.Sc 0407191
Computer Vision
Dr. Mike Spann
m.spann_at_bham.ac.uk
http//www.eee.bham.ac.uk/spannm

2
Contents

Why evaluate?
Images synthetic/natural?
Noise
Example 1. Evaluation of thresholding/segmentation
methods
Example 2. Evaluation of optical flow methods

3
Why evaluate?

Computer vision algorithms are complex and
difficult to analyse mathematically
Evaluation is usually through measurement of the
algorithms performance on test images
Use of a range of images to establish performance
envelope
Comparison with existing algorithms
Performance on degraded (noise-added) images
(robustness)
Sensitivity to algorithm parameter settings

4
Test images

Real images
Ground truth difficult to establish
Pseudo-real images
Could be synthetic objects moving against real
background
Often a good compromise
Synthetic images
Noise and illumination variation over object
surfaces hard to model realistically

5
Simple synthetic images

Simple object-background synthetic images used
to evaluate thresholding and segmentation
algorithms
They obey a very simple image model (piecewise
constant Gaussian noise)
Unrealistic in practice images are not like
this!

6
Simple synthetic images
Medium noise
Zero noise
Low noise
7
Pseudo-real images

More realistic object background images are
better used to evaluate segmentation algorithms
Images of natural objects in natural illumination
Ground truth can be established using hand
segmentation tools (such as built into many image
processing packages)

8
Pseudo-real images
Screws
Keys
Cars
Washers
9
Simple synthetic edges

Again, piecewise constant Gaussian noise image
model
Ideal step edge
Precise edge location but not achievable by
finite aperture imaging systems

10
Simple synthetic edges
Low noise
Medium noise
High noise
11
Pseudo-real edges

More realistic edge profiles can be created by
smoothing an ideal step edge

Step edge
Gaussian filter
12
Pseudo-real movies

The yosemite sequence is a computer generated
movie of a rendering of a fly-through the
Yosemite valley
Background clouds are real
Enables true flow (ground truth) to be determined
Used extensively in the evaluation of optical
flow algorithms
yosemite.avi
yosemite_flow.avi

13
Noise

Often used to evaluate the robustness of
algorithms
Additive noise usual in optical images but
multiplicative is more realistic in sonar/radar
images
Noise level proportional to signal level
Usual noise model is independent random variables
(usually Gaussian)
Correlated noise often more realistic

14
Noise

Standard noise model is zero-mean identical
independently distributed (iid) Gaussian (normal)
random variables
Characterised by variance
Probability distribution of rvs

15
Noise

Noise level characterised by the signal-to-noise
ratio
Usually expressed in dBs
Defined as
is the mean-square grey level defined (for a
pixel image) as

16
Noise
dB
30dB
0dB
17
Noise (mean-square error)

We can regard the mean-square error (difference)
between 2 images as noise
Often used to evaluate image compression
algorithms in comparing the original and
decompressed images
Image differences can also be expressed as the
peak-signal-to-noise-ratio (PSNR) in dB by taking
the signal level as 255

18
Noise (mean-square error)
19
Other types of noise

The other main category of (additive) noise is
impulse (sometimes called salt and pepper)
noise
Characterised by the impulse rate (spatial
density of noise impulses) and mean square
amplitude of impulse
Can normally be easily filtered out using median
filters

20
Other types of noise
Salt and pepper noise
Original
De-speckled
21
Other types of noise

There are many other types of noise which can be
considered in algorithm evaluation
Essentially more sophisticated and realistic
probability distributions of noise rvs
For example a generalised Gaussian model is
often considered to model heavy tailed
distributions
However, in my humble opinion, a more realistic
source of noise is the deviation away from the
ideal of the illumination variation across
object surfaces

22
Other types of noise
23
Other types of noise
24
Evaluation of thresholding segmentation methods

Segmentation and thresholding algorithms
essentially group pixels into regions (or
classes)
Simplest case is object/background
Simple evaluation metrics just quantify the
number of miss-classified pixels
For basic images models such as constant
greylevel in object/background regions plus iid
Gaussian noise, the probability of error can be
computed analytically

25
Evaluation of thresholding segmentation methods

For a simple object/background image

26
Evaluation of thresholding segmentation methods

Miss-classification probability is a function of
a threshold T
For a simple constant region greylevel model plus
additive iid Gaussian noise we can easily derive
an analytical expression for
Not very useful in practice as limited image
model and we also require the ground truth
More useful just to simply measure the
miss-classification error as a function of
threshold

27
Evaluation of thresholding segmentation methods

Usual to represent correct classification
probabilities and false alarm probabilities
jointly within a receiver operating curve (ROC)
For example, the ROC shows how these vary as a
function of threshold for an object/background
classification

28
Evaluation of thresholding segmentation methods
1.0
T0
Prob. of correct classification
T255
0.0
0.0
1.0
Prob. of false alarm
29
Evaluation of thresholding segmentation methods

More useful methods of evaluation can be found by
taking account of the application of the
segmentation
Segmentation is rarely an end in itself but a
component in an overall machine vision system
Also, the level of under- or over- segmentation
of an algorithm needs to be determined

30
Evaluation of thresholding segmentation methods
Ground truth
Under-segmentation
Over-segmentation
31
Evaluation of thresholding segmentation methods

Under-segmentation is bad as distinct regions are
merged
Over-segmentation can be acceptable as
sub-regions comprising a single ground truth
region can be merged using high level knowledge
Also, the level of over-segmentation can be
controlled by parameter settings of the algorithm

32
Evaluation of thresholding segmentation methods

A possible segmentation metric is to quantify
correctly detected regions, over-segmentation and
under-segmentation
Depends upon some threshold setting T
Region rather than pixel based
Used in Koester and Spanns paper (IEEE Trans.
PAMI, 2000) to evaluate range image segmentations

33
Evaluation of thresholding segmentation methods

Correct detection
At least T of the pixels in region k of the
segmented image are marked as pixels in region j
of the ground truth image
And vice versa

Segmentation
GT image
34
Evaluation of thresholding segmentation methods

Over-segmentation
Region j in the ground truth image corresponds
to regions k1, k2 km in the segmented image if
At least T of the pixels in region ki are
marked as pixels of region j
At least T of the pixels in region j are marked
as pixels in the union of regions k1, k2 km

35
Evaluation of thresholding segmentation methods
GT image
Segmentation
36
Evaluation of thresholding segmentation methods

Under-segmentation
Regions j1, j2 jm in the ground truth image
correspond to region k in the segmented image if
At least T of the pixels in region k are
marked as pixels in the union of regions j1, j2
jm
At least T of the pixels in region ji are
marked as pixels in region k

37
Evaluation of thresholding segmentation methods
GT image
Segmentation
38
Evaluation of thresholding segmentation methods

The metric also allows us to quantify missed and
noise regions
Missed regions regions in the ground truth
image not found in the segmented image
Noise regions regions in the segmented image
not found in the ground truth image
Overall, the average number of correct, over,
under, missed and noise regions can be quantified
over an image database and different algorithms
compared

39
Evaluation of optical flow methods

Optical flow algorithms compute the 2D optical
flow vector at each pixel using consecutive
frames in a video sequence
Optical flow algorithms are notoriously un-robust
Crucial to evaluate the effectiveness of any
method used (or any new method devised)
Usually ground truth difficult to come by

40
Evaluation of optical flow methods
41
Evaluation of optical flow methods

This simple error measurement naturally amplifies
errors when the flow vectors are large (for the
same relative flow error)
Can normalize the error by the product of the
magnitudes of the ground truth flow and flow
estimate

42
Evaluation of optical flow methods

Often the ground truth is not available
A useful (but often crude) way of comparing the
quality of two optical flow fields
and is to compute the displaced
frame difference (DFD) statistic
Uses the two consecutive frames of a sequence
from which the flows were computed

43
Evaluation of optical flow methods
44
Evaluation of optical flow methods

DFD is a crude estimate because it says nothing
about the accuracy of the motion field directly
just the quality of the pixel mapping from one
frame to the next
Plus it says nothing about the confidence
attached to optical flow estimates
However, it is the basis of motion compensation
algorithms for most of the current video
compression standards (MPEG, H261 etc)

45
Evaluation of optical flow methods

In optical flow estimation, as in other types of
estimation algorithms, we are often interested in
the quality of the estimates
In classic estimation theory, we often compute
confidence limits on estimates
We can say with a certain degree of confidence
(say 90) that the parameter lies within certain
bounds
We usually assume that the quantities we are
estimating follow some known probability
distribution (for example chi-squared)

46
Evaluation of optical flow methods

In the case of optical flow vectors, confidence
regions are ellipses in 2 dimensions
They essentially characterise the distribution of
the estimation error
Assuming a normal distribution of the flow
error, confidence ellipses can be drawn for
any confidence limit
Orientation and shape of ellipses determined by
the covariance matrix defining the normal
distribution
The eigenvalues of the covariance matrix define a
particular confidence limit

47
Evaluation of optical flow methods
99
90
70
Confidence ellipses of
48
Evaluation of optical flow methods
Yosemite true flow
Yosemite
Yosemite flow (LK)
Yosemite flow (LK) confidence thresholded
49
Conclusions

Evaluation in computer vision is a difficult and
often controversial topic
I would suggest 3 rules of thumb to consider when
evaluating your work for the purposes of
assignments
Consider carefully your test data. Make it as
realistic as possible
Make your evaluations as much as possible
application driven
Make your algorithms self evaluating if
possible through the use of confidence statistics