Ecological Statistics and Visual Grouping - PowerPoint PPT Presentation

1 / 77

About This Presentation

Title:

Ecological Statistics and Visual Grouping

Description:

Ecological Statistics and Visual Grouping – PowerPoint PPT presentation

Number of Views:57

Avg rating:3.0/5.0

Slides: 78

Provided by: david2938

Learn more at: https://people.eecs.berkeley.edu

Category:

more less

Transcript and Presenter's Notes

Title: Ecological Statistics and Visual Grouping

1
Ecological Statistics and Visual Grouping

Jitendra Malik
U.C. Berkeley

2
Collaborators

David Martin
Charless Fowlkes
Xiaofeng Ren

3
From Images to Objects
"I stand at the window and see a house, trees,
sky. Theoretically I might say there were 327
brightnesses and nuances of colour. Do I have
"327"? No. I have sky, house, and trees." --Max
Wertheimer
4
Grouping factors
5
Critique

Predictive power
Factors for complex, natural stimuli ?
How do they interact ?
Functional significance
Why should these be useful or confer some
evolutionary advantage to a visual organism?
Brain mechanisms
How are these factors implemented given what we
know about V1 and higher visual areas?

6
(No Transcript)
7
Our approach

Creating a dataset of human segmented images
Measuring ecological statistics of various
Gestalt grouping factors
Using these measurements to calibrate and
validate approaches to grouping

8
Natural Images arent generic signals

Edges/Filters/Coding Ruderman 1994/1997,
Olshausen/Field 1996, Bell/Sejnowski 1997,
Hateren/Schaaf 1998, Buccigrossi/Simoncelli 1999,
Alvarez/Gousseau/Morel 1999, Huang/Mumford 1999
Range Data Huang/Lee/Mumford 2000

9
Brunswik Kamiya 1953

Unification of two important theories of
perception
Statistical/Bayesian formulation, dueto
Helmholtz Likelihood Principle
Gestalt Psychology
Attempted an empirical proof of the Gestalt
grouping rule of proximity
892 separations
Ahead of his time.
Now we have the tools to do this.

Egon Brunswik (1903-1955)
10
Outline

Collect Data
Learn Local Boundary Model (Low-Level Cues)
Learn Pixel Affinity Model (Mid-Level Cues)
Discussion and Conclusion

11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
Protocol

You will be presented a photographic image.
Divide the image into some number of segments,
where the segments represent things or parts
of things in the scene. The number of segments
is up to you, as it depends on the image.
Something between 2 and 30 is likely to be
appropriate. It is important that all of the
segments have approximately equal importance.
Custom segmentation tool
Subjects obtained from work-study program (UC
Berkeley undergraduates)

15
(No Transcript)
16
(No Transcript)
17
Segmentations are Consistent
Perceptual organization forms a tree
Image
BG
L-bird
R-bird
bush
far
grass
beak
body
beak
body
head
eye
eye
head
Two segmentations are consistent when they can
be explained by the same segmentation tree (i.e.
they could be derived from a single perceptual
organization).

A,C are refinements of B
A,C are mutual refinements
A,B,C represent the same percept
Attention accounts for differences

18
(No Transcript)
19
Dataset Summary

30 subjects, age 19-23
17 men, 13 women
9 with artistic training
8 months
1,458 person hours
1,020 Corel images
11,595 Segmentations
5,555 color, 5,554 gray, 486 inverted/negated

20
Gray, Color, InvNeg Datasets

Explore how various high/low-level cues affect
the task of image segmentation by subjects
Color full color image
Gray luminance image
InvNeg inverted negative luminance image

21
Color
Gray
InvNeg
22
InvNeg
23
Color
Gray
InvNeg
24
Outline

Collect Data
Learn Local Boundary Model (Low-Level Cues)
The first step in human vision finding edges
Required for all segmentation algorithms
Learn Pixel Affinity Model (Mid-Level Cues)
Discussion and Conclusion

25
Dataflow
Pb
Image
Boundary Cues
Cue Combination
Brightness
Model
Color
Texture
Challenges texture cue, cue combination Goal
learn the posterior probability of a boundary
Pb(x,y,?) from local information only
26
(No Transcript)
27
Brightness and Color Features

1976 CIE Lab colorspace
Brightness Gradient BG(x,y,r,?)
?2 difference in L distribution
Color Gradient CG(x,y,r,?)
?2 difference in a and b distributions

28
Texture Feature

Texture Gradient TG(x,y,r,?)
?2 difference of texton histograms
Textons are vector-quantized filter outputs

29
Cue Combination Models

Classification Trees
Top-down splits to maximize entropy, error
bounded
Density Estimation
Adaptive bins using k-means
Logistic Regression, 3 variants
Linear and quadratic terms
Confidence-rated generalization of AdaBoost
(SchapireSinger)
Hierarchical Mixtures of Experts (JordanJacobs)
Up to 8 experts, initialized top-down, fit with
EM
Support Vector Machines (libsvm, ChangLin)
Gaussian kernel, ?-parameterization
Range over bias, complexity, parametric/non-parame
tric

30
Computing Precision/Recall

Recall Pr(signaltruth) fraction of ground
truth found by the signal
Precision Pr(truthsignal) fraction of signal
that is correct
Always a trade-off between the two
Standard measures in information retrieval (van
Rijsbergen XX)
ROC from standard signal detection the wrong
approach
Strategy
Detector output (Pb) is a soft boundary map
Compute precision/recall curve
Threshold Pb at many points t in 0,1
Recall Pr(Pbgttseg1)
Precision Pr(seg1Pbgtt)

31
Classifier Comparison
Goal
More Noise
More Signal
32
ROC vs. Precision/Recall
Truth
P N
P TP FP
N FN TN
Signal
ROC Curve Hit Rate TP / (TPFN) False Alarm
Rate FP / (FPTN) PR Curve Precision TP /
(TPFP) Recall TP / (TPFN)

/

/

/

/
33
Cue Calibration

All free parameters optimized on training data
All algorithmic alternatives evaluated by
experiment
Brightness Gradient
Scale, bin/kernel sizes for KDE
Color Gradient
Scale, bin/kernel sizes for KDE, joint vs.
marginals
Texture Gradient
Filter bank scale, multiscale?
Histogram comparison L1, L2, L?, ?2, EMD
Number of textons, Image-specific vs. universal
textons
Localization parameters for each cue

34
Calibration Example Number of Textons for the
Texture Gradient
35
Calibration Example 2 Image-Specific vs.
Universal Textons
36
Boundary Localization
Non-Boundaries
Boundaries
TG
(1) Fit cylindrical parabolas to raw oriented
signal to get local shape (Savitsky-Golay)
(2) Localize peaks
37
Dataflow
Pb
Image
Optimized Cues
Cue Combination
Brightness
Model
Color
Texture
38
Classifier Comparison
39
Cue Combinations
40
Alternate Approaches

Canny Detector
Canny 1986
MATLAB implementation
With and without hysteresis
Second Moment Matrix
Nitzberg/Mumford/Shiota 1993
cf. Förstner and Harris corner detectors
Used by Konishi et al. 1999 in learning framework
Logistic model trained on full eigenspectrum

41
Pb Images
Canny
2MM
Us
Human
Image
42
Pb Images II
Canny
2MM
Us
Human
Image
43
Pb Images III
Canny
2MM
Us
Human
Image
44
Two Decades of Boundary Detection
45
Findings

A simple linear model is sufficient for cue
combination
All cues weighted approximately equally in
logistic
Proper texture edge model is not optional for
complex natural images
Texture suppression is not sufficient!
Significant improvement over state-of-the-art in
boundary detection
Pb(x,y,?) useful for higher-level processing
Empirical approach critical for both cue
calibration and cue combination

46
Spatial priors on image regions and contours
47
Good Continuation

Wertheimer 23
Kanizsa 55
von der Heydt, Peterhans Baumgartner 84
Kellman Shipley 91
Field, Hayes Hess 93
Kapadia, Westheimer Gilbert 00
Parent Zucker 89
Heitger von der Heydt 93
Mumford 94
Williams Jacobs 95

48
Outline of Experiments

Prior model of contours in natural images
First-order Markov model
Test of Markov property
Multi-scale Markov models
Information-theoretic evaluation
Contour synthesis
Good continuation algorithm and results

49
Contour Geometry

First-Order Markov Model
( Mumford 94, Williams Jacobs 95 )
Curvature white noise ( independent from
position to position )
Tangent t(s) random walk
Markov property the tangent at the next
position, t(s1), only depends on the previous
tangent t(s)

t(s1)
s1
t(s)
s
50
Test of Markov Property
Segment the contours at high-curvature positions
51
Prediction Exponential Distribution

If the first-order Markov property holds
At every step, there is a constant probability p
that a high curvature event will occur
High curvature events are independent from step
to step
? Then the probability of finding a segment of
length k with no high curvature is (1-p)k

52
Empirical Distribution
Exponential ?
53
Empirical Distribution Power Law
Probability density
Contour segment length
54
Power Laws in Nature

Power Laws widely found in nature
Brightness of stars
Magnitude of earthquakes
Population of cities
Word frequency in natural languages
Revenue of commercial corporations
Connectivity in Internet topology
Usually characterized by self-similarity and
multi-scale phenomena

55
Multi-scale Markov Models

Assume knowledge of contour orientation at
coarser scales

t(s1)
s1
2nd Order Markov P( t(s1) t(s) , t(1)(s1)
) Higher Order Models P( t(s1) t(s) ,
t(1)(s1), t(2)(s1), )
t(s)
s
56
Information Gain in Multi-scale
14.6
of total entropy ( at order 5 )
H( t(s1) t(s) , t(1)(s1), t(2)(s1), )
57
Contour Synthesis
58
Multi-scale in Natural Images

Arbitrary viewing distance

Multi-scale in object shape

59
Conditioned on Object Size
Probability density
Contour segment length
60
Distribution of Region Convexity
61
Multi-scale Contour Completion

Coarse-to-Fine
Coarse-scale completes large gaps
Fine-scale detects details
Completed contours at coarser scales are used in
the higher-order Markov models of contour prior
for finer scales
P( t(s1) t(s) , t(1)(s1), )

62
Multi-scale Example
coarse scale
fine scale w/o multi-scale
fine scale w/ multi-scale
input
63
Comparison same number of edge pixels
Our result
Canny
64
Comparison same number of edge pixels
Our result
Canny
65
Outline

Collect and Validate Data
Learn Local Boundary Model (Low-Level Cues)
Learn Pixel Affinity Model (Mid-Level Cues)
Good representation for segmentation algorithms
Keeps segmentation in a probabilistic framework
Discussion and Conclusion

66
Dataflow
EstimatedAffinity
Image
Region Cues
E
Segment
Edge Cues

Eij affinity between pixels i and j
Representation for graph-theoretic segmentation
algorithms
Minimum Spanning Trees - Zahn 1971, Urquhart 1982
Spectral Clustering - Scott/Longuet-Higgins 1990,
Sarkar/Boyer 1996
Graph Cuts - Wu/Leahy 1993, Shi/Malik 1997,
Felzenszwalb/Huttenlocher 1998,
Gdalyahu/Weinshall/Werman 1999
Matrix Factorization - Perona/Freeman 1998
Graph Cycles - Jermyn/Ishikawa 2001

67
Pixel Affinity Cues

Patch similarity (?3)
Edges strength of intervening contour (?3)
Image plane distance

All cues calibrated withrespect to training data

Goal Learn affinity function from the
datasetusing these 7 cues

68
Dataflow
EstimatedAffinity (E)
Image
Region Cues
Segment
Edge Cues
69
Two Evaluation Methods
Gij
Eij

Precision-Recall of same-segment pairs
Precision is Pr(Gij1Eijgtt)
Recall is Pr(EijgttGij1)
Mutual Information between E and G

70
Mutual information
where x is a cue and y is indicator of being in
same segment
71
Individual Features
Gradients
Patches
72
The Distance Cue
cf. Bunskwik Kamiya 1953
73
Feature Pruning
Top-Down
Bottom-Up
4 Good cues Texture edge/patch, Color patch,
Brightness edge 2 Poor cues Color edge,
Brightness patch
74
Affinity Model vs. Humans
75
Results

Common Wisdom Use patches only / Use edges only
Finding Use both.
Common Wisdom Must use patches for texture
Finding Not true.
Common Wisdom Color is a powerful grouping cue
Finding True, but texture is better
Common Wisdom Brightness patches are a poor
cue Finding True (shadows)
Common Wisdom Proximity is a (Gestalt) grouping
cue Finding Proximity is a result, not a cause
of grouping

76
Outline

Collect and Validate Data
Learn Local Boundary Model (Low-Level Cues)
Learn Pixel Affinity Model (Mid-Level Cues)
Discussion and Conclusion

77
Contribution

Provide a mathematical foundation for the
grouping problem in terms of the ecological
statistics of natural images.
"When you can measure what you are speaking about
and express it in numbers, you know something
about it but when you cannot measure it, when
you cannot express it in numbers, your knowledge
is of the meager and unsatisfactory kind." --Lord
Kelvin

Write a Comment

User Comments (0)