Ecological Statistics and Visual Grouping - PowerPoint PPT Presentation

1 / 77
About This Presentation
Title:

Ecological Statistics and Visual Grouping

Description:

Ecological Statistics and Visual Grouping – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 78
Provided by: david2938
Category:

less

Transcript and Presenter's Notes

Title: Ecological Statistics and Visual Grouping


1
Ecological Statistics and Visual Grouping
  • Jitendra Malik
  • U.C. Berkeley

2
Collaborators
  • David Martin
  • Charless Fowlkes
  • Xiaofeng Ren

3
From Images to Objects
"I stand at the window and see a house, trees,
sky. Theoretically I might say there were 327
brightnesses and nuances of colour. Do I have
"327"? No. I have sky, house, and trees." --Max
Wertheimer
4
Grouping factors
5
Critique
  • Predictive power
  • Factors for complex, natural stimuli ?
  • How do they interact ?
  • Functional significance
  • Why should these be useful or confer some
    evolutionary advantage to a visual organism?
  • Brain mechanisms
  • How are these factors implemented given what we
    know about V1 and higher visual areas?

6
(No Transcript)
7
Our approach
  • Creating a dataset of human segmented images
  • Measuring ecological statistics of various
    Gestalt grouping factors
  • Using these measurements to calibrate and
    validate approaches to grouping

8
Natural Images arent generic signals
  • Edges/Filters/Coding Ruderman 1994/1997,
    Olshausen/Field 1996, Bell/Sejnowski 1997,
    Hateren/Schaaf 1998, Buccigrossi/Simoncelli 1999,
    Alvarez/Gousseau/Morel 1999, Huang/Mumford 1999
  • Range Data Huang/Lee/Mumford 2000

9
Brunswik Kamiya 1953
  • Unification of two important theories of
    perception
  • Statistical/Bayesian formulation, dueto
    Helmholtz Likelihood Principle
  • Gestalt Psychology
  • Attempted an empirical proof of the Gestalt
    grouping rule of proximity
  • 892 separations
  • Ahead of his time.
  • Now we have the tools to do this.

Egon Brunswik (1903-1955)
10
Outline
  1. Collect Data
  2. Learn Local Boundary Model (Low-Level Cues)
  3. Learn Pixel Affinity Model (Mid-Level Cues)
  4. Discussion and Conclusion

11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
Protocol
  • You will be presented a photographic image.
    Divide the image into some number of segments,
    where the segments represent things or parts
    of things in the scene. The number of segments
    is up to you, as it depends on the image.
    Something between 2 and 30 is likely to be
    appropriate. It is important that all of the
    segments have approximately equal importance.
  • Custom segmentation tool
  • Subjects obtained from work-study program (UC
    Berkeley undergraduates)

15
(No Transcript)
16
(No Transcript)
17
Segmentations are Consistent
Perceptual organization forms a tree
Image
BG
L-bird
R-bird
bush
far
grass
beak
body
beak
body
head
eye
eye
head
Two segmentations are consistent when they can
be explained by the same segmentation tree (i.e.
they could be derived from a single perceptual
organization).
  • A,C are refinements of B
  • A,C are mutual refinements
  • A,B,C represent the same percept
  • Attention accounts for differences

18
(No Transcript)
19
Dataset Summary
  • 30 subjects, age 19-23
  • 17 men, 13 women
  • 9 with artistic training
  • 8 months
  • 1,458 person hours
  • 1,020 Corel images
  • 11,595 Segmentations
  • 5,555 color, 5,554 gray, 486 inverted/negated

20
Gray, Color, InvNeg Datasets
  • Explore how various high/low-level cues affect
    the task of image segmentation by subjects
  • Color full color image
  • Gray luminance image
  • InvNeg inverted negative luminance image

21
Color
Gray
InvNeg
22
InvNeg
23
Color
Gray
InvNeg
24
Outline
  • Collect Data
  • Learn Local Boundary Model (Low-Level Cues)
  • The first step in human vision finding edges
  • Required for all segmentation algorithms
  • Learn Pixel Affinity Model (Mid-Level Cues)
  • Discussion and Conclusion

25
Dataflow
Pb
Image
Boundary Cues
Cue Combination
Brightness
Model
Color
Texture
Challenges texture cue, cue combination Goal
learn the posterior probability of a boundary
Pb(x,y,?) from local information only
26
(No Transcript)
27
Brightness and Color Features
  • 1976 CIE Lab colorspace
  • Brightness Gradient BG(x,y,r,?)
  • ?2 difference in L distribution
  • Color Gradient CG(x,y,r,?)
  • ?2 difference in a and b distributions

28
Texture Feature
  • Texture Gradient TG(x,y,r,?)
  • ?2 difference of texton histograms
  • Textons are vector-quantized filter outputs

29
Cue Combination Models
  • Classification Trees
  • Top-down splits to maximize entropy, error
    bounded
  • Density Estimation
  • Adaptive bins using k-means
  • Logistic Regression, 3 variants
  • Linear and quadratic terms
  • Confidence-rated generalization of AdaBoost
    (SchapireSinger)
  • Hierarchical Mixtures of Experts (JordanJacobs)
  • Up to 8 experts, initialized top-down, fit with
    EM
  • Support Vector Machines (libsvm, ChangLin)
  • Gaussian kernel, ?-parameterization
  • Range over bias, complexity, parametric/non-parame
    tric

30
Computing Precision/Recall
  • Recall Pr(signaltruth) fraction of ground
    truth found by the signal
  • Precision Pr(truthsignal) fraction of signal
    that is correct
  • Always a trade-off between the two
  • Standard measures in information retrieval (van
    Rijsbergen XX)
  • ROC from standard signal detection the wrong
    approach
  • Strategy
  • Detector output (Pb) is a soft boundary map
  • Compute precision/recall curve
  • Threshold Pb at many points t in 0,1
  • Recall Pr(Pbgttseg1)
  • Precision Pr(seg1Pbgtt)

31
Classifier Comparison
Goal
More Noise
More Signal
32
ROC vs. Precision/Recall
Truth
P N
P TP FP
N FN TN
Signal
ROC Curve Hit Rate TP / (TPFN) False Alarm
Rate FP / (FPTN) PR Curve Precision TP /
(TPFP) Recall TP / (TPFN)




/




/




/




/
33
Cue Calibration
  • All free parameters optimized on training data
  • All algorithmic alternatives evaluated by
    experiment
  • Brightness Gradient
  • Scale, bin/kernel sizes for KDE
  • Color Gradient
  • Scale, bin/kernel sizes for KDE, joint vs.
    marginals
  • Texture Gradient
  • Filter bank scale, multiscale?
  • Histogram comparison L1, L2, L?, ?2, EMD
  • Number of textons, Image-specific vs. universal
    textons
  • Localization parameters for each cue

34
Calibration Example Number of Textons for the
Texture Gradient
35
Calibration Example 2 Image-Specific vs.
Universal Textons
36
Boundary Localization
Non-Boundaries
Boundaries
TG
(1) Fit cylindrical parabolas to raw oriented
signal to get local shape (Savitsky-Golay)
(2) Localize peaks
37
Dataflow
Pb
Image
Optimized Cues
Cue Combination
Brightness
Model
Color
Texture
38
Classifier Comparison
39
Cue Combinations
40
Alternate Approaches
  • Canny Detector
  • Canny 1986
  • MATLAB implementation
  • With and without hysteresis
  • Second Moment Matrix
  • Nitzberg/Mumford/Shiota 1993
  • cf. Förstner and Harris corner detectors
  • Used by Konishi et al. 1999 in learning framework
  • Logistic model trained on full eigenspectrum

41
Pb Images
Canny
2MM
Us
Human
Image
42
Pb Images II
Canny
2MM
Us
Human
Image
43
Pb Images III
Canny
2MM
Us
Human
Image
44
Two Decades of Boundary Detection
45
Findings
  • A simple linear model is sufficient for cue
    combination
  • All cues weighted approximately equally in
    logistic
  • Proper texture edge model is not optional for
    complex natural images
  • Texture suppression is not sufficient!
  • Significant improvement over state-of-the-art in
    boundary detection
  • Pb(x,y,?) useful for higher-level processing
  • Empirical approach critical for both cue
    calibration and cue combination

46
Spatial priors on image regions and contours
47
Good Continuation
  • Wertheimer 23
  • Kanizsa 55
  • von der Heydt, Peterhans Baumgartner 84
  • Kellman Shipley 91
  • Field, Hayes Hess 93
  • Kapadia, Westheimer Gilbert 00
  • Parent Zucker 89
  • Heitger von der Heydt 93
  • Mumford 94
  • Williams Jacobs 95

48
Outline of Experiments
  • Prior model of contours in natural images
  • First-order Markov model
  • Test of Markov property
  • Multi-scale Markov models
  • Information-theoretic evaluation
  • Contour synthesis
  • Good continuation algorithm and results

49
Contour Geometry
  • First-Order Markov Model
  • ( Mumford 94, Williams Jacobs 95 )
  • Curvature white noise ( independent from
    position to position )
  • Tangent t(s) random walk
  • Markov property the tangent at the next
    position, t(s1), only depends on the previous
    tangent t(s)

t(s1)
s1
t(s)
s
50
Test of Markov Property
Segment the contours at high-curvature positions
51
Prediction Exponential Distribution
  • If the first-order Markov property holds
  • At every step, there is a constant probability p
    that a high curvature event will occur
  • High curvature events are independent from step
    to step
  • ? Then the probability of finding a segment of
    length k with no high curvature is (1-p)k

52
Empirical Distribution
Exponential ?
53
Empirical Distribution Power Law
Probability density
Contour segment length
54
Power Laws in Nature
  • Power Laws widely found in nature
  • Brightness of stars
  • Magnitude of earthquakes
  • Population of cities
  • Word frequency in natural languages
  • Revenue of commercial corporations
  • Connectivity in Internet topology
  • Usually characterized by self-similarity and
    multi-scale phenomena

55
Multi-scale Markov Models
  • Assume knowledge of contour orientation at
    coarser scales

t(s1)
s1
2nd Order Markov P( t(s1) t(s) , t(1)(s1)
) Higher Order Models P( t(s1) t(s) ,
t(1)(s1), t(2)(s1), )
t(s)
s
56
Information Gain in Multi-scale
14.6
of total entropy ( at order 5 )
H( t(s1) t(s) , t(1)(s1), t(2)(s1), )
57
Contour Synthesis
58
Multi-scale in Natural Images
  • Arbitrary viewing distance
  • Multi-scale in object shape

59
Conditioned on Object Size
Probability density
Contour segment length
60
Distribution of Region Convexity
61
Multi-scale Contour Completion
  • Coarse-to-Fine
  • Coarse-scale completes large gaps
  • Fine-scale detects details
  • Completed contours at coarser scales are used in
    the higher-order Markov models of contour prior
    for finer scales
  • P( t(s1) t(s) , t(1)(s1), )

62
Multi-scale Example
coarse scale
fine scale w/o multi-scale
fine scale w/ multi-scale
input
63
Comparison same number of edge pixels
Our result
Canny
64
Comparison same number of edge pixels
Our result
Canny
65
Outline
  • Collect and Validate Data
  • Learn Local Boundary Model (Low-Level Cues)
  • Learn Pixel Affinity Model (Mid-Level Cues)
  • Good representation for segmentation algorithms
  • Keeps segmentation in a probabilistic framework
  • Discussion and Conclusion

66
Dataflow
EstimatedAffinity
Image
Region Cues
E
Segment
Edge Cues
  • Eij affinity between pixels i and j
  • Representation for graph-theoretic segmentation
    algorithms
  • Minimum Spanning Trees - Zahn 1971, Urquhart 1982
  • Spectral Clustering - Scott/Longuet-Higgins 1990,
    Sarkar/Boyer 1996
  • Graph Cuts - Wu/Leahy 1993, Shi/Malik 1997,
    Felzenszwalb/Huttenlocher 1998,
    Gdalyahu/Weinshall/Werman 1999
  • Matrix Factorization - Perona/Freeman 1998
  • Graph Cycles - Jermyn/Ishikawa 2001

67
Pixel Affinity Cues
  1. Patch similarity (?3)
  2. Edges strength of intervening contour (?3)
  3. Image plane distance
  • All cues calibrated withrespect to training data
  • Goal Learn affinity function from the
    datasetusing these 7 cues

68
Dataflow
EstimatedAffinity (E)
Image
Region Cues
Segment
Edge Cues
69
Two Evaluation Methods
Gij
Eij
  • Precision-Recall of same-segment pairs
  • Precision is Pr(Gij1Eijgtt)
  • Recall is Pr(EijgttGij1)
  • Mutual Information between E and G

70
Mutual information
where x is a cue and y is indicator of being in
same segment
71
Individual Features
Gradients
Patches
72
The Distance Cue
cf. Bunskwik Kamiya 1953
73
Feature Pruning
Top-Down
Bottom-Up
4 Good cues Texture edge/patch, Color patch,
Brightness edge 2 Poor cues Color edge,
Brightness patch
74
Affinity Model vs. Humans
75
Results
  1. Common Wisdom Use patches only / Use edges only
    Finding Use both.
  2. Common Wisdom Must use patches for texture
    Finding Not true.
  3. Common Wisdom Color is a powerful grouping cue
    Finding True, but texture is better
  4. Common Wisdom Brightness patches are a poor
    cue Finding True (shadows)
  5. Common Wisdom Proximity is a (Gestalt) grouping
    cue Finding Proximity is a result, not a cause
    of grouping

76
Outline
  1. Collect and Validate Data
  2. Learn Local Boundary Model (Low-Level Cues)
  3. Learn Pixel Affinity Model (Mid-Level Cues)
  4. Discussion and Conclusion

77
Contribution
  • Provide a mathematical foundation for the
    grouping problem in terms of the ecological
    statistics of natural images.
  • "When you can measure what you are speaking about
    and express it in numbers, you know something
    about it but when you cannot measure it, when
    you cannot express it in numbers, your knowledge
    is of the meager and unsatisfactory kind." --Lord
    Kelvin
Write a Comment
User Comments (0)
About PowerShow.com