Title: Computational Perception
1Computational Perception
- Jim Rehg
- CS 7636 Computational Perception
- Lecture 1-5
- Mon Jan 6, 2003
2CPR Curriculum Overview
Pattern Recognition CS 4803
Computer Vision CS 4495/7495
Machine Learning CS 4640
Intelligent Robotics CS 4630
Spring 03
Spring 03
Multi-Robot Systems CS 8803L
Multi-view Geometry CS 8803
Computational Perception CS 7635
Autonomous Robotics CS 4803/7630
Spring 03
3Course Objectives
- Explore problems and techniques in audio-visual
human sensing with an emphasis on learning from
data. - Techniques
- Graphical models
- Time series modeling (HMM, Kalman filter, SLDS)
- PCA and Factor Analysis
- Problems
- Face detection and recognition
- Figure tracking and gesture recognition
- Modeling motion and actions
- Speech recognition
4Overview
Auditory Scene Analysis
Computer Vision
Vision-Based Human Sensing
Speech Recognition
CS 7635
5A Basic Plan for the Semester
Representations, Constraints, and Priors which
are tuned for audio-visual tasks
6Overview of Graphical Models
- A graphical model represents a factored
probability distribution, where nodes are random
variables and arcs denote conditional dependence - Originally developed in a variety of contexts
- Probabilistic inference in expert systems
- Path analysis in ecology
- Graphical models provide a unifying framework for
the statistical techniques used in audio-visual
sensing PCA, HMM, Kalman filter.
7Examples of Graphical Models
- Factoring P(A,B)
- Naïve Bayes Classifier
- Mixture density
- Principle Components Analysis (PCA)
- Factor Analysis
- Hidden Markov Model
- Linear Dynamic System
- Switching Linear Dynamic System
8Skin Detection
- Skin can be detected in images based on its
color. - Example of an attentional mechanism.
- Quickly finding skin patchs can speed the search
for faces, limbs, etc. - Skin color is an example of a human invariant
(along with faces and skeletal motion) - Images containing people should contain skin
- We can hope to build a universal skin color model.
9Physics of Skin Color
- Skin color is due to melanin and hemoglobin.
- Hue (normalized color) of skin is largely
invariant across the human population. - Saturation of skin color varies with
concentration of melanin and hemoglobin (e.g.
lips). - Detailed color models exist for melanoma
identification using calibrated illumination. - But in general, observed skin color will be
effected by illuminant. (e.g. web images)
10A Statistical Skin Color Model
- Joint work with Michael Jones at Compaq CRL circa
1998. - M. Jones and J. M. Rehg, Statistical Color
Models with Application to Skin Detection, IJCV,
2001. - Data set
- 12,000 example photos sampled from a 2 million
image set obtained from an AltaVista web crawl. - Plan
- Construct skin and non-skin histograms from
labeled pixels. - Study distribution of skin color in web images.
- Compare effectiveness of histogram and mixture
density models.
11Some Example Photos
Example skin images
Example non-skin images
12Manually Segmenting Skin
Example skin images are segmented by hand
13Skin Color Histogram
Segmented skin regions produce a histogram in RGB
space showing the distribution of skin colors.
Three views of the same skin histogram are shown
14Non-Skin Color Histogram
Three views of the same non-skin histogram
showing the distribution of non-skin colors
15Histogram Skin Model
Skin histogram gives
Non-skin histogram gives count
for bin rgb
count for bin rgb P(rgb skin)
----------------------- P(rgb non-skin)
----------------------------
Total skin count
Total non-skin count
Bayes rule yields
P(rgb skin) P(skin) P(skin rgb)
--------------------------------------------------
------------------ P(rgb
skin) P(skin) P(rgb non_skin) P(non-skin)
16Likelihood Ratio Test
- But what choice for P(skin)?
- Define likelihood ratio test
- a is a parameter for tuning the ratio test
17Summary of Histogram Classifier
L
L
C
C
Parameter Optimization
Bayesian Inference
18ROC Curve
ROC Receiver Operating Characteristic (where
the receiver was a radar
antenna!)
Correct Detection Rate
False Detection Rate
19ROC Curve Summary
- ROC curve gives application independent measure
of classifier performance - Performance reports based on a single point on
the ROC curve are generally meaningless - Several possible scalar summaries
- Area under the ROC curve
- Equal error rate
- Compute ROC by iterating over the values of a
- Compute the correct detection and false positive
rates on the testing set for each value and plot.
20Example Results
21Skin Detector Performance
Extremely good results considering only color of
single pixel is being used. Best published
results (at the time) One of the largest datasets
used in a vision model (nearly 1 billion labeled
pixels).
Correct Detection Rate
False Detection Rate
22Analyzing the color distributions
Why does it work so well?
2D color histogram for photos on the
web projected onto a slice through the
3D histogram
Surface plot of the 2D histogram
23Contour Plots
Full color model (includes skin and non-skin)
24Contour Plots Continued
Non-skin model
Skin model
25Basic Measurement Models
- Discrete measurement
- Continuous measurement
Conditional ProbabilityTable (CPT) i.e.
histogram)
s
n
y
(Gaussian) Mixture Density
10
25
35
26Comparison to Mixture Models
- Both histogram and mixture models are examples of
graphical models. - Bin size controls generalization of histogram
- Size 32 gave the best performance
- Mixture models have often been used for skin
color modeling in small sample size cases. - We found histograms to give better accuracy
- They are also much faster to evaluate
- lt Show figures from CRL technical report gt
27Adult Image Detection
- Observation Adult images usually contain large
areas of skin - Output of skin detector can be used to create
feature vector for an image - Adult image classifier trained on feature vectors
- Exploring joint image/text analysis
Image
Skin Features
Neural net Classifier
Skin Detector
Adult?
Text Features
HTML
Classifier
28Adult Detection Examples
These images are all correctly classified as
adult images.
29More Examples
Classified as not adult
Incorrectly classified as adult - closups of
faces are a failure mode for the image-based
detector
Classified as not adult
Classified as not adult
30Performance of Adult Image Detector
31Adult Image Detection Results
Two sets of html pages collected. Crawl A Adult
sites (2365 pages, 11323 images). Crawl B
Non-adult sites (2692 pages, 13973 images).
image-based text-based combined OR
detector detector
detector
-----------------
------------- ------------------- of
adult images rated correctly
(set A) 85.8 84.9
93.9 of non-adult images rated
correctly (set B) 92.5
98.9 92.0
32Cost Analysis
- General image properties
- Average width 301 pixels
- Average height 269 pixels
- Time to read an image .078 sec
- Skin Color Based Adult Image Detector
- Time to classify .043 sec
- Implies 23 images/sec throughput
33Application to Adult Image Filtering
- Adult photo detector based on skin features.
- Face analysis could prevent portraits from being
blocked due to skin content. - Could provide crude browser-side filtering.
- Complements page rating services
- Automatic ranking of crawled pages.
- Not a substitute for manual inspection.
- Focus attention on most likely offensive pages.
- Text analysis can be used to improve accuracy.
34Person Detection From Skin Detection
- Skin detector gives evidence for the presence of
people, but has false positives and negatives. - Use skin detector output for person detection
- Construct feature vector from detected skin
pixels. - Classify image into person/non-person
- Features
- Percent of pixels in image detected as skin
- Average probability of skin pixels
- Largest connected component of skin
35Person Detection Example Results
Person
Person
No Person
36Person Detection Results Continued
No Person
No Person
Person
37Person Detector Performance
Two classifiers were built using these measures
on 1400 training images. A test set of 456
images was used to evaluate the classifier.
Classifier Performance
Training Testing
examples
examples Neural network 76.2
74.3 Decision tree 75.8 72.1
38Applications of Person Detection
- Person Detected tag for media search
- Skin and face analysis tag photos and video
frames with people in them. - Improved ranking of query returns Photos of
people appear at top of list. - Image similarity measure
- Photos with people in them are grouped together.
- Can be used during query refinement.
39Summary
- What are the factors that made skin detection
successful? - Problem which seemed hard a priori but turned out
to be easy (classes surprisingly separable). - Low dimensionality makes adequate data collection
feasible and classifier design a non-issue. - Intrinisic dimensions are clear a priori
- Concentration of nonskin model along grey line is
completely predictable from the design of
perceptual color spaces
40Summary
- Additional factors in success of skin detection
- Assignment of output class (skin vs. nonskin) is
straight-forward. - Computational cost is low because it is
unnecessary to consider spatial arrangement of
pixels. - Problems with richer output classes (adult image,
happy faces, etc.) will be less favorable in all
of these aspects. - But our line of attack will be the same!