Recovering Human Body Configurations: Combining Segmentation and Recognition - PowerPoint PPT Presentation

About This Presentation
Title:

Recovering Human Body Configurations: Combining Segmentation and Recognition

Description:

Create a skeleton of their pose. Create a segmentation mask of the person ... Body building ... Body building: slimming down. Reduces to ~1000 partial configurations ... – PowerPoint PPT presentation

Number of Views:88
Avg rating:3.0/5.0
Slides: 35
Provided by: matthe95
Learn more at: https://cseweb.ucsd.edu
Category:

less

Transcript and Presenter's Notes

Title: Recovering Human Body Configurations: Combining Segmentation and Recognition


1
Recovering Human Body Configurations Combining
Segmentation and Recognition
  • Greg Mori, Xiaofeng Ren, and Jitentendra Malik
    (UC Berkeley)
  • Alexei A. Efros (Oxford)

2
The goal
  • Given an image
  • Detect a human figure
  • Localize joints and limbs
  • Create a skeleton of their pose
  • Create a segmentation mask of the person

3
Other approaches Simple features
  • Model people as generalized cylinders (1980s)
  • Easily implemented bottom up
  • Often use tree to express relations
  • Problems
  • Cylinders are common
  • Often dependencies between body parts
  • Really need context

4
Other approaches Probable pose
  • Often use probable pose
  • Template matching
  • Top down constraints on pose
  • But even highly improbable poses are still
    possible

5
Other approaches Frequent simplifications
  • Nude models
  • Limited poses
  • Background subtraction or limited clutter

6
Arguably the most difficult recognition problem
in computer vision
  • Variation in clothing
  • Variation in limbs
  • Variation in pose

7
Solution Islands of Saliency
  • Use low-level features that are informative
    independent of context
  • Based on these islands, one is able to fill in
    gaps with context

8
Algorithm
9
Algorithm Segmenting into regions and superpixels
10
Segmentation
  • Combine boundary finder (Martin et al., 2002)
    with Normalized Cuts (Malik, Belongie, et al.,
    2001)
  • Groups similar pixels into regions

11
Segmentation Regions
  • 40 regions
  • Most salient parts of body become regions
  • Limbs usually two half-limbs

12
Segmentation Superpixels
  • 200 region (oversegmentation)
  • Retains virtually all structures in original
  • Still reduces complexity from 400,000 pixels to
    200 superpixels

13
Algorithm Finding salient limbs and torsos
14
Finding limbs
  • Candidates all 40 regions
  • Four cues for half-limb detection
  • Contour Probability of the boundary
  • Average probability of the regions boundary, as
    measured by Martins boundary finder
  • Shape How close to a rectangle
  • Area of overlap with reconstructed rectangle,

15
Find limbs
  • Shading
  • Limbs are roughly cylindrical, so should have 3D
    pop out due to shading
  • Compare Ix-, Ix, Iy-, Iy for region to mean of
    Ix-, Ix, Iy-, Iy for training set
  • Focus cue
  • Background is often not in focus
  • Cfocus Ehigh/(a Elow b)

16
Finding limbs
  • Cues are combined by summing
  • Use logistic regression to learn weights
    (training set of hand-labeled half-limbs)

17
Evaluation Cues
Number of hits
Number of candidates generated
18
Evaluation Performance
19
Evaluation summary
  • Not very good detectors
  • Strength of boundary best cue
  • Combining cues yields better performance
  • On average 4.08 of top 8 candidates produced were
    hits
  • 89 have at least 3 hits among top 8
  • Motivates search for 3 half-limbs combined with
    head and torso

20
Finding torsos
  • Unlike half-limbs, typically several regions
  • Consider all sets of adjacent regions within some
    range of total sizes
  • Set of cues
  • Contour
  • Shape
  • Focus
  • (No shading)

21
Finding torsos
  • Find orientation of torso
  • Find best matching head
  • Again contour, shape, and focus cues with shape a
    disk
  • Score for torso, score for head, and score for
    relative positions of head to torso multiplied to
    create score for oriented torso

22
(No Transcript)
23
Evaluation
  • Success if all four torso points within 60 pixels
    of ground truth

24
Algorithm Pruning to form partial configurations
25
Body building
  • From 5-7 half-limbs and 50 candidate oriented
    torsos form partial configurations consisting of
  • Each torso
  • Three half limbs assigned each assigned to
  • One of 8 half limb body parts
  • One of two polarities
  • 2-3 million partial configurations!

26
Enforce constraints
  • Relative widths
  • Foreshortening doesnt affect width of limbs much
  • Use anthropomorphic data to rule out limbs more
    than 4 standard deviations wider than expected
  • Length of limbs relative to torso
  • Assume torso not too foreshortened
  • No more than /- 40 angle with image plane
  • Again, prune limbs more than 4 standard
    deviations away from mean length, relative to
    torso
  • Seems to be making some assumptions of probable
    pose

27
Enforce constraints
  • Adjacency
  • Upper limbs must be adjacent to torso
  • Lower limbs must be adjacent to upper limbs
  • Symmetry in clothing color histograms must not
    be overly dissimilar for corresponding segments
  • E.g. right and left upper arms should be similar
  • Makes some small assumptions about variations in
    clothing

28
Body building slimming down
  • Reduces to 1000 partial configurations
  • Sorted by linear combination of the torso and the
    three half-limb scores
  • (This score can be used to improve torso
    detection)

29
Algorithm
30
Extending to full limbs
  • Adding additional rectangles evaluated on
    adjacent superpixels to empty limb joints
  • Want high internal similarity and high
    dissimilarity to surroundings

31
Algorithm
32
(No Transcript)
33
(No Transcript)
34
Summary
  • Arguably the most difficult problem in computer
    vision
  • Not solved here
  • Method here is appealing
  • Dont need to store exemplars
  • Island of saliency approach seems useful in many
    contexts
  • Use some configural knowledge to make reasonable
    guesses
  • Good illustration of integrating recognition and
    segmentation
Write a Comment
User Comments (0)
About PowerShow.com