Scenes and objects - PowerPoint PPT Presentation

About This Presentation
Title:

Scenes and objects

Description:

Scenes and objects – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 105
Provided by: antoniot
Category:
Tags: joey | objects | scenes

less

Transcript and Presenter's Notes

Title: Scenes and objects


1
6.870 Object Recognition and Scene Understanding
http//people.csail.mit.edu/torralba/courses/6.870
/6.870.recognition.htm
  • Lecture 6
  • Scenes and objects

2
Class business
  • Next Wednesday

3
  • Week 2 Objects without scenes
  • Week 5 Scenes without objects
  • Week 6 Scenes and objects

4
Why is detection hard?
5
Standard approach to scene analysis
6
Is local information enough?
7
With hundreds of categories
If we have 1000 categories (detectors), and each
detector produces 1 fa every 10 images, we will
have 100 false alarms per image pretty much
garbage
8
Is local information even enough?
9
Is local information even enough?
Information
Contextual features
Local features
Distance
10
The system does not care about the scene, but we
do
We know there is a keyboard present in this scene
even if we cannot see it clearly.
11
The multiple personalities of a blob
12
The multiple personalities of a blob
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
Look-Alikes by Joan Steiner
17
Look-Alikes by Joan Steiner
18
Look-Alikes by Joan Steiner
19
Why is context important?
  • Changes the interpretation of an object (or its
    function)
  • Context defines what an unexpected event is

20
The influence of an object extends beyond its
physical boundaries
21
The context challenge
How far can you go without using an object
detector?
22
What are the hidden objects?
1
2
23
What are the hidden objects?
Chance 1/30000
24
(No Transcript)
25
The importance of context
  • Cognitive psychology
  • Palmer 1975
  • Biederman 1981
  • Computer vision
  • Noton and Stark (1971)
  • Hanson and Riseman (1978)
  • Barrow Tenenbaum (1978)
  • Ohta, kanade, Skai (1978)
  • Haralick (1983)
  • Strat and Fischler (1991)
  • Bobick and Pinhanez (1995)
  • Campbell et al (1997)

26
Biederman 1972
  • Arrow appeared before or after picture.
  • Selected object from 4 pictures.

27
(No Transcript)
28
(No Transcript)
29
Biederman 1972
  • Better accuracy with normal scene and with
    pre-cue.
  • Coherence of surroundings affected object
    perception.
  • But, jumbled pictures had unnatural edge
    artifacts.

30
Palmer 1975
  • Scene preceded object to identify.
  • Better identification when preceded by a
    semantically consistent scene.

Objects seen for 20, 40, 60 or 120 ms.
31
Palmer
  • Scenes shown ahead of time for 2 s.
  • More accurate recognition of consistent objects
    than inconsistent objects.
  • Similar looking objects were misnamed, showing a
    bias effect.

32
Loftus Mackworth
  • Inconsistent objects fixated earlier and longer.
  • Suggested additional processing of objects out of
    context.
  • Similar results found by Friedman (1979).

33
De Graef et al. 1990
  • Prior results due to memory task?
  • Measured eye movements during non-object search
    task.

34
De Graef et al.
  • Inconsistent objects fixated longer than
    consistent objects.
  • Consistency effect only occurred after several
    fixations, 2 s.
  • Consistency effect not initially present in scene
    processing.

35
Object Detection
  • Biederman et al. 1982, relational violations

36
(No Transcript)
37
Biederman 1982
  • Pictures shown for 150 ms.
  • Objects in appropriate context were detected more
    accurately than objects in an inappropriate
    context.
  • Scene consistency affects object detection.

38
Objects and Scenes
  • Biedermans violations (1981)

39
Support
Golconde Rene Magritte
40
Interposition
Blank Check Rene Magritte
41
Size
The Listening Room Rene Magritte
42
Position, Probability
Personal Values Rene Magritte
43
Object Consistencies
Biederman et al (1982), DeGraef(1990).
44
Object Consistencies
Examples of inconsistencies
Biederman et al (1982), DeGraef(1990).
45
Contextual cueing
Chun Jiang, 1998
46
Object priming
Increasing contextual information
Torralba, Sinha, Oliva, VSS 2001
47
Object priming
Torralba, Sinha, Oliva, VSS 2001
48
Object priming
Car, pedestrian, mailbox,
?
p(object scene)
Torralba, Sinha, Oliva, VSS 2001
49
Object priming
Torralba, Sinha, Oliva, VSS 2001
50
Examples of consistent scenes (a), inconsistent
scenes (b), and isolated objects and backgrounds
(c) from Davenport Potter, 2004
51
But do we really need context?
52
Hollingworth Henderson
  • Concerns with object detection studies
  • Object label could bias results.
  • Location cue selectively helpful for consistent
    objects.
  • Controlled for false alarm biases with post-cue
    and 2AFC.
  • Failed to find consistency effects.

53
Hollingworth Henderson
  • Post-cue
  • 2AFC with object labels
  • Both consistent or inconsistent.
  • 2AFC with token discrimination.
  • E.g. sports car or sedan.
  • Proposed functional isolation model.

54
Who needs context anyway?We can recognize
objects even out of context
Banksy
55
Getting stuck
56
  • We need some signal to go up in order for
    top-down to work

57
Looking outside the bounding box
Outside the object (contextual features)
Inside the object (intrinsic features)
Object size
Pixels
Parts
Global appearance
Local context
Global context
Kruppa Shiele, (03), Fink Perona
(03) Carbonetto, Freitas, Barnard (03), Kumar,
Hebert, (03) He, Zemel, Carreira-Perpinan (04),
Moore, Essa, Monson, Hayes (99) Strat Fischler
(91), Torralba (03), Murphy, Torralba Freeman
(03)
Agarwal Roth, (02), Moghaddam, Pentland (97),
Turk, Pentland (91),Vidal-Naquet, Ullman,
(03) Heisele, et al, (01), Agarwal Roth, (02),
Kremp, Geman, Amit (02), Dorko, Schmid,
(03) Fergus, Perona, Zisserman (03), Fei Fei,
Fergus, Perona, (03), Schneiderman, Kanade (00),
Lowe (99) Etc.
58
CONDOR system
Strat and Fischler (1991)
  • Guzman (SEE), 1968
  • Noton and Stark 1971
  • Hansen Riseman (VISIONS), 1978
  • Barrow Tenenbaum 1978
  • Brooks (ACRONYM), 1979
  • Marr, 1982
  • Ohta Kanade, 1978
  • Yakimovsky Feldman, 1973

59
An Age of Scene Understanding
Ohta Kanade 1978
  • Guzman (SEE), 1968
  • Noton and Stark 1971
  • Hansen Riseman (VISIONS), 1978
  • Barrow Tenenbaum 1978
  • Brooks (ACRONYM), 1979
  • Marr, 1982
  • Ohta Kanade, 1978
  • Yakimovsky Feldman, 1973

60
Current approaches
  1. Scene to object dependencies
  2. Object to object dependencies

61
Levels of context
  • Context in low-level vision
  • Part-based models
  • Objects relations

Fix graph structures can be useful approximations
Long-range connections Weak constraints Multimodal

62
Current approaches
  1. Scene to object dependencies
  2. Object to object dependencies

63
Many object types co-occur
64
but this co-occurrence has a hidden common
cause the scene
streets
offices
It is easier to first recognize the scene, then
predict object presence, than running local
object classifiers
65
The layered structure of scenes
Assuming a human observer standing on the ground
In a display with multiple targets present, the
location of one target constraints the y
coordinate of the remaining targets, but not the
x coordinate.
66
The layered structure of scenes
Assuming a human observer standing on the ground
p(x2x1)
p(x)
In a display with multiple targets present, the
location of one target constraints the y
coordinate of the remaining targets, but not the
x coordinate.
Torralba, Oliva, Castelhano, Henderson. In press.
67
Detecting faces without a face detector
Torralba Sinha, 01 Torralba, 03
68
Context-based vision system for place and object
recognition
We use 17 annotated sequences for training
  • Hidden states location (63 values)
  • Observations vGt (80 dimensions)
  • Transition matrix encodes topology of environment
  • Observation model is a mixture of Gaussians
    centered on prototypes (100 views per place)

Torralba, Murphy, Freeman and Rubin. ICCV 2003
69
Our mobile rig
Torralba, Murphy, Freeman, Rubin. 2003
70
Place recognition demo
Shows the category and the identity of The place
when the system is confident. Runs at 4 fps on
Matlab.
Input image (120x160)
71
Identification and categorization of known places
Building 400
Outdoor AI-lab
Ground truth
System estimate
Specific location
Location category
Indoor/outdoor
Frame number
72
Previous place
Place recognition
Steerable pyr
Object priming
Scene features
Expected object position
73
Application of object detection for image
retrieval
Results using the keyboard detector alone
74
An integrated model of Scenes, Objects, and Parts
Scene
Ncar
P(Ncar S street)
N
0
1
5
P(Ncar S park)
Scene gist features
N
0
1
5
75
Application of object detection for image
retrieval
Results using the keyboard detector alone
Results using both the keyboard detector and the
global scene features
76
Global to local
  • Use global context to predict objects but there
    is no modeling of spatial relationships between
    objects.

Keyboards
Murphy, Torralba Freeman (03)
77
3d Scene Context
Image
World
Hoiem, Efros, Hebert ICCV 2005
78
3d Scene Context
Ped
Ped
Car
Hoiem, Efros, Hebert ICCV 2005
79
3D City Modeling using Cognitive Loops
N. Cornelis, B. Leibe, K. Cornelis, L. Van Gool.
CVPR'06
80
Current approaches
  1. Scene to object dependencies
  2. Object to object dependencies

81
Where should I put the silverware?
82
Sampling from the labels
83
Sampling from the labels
Cf. Hoiem et al Hays, Efros. Siggraph 2007
84
Contextual object relationships
Carbonetto, de Freitas Barnard (2004)
Kumar, Hebert (2005)
Torralba Murphy Freeman (2004)
E. Sudderth et al (2005)
Fink Perona (2003)
85
Object-Object Relationships
  • Fink Perona (NIPS 03)
  • Use output of boosting from other objects at
    previous iterations as input into boosting for
    this iteration

86
Pixel labeling using MRFs
  • Enforce consistency between neighboring labels,
    and between labels and pixels

Carbonetto, de Freitas Barnard, ECCV04
87
Beyond nearest-neighbor grids
  • Most MRF/CRF models assume nearest-neighbor graph
    topology
  • This cannot capture long-distance correlations

88
Dynamically structured trees
  • Each node pick its parents(Storkey Williams,
    PAMI03)
  • 2D SCFGs(Pollak, Siskind, Harper Bouman
    ICASSP03)

89
Object-Object Relationships
  • Use latent variables to induce long distance
    correlations between labels in a Conditional
    Random Field (CRF)

He, Zemel Carreira-Perpinan (04)
90
Object-Object Relationships
Kumar Hebert 2005
91
Hierarchical Sharing and Context
E. Sudderth, A. Torralba, W. T. Freeman, and A.
Wilsky.
  • Scenes share objects
  • Objects share parts
  • Parts share features

92
3d Scene Context
Image
Support
Vertical
Sky
V-Center
V-Left
V-Right
V-Porous
V-Solid
Hoiem, Efros, Hebert ICCV 2005
93
Detecting difficult objects
Maybe there is a mouse
Office
Start recognizing the scene
Torralba, Murphy, Freeman. NIPS 2004.
94
Detecting difficult objects
Detect first simple objects (reliable detectors)
that provide strong contextual constraints to the
target (screen -gt keyboard -gt mouse)
Torralba, Murphy, Freeman. NIPS 2004.
95
Detecting difficult objects
Detect first simple objects (reliable detectors)
that provide strong contextual constraints to the
target (screen -gt keyboard -gt mouse)
Torralba, Murphy, Freeman. NIPS 2004.
96
BRF for screen/keyboard/mouse
Iteration
97
BRF for screen/keyboard/mouse
Iteration
98
BRF for screen/keyboard/mouse
Iteration
99
BRF for screen/keyboard/mouse
Iteration
100
BRF for screen/keyboard/mouse
Iteration
101
BRF for car detection topology
102
BRF for car detection results
103
A car out of context is less of a car
From image
Thresholded beliefs
From detectors
Road
Car
Building
b
F
G
b
F
G
b
F
G
104
Context
  • or no context
Write a Comment
User Comments (0)
About PowerShow.com