Sensor, Motion - PowerPoint PPT Presentation

1 / 53

About This Presentation

Title:

Sensor, Motion

Description:

Sensor, Motion – PowerPoint PPT presentation

Number of Views:32

Avg rating:3.0/5.0

Slides: 54

Provided by: ser114

Category:

more less

Transcript and Presenter's Notes

Title: Sensor, Motion

1
Sensor, Motion Temporal Planning

PhD Defense for
Ser-Nam Lim
Department of Computer Science
University of Maryland, College Park

2
Outline

Two-camera background subtraction
Invariant to shadows, lighting changes.
Multi-camera background subtraction and tracking
Occlusions.
Active camera
Predictive tracking.
Motion, temporal planning.
Camera scheduling.
Abandoned package detection
Severe occlusions.
Temporal analysis in a statistical framework to
minimize reliance on thresholding.

3
1. Two-camera Background Subtraction

Details given during proposal.
"Fast Illumination-invariant Background
Subtraction using Two Views Error Analysis,
Sensor Placement and Applications", IEEE CVPR
2005.

4
Problem Description

Single-camera background subtraction
Shadows,
Illumination changes, and
Specularities.
Disparity-based background subtraction
Can overcome many of these problems, BUT
Slow and
Inaccurate online matches.

5
Two-Camera Algorithm

Real time, two-camera background subtraction
Develop a fast two camera background subtraction
algorithm that doesnt require solving the
correspondence problem online.
Analyze advantages of various camera
configurations with respect to robustness of
background subtraction.

6
Fast Illumination-invariant Two-cameras Approach

Clever idea due to Ivanov et. al.
Yuri A. Ivanov, Aaron F. Bobick and John Liu,
Fast Lighting Independent Background
Subtraction, IEEE Workshop on Visual
Surveillance, ICCV'98, Bombay, India, January
1998.
Intuition
Established background conjugate pixels offline.
Color differences between conjugate pixels.
What are the problems?
False and missed detections caused by homogeneous
objects.

7
Intuition
Color difference still small with shadow
Color difference of the image point in both
cameras are small when building the background
8
False Detections
Reference camera
Happens when object is close to background.
Big color difference even though its background!!
9
Missed Detections
Reference camera
Background occluded!! Both cameras see color on
the truck, so small color difference if
homogeneous
10
Eliminate False Detections

Place the two cameras vertical to each other with
respect to the ground plane on which object moves

11
Reference camera
Now, whenever refcam sees background, the other
cam too
Big color difference even though its background!!
12
Reducing Missed Detections

Initial detection free of false detections.
And the missed detections form a component
adjacent to the ground plane.
Utilize stereo matching of the initial detection
to infer height and fill up the missed portion.

13
Refcam
Infer height through selective stereo
14
Advantages

FAST!! No online stereo matching.
Invariant to shadows, lighting changes.
Invariant to specularities
Through a height-inferring process.
Detect near-background object
Difficult problem with disparity-based background
subtraction.
Accurate
Offline stereo matching can be computational
intensive.
Human intervention can be used.

15
Experiments Lighting Changes
16
Experiments - Specularities
17
Experiments - Specularities
18
Experiments Near Background
19
Experiments - Indoor
20
2. Multi-camera Detection and Tracking Under
Occlusions

Preparing for submission.

21
Problem

Severe occlusions make detection and tracking
difficult.
We often need to observe highly occluded places!!
Partial and full occlusions.

22
Algorithm Outline

Silhouette detection on a per-camera basis.
Count people in a top view.
Constrained stereo.
Sensor fusion particle filter.

23
Silhouette Detection background subtraction
24
People Counting

Project the foreground silhouettes onto a common
ground plane do it for every available camera.
Intersect projections of different cameras.
Obtains a set of polygons, that possibly contain
valid objects.
Number of polygons is a rough estimate of the
number of people in the scene.

25
Phantom polygon
Camera 1
Camera 2
26
Selective Stereo
Correct vertical line
Epipolar line
Wrong vertical line
Good color matching
Phantom polygon.
Ground plane pixel
27
Constrained Stereo
Vertical line
Correct vertical line
Wrong vertical line
Foreground pixel
Good color matching
Bad color matching
Phantom polygon.
Epipolar line
Mapped candidate ground plane pixels
Candidate ground plane pixels
Camera 1 view
Camera 2 view
28
Note that only the visible foreground pixels are
successfully segmented based on selective stereo
with one pair.
Partial and full occlusions need to be dealt
with multiple camera fusion. How??
29
Additional Consideration Sensor Fusion

Choosing the best stereo pairs for performing
stereo matching guided by particle filter.

30
Count people

Use
Danny Yang, Hector H. Gonzalez-Baònos, Leonidas
J. Guibas, Counting People in Crowds with a
Real-Time Network of Simple Image Sensors, ICCV,
2003.
Notice the errors!!

31
Final Results
32
3. Active Camera

Submitted to ACM Multimedia System Journal.
Submitted to ACM Multimedia 2006.
Constructing Task Visibility Intervals for
Surveillance Systems, VSSN Workshop, ACM
Multimedia 2005.
A Scalable Image-based Multi-camera Visual
Surveillance System, AVSS 2003.

33
Problem Description

Given
Collection of calibrated PTZ cameras and
Large surveillance site.
How to control cameras to acquire surveillance
videos?
Why collect surveillance videos?
Collect k secs of unobstructed video from as
close to a side angle as possible for gait
recognition.
Collect unobstructed video of person near any ROI.

34
Project Goals - Visual Requirements

Targets have to be unobstructed in the collected
videos during useful video collections.
Involves predicting object trajectories in the
field of regard based on tracking.
Targets have to be in the field of view in the
collected videos.
Constrains PT parameters for cameras as a
function of time during periods of visibility.
Targets have to satisfy some task-specific
minimum resolutions in the collected videos.
Constrains Z parameter.

35
Project Goals - Performance Requirements

Scheduling cameras to maximize task coverage.
Determine future time intervals within which
visual requirements of tasks are satisfied
We first do this for each camera, task pair.
We then combine these solutions across tasks and
then cameras to schedule tasks.

36
System Timeline

For every (camera, task, object) tuple
Detection and tracking using existing methods.
Predict future locations of objects.
Visibility analysis, to predict period during
which objects are visible visibility intervals.
Determine allowable camera settings over time,
within these visibility intervals to form Task
Visibility Intervals (TVIs).
Composite TVIs to form Multiple Task Visibility
Intervals (MTVIs) - scalability.
Scheduling scalability.

37
Predicting Future Location

Represent object as sphere.
For computational efficiency, each sphere
represented as triplet of circular shadows on the
projection planes for visibility analysis
Extrapolate the motion of each shadow for
predicting their future locations.
Each shadow move in a straight line in the
predicted path, and its radius is grows linearly
to capture the positional uncertainty.

38
Predictive Tracking Experiments
39
Visibility Analysis

With the predicted locations, we can represent
the extremal angle trajectories over time of each
shadow in closed-form
Extremal angles are the angles subtended by the
pair of tangent points.

Straight line trajectory
Shadows radius increases over time
Extremal angle of one tangent point
Camera center
40

The extremal angle trajectories of two different
objects, are equated to find time intervals
(intersections) when occlusion occurs occlusion
intervals
Complements of occlusion intervals are the
visibility intervals.
Can do this for every object pair. But can be
more efficient using an optimal segment
intersection algorithm (details given in
dissertation).

41
Efficient Segment Intersection vs Brute Force
42
(No Transcript)
43
Task Visibility Intervals (TVIs)

Combine allowable camera settings over time with
visibility intervals to form TVIs.
Allowable camera settings are determined at each
future time step in the visibility interval
Iterates through range of pan, tilt and zoom
settings, and determine time intervals during
which PTZ ranges exist that satisfy task-specific
resolution.
For efficiency, use a piecewise approximation to
the PTZ range.
These TVIs must also satisfy the required length
of collected video.

44
Multiple Task Visibility Intervals (MTVIs)

TVIs can be combined if
Common time intervals exist that are at least as
long as the maximum required processing times
among all the tasks involved.
Common camera settings exist in these common time
intervals.
For efficiency, TVIs can be combined with a
plane-sweep algorithm.

45
Zoom
46
Camera Scheduling

Scheduling based on the constructed (M)TVIs.
Two methods are compared
Branch and bound.
Greedy.

Define slack ? as
? t?-, t? r, d p,
where d is the deadline, r is the earliest
release time and p is the processing time
(duration of task).
Let ? be t? - t?-.
It can be shown that if ?max lt pmin, then in
any feasible schedule, the (M)TVIs must be
ordered by r.

Each camera can then be modeled with a acyclic
graph with source and sink, with the nodes being
the (M)TVIs and the edge being the number of
tasks covered on moving from one node to another.
The sink of the graph of one camera is linked to
the source of the graph of another camera
cascading.

49
Example
1
4
0
0
2
2
0
0
s1
t1
2
s2
t2
2
2
5
2
0
2
0
3
6
0
0
7
2
0
0
0
s3
t3
2
t
8
2
0
9
50

Dynamic Programming (DP) is run on the
multi-camera graph
Equivalent to greedy algorithm, BUT
Branch Bound look at what are the tasks other
cameras in the graph can potentially covered
while running DP backtracking.

51
Approximation Factors Branch Bound vs Greedy

Given k cameras, the approximation factor for
multi-camera scheduling using the greedy
algorithm is 2 k??, where ? and ? are variables
representing the distribution of tasks among the
cameras.
Proof in dissertation.
Important depends on the number of cameras,
i.e., does not scale well to large camera
networks!!

For k cameras, the approximation factor of the
branch and bound algorithm is
Proof in dissertation - ? and u are task
distribution factors.
Important insensitive to number of cameras!!

53
Performance Simulations
54
Experiments Face Capture
55
Experiments Full Body Video
56
Experiments Lower Resolution
57
Experiments Higher Resolution
58
4. Abandoned Package Detection under Severe
Occlusions

A short overview.
Refer to dissertation for details.
Preparing for submission.

59
Constraints

No background frame available.
Constant foreground motion.
Constant occlusion.
Single camera.

60
Algorithm

PDF for motion detection, Pd
Observe successive frame differences.
Assume pdf is zero-mean extract the
zero-centered mode.
PDF for background model, Pb
Histogram frequency computed based on joint
probability with Pd.
Intuition true background pixels should observe
no motion.

PDF of static pixels that are foreground,
conditioned on Pb and Pd
Intuition pixels belonging to abandoned
packages are static foreground pixels.
MRF to label these pixels. Avoid thresholding.
Evaluate the clusters based on temporal
persistency of shape (Hausdorff) and intensities.

62
Experiments
63
(No Transcript)
64
(No Transcript)
65
Conclusions

The role of sensor placement in detections
Highlighted in two-camera background subtraction.
The role of sensor placement/selection in
tracking under occlusions
Improve stereo matching by choosing different
stereo pairs based on a particle filter.
Active camera system
A challenge to deploy in real world applications.
Depends a lot on predictive tracking, how can we
improve it?
Left-baggage detection
What if the baggage is invisible (e.g., bomb left
in trash can!!)?

66
Thanks!!