Tracking Across Multiple Cameras With Disjoint Views by Omar Javed, Zeeshan Rasheed, Khurram Shafiqu - PowerPoint PPT Presentation

1 / 38

About This Presentation

Title:

Tracking Across Multiple Cameras With Disjoint Views by Omar Javed, Zeeshan Rasheed, Khurram Shafiqu

Description:

Tracking Across Multiple Cameras With Disjoint Views ... Locations of exit and entry points between cameras is also correlated ... – PowerPoint PPT presentation

Number of Views:136

Avg rating:3.0/5.0

Slides: 39

Provided by: usersS

Category:

more less

Transcript and Presenter's Notes

Title: Tracking Across Multiple Cameras With Disjoint Views by Omar Javed, Zeeshan Rasheed, Khurram Shafiqu

1
Tracking Across Multiple Cameras With Disjoint
Viewsby Omar Javed, Zeeshan Rasheed, Khurram
Shafique, and Mubarak Shah

Shane Brennan
1 / 31 / 07

2
The Problem

Techniques to track individuals from a single
camera have improved, but methods of tracking an
individual across multiple cameras where there is
no overlapping region is still extremely
difficult
Object is hidden for an uindeterminate amount of
time
Appearance of an individual can change greatly
due to changes in viewpoint, lighting, and other
environmental changes

3
An Example of a Camera Setup
4
Some Notation

Assume single-camera tracking problem solved
K cameras, C1, C2, ..., Ck
Oj Oj,1, Oj,2, ..., Oj,mj set of tracks
observed by camera Cj
Observations are broken into two parts,
appearance (app) and space-time (st) features
(location, velocity, time)

5
Some Notation continued...

Let be an ordered pair (Oa,b, Oc,d) be
hypothesis that observations Oa,b, and Oc,d are
consecutive tracks of the same object
Find correspondences K such that each
observation is preceded or succeeded by at most 1
other observation and exists in K only if
Oa,b and Oc,d correspond to the consecutive
tracks of the same object

6
The Formulation

Maximize the Posterior!
is the
probability of the correspondance given the
observations Oi,a and Oj,b for two cameras Ci and
Cj.

7
The Formulation, continued...

From Bayes theorem, we getSince appearance
and space-time information are considered
independent we get

8
The Prior

The prior, is the probability that
an object transitions from Ci to Cj
By also assuming obervations are uniformly
distributed, Pi,j(Oi,a, Oj,b) becomes a constant
scale factor, so the posterior maximization
problem becomes

9
Space-Time Conditional Probs

Assume camera correspondances known
Call S a sample of n, d-dimensional data points
x1, x2, ... xn, from a multivariate distribution
p(x). Estimate p(x) using the parzen window
technique
K(x) is a multivariate kernel equal to

10
Space-Time cond probs continued

H is a symmetric dxd bandwidth matrix which is
assumed to be diagonal in order to simplify
matters
X is a 7-dimensional feature vector containing,
exit location/velocity, entry location/velocity,
time of travel (inter-arrival time)
During training, as correspondances are found (or
hand labeled) the feature vector is added to S

11
Inter-Arrival Times

Is dependent on magnitude and direction of motion
Dependent on location of exit and entry between
camera views
Locations of exit and entry points between
cameras is also correlated
Prior probability of correspondence of object
moving from Ci to Cj is calculated from ratio of
people exit Ci and enter Cj to the total number
of people that exit Ci during the learning phase

12
Appearance Probs

Need to model change in appearance across
cameras. Need to learn the appearance change
function!
Use color histogram, represent distance between
two histograms k and q using modified
Bhattacharyya coefficient

13
Appearance Probs, continued

Find the D for every object that goes between
cameras i and j, model the distances as a
Gaussian, this allows us to computeas being
equal towhere and are the
mean and variance of the color distance data
between cameras i and j

14
Some Distance Histograms
15
Establishing Correspondances

Finding the K can be modelled as a path through a
directed graph. Each node is an observation Oi,a,
and a correspondance is an arc between two
nodes. The weight of the arc is the value from
the log-likelihood function
A solution K is a set of disjoint directed paths
in the graph, covering the entire graph (every
vertex is in exactly one path). Solution to the
MAP problem such that sum of weights of the arcs
in K is maximimum among all sets

16
Establishing Corresp, continued...

Can reduce this to finding maximum matching of an
undirected bipartite graph
Can be solved in O(n2.5) time using the method
described by Hopcroft and Karp in An n2.5
algorithm for maximum matchings in bipartite
graphs
Split each vertex into two vertices, v- and v,
v- is for the arcs coming into the vertex, and
v is for the vertices leaving the vertex

17
The bad part...

Method of establishing correspondances assumes
all observations available, cant be used in
real-time!
Fix using a sliding window. Is a tradeoff
between accuracy and timely availability of
results
Authors adjust size of sliding window online, but
still a sub-optimal solution, best to not need a
sliding window, but is inherent in the method!

18
Online Update

Incorporate new observations, discard old ones
Achieve by estimating density of D from most
recent N samples. Update Gaussian
parameterswhere D is from the N recent
samples, and is a learning parameter

19
Results
20
Appearance Modeling for Tracking in Multiple
Non-overlapping Camerasby Omar Javed, Khurram
Shafique, and Mubarak Shah

The goal A better representation for finding the
brightness transfer function between an
individual as seen in two separate cameras
Authors represent the change in appearance as a
function they call a Brightness Transfer Function
(BTF)

21
Brightness-Transfer Functions

The BTFs for a pair of cameras lies in a small
subspace of the space of all possible BTFs
For a one-to-one mapping of brightness values
objects must be planar and only have diffuse
reflectance

22
Some Notation

Li(p, t) is scene radiance at a world point p of
an object illuminated by white light from camera
Ci at time t
Assuming objects have no specular reflectance,
Li(p, t) is a product of a material term Mi(p, t)
M(p) (the albedo) and illumination/camera
geometry Gi(p, t)
So Li(p, t) M(p) Gi(p, t)

23
BTF Formulation

Assuming planarity, Gi(p, t) Gi(q, t) Gi(t)
for all points p and q on an object, so Li(p, t)
M(p)Gi(t)
Image irradiance Ei(p, t) is given as Ei(p, t)
Li(p, t)Yi(t) M(p)Gi(t)Yi(t) whereand is a
function of camera parameters, hi(t) and di(t)
are the focal length and aperture of the lens,
and is the angle the principal ray
from p makes with the optical axis. The cos term
is negligable and is replaced with a constant c

24
BTF Formulation, continued...

Denote Xi(t) as time of exposure, and gi as the
radiometric response function of camera Ci, then
the measured image brightness of world point p
Bi(p, t) can be written asBi(p, t) gi(Ei(p,
t)Xi(t)) gi(M(p)Gi(t)Yi(t)Xi(t)
)the radiometric response times the material
properties times the geometric properties times
the camera parameters times the time of exposure

25
Calculating the BTF

Assume a point p is viewed by cameras i and j,
since material properties remain constant
So the BTF, B(p, t), is given bywhere w(ti,
ti) is a function of camera parameters and
illumination/scene geometry of cameras i and j at
time ti and ti

26
Calculating the BTF, continued...

Previous equation is valid for any p, so can drop
p from the notation. Is implicit that BTF is same
for any pair of frames so can drop ti and ti for
simplicity. Let fij denote a BTF from camera i to
j, so
Create vector for fij by sampling and creating a
set of fixed increasing brightness values, Bi(1)
(fij(Bi(1)), ..., fij(Bi(d))

27
Calculating the BTF, continued...

Denote space of BTFs by , its dimension is
at most d, where d is number of brightness levels
(256 for typical cameras). Can show BTFs actually
lie in a small subspace, use theorem 1
Theorem 1 The subspace of brightness transfer
functions has dimension at most m if for all

where gj is the radiometric response function of
camera Cj, and for all u, 1 are arbitrary but fixed 1D functions

28
Calculating the BTF, continued...

From Theorem 1, upper bound on dimension of
subspace depends on radiometric response of
camera j. Such functions are usually nonlinear
and differ from one camera to another, but do not
have exotic forms and are well-approximated by
simple parametric models
Can model by the gamma function, ieSo, for all a
and x in R

29
Calculating the BTF, continued...

Since can represent with gamma function, has
dimension of at most 2, as opposed to 256
Can better represent the radiometric response
function with a polynomial if one desires, though
the dimension of the space of BTFs will be the
degree of the polynomial

30
Estimating BTFs

View objects in cameras i and j, normalize
histograms by assuming percentage of pixels with
brightness less than Bi is the same in both
views
Hi and Hj are normalized cumulative histograms of
object observations Oi and Oj, thenHi(Bi)
Hj(Bj) Hj(fij(Bi)), so fij(Bi) Hj-1(Hi(Bi))
Use the function to compute BTF fij for every
pair of observations in training set, and define
Fij as the collection of all the fij's

31
Estimating BTFs, continued...

Use Principal PCA to learn the subspace of Fij
By this model, a d-dimensional BTF, fij, can be
written as fij Wy fij e where y is a
normally distributed q dimensional subspace
variable that is less than d. W is a dxq
projection matrix, fij is the mean of Fij, and e
is isotropic Gaussian noise, ie e N(0, o2I).
Since y and e are normally distributed, fij is
given asfij N(fij, Z) where Z WWT o2I

32
Estimating BTFs, continued...

W is estimated as
where the q column vectors in the dxq
dimensional Uq are the eigenvectors of the
sample covariance matrix of Fij. Eq is the qxq
diagonal matrix of corresponding eigenvalues. R
is an arbitrary rotation matrix, set to be the
identity matrix, and
Can now compute the probability a particular BTF
belonging to the learned subspace of BTFs. Can do
this process for each color channel separately

33
Incorporating BTFs into Tracking

Use the BTF as a better estimate for finding the
distance between objects tracked in two separate
cameras as discussed in the previous
presentation
Provides a better comparison of object
appearances, leading to overall better tracking