Object Tracking And SHOSLIF Tree Based Classification Using Shape And Color Features Author: Lucio Marcenaro, Franco Oberti and Carlo S. Regazzoni - PowerPoint PPT Presentation

About This Presentation

Title:

Object Tracking And SHOSLIF Tree Based Classification Using Shape And Color Features Author: Lucio Marcenaro, Franco Oberti and Carlo S. Regazzoni

Description:

... by extracting numerical characteristics (e.g.. Geometrical and shape properties) ... Temporal Graph (contd. ... Temporal Graphs (contd. ... – PowerPoint PPT presentation

Number of Views:103

Avg rating:3.0/5.0

Slides: 43

Provided by: venu5

Learn more at: https://cis.temple.edu

Category:

more less

Transcript and Presenter's Notes

Title: Object Tracking And SHOSLIF Tree Based Classification Using Shape And Color Features Author: Lucio Marcenaro, Franco Oberti and Carlo S. Regazzoni

1
Object Tracking And SHOSLIF Tree Based
Classification Using Shape And Color
FeaturesAuthor Lucio Marcenaro, Franco Oberti
and Carlo S. Regazzoni

CIS 750
Advisor Longin Jan Latecki
Presented by Venugopal Rajagopal

2
Introduction

Main functionalities of video- surveillance
system
Detection
Tracking of objects
acting within the guarded
environment
Higher modules of the system responsible for
object and event classification
Shape, color, motion are the frequent features
used to achieve the above tasks.
Here we use shape and color related features for
tracking and recognizing objects.
Shape to classify among different
postures, provides finer
discriminant feature allowing
objects within a same general
class to be classified
Histogram basis for classification
between different objects

3
Introduction (contd.)

Novel approach for tracking and recognition
Corner groups and objects histograms are used as
basis features for multilevel shape
representation.
Methods for representing models for the tracking
and recognition phases are based on Generalized
Hough Transform and on SHOSLIF trees.

4
System Architecture
Sensor (Camera)
High level IP
Classification
Low level IP
Corners Histogram
ROI Blobs
Corners Histogram
Corners based representation
SHOSLIF tree
Short term memory
Surveillance System Modules
5
Low Level Image Processing

Performs the first Stage of Abstraction from the
sequence acquired from the sensor to the
representation that is used for tracking and
classification.
From the acquired frame mobile areas of the image
(blobs) are detected by a frame-background
difference and analyzed by extracting numerical
characteristics (e.g.. Geometrical and shape
properties)
Blob Analysis is performed by the following
modules
Change detection By using statistical
morphological operators it identifies the mobile
blobs present in the image that exhibit
remarkable difference with respect to the
background.
Focus of attention The minimum rectangle
bounding (MBR) each of the blob in the image is
detected using some fast image segmentation
algorithm.
History of Detected ROI and blobs are maintained
in terms of temporal graph, which are used for
further processing by the higher level modules.

6
Detecting BLOBS and MBR
7
Temporal Graph

Temporal graph provides information on the
current bounding boxes and their relations to the
boxes detected in the previous frames.
All nodes of each level are the blobs detected in
each frame.
Relationships among the blobs belonging to
different adjacent levels are represented as arcs
between the nodes.
Arcs are inserted on the basis of superposition
of the blob areas on the image plane. If a blob
at step (k-1) overlaps a blob at step k, then a
link between them is created, so that the blob at
step (k-1) is called "father" of the blob at time
step k (its "son").

8
Temporal Graph (contd.)

Different events can occur
1) If a blob has only one "father", its type
is set "one-overlapping" (type o), and father
label is assigned to it.
2) If a blob has more than one "father", its
type is set to "merge" (type m), and a new label
is assigned.
3) If a blob is not the only "son" of its
father, its type is set to "split" (type s), and
a new label is assigned.
4) If a blob has no "father", its type is set
to "new" (type n) and a new label is assigned.

9
Temporal Graph (contd.)
A sequence of images showing critical cases of
blob splitting, merging and displacement. Each
image contains the detected blobs with their
numerical label and type.
10
Temporal Graphs (contd.)

Figure showing the bounding boxes and the
temporal graph representing the correspondences
between the bounding boxes

11
High Level Image Processing(Corner Extraction)

High level image processing extracts
high-curvature points (corners) and histograms
from each detected object.
General procedure to extract corners
Gradient of the input gray-level image is
computed using the Sobel Operator.
Edges are extracted by using the gradient
magnitude. A pixel of the image is considered to
be a point of an edge if its gradient magnitude
is greater than a fixed threshold.
If large variation in the direction of the
gradient is found in a neighborhood of edge
points, then a corner is detected.
Given an image to extract corners
Edges are extracted first using sobel
filter. The maximum variation of gradient
direction of the edges points inside a square
kernel is evaluated. If the maximum variation is
greater than a threshold then the pixel at the
center of the kernel is selected as a corner and
its gradient direction is fixed as a the corner
direction.

12
Corner Extraction (contd.)

This Figure shows the corner extraction steps
Original Image
Edges image
Corners extracted

13
Tracking and Recognition Modules

System uses short-term memory, associated with
the tracking process and long term memory
associated with the recognition process.
This module performs tasks based on two working
modalities learning and matching
The tracking module enters in this modality
whenever the object is not overlapped in order to
update the short-term object model.
The recognition module builds up a self
organizing tree during the learning modality.

14
Tracking and Recognition Modules (contd.)

Recognition Module (learning phase) A set of
human classified samples presented to the tree
which automatically organizes them in such a way
to maximize the inter-class distances, minimizing
the intra-class variances.
Recognition Module (Matching phase) SHOSLIF tree
used for objects classification, each object that
has been detected from the lower levels of the
system is presented to the classification tree
that outputs the estimated class for that object
and the nearest training sample.

15
Generalized Hough Transform (GHT)

Technique used to find arbitrary curves in an
image without having a parametric equation of
them.
A look-up table called R-table is used to model
the template shape of the object.
This R-table is used as a transform mechanism.
To build the R-table first a reference point and
several feature points of the shape are selected.

16
GHT (contd.)
Given a shape we wish to localize the first stage
is to build up a look up table Known as R-table
which will replace the need for parametric
equation in the Transform stage.
17
GHT (contd.)
For each feature point the orientation omega of
the tangential line at that point, the length
r, and the orientation beta of the radial
vector that joins the reference point and the
feature point can be calculated.
18
GHT (contd.)

If n is the number of feature points, a indexed
table of size n X 2 can be created using all n
pairs (r,beta) and using omega as index.
This table is the model of the shape and it can
be used with a transformation to find occurrences
of the same object in other images.
The shape is localized using a voting technique.

19
GHT (contd.)

Given an unknown image each edge point is
segmented and its orientation omega is
calculated. Using omega as an index into the
R-table each (r,beta) at this location is
extracted.

20
GHT (contd.)

Using the pair (r,beta), the possible position
for the reference point can be computed and an
accumulator of its position is incremented, the
maximum accumulator value will occur with high
probability at the actual reference point.

21
Modified GHT

In our approach the GHT is modified to
automatically extract the model of the object
(R-table) and also to individuate the position of
the object (voting).
Corners extracted from the object are used as
feature points and a different parameterization
is used. Instead of using pairs (r, beta),
pairs (dx, dy) are used where dx and dy are the
differences in x and y with respect to the
reference point.
Instead of using a 2 X N indexed table, a 3 X N
table is used.

22
Modified GHT (contd.)

The first value is the angle direction omega of
the gradient vector at the corner position with
respect to the original image. The obtained
triplet (omega, dx, dy) is used to model the
position and orientation of the corner with
respect to the reference point. Here in this
approach for a corner not all of the n possible
corners are voted, but only the ones that have
omega similar to the one obtained, thus
minimizing computational time and memory
requirements.

23
Corner Based Tracker

The output from the Low level Image processing
stage (MBRs and the correspondence graphs) is
used as the input for the tracking stage in order
to detect the objects present in isolated boxes
when they merge so forming a group.
Learning model phase applied to the isolated
rectangles in 2 or 3 frames before the union
takes place.
When the boxes are merged the matching phase used
in order to find the position of the objects
inside the merged rectangles.

24
Corner Based Tracker (Learning Phase)

The input is the gray level image of the desired
object. The center of the gray-level image is
selected as reference point.
Gradient operator applied to extract edges
(sobel) and for every edge point the direction of
the gradient is calculated. Then the corners are
extracted.
For each corner dx and dy calculated and
stored in the R-table, which represents the
obtained model of the object.
For robustness the previous method is applied to
different images of the object (frames of a
sequence), and a unique R-table is constructed by
selecting the corners that are present in most of
the images at the same location and with the same
orientation.

25
Corner Based Tracker (Matching Phase)

The input are the R-table of the searched object
and the gray level image in which the object
should be present.
As in the learning phase the gradient operator
applied to the input image and corners extracted.
For every extracted corner, omega computed and
if present in the R-table, then the possible
position for the reference point can be
calculated using (dx, dy) and its accumulator can
be incremented. As in the GHT the reference point
will be found at the maximum accumulator value.

26
Object Classification

The long term recognition module uses corner
representation and histograms features extracted
by the Image Processing modules (previous steps)
as a basis for objects classification.
SHOSLIF (Self Organizing Hierarchical Optimal
Subspace Learning and Interference Framework) is
the tool used for the objects classification.
Input to the SHOSLIF is a set of labeled patterns
X (xn wn) n 1..N, i.e. the training set,
where xn is a vector of dimensionality K
representing the observed sample and wn is a
class associated with xn, chosen in a set of
C classes.
The SHOSLIF algorithm produces as output a tree
whose nodes contain decreasing set of samples,
with root node containing all samples in X.

27
SHOSLIF

Uses the theories of optimal linear projection to
generate a space defined by the training images.
This space is generated using two projections
Karhunen-Loeve projection to produce a set
of Most Expressive Features (MEFs)
Subsequent discriminant analysis projection
to produce a set of Most Discriminating Features
(MDFs)
System builds a network that tessellates these
MEF/MDF spaces for recognizing objects from
images.

28
SHOSLIF (contd.)

Fig Tree example
a) Sample
Partitioning in the feature space
b) Tree structure

29
SHOSLIF (contd.)Most Expressive Features (MEF)

Each input sub image is treated as a
high-dimensional feature vector by concatenating
the rows of the sub image.
Perform Principal Component analysis on the set
of training images.
PCA analysis utilizes the eigen vectors of the
sample scatter matrix associated with the largest
eigenvalues. These vectors are in the direction
of the major variations in the samples and as
such can be used as a basis set with which to
describe the image samples. Using these eigen
vectors the image can be reconstructed close to
original.
Since the features produced in this projection
give minimum square error for approximating an
image and show good performance in image
reconstruction its called the Most Expressive
features.

30
SHOSLIF (contd.)Most Discriminating Features
(MDF)

The features produced by MEF are not good for
discriminating among classes defined by the set
of samples (fails when two same images with
different light intensity are present).
So on the features got from MEF, linear
dicriminant analysis (LDA) is performed.
In LDA the between class scatter is maximized
while minimizing the within-class scatter.
The features obtained from LDA optimally
discriminate among the classes represented in the
training set. Due to this it is called the Most
Discriminating Features.

31
SHOSLIF (contd.)Tree Construction

Each level of the tree has an expected radius
r(l) of the space it covers, where l is the
level.
d(X,A) is the distance measure between node N
with center vector A and a sample vector X.
Root node contains all the images from the
training set.
Every node which contains more than a single
training image computes a projection matrix V ,
which are features obtained from projecting the
features in to the MEF space.
If the training samples contained in a node are
drawn from multiple classes (indicated by the
labels), then MEF vectors are used to compute a
projection matrix W which are the features
obtained from projecting the features (MEF) in to
the MDF space.

32
SHOSLIF (contd.)Tree Construction (contd.)

If the training samples in a node are from a
single class we leave it the way it is.
Each node contains the feature vectors which are
within the radius covered by one of the children.
If we want to add a training sample X to node
N at level l, first check has been made
whether the feature vectors of X is within the
radius covered by one of the children's of the
node N , then X can be added as one of the
descendants of that child, if the feature vectors
of X is outside the radius then X is added as
a new child of N.

33
SHOSLIF (contd.)Image Retrieval

General Flow of each SHOSLIF Processing
Element

34
Object Classification (contd.)

The SHOSLIF setup is used to organize corners
extracted by blobs associated during a learning
phase.
A training set X is represented by a set of
pairs (corners, class).
One problem is the dimension of the input vectors
are fixed in a SHOSLIF tree.
The feature selection is performed by
partitioning the corner set C(t) into M regions
where M is the desired cardinality for the
pattern x to be given to the SHOSLIF.

35
Object Classification (contd.)
X
12
6
6
25
25
X
13
X
(b)
(c)
(d)
(a)

Corner partitioning process example
a) first division along x-axis
b) second division along y-axis
c) third division along x axis
d) final areas with M 16

36
Object Classification (contd.)

The corner set is chosen by iteratively
partitioning the blob into two areas, each
characterized by the same number of corners.
For each region the vector median corner in the
survived local population is chosen as
representative sample.
The next figure shows the survived corners as two
set of connected points
a) external closed lines connect median corner
points in outer regions.
b) internal lines connect corner points in
inner regions.

37
Object Classification (contd.)

Examples of Survived Corners

38
Object Classification (contd.)

In this way the vector xn is computed for each
sample and a class label is associated with it,
which is given as input to the SHOSLIF tree.

39
Results

Training set 328 Samples distributed over the
classes.
Test set 30 Samples
Misdetection probability over the test set was
15
Second test done by using histograms for objects
identification
Misdetection probability over the test set was 8

40
Results (contd.)

Example figure shows the probed figure in the
left hand side and the retreived figure in the
right hand side.

41
Conclusion

A method for tracking and classifying objects in
a video-surveillance system has been presented.
A corner based shape model is used for tracking
and for recognizing an object.
Classification performed by using SHOSLIF trees.
Computed misdetection probabilities confirm the
correctness of the proposed approach.

42
References

A. Tesei, A. Teschioni, C.S. Regazzoni and G.
Vernazza, Long Memory matching of interacting
complex objects from real image sequences
F. Oberti and C.S. Regazzoni, Real-Time Robust
Detection of Moving Objects in Cluttered Scenes
D.L. Swets and J. Weng, Hierarchical
Discriminant Analysis fro Image Retrieval

Write a Comment

User Comments (0)