Object Tracking And SHOSLIF Tree Based Classification Using Shape And Color Features Author: Lucio Marcenaro, Franco Oberti and Carlo S. Regazzoni - PowerPoint PPT Presentation

About This Presentation
Title:

Object Tracking And SHOSLIF Tree Based Classification Using Shape And Color Features Author: Lucio Marcenaro, Franco Oberti and Carlo S. Regazzoni

Description:

... by extracting numerical characteristics (e.g.. Geometrical and shape properties) ... Temporal Graph (contd. ... Temporal Graphs (contd. ... – PowerPoint PPT presentation

Number of Views:103
Avg rating:3.0/5.0
Slides: 43
Provided by: venu5
Learn more at: https://cis.temple.edu
Category:

less

Transcript and Presenter's Notes

Title: Object Tracking And SHOSLIF Tree Based Classification Using Shape And Color Features Author: Lucio Marcenaro, Franco Oberti and Carlo S. Regazzoni


1
Object Tracking And SHOSLIF Tree Based
Classification Using Shape And Color
FeaturesAuthor Lucio Marcenaro, Franco Oberti
and Carlo S. Regazzoni
  • CIS 750
  • Advisor Longin Jan Latecki
  • Presented by Venugopal Rajagopal

2
Introduction
  • Main functionalities of video- surveillance
    system
  • Detection
  • Tracking of objects
  • acting within the guarded
    environment
  • Higher modules of the system responsible for
    object and event classification
  • Shape, color, motion are the frequent features
    used to achieve the above tasks.
  • Here we use shape and color related features for
    tracking and recognizing objects.
  • Shape to classify among different
    postures, provides finer
  • discriminant feature allowing
    objects within a same general
  • class to be classified
  • Histogram basis for classification
    between different objects

3
Introduction (contd.)
  • Novel approach for tracking and recognition
  • Corner groups and objects histograms are used as
    basis features for multilevel shape
    representation.
  • Methods for representing models for the tracking
    and recognition phases are based on Generalized
    Hough Transform and on SHOSLIF trees.

4
System Architecture
Sensor (Camera)
High level IP
Classification
Low level IP
Corners Histogram
ROI Blobs
Corners Histogram
Corners based representation
SHOSLIF tree
Short term memory
Surveillance System Modules
5
Low Level Image Processing
  • Performs the first Stage of Abstraction from the
    sequence acquired from the sensor to the
    representation that is used for tracking and
    classification.
  • From the acquired frame mobile areas of the image
    (blobs) are detected by a frame-background
    difference and analyzed by extracting numerical
    characteristics (e.g.. Geometrical and shape
    properties)
  • Blob Analysis is performed by the following
    modules
  • Change detection By using statistical
    morphological operators it identifies the mobile
    blobs present in the image that exhibit
    remarkable difference with respect to the
    background.
  • Focus of attention The minimum rectangle
    bounding (MBR) each of the blob in the image is
    detected using some fast image segmentation
    algorithm.
  • History of Detected ROI and blobs are maintained
    in terms of temporal graph, which are used for
    further processing by the higher level modules.

6
Detecting BLOBS and MBR
7
Temporal Graph
  • Temporal graph provides information on the
    current bounding boxes and their relations to the
    boxes detected in the previous frames.
  • All nodes of each level are the blobs detected in
    each frame.
  • Relationships among the blobs belonging to
    different adjacent levels are represented as arcs
    between the nodes.
  • Arcs are inserted on the basis of superposition
    of the blob areas on the image plane. If a blob
    at step (k-1) overlaps a blob at step k, then a
    link between them is created, so that the blob at
    step (k-1) is called "father" of the blob at time
    step k (its "son").

8
Temporal Graph (contd.)
  • Different events can occur
  • 1) If a blob has only one "father", its type
    is set "one-overlapping" (type o), and father
    label is assigned to it.
  • 2) If a blob has more than one "father", its
    type is set to "merge" (type m), and a new label
    is assigned.
  • 3) If a blob is not the only "son" of its
    father, its type is set to "split" (type s), and
    a new label is assigned.
  • 4) If a blob has no "father", its type is set
    to "new" (type n) and a new label is assigned.

9
Temporal Graph (contd.)
A sequence of images showing critical cases of
blob splitting, merging and displacement. Each
image contains the detected blobs with their
numerical label and type.
10
Temporal Graphs (contd.)
  • Figure showing the bounding boxes and the
    temporal graph representing the correspondences
    between the bounding boxes

11
High Level Image Processing(Corner Extraction)
  • High level image processing extracts
    high-curvature points (corners) and histograms
    from each detected object.
  • General procedure to extract corners
  • Gradient of the input gray-level image is
    computed using the Sobel Operator.
  • Edges are extracted by using the gradient
    magnitude. A pixel of the image is considered to
    be a point of an edge if its gradient magnitude
    is greater than a fixed threshold.
  • If large variation in the direction of the
    gradient is found in a neighborhood of edge
    points, then a corner is detected.
  • Given an image to extract corners
  • Edges are extracted first using sobel
    filter. The maximum variation of gradient
    direction of the edges points inside a square
    kernel is evaluated. If the maximum variation is
    greater than a threshold then the pixel at the
    center of the kernel is selected as a corner and
    its gradient direction is fixed as a the corner
    direction.

12
Corner Extraction (contd.)
  • This Figure shows the corner extraction steps
  • Original Image
  • Edges image
  • Corners extracted

13
Tracking and Recognition Modules
  • System uses short-term memory, associated with
    the tracking process and long term memory
    associated with the recognition process.
  • This module performs tasks based on two working
    modalities learning and matching
  • The tracking module enters in this modality
    whenever the object is not overlapped in order to
    update the short-term object model.
  • The recognition module builds up a self
    organizing tree during the learning modality.

14
Tracking and Recognition Modules (contd.)
  • Recognition Module (learning phase) A set of
    human classified samples presented to the tree
    which automatically organizes them in such a way
    to maximize the inter-class distances, minimizing
    the intra-class variances.
  • Recognition Module (Matching phase) SHOSLIF tree
    used for objects classification, each object that
    has been detected from the lower levels of the
    system is presented to the classification tree
    that outputs the estimated class for that object
    and the nearest training sample.

15
Generalized Hough Transform (GHT)
  • Technique used to find arbitrary curves in an
    image without having a parametric equation of
    them.
  • A look-up table called R-table is used to model
    the template shape of the object.
  • This R-table is used as a transform mechanism.
  • To build the R-table first a reference point and
    several feature points of the shape are selected.

16
GHT (contd.)
Given a shape we wish to localize the first stage
is to build up a look up table Known as R-table
which will replace the need for parametric
equation in the Transform stage.
17
GHT (contd.)
For each feature point the orientation omega of
the tangential line at that point, the length
r, and the orientation beta of the radial
vector that joins the reference point and the
feature point can be calculated.
18
GHT (contd.)
  • If n is the number of feature points, a indexed
    table of size n X 2 can be created using all n
    pairs (r,beta) and using omega as index.
  • This table is the model of the shape and it can
    be used with a transformation to find occurrences
    of the same object in other images.
  • The shape is localized using a voting technique.

19
GHT (contd.)
  • Given an unknown image each edge point is
    segmented and its orientation omega is
    calculated. Using omega as an index into the
    R-table each (r,beta) at this location is
    extracted.

20
GHT (contd.)
  • Using the pair (r,beta), the possible position
    for the reference point can be computed and an
    accumulator of its position is incremented, the
    maximum accumulator value will occur with high
    probability at the actual reference point.

21
Modified GHT
  • In our approach the GHT is modified to
    automatically extract the model of the object
    (R-table) and also to individuate the position of
    the object (voting).
  • Corners extracted from the object are used as
    feature points and a different parameterization
    is used. Instead of using pairs (r, beta),
    pairs (dx, dy) are used where dx and dy are the
    differences in x and y with respect to the
    reference point.
  • Instead of using a 2 X N indexed table, a 3 X N
    table is used.

22
Modified GHT (contd.)
  • The first value is the angle direction omega of
    the gradient vector at the corner position with
    respect to the original image. The obtained
    triplet (omega, dx, dy) is used to model the
    position and orientation of the corner with
    respect to the reference point. Here in this
    approach for a corner not all of the n possible
    corners are voted, but only the ones that have
    omega similar to the one obtained, thus
    minimizing computational time and memory
    requirements.

23
Corner Based Tracker
  • The output from the Low level Image processing
    stage (MBRs and the correspondence graphs) is
    used as the input for the tracking stage in order
    to detect the objects present in isolated boxes
    when they merge so forming a group.
  • Learning model phase applied to the isolated
    rectangles in 2 or 3 frames before the union
    takes place.
  • When the boxes are merged the matching phase used
    in order to find the position of the objects
    inside the merged rectangles.

24
Corner Based Tracker (Learning Phase)
  • The input is the gray level image of the desired
    object. The center of the gray-level image is
    selected as reference point.
  • Gradient operator applied to extract edges
    (sobel) and for every edge point the direction of
    the gradient is calculated. Then the corners are
    extracted.
  • For each corner dx and dy calculated and
    stored in the R-table, which represents the
    obtained model of the object.
  • For robustness the previous method is applied to
    different images of the object (frames of a
    sequence), and a unique R-table is constructed by
    selecting the corners that are present in most of
    the images at the same location and with the same
    orientation.

25
Corner Based Tracker (Matching Phase)
  • The input are the R-table of the searched object
    and the gray level image in which the object
    should be present.
  • As in the learning phase the gradient operator
    applied to the input image and corners extracted.
  • For every extracted corner, omega computed and
    if present in the R-table, then the possible
    position for the reference point can be
    calculated using (dx, dy) and its accumulator can
    be incremented. As in the GHT the reference point
    will be found at the maximum accumulator value.

26
Object Classification
  • The long term recognition module uses corner
    representation and histograms features extracted
    by the Image Processing modules (previous steps)
    as a basis for objects classification.
  • SHOSLIF (Self Organizing Hierarchical Optimal
    Subspace Learning and Interference Framework) is
    the tool used for the objects classification.
  • Input to the SHOSLIF is a set of labeled patterns
  • X (xn wn) n 1..N, i.e. the training set,
    where xn is a vector of dimensionality K
    representing the observed sample and wn is a
    class associated with xn, chosen in a set of
    C classes.
  • The SHOSLIF algorithm produces as output a tree
    whose nodes contain decreasing set of samples,
    with root node containing all samples in X.

27
SHOSLIF
  • Uses the theories of optimal linear projection to
    generate a space defined by the training images.
  • This space is generated using two projections
  • Karhunen-Loeve projection to produce a set
    of Most Expressive Features (MEFs)
  • Subsequent discriminant analysis projection
    to produce a set of Most Discriminating Features
    (MDFs)
  • System builds a network that tessellates these
    MEF/MDF spaces for recognizing objects from
    images.

28
SHOSLIF (contd.)
  • Fig Tree example
  • a) Sample
    Partitioning in the feature space
  • b) Tree structure

29
SHOSLIF (contd.)Most Expressive Features (MEF)
  • Each input sub image is treated as a
    high-dimensional feature vector by concatenating
    the rows of the sub image.
  • Perform Principal Component analysis on the set
    of training images.
  • PCA analysis utilizes the eigen vectors of the
    sample scatter matrix associated with the largest
    eigenvalues. These vectors are in the direction
    of the major variations in the samples and as
    such can be used as a basis set with which to
    describe the image samples. Using these eigen
    vectors the image can be reconstructed close to
    original.
  • Since the features produced in this projection
    give minimum square error for approximating an
    image and show good performance in image
    reconstruction its called the Most Expressive
    features.

30
SHOSLIF (contd.)Most Discriminating Features
(MDF)
  • The features produced by MEF are not good for
    discriminating among classes defined by the set
    of samples (fails when two same images with
    different light intensity are present).
  • So on the features got from MEF, linear
    dicriminant analysis (LDA) is performed.
  • In LDA the between class scatter is maximized
    while minimizing the within-class scatter.
  • The features obtained from LDA optimally
    discriminate among the classes represented in the
    training set. Due to this it is called the Most
    Discriminating Features.

31
SHOSLIF (contd.)Tree Construction
  • Each level of the tree has an expected radius
    r(l) of the space it covers, where l is the
    level.
  • d(X,A) is the distance measure between node N
    with center vector A and a sample vector X.
  • Root node contains all the images from the
    training set.
  • Every node which contains more than a single
    training image computes a projection matrix V ,
    which are features obtained from projecting the
    features in to the MEF space.
  • If the training samples contained in a node are
    drawn from multiple classes (indicated by the
    labels), then MEF vectors are used to compute a
    projection matrix W which are the features
    obtained from projecting the features (MEF) in to
    the MDF space.

32
SHOSLIF (contd.)Tree Construction (contd.)
  • If the training samples in a node are from a
    single class we leave it the way it is.
  • Each node contains the feature vectors which are
    within the radius covered by one of the children.
  • If we want to add a training sample X to node
    N at level l, first check has been made
    whether the feature vectors of X is within the
    radius covered by one of the children's of the
    node N , then X can be added as one of the
    descendants of that child, if the feature vectors
    of X is outside the radius then X is added as
    a new child of N.

33
SHOSLIF (contd.)Image Retrieval
  • General Flow of each SHOSLIF Processing
    Element

34
Object Classification (contd.)
  • The SHOSLIF setup is used to organize corners
    extracted by blobs associated during a learning
    phase.
  • A training set X is represented by a set of
    pairs (corners, class).
  • One problem is the dimension of the input vectors
    are fixed in a SHOSLIF tree.
  • The feature selection is performed by
    partitioning the corner set C(t) into M regions
    where M is the desired cardinality for the
    pattern x to be given to the SHOSLIF.

35
Object Classification (contd.)
X
12
6
6
25
25
X
13
X
(b)
(c)
(d)
(a)
  • Corner partitioning process example
  • a) first division along x-axis
  • b) second division along y-axis
  • c) third division along x axis
  • d) final areas with M 16

36
Object Classification (contd.)
  • The corner set is chosen by iteratively
    partitioning the blob into two areas, each
    characterized by the same number of corners.
  • For each region the vector median corner in the
    survived local population is chosen as
    representative sample.
  • The next figure shows the survived corners as two
    set of connected points
  • a) external closed lines connect median corner
    points in outer regions.
  • b) internal lines connect corner points in
    inner regions.

37
Object Classification (contd.)
  • Examples of Survived Corners

38
Object Classification (contd.)
  • In this way the vector xn is computed for each
    sample and a class label is associated with it,
    which is given as input to the SHOSLIF tree.

39
Results
  • Training set 328 Samples distributed over the
    classes.
  • Test set 30 Samples
  • Misdetection probability over the test set was
    15
  • Second test done by using histograms for objects
    identification
  • Misdetection probability over the test set was 8

40
Results (contd.)
  • Example figure shows the probed figure in the
    left hand side and the retreived figure in the
    right hand side.

41
Conclusion
  • A method for tracking and classifying objects in
    a video-surveillance system has been presented.
  • A corner based shape model is used for tracking
    and for recognizing an object.
  • Classification performed by using SHOSLIF trees.
  • Computed misdetection probabilities confirm the
    correctness of the proposed approach.

42
References
  1. A. Tesei, A. Teschioni, C.S. Regazzoni and G.
    Vernazza, Long Memory matching of interacting
    complex objects from real image sequences
  2. F. Oberti and C.S. Regazzoni, Real-Time Robust
    Detection of Moving Objects in Cluttered Scenes
  3. D.L. Swets and J. Weng, Hierarchical
    Discriminant Analysis fro Image Retrieval
Write a Comment
User Comments (0)
About PowerShow.com