Object Tracking - PowerPoint PPT Presentation

1 / 58

About This Presentation

Title:

Object Tracking

Description:

Data appears in a horizontal raster scan fashion ... Dynamic design useful when the template is searched for in a large number of ... – PowerPoint PPT presentation

Number of Views:280

Avg rating:3.0/5.0

Slides: 59

Provided by: Rajd

Category:

more less

Transcript and Presenter's Notes

Title: Object Tracking

1
Object Tracking

By Rajdeep Bondade

2
Tracking

Tracking is the task of estimating the trajectory
of an object in the image plane as it moves
around a scene
It is important in the field of computer vision
and artificial intelligence
Interest generated due to high powered computers,
high quality and inexpensive video cameras and
need for automated video analysis

3
Applications of Tracking

Motion-based recognition
Automated surveillance
Video indexing
Human-computer interaction
Traffic monitoring
Vehicle navigation
Automatic target recognition in military domain
Visual inspection in industries

4
Classification

Tracking is broadly divided into Point Tracking,
Kernel Tracking and Silhouette Tracking

5
Difficulties Faced

Tracking is highly computation intensive
Algorithms define multiple correlations,
convolutions and other complex operations
These operations are very difficult to perform on
a microprocessor
Microprocessors are serial in nature, but most
operations are inherently parallel

6
Possible Solutions

Implement the parallel operations in hardware
using ASICs
Not flexible
Cannot change the template object
Difficult to modify parameters such as size and
shape of object being tracked
Implement using FPGAs and reconfigurable
computing
Provides a tradeoff between speed of hardware and
flexibility of software

7
Classical Image tracking system

Detection is a decision making process
Tracking involves associating discreet detections
over time to form a track path
Recognition uses the results of detection and
tracking to classify the object

8
Template Matching

It is an object detection technique used to find
an object in a search image
Correlate the template image with the scene image
and find the location where the result is minimum
Software solutions are flexible but is too slow

9
Paper 1Reconfigurable Shape Adaptive Template
Matching

Jörn Gause, Peter Y. K. Cheung, Wayne Luk
Imperial College, London
IEEE Symposium on Field Programmable Custom
Computing Machines
2002

10
Objective

Reconfigurable strategies for Shape-Adaptive
Template Matching to detect arbitrarily shaped
objects in images/video frames
Static Design
Template is stored on off-chip memory
Partially Dynamic Design
Template is stored in on-chip memory, allowing
reconfiguration
Dynamic Design
Configuration data is completely adapted to shape
and size of the template

11
Purpose

Algorithm is truly object oriented, i.e. it
depends only on the template used
Software solutions may provide the flexibility
but are too slow for real-time video processing
ASIC implementation is not practical due to
infinite number of sizes of the template
Thus, a reconfigurable architecture is proposed
to implement a fast and flexible SA-TM design

12
Prior Work and their Conclusions

Work has been performed using dynamic
reconfiguration leading to acceleration and
effective logic capacity usage
Computer vision algorithms
not shape adaptive
applied to small images (512 X 512)
Automatic Target Recognition
Binary templates (16 X 16) and small image (128 X
128)
Run-time reconfiguration decreases area-execution
time product if the search image is large enough
Dynamic Reconfiguration suitable to
shape-adaptive algorithms if reconfiguration
overhead is small

13
Shape Adaptive Template Matching

Aim
To find a template object of arbitrary shape and
size within a search image or video frame of any
size using a reconfigurable computing
architecture
Search Image consists of WH pixels
The template consists of p opaque pixels and can
have any shape. It is bounded by a box of size
wh
In the bounding box, each pixel has one mask bit
1 if the pixel belongs to the object and 0
otherwise

14
Shape Adaptive Template Matching

Template is shifted over the image over
(W-w1)(H-h1) locations
Sum of Absolute Distances over luminance pixel
values is chosen as the comparison metric
The match is found when SAD(y,x) is minimum and
smaller than a certain threshold

15
Systolic Array for SA-TM

Data appears in a horizontal raster scan fashion
The AD computations for a position along with the
clock cycle and the SAD is shown

16
Systolic Array for SA-TM

A signal flow graph representing the previous
example is shown below
The node lti,jgt represents I(y,x) T(i,j)
The pixel values I(x,y) are broadcasted
sequentially to all PEs and all computations
parallel
At the end of 42 clock cycles, all 20 SAD values
will be computed

17
Structure of PE for SA-TM

The following general systolic array, adapted to
the shape of the template object is presented
Each pixel belonging to a template is represented
by a PE
The template pixel value is stored in the ROM
within the PE

18
Structure of PE for SA-TM

Size of Sum_in and Sum_out depend on the position
of PE in the SFG
N max(m,c)
Max. distance in one PE is 2c 1
Max. intermediate sum of k of the ADs is (2c
1)k
This requires bits

19
Area Calculations

Area of PE contains a
constant part (AD) and a
variable part which
grows with n
where a and b are constants
The total area to implement p PEs is given by

20
Area Calculations

The area then simplifies
to
where

21
Further Area Calculations

Registers are required to delay the intermediate
sums
PEs and registers are arranged according to the
mask of the template
After each line of PEs, W-w shift registers are
needed

22
Further Area Calculations

wh-p pixels require shift registers-
W-w pixels require shift registers-

23
Summary of Structure

p PEs are required to for AD computations and
summation of intermediate results
Arrangement of PEs in the same way as the pixels
of the template
wh-p gaps represent transparent pixels filled
with registers
W-w shift registers are required in each but the
last row to store intermediate sums
The size of the kth adder where 1k p is given
by

24
Reconfigurable design strategiesDYNAMIC DESIGN

Reconfigured for every possible template size and
shape and search frame size
The template is a part of the configuration data
and word lengths can be optimized
One input to the AD module is constant, it can be
replaced by a look-up table, which stores the AD
value for each I(y,x)
p PEs and wh p (W-w)(h-1) registers are
required

25
Reconfigurable design strategiesDYNAMIC DESIGN

The area required is
The total execution time TD consists of the
computation time, the reconfiguration time and
the compilation time
The execution time for N frames is

26
Reconfigurable design strategiesSTATIC DESIGN

Dynamic design useful when the template is
searched for in a large number of video frames of
the same size
In static design, FPGA configuration is not
changed when a new template is used
Number of search frame sizes and template shapes
and sizes is unlimited, only a subset of all
solutions are implemented

27
Reconfigurable design strategiesSTATIC DESIGN

If the search frame size is fixed, the following
PE structure is used
Template pixel values T(i,j) come from external
memory and a multiplexer is used to determine if
either addition or delay is performed

28
Reconfigurable design strategiesSTATIC DESIGN

The area for the static design is
The execution time is given by
Advantages No recompilation of the design code
or reconfiguration of the device
Disadvantages
large external RAM, which stores template pixels
and mask bits, makes the design slower
For large frame sizes, the number of I/O pins
required is extremely large

29
Reconfigurable design strategiesPARTIALLY
DYNAMIC DESIGN

Combines the advantages of both static and
dynamic design
Template pixels and mask bits are stored in
on-chip memory
To change the template, only a reconfiguration of
memory parts is required

30
Reconfigurable design strategiesPARTIALLY
DYNAMIC DESIGN

Where tbit is the time needed to reconfigure 1 bit

31
FPGA Implementation and Results

The PEs for the three reconfigurable designs have
been implemented for different values of output
word length n and c8 on a Xilinx Virtex XCV1000E
Using these results, the constant values a and b
for each design is determined

32
Results for a small example

For a template where wh3, p8, W7, H6, the
following results are obtained
From the first two rows, it can be seen that the
calculated and the measured values are almost
equal

33
Results for HDTV format

W1920, H1080, frame rate 30Hz
Area of dynamic design is 34 smaller than static
and 16 smaller than partially dynamic designs

34
Results for HDTV format

Area requirement for dynamic design (same p) but
different shapes, (w/h) is shown below

35
Results for HDTV format

Total execution times T required for different
number of frames and different techniques

36
Speed-up Achieved

Comparison with software (1.4GHz Pentium 4 PC )
for HDTV frame format-1 frame

37
Conclusion

Number of logic cells required for static and
partially dynamic design is constant for a frame
size
The dynamic design leads to significant savings
in area
Static design is suitable for an operation on one
or only a few frames
Partial and fully dynamic designs perform well if
matching is done on a large number of frames

38
Paper 2FPGA-based Template Matching using
Distance Transforms

S. Hezel, A. Kugel, R. Männer, D. M. Gavrila
IEEE Symposium on Field Programmable Custom
Computing Machines
2002

39
Objective

A high performance FPGA solution for generic
shape-based object detection
To present a step by step implementation of
components of object detection systems
Template matching performed with distance
transforms
Method is robust to missing or partially
incorrect data
Employing highly parallel pipelines, high
speed-up can be achieved in comparison to
sequential machines
Matching is done for many binary templates
concurrently using several distance transformed
images

40
Method Followed

Target object represented by binary templates,
containing positional and edge information
Scene image is preprocessed by edge segmentation,
edge cleaning and distance transforms
Matching involves correlating the templates with
the distance-transformed scene image
Locations where the mismatch is below a
user-defined threshold gives the object location

41
Hardware Used

FPGA implementation target PCI based FPGA
co-processors
Final implementation was carried out on a RACE-1
coprocessor
XILINX Virtex-2 FPGA (XC2V3000)
Four 36-bit wide 133MHz SRAM banks
64 bit, 66MHz PCI

42
Matching Algorithm using Distance Transforms

The distance transforms converts a binary image
consisting of feature and non-feature pixels into
an image where each pixel denotes the distance to
the nearest featured pixel

43
Matching with Distance Transforms

It involves 2 binary images
Segmented/Feature template T
Segmented/Feature image I
On and off pixels denote the presence and absence
of a feature
Actual features dont matter, and only edge
points are used
Feature template is given offline
Feature image is derived by feature extraction

44
Matching with Distance Transforms

The template T is translated and positioned over
the DT image of I
Measure D(T,I) determined by pixel values of the
DT image which lie under the on pixels of the
template
The lower the distance, the better the match
One measure for distance is the chamfer distance
Where T is the number of features in T

45
Matching using Distance Transforms

A template is considered as matched at locations
where D(T,I) lt ?
The advantage of matching
a template with the DT
image is that it provides a
smoother similarity
measure

a) Original image b) Template c) Edge image d)
DT image
46
Matching Algorithm Components

The matching algorithm contains the following
components
Edge detection
Edge noise removal
Computation of the distance transform
Correlation between the template and DT image
Calculation of the distance transform is a two
stage process

47
Preprocessing ArchitectureEDGE DETECTION

Sobel Operators for edge detection
Mask is fixed and image is transformed under the
mask, line by line
Two lines of the original image is copied into
the FPGA RAM
Calculations are done in parallel using two
pipelined Aus
A new pixel is fed into the shift register every
clock cycle
If SX SY gt threshold, then pixel is a feature
Discrete orientations are determined in parallel

48
Morphological Cleaning

Aim is to remove noise in the binary edge image
Three or less connected pixels are considered as
noise
Cleaning module is built as a pipeline with a
logic unit that has parallel access to all
relevant pixels
The LU detects in parallel all possible
combinations of three or less connected pixels

49
Distance Transformation

The chamfer metric is used for distance
Two-step process
1st step edge detection, morphological cleaning
and the forward distance transformation
A non-symmetric forward and backward mask is
present to calculate distance
Image translated under this mask, first in
forward and then in backward direction
All 8 directional images are processed in
parallel
Results clipped to 4 bits so all directions of a
pixel can be stored in a single word in memory

50
Control and Resources
Pipeline with forward transformation
Pipeline with backward transformation
51
Resource Requirements

The resource utilization of the two pipelines for
images of size 512X512 and 8-bit input data is
given

52
Architecture of Template Matching

A pipelined parallel approach is made use of
Relevant data of all multiple templates are
stored in shift register arrays and correlation
of all templates are carried out simultaneously
Depending on the number, size and shape of the
templates, varying FPGA resources are used

53
Parallel Pipelined Matching

To calculate the correlation of one template, the
following summations must be performed
The pixels of one DT image corresponding to the
template pixels have to be added
This is done 8 times, for each DT image
The intermediate sum of these 8 sums is
calculated
For N templates, this has to be done N times

54
Parallel Pipelined Matching

Correlations of all templates carried out
simultaneously
Each DT image has its one SRA
8 SRAs are required for 8 DT images
For each template, one adder tree which has
access to all SRA is assigned
The calculation strategy is similar to Sobel
calculations

55
Parallel Pipelined Matching

Each SRA differs in its extension depending on
the shape of the templates

56
Control

The SRAs can be filled with DT pixels such that
each SRA receives one input data every clock
cycle
The data is resorted before storing it in the SRA

57
Control

Filling the pipeline
Fill the SRA which has the biggest extension
Other SRAs are filled simultaneously
Each SRA is filled with the correct DT pixels
The pipeline is never stalled and all registers
can be clock enabled
After the pipeline is filled, no results of
possible matched templates are stored
The verification of the results is conducted on
the PC

58
Results

Results for Placement and Routing for
Preprocessing (PP) and Template Matching (TM) are
shown above
A speed-up to 200 was achieved in comparison to
software implementation on Pentium III 500MHz
processor

Write a Comment

User Comments (0)