Object Tracking - PowerPoint PPT Presentation

1 / 58
About This Presentation
Title:

Object Tracking

Description:

Data appears in a horizontal raster scan fashion ... Dynamic design useful when the template is searched for in a large number of ... – PowerPoint PPT presentation

Number of Views:280
Avg rating:3.0/5.0
Slides: 59
Provided by: Rajd
Category:
Tags: object | tracking

less

Transcript and Presenter's Notes

Title: Object Tracking


1
Object Tracking
  • By Rajdeep Bondade

2
Tracking
  • Tracking is the task of estimating the trajectory
    of an object in the image plane as it moves
    around a scene
  • It is important in the field of computer vision
    and artificial intelligence
  • Interest generated due to high powered computers,
    high quality and inexpensive video cameras and
    need for automated video analysis

3
Applications of Tracking
  • Motion-based recognition
  • Automated surveillance
  • Video indexing
  • Human-computer interaction
  • Traffic monitoring
  • Vehicle navigation
  • Automatic target recognition in military domain
  • Visual inspection in industries

4
Classification
  • Tracking is broadly divided into Point Tracking,
    Kernel Tracking and Silhouette Tracking

5
Difficulties Faced
  • Tracking is highly computation intensive
  • Algorithms define multiple correlations,
    convolutions and other complex operations
  • These operations are very difficult to perform on
    a microprocessor
  • Microprocessors are serial in nature, but most
    operations are inherently parallel

6
Possible Solutions
  • Implement the parallel operations in hardware
    using ASICs
  • Not flexible
  • Cannot change the template object
  • Difficult to modify parameters such as size and
    shape of object being tracked
  • Implement using FPGAs and reconfigurable
    computing
  • Provides a tradeoff between speed of hardware and
    flexibility of software

7
Classical Image tracking system
  • Detection is a decision making process
  • Tracking involves associating discreet detections
    over time to form a track path
  • Recognition uses the results of detection and
    tracking to classify the object

8
Template Matching
  • It is an object detection technique used to find
    an object in a search image
  • Correlate the template image with the scene image
    and find the location where the result is minimum
  • Software solutions are flexible but is too slow

9
Paper 1Reconfigurable Shape Adaptive Template
Matching
  • Jörn Gause, Peter Y. K. Cheung, Wayne Luk
  • Imperial College, London
  • IEEE Symposium on Field Programmable Custom
    Computing Machines
  • 2002

10
Objective
  • Reconfigurable strategies for Shape-Adaptive
    Template Matching to detect arbitrarily shaped
    objects in images/video frames
  • Static Design
  • Template is stored on off-chip memory
  • Partially Dynamic Design
  • Template is stored in on-chip memory, allowing
    reconfiguration
  • Dynamic Design
  • Configuration data is completely adapted to shape
    and size of the template

11
Purpose
  • Algorithm is truly object oriented, i.e. it
    depends only on the template used
  • Software solutions may provide the flexibility
    but are too slow for real-time video processing
  • ASIC implementation is not practical due to
    infinite number of sizes of the template
  • Thus, a reconfigurable architecture is proposed
    to implement a fast and flexible SA-TM design

12
Prior Work and their Conclusions
  • Work has been performed using dynamic
    reconfiguration leading to acceleration and
    effective logic capacity usage
  • Computer vision algorithms
  • not shape adaptive
  • applied to small images (512 X 512)
  • Automatic Target Recognition
  • Binary templates (16 X 16) and small image (128 X
    128)
  • Run-time reconfiguration decreases area-execution
    time product if the search image is large enough
  • Dynamic Reconfiguration suitable to
    shape-adaptive algorithms if reconfiguration
    overhead is small

13
Shape Adaptive Template Matching
  • Aim
  • To find a template object of arbitrary shape and
    size within a search image or video frame of any
    size using a reconfigurable computing
    architecture
  • Search Image consists of WH pixels
  • The template consists of p opaque pixels and can
    have any shape. It is bounded by a box of size
    wh
  • In the bounding box, each pixel has one mask bit
    1 if the pixel belongs to the object and 0
    otherwise

14
Shape Adaptive Template Matching
  • Template is shifted over the image over
    (W-w1)(H-h1) locations
  • Sum of Absolute Distances over luminance pixel
    values is chosen as the comparison metric
  • The match is found when SAD(y,x) is minimum and
    smaller than a certain threshold

15
Systolic Array for SA-TM
  • Data appears in a horizontal raster scan fashion
  • The AD computations for a position along with the
    clock cycle and the SAD is shown

16
Systolic Array for SA-TM
  • A signal flow graph representing the previous
    example is shown below
  • The node lti,jgt represents I(y,x) T(i,j)
  • The pixel values I(x,y) are broadcasted
    sequentially to all PEs and all computations
    parallel
  • At the end of 42 clock cycles, all 20 SAD values
    will be computed

17
Structure of PE for SA-TM
  • The following general systolic array, adapted to
    the shape of the template object is presented
  • Each pixel belonging to a template is represented
    by a PE
  • The template pixel value is stored in the ROM
    within the PE

18
Structure of PE for SA-TM
  • Size of Sum_in and Sum_out depend on the position
    of PE in the SFG
  • N max(m,c)
  • Max. distance in one PE is 2c 1
  • Max. intermediate sum of k of the ADs is (2c
    1)k
  • This requires bits

19
Area Calculations
  • Area of PE contains a
    constant part (AD) and a
    variable part which
    grows with n
  • where a and b are constants
  • The total area to implement p PEs is given by

20
Area Calculations
  • The area then simplifies
    to
  • where

21
Further Area Calculations
  • Registers are required to delay the intermediate
    sums
  • PEs and registers are arranged according to the
    mask of the template
  • After each line of PEs, W-w shift registers are
    needed

22
Further Area Calculations
  • wh-p pixels require shift registers-
  • W-w pixels require shift registers-

23
Summary of Structure
  • p PEs are required to for AD computations and
    summation of intermediate results
  • Arrangement of PEs in the same way as the pixels
    of the template
  • wh-p gaps represent transparent pixels filled
    with registers
  • W-w shift registers are required in each but the
    last row to store intermediate sums
  • The size of the kth adder where 1k p is given
    by

24
Reconfigurable design strategiesDYNAMIC DESIGN
  • Reconfigured for every possible template size and
    shape and search frame size
  • The template is a part of the configuration data
    and word lengths can be optimized
  • One input to the AD module is constant, it can be
    replaced by a look-up table, which stores the AD
    value for each I(y,x)
  • p PEs and wh p (W-w)(h-1) registers are
    required

25
Reconfigurable design strategiesDYNAMIC DESIGN
  • The area required is
  • The total execution time TD consists of the
    computation time, the reconfiguration time and
    the compilation time
  • The execution time for N frames is

26
Reconfigurable design strategiesSTATIC DESIGN
  • Dynamic design useful when the template is
    searched for in a large number of video frames of
    the same size
  • In static design, FPGA configuration is not
    changed when a new template is used
  • Number of search frame sizes and template shapes
    and sizes is unlimited, only a subset of all
    solutions are implemented

27
Reconfigurable design strategiesSTATIC DESIGN
  • If the search frame size is fixed, the following
    PE structure is used
  • Template pixel values T(i,j) come from external
    memory and a multiplexer is used to determine if
    either addition or delay is performed

28
Reconfigurable design strategiesSTATIC DESIGN
  • The area for the static design is
  • The execution time is given by
  • Advantages No recompilation of the design code
    or reconfiguration of the device
  • Disadvantages
  • large external RAM, which stores template pixels
    and mask bits, makes the design slower
  • For large frame sizes, the number of I/O pins
    required is extremely large

29
Reconfigurable design strategiesPARTIALLY
DYNAMIC DESIGN
  • Combines the advantages of both static and
    dynamic design
  • Template pixels and mask bits are stored in
    on-chip memory
  • To change the template, only a reconfiguration of
    memory parts is required

30
Reconfigurable design strategiesPARTIALLY
DYNAMIC DESIGN
  • Where tbit is the time needed to reconfigure 1 bit

31
FPGA Implementation and Results
  • The PEs for the three reconfigurable designs have
    been implemented for different values of output
    word length n and c8 on a Xilinx Virtex XCV1000E
  • Using these results, the constant values a and b
    for each design is determined

32
Results for a small example
  • For a template where wh3, p8, W7, H6, the
    following results are obtained
  • From the first two rows, it can be seen that the
    calculated and the measured values are almost
    equal

33
Results for HDTV format
  • W1920, H1080, frame rate 30Hz
  • Area of dynamic design is 34 smaller than static
    and 16 smaller than partially dynamic designs

34
Results for HDTV format
  • Area requirement for dynamic design (same p) but
    different shapes, (w/h) is shown below

35
Results for HDTV format
  • Total execution times T required for different
    number of frames and different techniques

36
Speed-up Achieved
  • Comparison with software (1.4GHz Pentium 4 PC )
    for HDTV frame format-1 frame

37
Conclusion
  • Number of logic cells required for static and
    partially dynamic design is constant for a frame
    size
  • The dynamic design leads to significant savings
    in area
  • Static design is suitable for an operation on one
    or only a few frames
  • Partial and fully dynamic designs perform well if
    matching is done on a large number of frames

38
Paper 2FPGA-based Template Matching using
Distance Transforms
  • S. Hezel, A. Kugel, R. Männer, D. M. Gavrila
  • IEEE Symposium on Field Programmable Custom
    Computing Machines
  • 2002

39
Objective
  • A high performance FPGA solution for generic
    shape-based object detection
  • To present a step by step implementation of
    components of object detection systems
  • Template matching performed with distance
    transforms
  • Method is robust to missing or partially
    incorrect data
  • Employing highly parallel pipelines, high
    speed-up can be achieved in comparison to
    sequential machines
  • Matching is done for many binary templates
    concurrently using several distance transformed
    images

40
Method Followed
  • Target object represented by binary templates,
    containing positional and edge information
  • Scene image is preprocessed by edge segmentation,
    edge cleaning and distance transforms
  • Matching involves correlating the templates with
    the distance-transformed scene image
  • Locations where the mismatch is below a
    user-defined threshold gives the object location

41
Hardware Used
  • FPGA implementation target PCI based FPGA
    co-processors
  • Final implementation was carried out on a RACE-1
    coprocessor
  • XILINX Virtex-2 FPGA (XC2V3000)
  • Four 36-bit wide 133MHz SRAM banks
  • 64 bit, 66MHz PCI

42
Matching Algorithm using Distance Transforms
  • The distance transforms converts a binary image
    consisting of feature and non-feature pixels into
    an image where each pixel denotes the distance to
    the nearest featured pixel

43
Matching with Distance Transforms
  • It involves 2 binary images
  • Segmented/Feature template T
  • Segmented/Feature image I
  • On and off pixels denote the presence and absence
    of a feature
  • Actual features dont matter, and only edge
    points are used
  • Feature template is given offline
  • Feature image is derived by feature extraction

44
Matching with Distance Transforms
  • The template T is translated and positioned over
    the DT image of I
  • Measure D(T,I) determined by pixel values of the
    DT image which lie under the on pixels of the
    template
  • The lower the distance, the better the match
  • One measure for distance is the chamfer distance
  • Where T is the number of features in T

45
Matching using Distance Transforms
  • A template is considered as matched at locations
    where D(T,I) lt ?
  • The advantage of matching
    a template with the DT
    image is that it provides a
    smoother similarity
    measure

a) Original image b) Template c) Edge image d)
DT image
46
Matching Algorithm Components
  • The matching algorithm contains the following
    components
  • Edge detection
  • Edge noise removal
  • Computation of the distance transform
  • Correlation between the template and DT image
  • Calculation of the distance transform is a two
    stage process

47
Preprocessing ArchitectureEDGE DETECTION
  • Sobel Operators for edge detection
  • Mask is fixed and image is transformed under the
    mask, line by line
  • Two lines of the original image is copied into
    the FPGA RAM
  • Calculations are done in parallel using two
    pipelined Aus
  • A new pixel is fed into the shift register every
    clock cycle
  • If SX SY gt threshold, then pixel is a feature
  • Discrete orientations are determined in parallel

48
Morphological Cleaning
  • Aim is to remove noise in the binary edge image
  • Three or less connected pixels are considered as
    noise
  • Cleaning module is built as a pipeline with a
    logic unit that has parallel access to all
    relevant pixels
  • The LU detects in parallel all possible
    combinations of three or less connected pixels

49
Distance Transformation
  • The chamfer metric is used for distance
  • Two-step process
  • 1st step edge detection, morphological cleaning
    and the forward distance transformation
  • A non-symmetric forward and backward mask is
    present to calculate distance
  • Image translated under this mask, first in
    forward and then in backward direction
  • All 8 directional images are processed in
    parallel
  • Results clipped to 4 bits so all directions of a
    pixel can be stored in a single word in memory

50
Control and Resources
Pipeline with forward transformation
Pipeline with backward transformation
51
Resource Requirements
  • The resource utilization of the two pipelines for
    images of size 512X512 and 8-bit input data is
    given

52
Architecture of Template Matching
  • A pipelined parallel approach is made use of
  • Relevant data of all multiple templates are
    stored in shift register arrays and correlation
    of all templates are carried out simultaneously
  • Depending on the number, size and shape of the
    templates, varying FPGA resources are used

53
Parallel Pipelined Matching
  • To calculate the correlation of one template, the
    following summations must be performed
  • The pixels of one DT image corresponding to the
    template pixels have to be added
  • This is done 8 times, for each DT image
  • The intermediate sum of these 8 sums is
    calculated
  • For N templates, this has to be done N times

54
Parallel Pipelined Matching
  • Correlations of all templates carried out
    simultaneously
  • Each DT image has its one SRA
  • 8 SRAs are required for 8 DT images
  • For each template, one adder tree which has
    access to all SRA is assigned
  • The calculation strategy is similar to Sobel
    calculations

55
Parallel Pipelined Matching
  • Each SRA differs in its extension depending on
    the shape of the templates

56
Control
  • The SRAs can be filled with DT pixels such that
    each SRA receives one input data every clock
    cycle
  • The data is resorted before storing it in the SRA

57
Control
  • Filling the pipeline
  • Fill the SRA which has the biggest extension
  • Other SRAs are filled simultaneously
  • Each SRA is filled with the correct DT pixels
  • The pipeline is never stalled and all registers
    can be clock enabled
  • After the pipeline is filled, no results of
    possible matched templates are stored
  • The verification of the results is conducted on
    the PC

58
Results
  • Results for Placement and Routing for
    Preprocessing (PP) and Template Matching (TM) are
    shown above
  • A speed-up to 200 was achieved in comparison to
    software implementation on Pentium III 500MHz
    processor
Write a Comment
User Comments (0)
About PowerShow.com