Human Visual Processing II ExtraStriate Processing and Object Recognition Models PowerPoint PPT Presentation

presentation player overlay
1 / 27
About This Presentation
Transcript and Presenter's Notes

Title: Human Visual Processing II ExtraStriate Processing and Object Recognition Models


1
Human Visual Processing IIExtra-Striate
Processing and Object Recognition Models
  • -Sanketh Shetty
  • Image Analysis Lab
  • NC State University

2
Updates From Retina to V1
  • Simple Cells
  • Sensitive to Orientation, Position AND Spatial
    Frequency
  • Gabor Filters or Difference of Gaussians can be
    used to model Simple V1
  • Hyper-columns ltgt Gabor Jets
  • Metric Relationships are destroyed

3
The Four Visual Pathways
  • M-gtLGN M-gt4Ca-gt4B-gtMT -gt MST
  • Motion Processing
  • M-gtLGN M-gt4Ca-gt4B-gtThick Stripes (V2)-gtMT
  • Binocular Vision and Depth
  • P-gtLGN P-gt 4Cb-gt 2-3 (blobs) -gt Thin Stripes (V2)
    gt V4
  • Color Processing (no orientation selectivity)
  • And

4
And the Fourth
  • P-gtLGN P-gt 4Cb-gt 2-3 (inter blobs) -gt Pale
    Stripes (V2) gt V4 gt IT

5
V2 and V4
  • Higher along the Visual Path way for Form
  • V2 shows sensitivity to low order feature
    combinations (from V1)
  • V4 is sensitive to end-stopped lines and curves
    and more complex features (about 33 of the cells)

6
The Inferior Temporal Cortical Areas
  • Posterior IT (TEO)
  • IT (TEa, TEm, TE3)
  • Anterior IT (TE1 and TE2)
  • Higher order features

7
IT Properties -1
  • Require Moderately complex features for
    activation
  • Multi-cell activation (refutation of the
    Grandmother Cell Theory)
  • Evidence for Distributed Representation
  • Exponentially increasing coding capacity
  • Smaller population of neurons needed for decoding

8
Advantages of Distributed Representation
  • Graceful degradation
  • Lesser sensitivity to noise
  • Exponential encoding capacity

9
IT Properties - 2
  • View Invariant Neurons exist along side View
    dependent Neurons
  • Object based rather than view based
    representation
  • Different Neurons have independent response
    profiles to different stimuli
  • Correlation between Firing rates and Information
    Content
  • Little information in the synchronization of
    firing

10
Other Properties
  • Invariance
  • Size and Spatial Frequency
  • No Fourier Analysis Involved
  • Information is biased towards what is present in
    fovea
  • Translational Invariance and Receptive Field
    Sizes vary with background
  • Better Response to 3D than 2D objects
  • Can be disputed

11
Models for Invariant Object Recognition
  • Feature Spaces
  • Rolls Hypotheses (Feature Hierarchies)
  • Structural Representation (Surface Based)
  • Others
  • Template Matching
  • Invertible Networks (Hinton et al.)

12
Rolls Hypotheses
  • V1 gt V2 gt V4 gt Posterior Inferior Temporal
    Cortex (TEO)gt Inferior Temporal Cortex (TE3,
    TEa and Tem) gt Anterior Inferior Temporal Cortex
    (TE2, TE1)
  • Hierarchy of competitive neural networks with
    overlapping inputs

13
Rolls Hypotheses Contd.
  • Each layer is self-organizing
  • Non-linear, Feed Forward with Inhibitory Feedback
    connections
  • Local Modified Hebbian Learning (Trace Rule)
  • Exploits temporal aspects of firing to achieve
    invariance
  • Distributed Output Representation

14
More about Rolls Hypotheses
  • Increasing Complexity of Features higher up in
    the hierarchy
  • Inputs from a Gaussian distributed area in the
    previous layer
  • Low order feature combinations
  • Consistent with Neurophysiological Data

15
VisNet
  • Implemented Rolls Hypotheses
  • Achieved Translational Invariance with Trace
    Learning Rule
  • 4 Layers
  • Input is from 2D Spatial Filters of different
    Spatial Frequencies and Orientations

16
More on VisNet
  • High Spatial Precision of Feature Combinations
  • Successful recognition of objects and faces
  • View independent also
  • Solution to the binding problem
  • a local processing problem

17
Structural Representation
  • General Agreement on what processing goes on in
    the Lower Visual Areas
  • Disagreement in Higher Level Processing
  • Marrs Classification of this paradigm
  • Boundary/Surface based shape desc.
  • Axial/Volumetric (eg. Medial axis)

18
What is the hypotheses?
  • Alphabet of Simple Shape primitives to describe
    objects
  • Infinity of Shapes
  • Information about relative spatial position and
    connectivity between primitives
  • Geometric Properties are explicitly encoded

19
Connors Contour Fragment Hypotheses
  • 4D Space
  • Curvature
  • Orientation (surface normal)
  • Radial Position
  • Angular Position

20
Connors Hypotheses Contd.
  • Hierarchical System
  • Simpler to more complex features
  • Not just features but also their spatial
    relationships
  • Object based coordinated vs. view/viewer based
    coordinates
  • Cells sensitive to local conjunctions

21
The Structural Model
  • Hierarchical model
  • First layers sensitive to oriented lines, edges
    other simple features
  • More complex contour fragments and angles could
    come in the second level
  • This tuning is present in V4 and even more
    prominently in PIT and CIT

22
Evidence and Properties
  • 33 of V4 cells are tuned to complex features
  • Many cells in V4 are tuned for linear
    orientations
  • Many cells are tuned for contour fragment shape

23
Properties -2
  • Cells in V4 are tuned to both the curvature and
    the normal
  • Also tuned to contour fragment position and
    orientation
  • Orientation and Normal are not necessarily
    connected
  • Depends on shape primitive and also conjoined
    primitives eg.

24
Last Word on IT
  • Some cells respond to parts of objects and some
    to whole objects
  • Translational Invariance
  • Invariant to Size and Aspect Ratio
  • Cells with Similar sensitivity are clustered
    together in columns
  • Solution to Binding Problem

25
Last Word
  • There are some cells that are sensitive to
    arrangement of features rather than the features
    themselves
  • Interesting points on receptive fields
  • Translation Invariance
  • Size in complex backgrounds
  • There are 2 kinds of horizontal connections
  • Inhibitory till 1mm
  • Excitatory then on till 8mm (why!!)

26
Conclusion
  • Definite Idea about the structure of the ventral
    pathway
  • One theory to support a Hough like technique
  • What we need to look up
  • Feature alphabet
  • Possible mode of calculation of object center

27
The End
Write a Comment
User Comments (0)
About PowerShow.com