Title: Ultrarapid visual form analysis using feed forward processing
1Ultra-rapid visual form analysis using feed
forward processing
- Timothée Masquelier, Rudy Guyonneau, Nicolas
Guilbaud, - Jong-Mo Allegraud, Simon J Thorpe
- CERCO, SpikeNet Technology
- ECVP 2005
- timothee.masquelier_at_cerco.ups-tsle.fr
2Ultra-rapid visual categorization
E.g. Choice saccade task In which of the two
scenes (left or right) is the animal?
3(No Transcript)
4(No Transcript)
5(No Transcript)
6(No Transcript)
7Ultra-rapid visual categorization
- The eyes can move towards high level objects in
as little as 120-30 ms - See Kirchner, Guyonneau and Thorpe 2005 (all at
ECVP this year!)
8Ultra-rapid visual categorization
- What neurobiologically plausible image
processing scheme could explain this performance?
9SpikeNet presentation
- Integrate and fire neurons
- V1 Gabor-like filters, intensity to delay
converters - Asynchronous spike propagation
- Purely feed-forward architecture
- No more than 1 spike per neuron (only the first
spike wave is modeled) - Very sparse activation (1-2)
- Learn by putting high weights on early firing
inputs
10Learning in SpikeNet
- Consistent with STDP based learning (Guyonneau
et al 2005)
11Recognition in SpikeNet
A perfect match produces a potential (or signal)
of 4
Limit activation with kWTA
12Performance measure
- Take one target image and learn it
- Measure maximum noise (i.e. signal for
non-targets) on 800 varied distractors - Measure signal for target image and measure how
signal decreases with image transformations
(zoom, rotation, blur etc.) - Deduce robustness to transformation (from signal
to noise ratio)
13Noise estimation
14Rotation tolerance
15All transformations
16Performances
- Good resistance to
- Rotation ( /- 12)
- Zoom ( /- 20)
- Aspect ratio ( between x 0.7 and x 1.4)
- Shear ( 30)
- Noise, blur
- Invariant to contrast, global lightness
- Staying very selective P(FalseAlarm) lt .0001
17Demo
18Performance
- We have not found natural images that cannot be
learned with this approach - Surprisingly, this simple feed-forward algorithm
can learn and detect visual shapes even when they
are themselves low contrast and/or noisy
19Conclusions
- Ultra-rapid visual categorization could be done
with - A very large number of neurons in higher order
visual areas that are selective to image
fragments (diagnostic of animal for e.g.) - Categorization might be possible with only the
earliest responses of these neurons - More complex and time consuming processes (i.e.
segmentation) could be done only after the
initial feed-forward pass.
20SpikeNet vs visual system
21Shift invariance and localization
Visual system Higher order neurons are
(relatively) shift invariant. Much fewer
neurons needed. What is learnt somewhere can be
recognized (almost) everywhere. - Localization
need a second feed-back process.
- SpikeNet
- Not naturally shift-invariant gt recognition
neurons are duplicated to cover the whole visual
field (weight sharing), not very realistic. - - Huge number of neurons needed
- Localization straightforward
22Next steps
- More realistic hierarchical architecture
- Introduce layers with neurons selective to
intermediate complexity features (Ullman 2002,
Serre 2005) - Increasing RF sizes, progressive loss of position
information - STDP based learning (at every stages)
23We believe we have captured a key mechanism, both
robust and selective, that is probably
extensively used in the human visual system.
?
- H Kirchner, SJ Thorpe. Ultra-rapid object
detection with saccadic eye movements Visual
processing speed revisited. Vision Research 2005. - R Guyonneau, R VanRullen, SJ Thorpe. Neurons
Tuned to the Earliest Spikes Through STDP. Neural
Computation, 2005 - S Ullman, M Vidal-Naquet, and E Sali. Visual
features of intermediate complexity and their use
in classification. Nature Neuroscience,
5(7)682687, 2002. - T. Serre, L. Wolf, T. Poggio. A New Biologically
Motivated Framework for Robust Object
Recognition. CVPR 2005