Visual Attention: Selective tuning and saliency computation using game theory presentation

About This Presentation

Transcript and Presenter's Notes

Title: Visual Attention: Selective tuning and saliency computation using game theory

1
Visual AttentionSelective tuning and saliency
computation using game theory

Presentation prepared by Alexandre Bernardino,
VisLab-ISR-IST.

Based on the papers
Modeling visual attention via selective tuning.
J. Tsotsos, S. Culhane, W. Wai, Y. Lai, N.
Davies, F. Nuflo. Artificial Intelligence 78
(1995) 507-546.
Visual Attention using game theory. O.
Ramström, H. Christensen. BMCV 2002, 462-471.

2
The General Vision Problem (oversimplified)
Context
Recognition/Inference
Attention/Selection
Features/Saliency
Images
Can all vision problems be described by this
diagram ?
3
Visual Attention Components (from Tsotsos et al)

Selection of a region of interest in the visual
field
Selection of feature dimensions and values of
interest
Control of information flow through the network
of neurons that constitute the visual system
The shifting from one selected region to the next
in time.
Transformation of task information into
attentional instructions
Integration of successive attentional fixations
Interactions with memory
Indexing into model bases

4
The Selective Tuning Model

Localizes interesting regions in the visual
field.
Assumes Interestingness values can be easily
computed for each item, depending on task
definition.
Reduces computation by utilizing a visual pyramid
Addresses some problems with pyramid
representations

5
Visual Pyramids

Small receptive fields at the bottom and large
receptive fields at the top may overlap.
At each site and scale, the information is
interpreted by interpretive units of different
types.
Each interpretive unit may receive feedback,
fedforward, lateral interactions (etc...), from
other units.
Solve part of the complexity problem but
introduce others

6
Benefits of Visual Pyramids

Multi-scale analysis and data reduction
Each unit computes a weighted average of its
lower-level units.

7
Problems with information flow due to pyramidal
processing (1)

The Context Effect
Units at the top of the top of the pyramid
receives input from a very large sub-pyramid and
are confounded by the surroundings of the
attended object.

8
Problems with information flow due to pyramidal
processing (2)

Blurring
A single event at the input affects an inverted
subpyramid of units and gets blurred as it flows
upwards so that a large portion of the output
represents part of it.

9
Problems with information flow due to pyramidal
processing (3)

Cross-talk
Two separate visual events activate two inverted
subpyramids that may overlap. Thus one event
interferes with the interpretation of the other.

10
Problems with information flow due to pyramidal
processing (4)

Boudary
Central items appear stronger than peripheral
items since the number of upgoing connections for
central objects is bigger.

11
Tsotsos et al Selective Tuning Architecture
12
WTA Units

Il,k the interpretive unit in assembly k in
layer l
Gl,k,j the jth WTA gating unit, in assembly k in
layer l, linking Il,k with Il-1,j
gl,k the gating control unit for the WTA over
the inputs to Il,k
bl,k the bias unit for Il,k
ql,j,i weight applied to Il-1,i in the
computation of Il,j
nl,x scale normalization factor
Ml,k the set of gating units for Il,k
Ul1,k the set of gating units in layer l1
making feedback connections to gl,k
Bl1,k the set of bias units in layer l1 making
feedback connections to bl,k

13
Selective Tuning Overview

Build the pyramid
Compute MAX (Winner Take All) at the higher level
to determine the globally most salient items.
Top-down bias can be externally introduced.
Inhibit units not on the winners receptive field.
The process continues to the bottom of the
pyramid
As the prunning of connections proceeds
downwards, interpretive units are recomputed and
propagated upwards.
BENEFITS
WTAs are computed on small regions.
RESULT
Selects (segments) a region that fulfils the
saliency definition at all scales.

14
Information Routing
15
Results Brightness and Orientation

Brightness
Saliency the largest and brightest
Features Average gray level on rectangles
6,50x6,50
Pyramid local average of previous level.
Orientation
Saliency the longest and highest contrast
straight line
Features edges with orientations 0, 45, 90,
135 and sizes 3,35x3,35
Pyramid 128,108,80,48,28

16
Results Motion

Simulated optic flow
Matching (correlation) against 16 templates of
motion patterns
Pyramid computes local average. 4 levels.

17
What is missing ?

Salient features are predefined and very simple
Ex
The brightest and largest item.
The largest and highest contrast line.
The best matching item with a database.
Conjuction of features is Ad-hoc
WTA within each feature dimension
WTA across the winners of 1
Overall winner selects the attended region

18
Visual Attention Using Game Theory

Compute salient locations on multi-feature
spaces
Each point (x,y) is associated with a unit vector
of multiple features, eg. color and brightness
nx,y (Rx,y,Gx,y,Bx,y,Ix,y)t/N
Incorporates task knowledge top-down bias as a
desired feature unit vector
w (Rd,Gd,Bd,Id)t/M
Salient regions in a image are defined as being
similar the the desired feature vector and
distinct from their neighbors
wtnA gt wtnA A lt A nA sum(nx,y\in
A)
The subregion matches the wanted feature better
than its surrounding.

19
The Market

N actors (points)
K types of available goods
Actor i as an allocation of goods ni \in Rk
Let f(ni) be the utility of a certain allocation
of goods.
Each agent will trade to get the zi that solves
max (f(zi)-pt(zi-ni)) p is the price vector.

20
The Feature Market

If f(ni) is a concave function, then the market
reaches competitive equilibrium
ni naverage, in a neighborhood.
f(ni) wtni is a concave function
A fair price is defined as
p f(nav) w navnavtw (I-navnavt)w Aw
A projection on the orthogonal complement

21
Saliency Wealth

Capital of actor i
Ci pt (ni-nav) wtA(ni-nav) wtAni

ni
w
Ani
nav
navnavtni
22
Interesting Things

Normalization Matrix A enhances directions with
less items.
Salience can be split in two terms
Intrinsic salience independent of the task
Si Ani
Extrinsic salience depends on top-down bias
Se wtSi

Write a Comment

User Comments (0)

About PowerShow.com

Visual Attention: Selective tuning and saliency computation using game theory PowerPoint PPT Presentation