Visual Attention: Selective tuning and saliency computation using game theory PowerPoint PPT Presentation

presentation player overlay
1 / 20
About This Presentation
Transcript and Presenter's Notes

Title: Visual Attention: Selective tuning and saliency computation using game theory


1
Visual AttentionSelective tuning and saliency
computation using game theory
  • Presentation prepared by Alexandre Bernardino,
    VisLab-ISR-IST.
  • Based on the papers
  • Modeling visual attention via selective tuning.
    J. Tsotsos, S. Culhane, W. Wai, Y. Lai, N.
    Davies, F. Nuflo. Artificial Intelligence 78
    (1995) 507-546.
  • Visual Attention using game theory. O.
    Ramström, H. Christensen. BMCV 2002, 462-471.

2
The General Vision Problem (oversimplified)
Context
Recognition/Inference
Attention/Selection
Features/Saliency
Images
Can all vision problems be described by this
diagram ?
3
Visual Attention Components (from Tsotsos et al)
  • Selection of a region of interest in the visual
    field
  • Selection of feature dimensions and values of
    interest
  • Control of information flow through the network
    of neurons that constitute the visual system
  • The shifting from one selected region to the next
    in time.
  • Transformation of task information into
    attentional instructions
  • Integration of successive attentional fixations
  • Interactions with memory
  • Indexing into model bases

4
The Selective Tuning Model
  • Localizes interesting regions in the visual
    field.
  • Assumes Interestingness values can be easily
    computed for each item, depending on task
    definition.
  • Reduces computation by utilizing a visual pyramid
  • Addresses some problems with pyramid
    representations

5
Visual Pyramids
  • Small receptive fields at the bottom and large
    receptive fields at the top may overlap.
  • At each site and scale, the information is
    interpreted by interpretive units of different
    types.
  • Each interpretive unit may receive feedback,
    fedforward, lateral interactions (etc...), from
    other units.
  • Solve part of the complexity problem but
    introduce others

6
Benefits of Visual Pyramids
  • Multi-scale analysis and data reduction
  • Each unit computes a weighted average of its
    lower-level units.

7
Problems with information flow due to pyramidal
processing (1)
  • The Context Effect
  • Units at the top of the top of the pyramid
    receives input from a very large sub-pyramid and
    are confounded by the surroundings of the
    attended object.

8
Problems with information flow due to pyramidal
processing (2)
  • Blurring
  • A single event at the input affects an inverted
    subpyramid of units and gets blurred as it flows
    upwards so that a large portion of the output
    represents part of it.

9
Problems with information flow due to pyramidal
processing (3)
  • Cross-talk
  • Two separate visual events activate two inverted
    subpyramids that may overlap. Thus one event
    interferes with the interpretation of the other.

10
Problems with information flow due to pyramidal
processing (4)
  • Boudary
  • Central items appear stronger than peripheral
    items since the number of upgoing connections for
    central objects is bigger.

11
Tsotsos et al Selective Tuning Architecture
12
WTA Units
  • Il,k the interpretive unit in assembly k in
    layer l
  • Gl,k,j the jth WTA gating unit, in assembly k in
    layer l, linking Il,k with Il-1,j
  • gl,k the gating control unit for the WTA over
    the inputs to Il,k
  • bl,k the bias unit for Il,k
  • ql,j,i weight applied to Il-1,i in the
    computation of Il,j
  • nl,x scale normalization factor
  • Ml,k the set of gating units for Il,k
  • Ul1,k the set of gating units in layer l1
    making feedback connections to gl,k
  • Bl1,k the set of bias units in layer l1 making
    feedback connections to bl,k

13
Selective Tuning Overview
  • Build the pyramid
  • Compute MAX (Winner Take All) at the higher level
    to determine the globally most salient items.
    Top-down bias can be externally introduced.
  • Inhibit units not on the winners receptive field.
  • The process continues to the bottom of the
    pyramid
  • As the prunning of connections proceeds
    downwards, interpretive units are recomputed and
    propagated upwards.
  • BENEFITS
  • WTAs are computed on small regions.
  • RESULT
  • Selects (segments) a region that fulfils the
    saliency definition at all scales.

14
Information Routing
15
Results Brightness and Orientation
  • Brightness
  • Saliency the largest and brightest
  • Features Average gray level on rectangles
    6,50x6,50
  • Pyramid local average of previous level.
  • Orientation
  • Saliency the longest and highest contrast
    straight line
  • Features edges with orientations 0, 45, 90,
    135 and sizes 3,35x3,35
  • Pyramid 128,108,80,48,28

16
Results Motion
  • Simulated optic flow
  • Matching (correlation) against 16 templates of
    motion patterns
  • Pyramid computes local average. 4 levels.

17
What is missing ?
  • Salient features are predefined and very simple
  • Ex
  • The brightest and largest item.
  • The largest and highest contrast line.
  • The best matching item with a database.
  • Conjuction of features is Ad-hoc
  • WTA within each feature dimension
  • WTA across the winners of 1
  • Overall winner selects the attended region

18
Visual Attention Using Game Theory
  • Compute salient locations on multi-feature
    spaces
  • Each point (x,y) is associated with a unit vector
    of multiple features, eg. color and brightness
    nx,y (Rx,y,Gx,y,Bx,y,Ix,y)t/N
  • Incorporates task knowledge top-down bias as a
    desired feature unit vector
  • w (Rd,Gd,Bd,Id)t/M
  • Salient regions in a image are defined as being
    similar the the desired feature vector and
    distinct from their neighbors
  • wtnA gt wtnA A lt A nA sum(nx,y\in
    A)
  • The subregion matches the wanted feature better
    than its surrounding.

19
The Market
  • N actors (points)
  • K types of available goods
  • Actor i as an allocation of goods ni \in Rk
  • Let f(ni) be the utility of a certain allocation
    of goods.
  • Each agent will trade to get the zi that solves
  • max (f(zi)-pt(zi-ni)) p is the price vector.

20
The Feature Market
  • If f(ni) is a concave function, then the market
    reaches competitive equilibrium
  • ni naverage, in a neighborhood.
  • f(ni) wtni is a concave function
  • A fair price is defined as
  • p f(nav) w navnavtw (I-navnavt)w Aw
  • A projection on the orthogonal complement

21
Saliency Wealth
  • Capital of actor i
  • Ci pt (ni-nav) wtA(ni-nav) wtAni

ni
w
Ani
nav
navnavtni
22
Interesting Things
  • Normalization Matrix A enhances directions with
    less items.
  • Salience can be split in two terms
  • Intrinsic salience independent of the task
  • Si Ani
  • Extrinsic salience depends on top-down bias
  • Se wtSi
Write a Comment
User Comments (0)
About PowerShow.com