Recognition presentation

About This Presentation

Transcript and Presenter's Notes

Title: Recognition

1
Recognition

Chapter 5
Identifying and Classifying
patterns, objects, faces

2
Ch 5 Identifying and Classifying Concepts
COGNITION AND CONSCIOUSNESS Subliminal Visual
Priming COGNITION AND INDIVIDUAL DIFFERENCES
Right Hemisphere and Self-Recognition
COGNITION AND NEUROSCIENCE Causal Similarity,
Perceptual Similarity, Childrens Categorization

I. Bottom-up Views of Pattern Recognition
A.Structural-Description-Based (SDB) Approaches
Pandemonium Model Marr RBC
B.View-Based (VB) Approaches
Template Matching and Object Recognition
II. Top-down Processing in Pattern Recognition
A.The Word-Superiority Effect
B.A Scene-Superiority Effect
C.Two-System Views of Object Recognition
Structural Description and Episodic Systems
III. Recognizing Faces
A. The Face-Inversion Effect
B. Wholistic Processing of Faces
C. Dissociations/Associations in Face Recognition
IV. Concepts and Categories
Similarity Prototypes Exemplars
Explanation-based view on Concept Representation

3
Identifying and Classifying Objects

Pattern, object and face recognition how does
this remarkable feat happen?
processes whereby we match a stimulus with stored
representations for the purpose of identification
I. Bottom-up Views of Pattern Recognition
views of pattern recognition differ primarily in
their characterization of bottom-up processes
basic question--when engaged in object
recognition, what are we matching the information
to in memory?
A. Structural-Description-Based (SDB) Approaches
compare patterns to a structural description in
memory includes a list of the visual features,
as well as the relationships between these
features (often called feature analysis)
representation in memory is not visually or
spatially analogous to the pattern being
recognized
compare features of incoming stimulus to an
abstract description of those features in memory
advantage particular orientation or view of the
pattern is not important no matter what the
perspective, patterns are broken down into
component parts and compared to a structural
description in memory (also called view-point
independent)

4
Pandemonium Model

feature analysis is carried out by hierarchically
organized demons
-entities each carry out different jobs that
culminate in pattern recognition
-each set of demons passes their info on to the
next set for further analysis
image demons responsible for the initial
encoding of the pattern
feature demons analyze the now encoded stimulus
in terms of its component elements (e.g., /, -,
and \) different feature demons look for each of
these elements
cognitive demons monitor work of feature demons,
looking for evidence that supports a specific
pattern
-each cognitive demon represents a different
pattern (e.g., A, V, W, B)
-cognitive demons shout with each piece of
evidence (i.e., info from feature demons like /,
-, or \) that supports their designated pattern
-multiple cognitive demons will be shouting if
the to-be-identified pattern shares features with
several cognitive demons
-e.g., the A cognitive demon, plus the V
cognitive demon will be shouting
decision demons assess shouting of cognitive
demons
evaluate which one (e.g., A or V) is shouting
the loudest (i.e, has detected the most
evidence) use that information to decide on
pattern

5
Pandemonium model

model fits well with what we know about lower
level perceptual processes
there are cells in the visual cortex that
correspond to lines of certain lengths,
orientations, etc.
so it makes sense to characterize pattern
recognition as a gradual process of evidence
accumulation based on a feature-by-feature
analysis of incoming information
problems
-limited to explanations of how we recognize
simple patterns like letters
-representational economy--unlikely that we have
a near-infinite number of feature analyzers to
analyze a near-infinite number of features

6
Visual Pathways
7
Extracting Sensory MessagesVisual Processing in
the Brain (Pp. 130-131)

Neural messages travel to brain via optic nerve
Splits at optic chiasm so that information from
left visual field goes to right hemisphere and
vice versa
Within each hemisphere, information goes to the
lateral geniculate nucleus (90 in thalamus) and
superior colliculus (10 in midbrain)
Former deals with colour, texture, depth
Former deals with movement
Demonstrates parallel processing
From the lateral geniculate nucleus,
informationgoes to primary visual cortex (in
occipital lobe)
Feature detectors Cells in cortex that react to
simple visual stimuli such as edges, lines,
angles (may be orientation specific)

8
Feature Detectors in Visual Cortex
9
Marrs View

goal of the visual system is to transform a
2-dimensional retinal image into a 3-dimensional
percept that is quickly and easily identified
processing stages
first stage register the information contained
in the retinal image
note the specifics of the retinal image depend
on the particular view of the object)
information about light intensity, boundaries,
discontinuities, edges, and groupings
used to form a primal sketch--rough rendition of
the most primitive elements of the object
information from the primal sketch is used to
construct a 2½ -D sketch
includes information about orientation, relative
depth of the visible surfaces also about
discontinuities in depth and orientation

10
Marrs View

information derived from the primal sketch and
2½-D sketch forms the basis for the construction
of a 3-D model of the object
the parts of 3-D model are termed volumetric
primitives, but are basically generalized 3-D
cylinders (a pipe-cleaner version of the object
were looking at) -I.e., there is less detail
unlike the primal sketch and the 2½ -D sketch
(which differ depending on the perspective of the
observer), the 3-D model is not view-specific
(viewpoint independent)

11
Recognition-by-Components (RBC) Theory

object recognition is a matter of parsing objects
into features
analyzed features are not simple line segments,
angles, and curves the features are basic 3-D
shapes termed geons
a total of 36 geons that serve as visual
primitives--simple shapes that can combine to
form most other more complex shapes
similar to Marrs theory in that it proposes a
series of hierarchically arranged stages whereby
information about component features is used to
identify the object
edge extraction process--looks for differences in
features like texture, luminance, color, and
results in a simple line drawing of the object
-next, two processes occur simultaneously
-the detection of nonaccidental features--actual
features of the stimulus, rather than some
accident of a peculiar observer perspective
-parsing the object at areas where there appear
to be boundaries between the parts of the object

12
Recognition-by-Components (RBC) Theory

next, with the information gained to this point,
the components of the figure, the geons, are
determined
this set of components is matched with object
representations in memory
when a match is found, the object is identified
critical predictions
objects should become less identifiable the
harder it gets to recover their components

13
Biederman and Cooper (1991)

participants were presented with sketches of
objects in which 50 of the contours had been
deleted
for some figures, the contour deletion disrupted
the figure at the points of segmentation that
would be used to carve the object into its
component geons
Predictions
object recognition should suffer
for other figures, the contour deletion did not
prevent recovery of the geons (although the same
amount of the contour was deleted)
object recognition should not be affected
findings were consistent with the RBC theory
rotation of objects should not hinder recognition
changes in orientation do not influence the basic
components of the object and their relationships
to one another

14
View-Based (VB) Approaches

objects are recognized wholistically through a
process of comparison with a stored analog
when a match is found, the pattern is recognized
termed viewpoint-dependent because identification
of an object depends critically on the particular
perspective the viewer has
to identify the object, an image matching this
particular view must be found or, the incoming
stimulus image must be manipulated in some way
(e.g., rotated) until a match is found with
images represented in memory
An Early Attempt Template Matching
our store of general knowledge includes a set of
templates (copies) of every pattern that we might
encounter
when we encounter a pattern that needs to be
identified the mind quickly rifles through its
set of templates when a match is found, the
pattern is given the label stored with the
template (i.e., the pattern is recognized)
even the slightest change in the pattern will
lead to a recognition failure
problems
too rigid
lack of economy

15
Modern Versions of the View-Based Approach
Tarr and Pinker (1989)

taught participants names for shapes shapes were
always presented in the same orientation during
training
during a test phase, these shapes were presented
for recognition
participants responded quickly if the shapes were
presented at the same orientation as in training,
but were successively slower as the degree of
rotation from that original position increased,
indicating that perception is viewpoint-dependent
after training on the new orientations,
participants eventually became equally fast no
matter what the orientation
results parallel the everyday recognition of
objects
we start with one representation of objects
(termed the canonical representation), and
through experience with objects in many different
orientations and viewed from many different
perspectives, we develop multiple
representations, or views of the objects
these multiple views serve as the templates for
later recognition
orientation tends not to affect visual
recognition under most circumstances since
everything we must recognize has received
extensive exposure from different view

16
Logothetis, Poggio, and Poggio (1995)

taught monkeys to recognize novel 3D objects from
a variety of different perspectives, and like
humans, they eventually became equally proficient
at recognizing these objects given any of the
rotations
neural activity associated with recognition
different sets of cells responded most strongly
to certain objects, indicating that certain
networks were devoted to certain objects
a given set of cells responded most strongly when
that object appeared in the same orientation as
it had during training the responses decreased
systematically with increases in the rotation
from that perspective
monkeys seemed to have what might be termed
physiological templates that were devoted to
recognizing a specific object in a specific
orientation

17
Object Recognition Views or Structural
Descriptions?

basic question--is object recognition is truly
independent of the particular view there is
support for both views
Tarr and Bulthoff (1995) suggest that object
recognition should be conceived of as a continuum
one end are heavily view-point dependent
mechanisms that are used for making subtle
discriminations among similar exemplars
(distinguishing a finch from a sparrow)
explains Tarr and colleagues results--stimuli
used required fairly subtle discriminations
object recognition appears to be viewpoint
dependent
other end are heavily view-point independent
mechanisms that are used when gross categorical
judgments are required (distinguishing a hammer
from a sparrow)
explains Biederman and colleagues
results--stimuli used require fairly gross
discriminations
object recognition process appears to be
view-point independent

18
II. Top-down Processing in Pattern Recognition

A. The Word-Superiority Effect
Reicher (1969)
participants were briefly presented with letter
strings that either did or did not form a word
(e.g., OWRK or WORK, respectively)
following a rapid display of such a letter
string, participants were queried about the
component letters
two alternatives were presented and participants
were to pick the one they had just seen
identification accuracy was higher if the letter
had been presented in the context of a word,
relative to when it had been presented in the
context of a non-word
if letter identification had been based solely on
bottom-up processing, then letter identification
accuracy should have been equivalent in the two
conditions
the data that make up the letter D do not vary
depending on the context in which D appears a
D is a D

19
An Interactive Approach to Word Recognition

word superiority effect is the result of an
interplay between the bottom-up processes and the
top-down processes
McClelland and Rumelhart (1981) model assumes
that words are represented in our mental
dictionaries at three different levels features,
letters, and as whole words

each type of information about a word is being
analyzed simultaneously, and information about
the words identity accumulates. -example the
letter A
activation of the A detector will excite nodes
representing words that have an A, but inhibit
nodes representing words that dont have an
A--bottom-up
activation of the A detector will excite nodes
representing features that are part of the letter
A (e.g., slanted lines), but inhibit nodes that
are not part of the letter A (e.g., a curved
line)--top-down

20
An Interactive Approach to Word Recognition

when we read a word (e.g., work), info about the
component letters leads to the activation of
representations at the word level that include
these letters
this heightened activation at the word level then
feeds back to the letter level, enhancing
activation of component letters of the words
activated
but only the letters in a word (e.g., work) are
receiving the bottom-up (feature-based)
activation therefore, evidence for these
particular letters will accumulate the fastest,
facilitating their identification
nonwords dont have representations at the word
level the letters in the nonword (e.g., owrk)
dont receive this additional top-down
activation therefore speed of identifying a
letter in a non-word will be slower

21
A Scene-Superiority Effect

identifying an object within a scene is
facilitated when the object is consistent with
the scene (e.g., refrigerator in kitchen),
relative to when it is inconsistent (e.g.,
refrigerator in farm)
Biederman, Mezzanotte, and Rabinowitz (1982)
participants saw the name of a common object,
followed by a real-world scene, and then a mask
that included a location cue
participants were to determine whether the common
object had appeared at that location in the scene
detection performance was better when the object
was consistent with the scene relative to when
the object was inconsistent

22
Hollingworth and Henderson (1998)

a subtle response bias may have been at play in
the Biederman procedure
participants were told what to look for in the
scene, prior to the scenes presentation this
expectation may have influenced the participants
response
seeing sheep might lead to an expectation of a
farm scene
if that scene is indeed presented, then
participants will have a bias to respond yes
when an inconsistent scene is presented (e.g., an
office) the participants will demand a lot of
visual evidence (i.e., bottom-up information)
before theyll acknowledge that a sheep was in
the office--in other words, they will have a bias
to respond no
employed a procedure much like the one used by
Reicher (1969) to investigate word-superiority
after presentation of a scene (e.g., farm),
participants were given a forced choice
recognition test in which two alternatives were
presented
either both scene consistent (sheep, pig) or both
scene inconsistent (coffee maker, mixer)
participants had to pick the one that was in the
scene

23
Hollingworth and Henderson (1998)

predictions
if scene context facilitates recognition, then in
the scene-consistent trials, participants should
successfully choose the correct option a majority
of the time
if scene superiority effect had been due to a
guessing bias, then this procedure should
eliminate the scene superiority effect
when faced with two alternatives that both fit
with the context, neither item will have an
advantage--theyre both likely choices
no scene-superiority effect was found
participants were just as good at choosing which
of two scene inconsistent objects had occurred as
they were at guessing which of two scene
consistent objects had occurred
authors conclude that scene context does not
facilitate the recognition of scene components
the processes underlying object perception seem
to be isolated from information about objects
that might occur in any given scene
the lack of effect of contextual knowledge on
object recognition is probably a good thing
if identification of objects involved consulting
general knowledge about objects both relevant and
irrelevant to the scene, it would get bogged
down top-down processing would be an unnecessary
drag on the system

24
Hollingworth and Henderson (1998), cont.

so why does it top-down information facilitate
the recognition of letters?
our experience with any given word is not as
extensive as our experience with a given natural
scene
the letters that can appear in a word are much
more constrained by context (i.e., context
provides more information about the component
letters of a word) than are the objects that can
appear in a scene
Two-System Views of Object Recognition
objects are represented in two separate
systems--a structural description system and an
episodic system
the structural description system includes
information about the global shape of an object,
as well as the relationships among the objects
parts
the episodic system encodes semantic (i.e.,
meaning) and visual information about the
objects, such as their identity, their function,
and specifics of their visual presentation

25
Cooper and Shepard (1992)

encoding phase participants are asked to make a
judgment about possible and impossible objects
possible objects are ones whose surfaces and
edges are configured such that they could exist
in a 3-D world
impossible objects are those that could not exist
in three dimensions

two encoding tasks
global structure task--decide whether the object
faced right or left
induce participants to encode the global
properties of the objects, and the
interrelationships between the parts
meaningful properties task--name something that
each object resembled
required a meaningful elaboration of the object
later phase participants are given two different
object recognition tests
one test, they are asked whether they saw each
figure earlier
other test, they are asked to rapidly classify a
series of objects as possible or impossible
some of these objects had been presented earlier

dependent variable is whether participants have
an easier time making the possible-impossible
judgment for figures that they had seen before
(i.e., priming)
benefit from having seen an object before is
called priming
results
if participants had to recognize the object as
one they had seen before, the most effective
encoding was to think about it in meaningful
terms (i.e., meaningful properties task)
foreshadows levels-of-processing effect remember
material better when its processed in terms of
its meaning rather than its physical structure
if participants had to rapidly classify objects
as possible or impossible, the most effective
encoding was to notice the global structure of
the object (i.e., the global structure task)

27
Structural Description System

includes a stored representation of the overall
structure of objects and is used as the basis for
their rapid recognition
subserved performance on the speeded
possible-impossible classification test
the only encoding method that primed the
representation in this system was the global
structure task (which induced participants to
notice the overall structure of objects)
However, rapid recognition of impossible objects
was not aided by global structure encoding task.
Suggestion The structural description system
cant recognize impossible objects?
based on bottom-up processing this system
contains the actual data that we analyze

28
Episodic System

includes a stored representation of the identity
of the object and its distinctive physical
characteristics this system is the basis for our
memory of and knowledge about objects
subserved performance on the recognition test
this system stores information about what objects
are, recognition was most affected by the
encoding task that required participants to say
what each object resembled
based on top-down processing this system
includes our conceptual knowledge about the
objects that we need to identify
global judgments primed representations in the
structural description system but only for
possible objects the rapid recognition of
impossible objects showed no benefit from the
global structure judgment
the structural description system is incapable of
representing objects that could not exist in
three dimensions
indicates the importance of top-down processing
in object recognition--even the structural
description system, which provides the data for
bottom-up processing, is influenced by our
previous experience

29
III. Recognizing Faces

The Face-Inversion Effect
the deleterious effect of inversion is
disproportionately great for faces compared to
other objects
to recognize objects, we need first-order
relational information--information about the
parts of an object, and how those parts relate to
one another
first-order relational information is not enough
to recognize faces noticing that two eyes are
above the nose, which is above the mouth may be
enough to recognize that something is a face, but
doesnt allow for recognition of who the face is
to recognize faces, we need second-order
relational information
second-order relational information involves
comparing the first order analysis to facial
features of a typical or average face
this typical face is built up through experience,
and serves as an implicit standard against which
we compare faces that we see
when a face is inverted, this disrupts the
encoding of second-order relational information
therefore inversion disproportionately harms face
recognition

30
Diamond and Carey (1986)

in addition to replicating the basic inversion
effect with human faces, they also investigated
recognition of dog faces
compared dog experts with dog non-experts
dog experts are so experienced with dogs that
they encode dog faces in terms of second-order
relational properties
inversion should have adverse effects on the dog
non-experts recognition of human faces, but not
dog faces
inversion should have adverse effects on dog
experts recognition of both human faces and dog
faces
predicted results were obtained
Wholistic Processing of Faces
faces are encoded, stored, and retrieved from
memory as whole configurations, rather than as a
set of features or parts

31
Tanaka and Farah (1993)

to the degree that a given object is stored as a
set of features, then those features ought to be
useful cues in retrieving the remaining
information about the object
if an object is stored as a whole configuration
then presenting part of that whole will not be
particularly helpful in recognition
presented participants with sketches of faces and
sketches of houses, both decomposable in terms of
distinct features each face and house was given
a label, such as Larrys house, or Larrys
face
on a later recognition test, participants were
asked about the faces and houses
isolated-part condition--given a choice of two
object parts, and had to pick which one of them
had been part of an earlier-presented object
(e.g., Which of these is Larrys nose? or Which
of these is Larrys door?)
whole-object condition--given a choice of two
whole objects, and had to pick out the one they
had seen earlier (e.g., Which of these is Larrys
face? or Which of these is Larrys house?)

32
Tanaka and Farah (1993)

results
the type of question asked didnt matter for
recognition of houses participants were just as
good at recognizing parts of houses as they were
at recognizing whole houses
for faces, the type of question did matter
participants were not as good at recognizing face
parts as they were at recognizing whole faces
face recognition is similar to view based
approaches to pattern/object recognition

33
Dissociations and Associations in Face Recognition

some suggest at least two subsystems for
recognizing letters, objects, and faces rather
than a special mechanism for face recognition
alexia deficits in the ability to recognize
printed words
object agnosia deficits in the ability to
recognize everyday objects
prosopagnosia an inability to recognize familiar
faces
There are associations and dissociations among
these three disorders
deficits in recognizing faces are often found in
association with at least some deficits in
recognizing objects
deficits in recognizing letters/words are often
found in association with at least some deficits
in recognizing objects
these two findings indicate that object
recognition does share some mechanisms with both
letter/word and face recognition

34
Dissociations and Associations in Face Recognition

deficits in object recognition with spared face
and letter/word processing or the opposite
pattern (deficits in face and letter/word
processing with spared object recognition) are
rarely found
the fact that face recognition and letter/word
recognition are almost never spared together or
impaired together indicates that face recognition
and letter/word recognition rely on quite
distinct mechanisms
the fact that problems in object recognition
quite often line up with problems in either face
or letter recognition indicates that object
recognition relies on some of the mechanisms
required for each
visual recognition involves two primary
mechanisms
one mechanism used for representation/combination
of parts
important for letter/word recognition not
important for face recognition
second mechanism used for representation/combinati
on of complex wholes
important for face recognition not important for
letter/word recognition
a combination of both is important for object
recognition

Write a Comment

User Comments (0)

About PowerShow.com

Recognition PowerPoint PPT Presentation