Recognition - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Recognition

Description:

... discriminations among similar exemplars (distinguishing a finch from a sparrow) ... judgments are required (distinguishing a hammer from a sparrow) ... – PowerPoint PPT presentation

Number of Views:119
Avg rating:3.0/5.0
Slides: 35
Provided by: michael411
Category:

less

Transcript and Presenter's Notes

Title: Recognition


1
Recognition
  • Chapter 5
  • Identifying and Classifying
  • patterns, objects, faces

2
Ch 5 Identifying and Classifying Concepts
COGNITION AND CONSCIOUSNESS Subliminal Visual
Priming COGNITION AND INDIVIDUAL DIFFERENCES
Right Hemisphere and Self-Recognition
COGNITION AND NEUROSCIENCE Causal Similarity,
Perceptual Similarity, Childrens Categorization
  • I. Bottom-up Views of Pattern Recognition
  • A.Structural-Description-Based (SDB) Approaches
  • Pandemonium Model Marr RBC
  • B.View-Based (VB) Approaches
  • Template Matching and Object Recognition
  • II. Top-down Processing in Pattern Recognition
  • A.The Word-Superiority Effect
  • B.A Scene-Superiority Effect
  • C.Two-System Views of Object Recognition
  • Structural Description and Episodic Systems
  • III. Recognizing Faces
  • A. The Face-Inversion Effect
  • B. Wholistic Processing of Faces
  • C. Dissociations/Associations in Face Recognition
  • IV. Concepts and Categories
  • Similarity Prototypes Exemplars
  • Explanation-based view on Concept Representation

3
Identifying and Classifying Objects
  • Pattern, object and face recognition how does
    this remarkable feat happen?
  • processes whereby we match a stimulus with stored
    representations for the purpose of identification
  • I. Bottom-up Views of Pattern Recognition
  • views of pattern recognition differ primarily in
    their characterization of bottom-up processes
  • basic question--when engaged in object
    recognition, what are we matching the information
    to in memory?
  • A. Structural-Description-Based (SDB) Approaches
  • compare patterns to a structural description in
    memory includes a list of the visual features,
    as well as the relationships between these
    features (often called feature analysis)
  • representation in memory is not visually or
    spatially analogous to the pattern being
    recognized
  • compare features of incoming stimulus to an
    abstract description of those features in memory
  • advantage particular orientation or view of the
    pattern is not important no matter what the
    perspective, patterns are broken down into
    component parts and compared to a structural
    description in memory (also called view-point
    independent)

4
Pandemonium Model
  • feature analysis is carried out by hierarchically
    organized demons
  • -entities each carry out different jobs that
    culminate in pattern recognition
  • -each set of demons passes their info on to the
    next set for further analysis
  • image demons responsible for the initial
    encoding of the pattern
  • feature demons analyze the now encoded stimulus
    in terms of its component elements (e.g., /, -,
    and \) different feature demons look for each of
    these elements
  • cognitive demons monitor work of feature demons,
    looking for evidence that supports a specific
    pattern
  • -each cognitive demon represents a different
    pattern (e.g., A, V, W, B)
  • -cognitive demons shout with each piece of
    evidence (i.e., info from feature demons like /,
    -, or \) that supports their designated pattern
  • -multiple cognitive demons will be shouting if
    the to-be-identified pattern shares features with
    several cognitive demons
  • -e.g., the A cognitive demon, plus the V
    cognitive demon will be shouting
  • decision demons assess shouting of cognitive
    demons
  • evaluate which one (e.g., A or V) is shouting
    the loudest (i.e, has detected the most
    evidence) use that information to decide on
    pattern

5
Pandemonium model
  • model fits well with what we know about lower
    level perceptual processes
  • there are cells in the visual cortex that
    correspond to lines of certain lengths,
    orientations, etc.
  • so it makes sense to characterize pattern
    recognition as a gradual process of evidence
    accumulation based on a feature-by-feature
    analysis of incoming information
  • problems
  • -limited to explanations of how we recognize
    simple patterns like letters
  • -representational economy--unlikely that we have
    a near-infinite number of feature analyzers to
    analyze a near-infinite number of features

6
Visual Pathways
7
Extracting Sensory MessagesVisual Processing in
the Brain (Pp. 130-131)
  • Neural messages travel to brain via optic nerve
  • Splits at optic chiasm so that information from
    left visual field goes to right hemisphere and
    vice versa
  • Within each hemisphere, information goes to the
    lateral geniculate nucleus (90 in thalamus) and
    superior colliculus (10 in midbrain)
  • Former deals with colour, texture, depth
  • Former deals with movement
  • Demonstrates parallel processing
  • From the lateral geniculate nucleus,
    informationgoes to primary visual cortex (in
    occipital lobe)
  • Feature detectors Cells in cortex that react to
    simple visual stimuli such as edges, lines,
    angles (may be orientation specific)

8
Feature Detectors in Visual Cortex
9
Marrs View
  • goal of the visual system is to transform a
    2-dimensional retinal image into a 3-dimensional
    percept that is quickly and easily identified
  • processing stages
  • first stage register the information contained
    in the retinal image
  • note the specifics of the retinal image depend
    on the particular view of the object)
  • information about light intensity, boundaries,
    discontinuities, edges, and groupings
  • used to form a primal sketch--rough rendition of
    the most primitive elements of the object
  • information from the primal sketch is used to
    construct a 2½ -D sketch
  • includes information about orientation, relative
    depth of the visible surfaces also about
    discontinuities in depth and orientation

10
Marrs View
  • information derived from the primal sketch and
    2½-D sketch forms the basis for the construction
    of a 3-D model of the object
  • the parts of 3-D model are termed volumetric
    primitives, but are basically generalized 3-D
    cylinders (a pipe-cleaner version of the object
    were looking at) -I.e., there is less detail
  • unlike the primal sketch and the 2½ -D sketch
    (which differ depending on the perspective of the
    observer), the 3-D model is not view-specific
    (viewpoint independent)

11
Recognition-by-Components (RBC) Theory
  • object recognition is a matter of parsing objects
    into features
  • analyzed features are not simple line segments,
    angles, and curves the features are basic 3-D
    shapes termed geons
  • a total of 36 geons that serve as visual
    primitives--simple shapes that can combine to
    form most other more complex shapes
  • similar to Marrs theory in that it proposes a
    series of hierarchically arranged stages whereby
    information about component features is used to
    identify the object
  • edge extraction process--looks for differences in
    features like texture, luminance, color, and
    results in a simple line drawing of the object
  • -next, two processes occur simultaneously
  • -the detection of nonaccidental features--actual
    features of the stimulus, rather than some
    accident of a peculiar observer perspective
  • -parsing the object at areas where there appear
    to be boundaries between the parts of the object

12
Recognition-by-Components (RBC) Theory
  • next, with the information gained to this point,
    the components of the figure, the geons, are
    determined
  • this set of components is matched with object
    representations in memory
  • when a match is found, the object is identified
  • critical predictions
  • objects should become less identifiable the
    harder it gets to recover their components

13
Biederman and Cooper (1991)
  • participants were presented with sketches of
    objects in which 50 of the contours had been
    deleted
  • for some figures, the contour deletion disrupted
    the figure at the points of segmentation that
    would be used to carve the object into its
    component geons
  • Predictions
  • object recognition should suffer
  • for other figures, the contour deletion did not
    prevent recovery of the geons (although the same
    amount of the contour was deleted)
  • object recognition should not be affected
  • findings were consistent with the RBC theory
  • rotation of objects should not hinder recognition
  • changes in orientation do not influence the basic
    components of the object and their relationships
    to one another

14
View-Based (VB) Approaches
  • objects are recognized wholistically through a
    process of comparison with a stored analog
  • when a match is found, the pattern is recognized
  • termed viewpoint-dependent because identification
    of an object depends critically on the particular
    perspective the viewer has
  • to identify the object, an image matching this
    particular view must be found or, the incoming
    stimulus image must be manipulated in some way
    (e.g., rotated) until a match is found with
    images represented in memory
  • An Early Attempt Template Matching
  • our store of general knowledge includes a set of
    templates (copies) of every pattern that we might
    encounter
  • when we encounter a pattern that needs to be
    identified the mind quickly rifles through its
    set of templates when a match is found, the
    pattern is given the label stored with the
    template (i.e., the pattern is recognized)
  • even the slightest change in the pattern will
    lead to a recognition failure
  • problems
  • too rigid
  • lack of economy

15
Modern Versions of the View-Based Approach
Tarr and Pinker (1989)
  • taught participants names for shapes shapes were
    always presented in the same orientation during
    training
  • during a test phase, these shapes were presented
    for recognition
  • participants responded quickly if the shapes were
    presented at the same orientation as in training,
    but were successively slower as the degree of
    rotation from that original position increased,
    indicating that perception is viewpoint-dependent
  • after training on the new orientations,
    participants eventually became equally fast no
    matter what the orientation
  • results parallel the everyday recognition of
    objects
  • we start with one representation of objects
    (termed the canonical representation), and
    through experience with objects in many different
    orientations and viewed from many different
    perspectives, we develop multiple
    representations, or views of the objects
  • these multiple views serve as the templates for
    later recognition
  • orientation tends not to affect visual
    recognition under most circumstances since
    everything we must recognize has received
    extensive exposure from different view

16
Logothetis, Poggio, and Poggio (1995)
  • taught monkeys to recognize novel 3D objects from
    a variety of different perspectives, and like
    humans, they eventually became equally proficient
    at recognizing these objects given any of the
    rotations
  • neural activity associated with recognition
    different sets of cells responded most strongly
    to certain objects, indicating that certain
    networks were devoted to certain objects
  • a given set of cells responded most strongly when
    that object appeared in the same orientation as
    it had during training the responses decreased
    systematically with increases in the rotation
    from that perspective
  • monkeys seemed to have what might be termed
    physiological templates that were devoted to
    recognizing a specific object in a specific
    orientation

17
Object Recognition Views or Structural
Descriptions?
  • basic question--is object recognition is truly
    independent of the particular view there is
    support for both views
  • Tarr and Bulthoff (1995) suggest that object
    recognition should be conceived of as a continuum
  • one end are heavily view-point dependent
    mechanisms that are used for making subtle
    discriminations among similar exemplars
    (distinguishing a finch from a sparrow)
  • explains Tarr and colleagues results--stimuli
    used required fairly subtle discriminations
  • object recognition appears to be viewpoint
    dependent
  • other end are heavily view-point independent
    mechanisms that are used when gross categorical
    judgments are required (distinguishing a hammer
    from a sparrow)
  • explains Biederman and colleagues
    results--stimuli used require fairly gross
    discriminations
  • object recognition process appears to be
    view-point independent

18
II. Top-down Processing in Pattern Recognition
  • A. The Word-Superiority Effect
  • Reicher (1969)
  • participants were briefly presented with letter
    strings that either did or did not form a word
    (e.g., OWRK or WORK, respectively)
  • following a rapid display of such a letter
    string, participants were queried about the
    component letters
  • two alternatives were presented and participants
    were to pick the one they had just seen
  • identification accuracy was higher if the letter
    had been presented in the context of a word,
    relative to when it had been presented in the
    context of a non-word
  • if letter identification had been based solely on
    bottom-up processing, then letter identification
    accuracy should have been equivalent in the two
    conditions
  • the data that make up the letter D do not vary
    depending on the context in which D appears a
    D is a D

19
An Interactive Approach to Word Recognition
  • word superiority effect is the result of an
    interplay between the bottom-up processes and the
    top-down processes
  • McClelland and Rumelhart (1981) model assumes
    that words are represented in our mental
    dictionaries at three different levels features,
    letters, and as whole words
  • each type of information about a word is being
    analyzed simultaneously, and information about
    the words identity accumulates. -example the
    letter A
  • activation of the A detector will excite nodes
    representing words that have an A, but inhibit
    nodes representing words that dont have an
    A--bottom-up
  • activation of the A detector will excite nodes
    representing features that are part of the letter
    A (e.g., slanted lines), but inhibit nodes that
    are not part of the letter A (e.g., a curved
    line)--top-down

20
An Interactive Approach to Word Recognition
  • when we read a word (e.g., work), info about the
    component letters leads to the activation of
    representations at the word level that include
    these letters
  • this heightened activation at the word level then
    feeds back to the letter level, enhancing
    activation of component letters of the words
    activated
  • but only the letters in a word (e.g., work) are
    receiving the bottom-up (feature-based)
    activation therefore, evidence for these
    particular letters will accumulate the fastest,
    facilitating their identification
  • nonwords dont have representations at the word
    level the letters in the nonword (e.g., owrk)
    dont receive this additional top-down
    activation therefore speed of identifying a
    letter in a non-word will be slower

21
A Scene-Superiority Effect
  • identifying an object within a scene is
    facilitated when the object is consistent with
    the scene (e.g., refrigerator in kitchen),
    relative to when it is inconsistent (e.g.,
    refrigerator in farm)
  • Biederman, Mezzanotte, and Rabinowitz (1982)
  • participants saw the name of a common object,
    followed by a real-world scene, and then a mask
    that included a location cue
  • participants were to determine whether the common
    object had appeared at that location in the scene
  • detection performance was better when the object
    was consistent with the scene relative to when
    the object was inconsistent

22
Hollingworth and Henderson (1998)
  • a subtle response bias may have been at play in
    the Biederman procedure
  • participants were told what to look for in the
    scene, prior to the scenes presentation this
    expectation may have influenced the participants
    response
  • seeing sheep might lead to an expectation of a
    farm scene
  • if that scene is indeed presented, then
    participants will have a bias to respond yes
  • when an inconsistent scene is presented (e.g., an
    office) the participants will demand a lot of
    visual evidence (i.e., bottom-up information)
    before theyll acknowledge that a sheep was in
    the office--in other words, they will have a bias
    to respond no
  • employed a procedure much like the one used by
    Reicher (1969) to investigate word-superiority
  • after presentation of a scene (e.g., farm),
    participants were given a forced choice
    recognition test in which two alternatives were
    presented
  • either both scene consistent (sheep, pig) or both
    scene inconsistent (coffee maker, mixer)
  • participants had to pick the one that was in the
    scene

23
Hollingworth and Henderson (1998)
  • predictions
  • if scene context facilitates recognition, then in
    the scene-consistent trials, participants should
    successfully choose the correct option a majority
    of the time
  • if scene superiority effect had been due to a
    guessing bias, then this procedure should
    eliminate the scene superiority effect
  • when faced with two alternatives that both fit
    with the context, neither item will have an
    advantage--theyre both likely choices
  • no scene-superiority effect was found
  • participants were just as good at choosing which
    of two scene inconsistent objects had occurred as
    they were at guessing which of two scene
    consistent objects had occurred
  • authors conclude that scene context does not
    facilitate the recognition of scene components
  • the processes underlying object perception seem
    to be isolated from information about objects
    that might occur in any given scene
  • the lack of effect of contextual knowledge on
    object recognition is probably a good thing
  • if identification of objects involved consulting
    general knowledge about objects both relevant and
    irrelevant to the scene, it would get bogged
    down top-down processing would be an unnecessary
    drag on the system

24
Hollingworth and Henderson (1998), cont.
  • so why does it top-down information facilitate
    the recognition of letters?
  • our experience with any given word is not as
    extensive as our experience with a given natural
    scene
  • the letters that can appear in a word are much
    more constrained by context (i.e., context
    provides more information about the component
    letters of a word) than are the objects that can
    appear in a scene
  • Two-System Views of Object Recognition
  • objects are represented in two separate
    systems--a structural description system and an
    episodic system
  • the structural description system includes
    information about the global shape of an object,
    as well as the relationships among the objects
    parts
  • the episodic system encodes semantic (i.e.,
    meaning) and visual information about the
    objects, such as their identity, their function,
    and specifics of their visual presentation

25
Cooper and Shepard (1992)
  • encoding phase participants are asked to make a
    judgment about possible and impossible objects
  • possible objects are ones whose surfaces and
    edges are configured such that they could exist
    in a 3-D world
  • impossible objects are those that could not exist
    in three dimensions
  • two encoding tasks
  • global structure task--decide whether the object
    faced right or left
  • induce participants to encode the global
    properties of the objects, and the
    interrelationships between the parts
  • meaningful properties task--name something that
    each object resembled
  • required a meaningful elaboration of the object
  • later phase participants are given two different
    object recognition tests
  • one test, they are asked whether they saw each
    figure earlier
  • other test, they are asked to rapidly classify a
    series of objects as possible or impossible
  • some of these objects had been presented earlier

26
  • dependent variable is whether participants have
    an easier time making the possible-impossible
    judgment for figures that they had seen before
    (i.e., priming)
  • benefit from having seen an object before is
    called priming
  • results
  • if participants had to recognize the object as
    one they had seen before, the most effective
    encoding was to think about it in meaningful
    terms (i.e., meaningful properties task)
  • foreshadows levels-of-processing effect remember
    material better when its processed in terms of
    its meaning rather than its physical structure
  • if participants had to rapidly classify objects
    as possible or impossible, the most effective
    encoding was to notice the global structure of
    the object (i.e., the global structure task)

27
Structural Description System
  • includes a stored representation of the overall
    structure of objects and is used as the basis for
    their rapid recognition
  • subserved performance on the speeded
    possible-impossible classification test
  • the only encoding method that primed the
    representation in this system was the global
    structure task (which induced participants to
    notice the overall structure of objects)
  • However, rapid recognition of impossible objects
    was not aided by global structure encoding task.
  • Suggestion The structural description system
    cant recognize impossible objects?
  • based on bottom-up processing this system
    contains the actual data that we analyze

28
Episodic System
  • includes a stored representation of the identity
    of the object and its distinctive physical
    characteristics this system is the basis for our
    memory of and knowledge about objects
  • subserved performance on the recognition test
    this system stores information about what objects
    are, recognition was most affected by the
    encoding task that required participants to say
    what each object resembled
  • based on top-down processing this system
    includes our conceptual knowledge about the
    objects that we need to identify
  • global judgments primed representations in the
    structural description system but only for
    possible objects the rapid recognition of
    impossible objects showed no benefit from the
    global structure judgment
  • the structural description system is incapable of
    representing objects that could not exist in
    three dimensions
  • indicates the importance of top-down processing
    in object recognition--even the structural
    description system, which provides the data for
    bottom-up processing, is influenced by our
    previous experience

29
III. Recognizing Faces
  • The Face-Inversion Effect
  • the deleterious effect of inversion is
    disproportionately great for faces compared to
    other objects
  • to recognize objects, we need first-order
    relational information--information about the
    parts of an object, and how those parts relate to
    one another
  • first-order relational information is not enough
    to recognize faces noticing that two eyes are
    above the nose, which is above the mouth may be
    enough to recognize that something is a face, but
    doesnt allow for recognition of who the face is
  • to recognize faces, we need second-order
    relational information
  • second-order relational information involves
    comparing the first order analysis to facial
    features of a typical or average face
  • this typical face is built up through experience,
    and serves as an implicit standard against which
    we compare faces that we see
  • when a face is inverted, this disrupts the
    encoding of second-order relational information
    therefore inversion disproportionately harms face
    recognition

30
Diamond and Carey (1986)
  • in addition to replicating the basic inversion
    effect with human faces, they also investigated
    recognition of dog faces
  • compared dog experts with dog non-experts
  • dog experts are so experienced with dogs that
    they encode dog faces in terms of second-order
    relational properties
  • inversion should have adverse effects on the dog
    non-experts recognition of human faces, but not
    dog faces
  • inversion should have adverse effects on dog
    experts recognition of both human faces and dog
    faces
  • predicted results were obtained
  • Wholistic Processing of Faces
  • faces are encoded, stored, and retrieved from
    memory as whole configurations, rather than as a
    set of features or parts

31
Tanaka and Farah (1993)
  • to the degree that a given object is stored as a
    set of features, then those features ought to be
    useful cues in retrieving the remaining
    information about the object
  • if an object is stored as a whole configuration
    then presenting part of that whole will not be
    particularly helpful in recognition
  • presented participants with sketches of faces and
    sketches of houses, both decomposable in terms of
    distinct features each face and house was given
    a label, such as Larrys house, or Larrys
    face
  • on a later recognition test, participants were
    asked about the faces and houses
  • isolated-part condition--given a choice of two
    object parts, and had to pick which one of them
    had been part of an earlier-presented object
    (e.g., Which of these is Larrys nose? or Which
    of these is Larrys door?)
  • whole-object condition--given a choice of two
    whole objects, and had to pick out the one they
    had seen earlier (e.g., Which of these is Larrys
    face? or Which of these is Larrys house?)

32
Tanaka and Farah (1993)
  • results
  • the type of question asked didnt matter for
    recognition of houses participants were just as
    good at recognizing parts of houses as they were
    at recognizing whole houses
  • for faces, the type of question did matter
    participants were not as good at recognizing face
    parts as they were at recognizing whole faces
  • face recognition is similar to view based
    approaches to pattern/object recognition

33
Dissociations and Associations in Face Recognition
  • some suggest at least two subsystems for
    recognizing letters, objects, and faces rather
    than a special mechanism for face recognition
  • alexia deficits in the ability to recognize
    printed words
  • object agnosia deficits in the ability to
    recognize everyday objects
  • prosopagnosia an inability to recognize familiar
    faces
  • There are associations and dissociations among
    these three disorders
  • deficits in recognizing faces are often found in
    association with at least some deficits in
    recognizing objects
  • deficits in recognizing letters/words are often
    found in association with at least some deficits
    in recognizing objects
  • these two findings indicate that object
    recognition does share some mechanisms with both
    letter/word and face recognition

34
Dissociations and Associations in Face Recognition
  • deficits in object recognition with spared face
    and letter/word processing or the opposite
    pattern (deficits in face and letter/word
    processing with spared object recognition) are
    rarely found
  • the fact that face recognition and letter/word
    recognition are almost never spared together or
    impaired together indicates that face recognition
    and letter/word recognition rely on quite
    distinct mechanisms
  • the fact that problems in object recognition
    quite often line up with problems in either face
    or letter recognition indicates that object
    recognition relies on some of the mechanisms
    required for each
  • visual recognition involves two primary
    mechanisms
  • one mechanism used for representation/combination
    of parts
  • important for letter/word recognition not
    important for face recognition
  • second mechanism used for representation/combinati
    on of complex wholes
  • important for face recognition not important for
    letter/word recognition
  • a combination of both is important for object
    recognition
Write a Comment
User Comments (0)
About PowerShow.com