Title: NOTES FOR EXAM 3 BEGIN HERE Chapter 10: Conditioned Reinforcement
1NOTES FOR EXAM 3BEGIN HERE Chapter
10Conditioned Reinforcement
2A scenario
- Imagine you are lost
- You finally stumble upon a landmark that is
familiar to you - You become happy because you know how to get home
from this spot - This spot is both a CS that elicits happiness
as well as an SD for the behavior of getting
home. - There is also a THIRD function of this stimulus
- It has also served as a reinforcer for the
stumbling around behavior that led you to it - In fact, if we consider any series of linked
behaviors (like following directions or recipes,
etc.), the consequence of completing each step is
both a reinforcer for completing that step as
well as an SD for completing the NEXT step
3Conditioned Reinforcement
- Conditioned reinforcement is when behavior is
strengthened by consequence events that have an
effect because of a learning history. - The critical aspect of this history involves a
pairing between an arbitrary event and an already
established reinforcer. - Once the arbitrary event increases the frequency
of an operant behavior, it is called a
conditioned reinforcer.
4Chain Schedules and Conditioned Reinforcement
- One way to investigate conditioned reinforcement
is to construct sequences of behavior. - A chain schedule of reinforcement involves two
or more simple schedules (CRF, FI, VI, FR, etc.)
each of which is presented sequentially and is
signaled by an arbitrary stimulus (each has its
own SD). - Only the final or terminal link in this chain
results in primary reinforcement.
5Multiple Stimulus Functions
- An unsignalled chain (or tandem schedule) is a
sequence of two schedules (such as an FR150 - FI
120 seconds) in which distinct SDs do not signal
the different components - In equivalent tandem vs. chain schedules,
performances will be BETTER on the chain than the
tandem - This shows that distinct signals serve as both
SDs and conditioned reinforcers.
6Homogeneous and Heterogeneous Chains
- Operant chains are classified as homogeneous when
the topography or form of response is similar in
each component, i.e., a similar response
requirement is in effect in all components. - A heterogeneous chain requires different
responses in each link.
7Teaching a backwards Chain
- For complex tasks with many steps, often better
to teach the final step FIRST and reinforce its
completion - After practicing this final unit many times and
reinforcing its completion many times, ACESS to
this unit of SD - R - SR will now serve as am
effective conditioned reinforcer for the second
to last unit on the chain of behavior - More
8Teaching a backwards Chain
- After practicing the second to last and final
unit many times, ACESS to the SECOND TO LAST unit
of SD - R - SR will now serve as am effective
conditioned reinforcer for the THIRD to last unit
on the chain of behavior - And so on!
- Note that we are not doing the behavior in
reverse! We are simply completing the final step
first in our teaching procedure
9Determinants of Conditioned Reinforcement Strength
- Frequency of Primary Reinforcement paired with
the conditioned reinforcer - Variability of Primary Reinforcement paired with
the conditioned reinforcer - Establishing Operations
- Delay to Primary Reinforcement
10Delay Reduction and Conditioned Reinforcement
- Delay-reduction hypothesis
- Stimuli closer in time to positive reinforcement,
or further in time from an aversive event, are
more effective conditioned reinforcers. - Stimuli that signal no reduction in time to
reinforcement (S?) or no period of safety from an
aversive event (Save) do not function as
conditioned reinforcement.
11Concurrent-Chain Schedules of Reinforcement
- Previously we talked about choice where the
organism is free to switch back and forth between
different response alternatives (called
CONCURRENT SCHEDULES OF REINFORCEMENT) - But often in the real world, once you choose one
response alternative, you lock out the
opportunity to do some other behavior for a
period of time - that is, choosing one response COMMITS you to
that particular response for at least some period
of time
12Concurrent-Chain Schedules of Reinforcement
- How would we study such an idea in the lab?
- we could ask which does a person prefer, working
on an FR10 or a VI60s each for some set period of
time? - this is a CONCURRENT CHAIN SCHEDULE
- It involves two different components (an initial
LINK, or menu, and a terminal LINK)
13Concurrent-Chain Schedules of Reinforcement
- subject is given a "menu" in which it must press
a particular key to TURN ON a particular schedule
of reinforcement. - There is no reinforcer given for making the
initial link choice itself and the subject is
given immediate access to whatever reinforcement
schedule he chose - Subject must stay on that schedule for some
specified time. - Then he can make a choice again.
- What is our measure of choice in a concurrent
chain schedule? - the proportion of times subject chooses one
schedule over another
14Concurrent-Chain Schedules of Reinforcement
- IF we put in a delay to access to the terminal
links, however, then a subject is LESS likely to
choose that initial link because there is now an
increased delay to reinforcement - For example, in a two-key concurrent-chain
procedure with equivalent initial links but
different lengths of delay to get to terminal
links.
15Generalized Conditioned Reinforcement
- any event or stimulus paired with or,
exchangeable for, many sources of primary
reinforcement. - Generalized reinforcement does not depend on
deprivation or satiation for any specific
reinforcer. - Generalized social reinforcement for human
behavior approval, attention, affection, praise
16Tokens, Money and Generalized Reinforcement
- Other conditioned reinforcers are economic since
they are exchangeable for goods and services.
Probably the most important such reinforcement is
money. - A token economy is a set of contingencies based
on token reinforcement the contingencies specify
when and under what conditions, particular forms
of behavior are reinforced with tokens. Tokens
are exchangeable for a variety of backup
reinforcers.
17Chapter 11Correspondence Relations Imitation
and Rule-Governed Behavior
18Correspondence Relations
- People often do what others do. A child who
observes an older sibling raid the cookie jar may
engage in similar behavior. - This is a correspondence between the modeled
behavior and the replicated behavior. - Technically, behavior of one person sets the
occasion for (is an SD for) an equivalent
response by the other.
19Correspondence Relations Continued
- There are other correspondence relations
established by our culture. We often receive
reinforcement if there is a correspondence
between saying and doing. - A large part of socialization involves
reinforcement for correspondence between what is
said and what is done.
20Correspondence Relations Continued
- Other people reinforce our behavior if there is
consistency (correspondence) between spoken
words and later performance. - A minister who preaches moral conduct and lives a
moral life is valued when moral words and moral
deeds do not match, people become upset and act
to correct the inconsistency. (They deliver
punishment!)
21Imitation
- Learning by observation involves doing what
others do - The behavior of an observer or learner is
regulated by the actions of a model. - imitation requires that the learner emit a
response that could only occur by observing a
model emit a similar response.
22Spontaneous Imitation
- Innate or spontaneous imitation is based on
evolution and natural selection rather than
learning experiences - Implies imitation of others may be an important
adaptive behavior.
23Immediate vs. Delayed Imitation
- Imitation may occur only when the model is
present or it may be delayed for some time after
the model has been removed. - delayed imitation is more complex since it
involves remembering the modeled stimulus (SD),
rather than direct stimulus control.
24Operant and Generalized Imitation
- It is possible to teach imitation as an operant
behavior - discriminative stimulus is behavior of the model
(SDmodel), - operant is a response that matches the modeled
stimulus (Rmatch), and reinforcement is verbal
praise (Srsocial). - Matching the model is reinforced, while
non-correspondent responses are extinguished.
25Operant and Generalized Imitation
- If imitation is reinforced and nonimitation is
extinguished, imitation of the model will
increase. - On the other hand, nonimitation will occur if
imitation is extinguished and nonimitation is
reinforced. - Learner learns to do as the model does
regardless of what the form of the model is!
26Operant and Generalized Imitation
- Donald Baer and his associates provided a
behavior analysis of imitation called generalized
imitation - involves several modeled stimuli (SDs) and
multiple operants (Rmatch). - In each case, what the model does sets the
occasion for reinforcement of a similar response
by the child all other responses are
extinguished. - This training results in a stimulus class of
models and an imitative response class. The
child now imitates whichever response that the
model performs.
27Generalized Imitation
- The next step is to test for generalization of
the stimulus and response class. - Baer and Sherman (1964) showed that a new-modeled
stimulus would set the occasion for a novel
imitative response, without any further
reinforcement. - Generalized imitation accounts for the appearance
of novel imitative acts in children- even when
these specific responses were never reinforced.
28Rules, Observational Learning, and Self-Efficacy
- For Skinner, following the rules is behavior
under the control of verbal stimuli SDs. - That is, statements of rules, advice, maxims, or
laws are discriminative stimuli that set the
occasion for behavior. - Rules, as verbal descriptions, may affect
observational learning.
29Rule-Governed Behavior
- A large part of human behavior is regulated by
verbal stimuli. - The common property of these kinds of stimuli is
that they describe the operating contingencies of
reinforcement. - Formally, rules, instructions, advice, and laws
are contingency-specifying stimuli, (they
describe the SDR? Sr relations of everyday
life.) - The term rule-governed behavior is used when the
listeners (readers) performance is regulated by
contingency-specifying stimuli.
30Rule-Governed and Contingency-Shaped Behavior
- People are said to solve problems either by
discovery or by instruction. - From a behavioral perspective the difference is
between the direct effects of contingencies
(discovery) and the indirect effects of rules
(instruction). - When performance is attributed to direct exposure
to reinforcement contingencies, behavior is said
to be contingency-shaped. - As previously noted, performance set up by
constructing and following instructions (and
other verbal stimuli) is termed rule-governed
behavior.
31Rule-Governed and Contingency-Shaped Behavior
- The importance of reinforcement contingencies in
establishing and maintaining rule-following is
clearly seen with ineffective rules and
instructions. - When rules describe delayed and improbable
events, it is necessary to find other reasons to
follow them.
32Instructions and Contingencies
- In his discussion of rule-governed and
contingency-shaped behavior, Skinner (1969)
speculated that instructions may affect
performance differently than the actual
contingencies of reinforcement. - One way to test this idea is to expose humans to
reinforcement procedures that are accurately or
inaccurately described by the experimenters
instructions. - If behavior varies with the instructions while
the actual contingencies remain the same, this
would be evidence for Skinners assertion.
33Instructions and Contingencies
- Instructions are complex discriminative stimuli.
- Instructional control is a form of rule-governed
behavior.
34Chapter 12Verbal Behavior
35Language and Verbal Behavior
- In contrast with the term language, verbal
behavior deals with the performance of a speaker
and the environmental conditions that establish
and maintain such performance - Verbal behavior refers to the vocal, written and
gestural performance of a speaker, writer or
communicator. This behavior operates on the
listener, reader or observer, who then arranges
reinforcement of the verbal performance.
36Speaking, Listening and the Verbal Community
- Verbal behavior refers to the behavior of the
speaker, writer or gesturer. - The verbal community the practices and customary
ways a given culture reinforces the behavior of a
speaker
37Operant Functions of Verbal Behavior Mands
- A mand is a response class of verbal operants
whose form (what is said or written) is regulated
by specific establishing operations (deprivation,
satiation, etc.) - In lay terms, mands involve asking for something
you need to happen - It is commonly said that a mand specifies its
own reinforcer as in Give me a cookie but such
commands are only a small part of mands.
38Operant Functions of Verbal Behavior Tacts
- A tact is a response class of verbal operants
whose form (what is said or written) is regulated
by specific nonverbal discriminative stimuli - tact is derived from contact in that tacts
are verbal operants that make contact with the
environment. - In lay terms, tacts involve pointing something
out, commenting about something, labeling or
identifying something
39Does the form of the Verbal Behavior identify the
type? NOPE
- Behavior Honey, you sure look sexy tonight!
- Is this a tact or a mand?
- Identifying the type of verbal behavior depends
on the FUNCTION of the behavior! - What function does this statement have?
40Training Verbal Operants Mands
- To teach manding, the most direct procedure is to
manipulate an establishing operation (remove the
toy), and then reinforce the verbal response
(can I have the toy?) with the specified
consequence (guess what it is!). - Sometimes called teaching requesting
41Training Verbal Operants Tacts
- To teach tacting, a speaker must emit a verbal
operant whose form (what is said) is a function
of a nonverbal discriminative stimulus
reinforcement is non-specific to that stimulus. - A child comes home from preschool and when seeing
her mother the child says, Let me tell you what
I learned today and the child names several
parts of the body and points to where they are.
These would be tacts that would likely be
reinforced by praise and hugs from the proud
parent. (Mother may need to PROMPT that tacting
by the child What did you do in school today?)
42Additional Verbal Relations Intraverbals
- An intraverbal is a verbal operant (what the
listener says) controlled by a verbal
discriminative stimulus (what the speaker says)
but there is no one-to-one relation between the
intraverbal and its SD. - If you overhear me saying. Ill be damned! to
which you covertly reply I sure hope so your
response is an intraverbal - Teaching a child ABCs You say ABCDEFG and the
child says HIJK-ellamennopee - Free association therapy demonstrates this when
the therapist says Mother and you say
dominatrix (haha!)
43Additional Verbal Relations Echoics
- An echoic is a verbal operant in response to a
verbal SD but with a point-to-point
correspondence between the SD and operant. If you
swear after hitting your thumb with a hammer
(Damn!) and your four year-old-son subsequently
repeats your expletive, his response is an echoic.
44Additional Verbal Relations Textuals
- A textual is a verbal operant in which the verbal
SD (written or spoken words made by another) and
the response the listener makes correspond to
each other but not with a formal PHYSICAL
similarity. - In lay terms, you are READING aloud (or to
yourself) or TAKING NOTES -
45Symbolic Behavior and Stimulus Equivalence
- Stimulus equivalence occurs when presentation of
one class of stimuli occasion responses made to
other stimulus classes. - Example Most Americans will have a specific
response to the written or spoken word or image
of Osama Bin Laden. - The word in any recognizable form or media, or
the image of the person whether in cartoon
caricature, photograph or video footage, will
occasion the same response. - Stimulus equivalence is said to exist when
reflexivity, symmetry and transitivity can be
shown to be in effect between distinct stimuli.
46Basic Equivalence Relations
- Reflexivity (also referred to as identity
matching or matching to sample) a picture of
Bin Laden is matched up with an identical picture
of Bin Laden. (AA) - Symmetry stimulus A is interchangeable with
stimulus B, or AB and BA a picture of Bin
Laden is matched up with the phrase head of Al
Queida and vice versa. - Transitivity consists of showing that stimulus A
B and stimulus BC and if the learner responds
to A as interchangeable or equivalent to C then
transitivity is in effect between A, B and C. If
stimulus A (a picture of Bin Laden) is equivalent
to stimulus B, head of Al Queida and B is
equivalent to written words OSAMA BIN LADEN as
stimulus C if the picture of Bin Laden (stimulus
A) is matched up with the written words OSAMA BIN
LADEN (stimulus C) then transitivity is shown.