Title: Chapter 4 Reinforcement and Extinction of Operant Behavior
1Chapter 4Reinforcement and Extinction of Operant
Behavior
Prepared by Brady J. Phelps, South Dakota State
University
2Two Types of Behavior
- Classical Conditioning
- Respondents reflexive, elicited behavior
- CS controls behavior
- Operant Conditioning
- Operants voluntary, emitted behavior
outcomes control behavior
3Operant Behavior
- Operant behavior that is followed by reinforcing
consequences is selected in the sense that it
increases in frequency. - Behavior that is not followed by reinforcing
consequences decreases in frequency. - This is known as operant conditioning.
- Operant conditioning as a process, has evolved
over species history and is based on genetic
endowment. - That is, operant (and respondent) conditioning as
a general behavior-change process is based on
phylogeny.
4Operant Behavior
- From a scientific perspective, operant behavior
is lawful and may be analyzed in terms of its
relationship to environmental events. - Formally, responses that produce a change in the
environment are called operants. - The term operant comes from the verb to operate
and refers to behavior that operates on the
environment to produce a consequence. - A positive reinforcer is defined as any
consequence that increases the probability of the
operant that produced it. - Topography
- Operant class
5Why rats and pigeons?
Rats and pigeons (and humans) have been common
research subjects For operant conditioning. Rats
and pigeons have been used for a number of
reasons. The selection of a basic response such
as a lever- press or a key peck while these
responses are simple, they are adequate to
illustrate the generality of basic operant
processes. In other words, the specific subject
or response isnt as important as the principles
being demonstrated. Rats and pigeons are what
can be considered technophilic species as they
readily adapt to human apparatus and habitations.
Just as these species readily move into our
garages, barns, attics or basements, they live
well in cages. By the way, operant researchers
have used subjects as diverse as insects to
cephalapods to primates.
6Old Faithful
7(No Transcript)
8This slide and the prior slide are of Dr. Robert
Allen and here his student Shannon Lieb, with
subject 28, of Lafayette College. The next slide
is a close up of one of Dr. Allens pigeon
subjects responding in an operant chamber.
9(No Transcript)
10(No Transcript)
11RoboRat, ready for search and rescue
12Discriminative Stimuli
- Operant behavior is said to be emitted in the
sense that it often occurs without an observable
stimulus preceding it. This is in contrast to
reflexive responses, which are elicited by a
preceding stimulus. - Stimuli may also precede operant behavior.
However, these events do not force the occurrence
of the response that follows them. - An event that precedes an operant and alters its
likelihood is said to set the occasion for
behavior and is called a discriminative stimulus,
or SD.
13Discriminative Stimuli
- The consequences that follow operant behavior
establish the control exerted by discriminative
stimuli. When an SD is followed by an operant
that produces positive reinforcement, the operant
is more likely to occur the next time the
stimulus is present. - Differential reinforcement
- When an operant does not produce reinforcement,
the stimulus that precedes the response is called
an S-delta (S?.) In the presence of an S-delta,
the probability of emitting an operant declines. - An SD (or an S?) are not defined by their
physical parameters but rather by preceding and
altering the probability of responses.
14Emitted versus occasioned
Operants can and do occur in the absence of any
eliciting stimulus, they are said to be freely
emitted. However, when an SD comes to control
occurrences of an operant, to alter its
probability of occurring, then it is said that
the SD occasions the operant. The term occasion
dictates that the operant is under the stimulus
control of an antecedent stimulus. Occasion as a
verb can be defined as creating a situation in
which something (in this case, an operant) is
especially likely to occur.
15Contingencies of Reinforcement
- A contingency of reinforcement defines the
relationship between the events that set the
occasion for behavior, the operant class, and the
consequences that follow this behavior.
16Four Basic Contingencies
- Positive Reinforcement
- Positive reinforcement is one of the four basic
contingencies of operant behavior. - Positive reinforcement is when a stimulus
follows behavior and, as a result, the rate of
that behavior increases. - Positively reinforcing events usually include
consequences such as food, praise, and money.
These events, however, cannot be said to be
positive reinforcers until they have been shown
to increase behavior.
17Four Basic Contingencies
- Negative Reinforcement
- When a response results in the removal of an
event, and this procedure increases the rate of
that response, the contingency is called negative
reinforcement.
18Four Basic Contingencies
- Positive Punishment
- A situation in which responses produce an event
and the rate of behavior decreases. This
contingency is called positive punishment.
19Four Basic Contingencies
- Negative Punishment
- Punishment can also be arranged by removing
stimuli contingent on behavior. This contingency
is called negative punishment.
20Reinforcement, Intrinsic Motivation, and
Creativity
- Many educators and social psychologists argue
that rewards/reinforcement reduce individual
self-determination, motivation, and creativity - Rewards/reinforcement are interpreted as
controlling which leads to the reduction in the
above listed behaviors - Reinforcement reduces a persons intrinsic
motivation, the critics argue
21Rewards and Intrinsic Motivation
- Opponents of rewards often site experimental data
showing rewards as having a negative effect(s) - Results are not consistent, some studies find
negative effects, some find positive effects - Tangible rewards given for meeting a certain
criterion level of performance or exceeding the
performance of others actually maintained or
enhanced intrinsic interest - Eisenberger and Cameron (1996) no inherent
negative property of rewards
22- Eisenberger and Camerons research suggests ways
to approach the use of rewards - Verbal rewards increases peoples performance and
interest on a task - Tangible rewards produced a slight decrease in
intrinsic motivation when rewards were given
simply for doing an activity, regardless of
quality of performance - In general, rewards increase intrinsic interest
and task enjoyment - The view that rewards undermine peoples
intrinsic motivation is an overgeneralization
rewards tied to level of performance can increase
intrinsic motivation - When used correctly, rewards have positive
effects on creativity and intrinsic motivation - Rewards tied to level or quality of performance
increase intrinsic motivation or leave intrinsic
interest unaffected
23How does one indentify a reinforcing stimulus?
- Test it.
- If given contingent upon a behavior, what effect
is observed upon behavior? - Another way is to rely on the Premack principle
24Premacks Principle
- A higher frequency behavior will function as
reinforcement for a lower frequency behavior - Premack suggests it is possible to describe
reinforcing events as actions of the organism
rather than as discrete stimuli - Reinforcement relativity
25Premack and Punishment
- Less frequent behaviors can function as
punishment for more frequent behaviors - Often termed a reciprocal contingency
- Use in applied behavior analysis
26Response Deprivation
- In a free choice setting, behaviors occur at
different frequencies, yielding a response
hierarchy. Higher frequency behaviors placement
in such a hierarchy relative to lower frequency
behaviors - Depriving an animal of the opportunity to engage
in a given behavior changes the response
frequencies and the hierarchy.
27Response Deprivation and Equilibrium
- Deprivation leads to a reordering of the
hierarchy and determines which behaviors will
function as reinforcement at any given time. - Rats, humans and others will work to gain access
to deprived activities. - Instrumental responses
- Contingent responses
28Operant Conditioning
- Operant conditioning refers to an increase or
decrease in operant behavior as a function of a
contingency of reinforcement. - Latency is important in operant conditioning.
For example, in an experiment, time from closing
a trap door until a cat manages to get it open is
a period known as latency. - Thorndikes law of effect came out of measures of
latency. The law of effect says that operants
that produce positive reinforcers increase in
frequency.
29ThorndikeThe Law of Effect
30(No Transcript)
31E. L. Thorndike studied cats in puzzle boxes.
A hungry cat had to learn press a lever to get
out of the box and get food.
32Cats gradually learned to make the correct
response(by accident at first). Initially low
probability behavior, lever pressing became high
probability behavior and vice versa
33Their response latency also got faster.
34Operant Conditioning
Rate of Response as a Measure of Response
Strength Skinner suggested that rate of response
should be the basic datum (or measure) for
operant analysis. This is because rate of
response is an index of the probability that an
operant will occur in the future. Operant rate
provides a direct measure of the selection of
behavior by its consequences, selection by
consequences.
35Operant conditioning of the neuron
- Skinners speculation in 1953
- In vitro reinforcement of individual neurons
- Dopamine as reinforcement contingent upon bursts
of firings - Contingent delivery of dopamine versus
noncontingent - Dopamine and cannabinoids versus glutamate
36Procedures in Operant Conditioning
- Operant rate-probability of response
- Free operant method is when an animal may respond
over an extensive period of time. It is free to
emit many responses or none at all. Bar pressing
as an operant can occur rapidly or slowly - This allows the researcher to observe changes in
the rate of response and is important to be used
as a measure of response probability.
37The Operant Chamber
- Operant chamber, also called Skinner box
- - Allows continuous behavioral measure
- Measured by cumulative recorder
- - Main dependent variable is response
rate - - Now can measure maintenance of
response, not just learning of it
38Understanding Cumulative Records
- Plots responses as they occur moment to moment
- A pen records time horizontally. Each response
moves the pen vertically - Reinforcers are marked
- Slope of the record indicates response rate
- Steep High
- Flat Low
39Time moves Response moves
low rate
No response
high rate
40Procedures in Operant Conditioning
- Deprivation
- Because the delivery of food is used as
reinforcement, an animal must be motivated to
obtain food. An objective and quantifiable
measure of motivation for food is percentage of
free-feeding body weight. - The procedure of restricting access to food
(the potentially reinforcing stimulus) is called
a deprivation operation.
41Magazine Training
- After deprivation for food is established,
magazine training begins. - When the feeder releases a pellet of food, a
click sound is produced. - The sound of the feeder is associated with food
pellets and becomes a conditioned reinforcer. - After training the animal is observed to move
toward the magazine when the feeder is activated
signaled by the click.
42Procedures in Operant Conditioning
- Operant Level
- Rats emit many exploratory and manipulative
responses and as a result may press the lever in
an operant chamber at some low frequency, even
when this behavior is not reinforced with food.
This baseline rate of response is called the
operant level. If food were to then be made
contingent on lever pressing, lever pressing
would be acquired as a high probability behavior
and increase in frequency above its operant level.
43Procedures in Operant Conditioning
- The Method of Successive Approximation, aka
Shaping - The previous example described the rats
behavioral repertoire and an alteration of the
repertoire. The animals repertoire refers to
the behavior it is capable of naturally emitting
on the basis of species and environmental
history. - Many novel forms of behavior may be shaped by
the method of successive approximation. Shaping
is a key part of acquisition and necessarily
involves differential reinforcement.
44Shaping the lever press response(responses are
shaped, not rats)
- Extinguish any UR to the chambers
- Watch what the rat does
- If you are too slow with a reinforcer, response
not strengthened - If you do not use differential reinforcement,
behavioral variability will not be selected. The
shaping process will not select a new operant.
45A model experiment
- Deprivation and ad libitum weight
- Repeated exposure to the apparatus to extinguish
emotional responses - Pair sound of pellet dispenser with pellet
delivery - Shaping-reinforce successive approximations of
desired response. Responses are shaped, NOT rats. - Differential reinforcement
- First definable response
- Satiation
46Shaping
- In some instances, a behavior does not occur, so
reinforcement of it is not possible - Shaping differential reinforcement of successive
approximations of a terminal behavior - shake
10
1, 310
Raise right paw ½ from ground
On cue, shift weight on left paw, such that right
paw is available
47Shaping behavior in the real world
- Skinner was very good at shaping animals and
people, without their awareness (Rollo May) - The behavior of professors has been shaped by
students in the classroom.
48Satiation
- Satiation- the rate of response declines because
repeated presentations of the reinforcer weaken
its effectiveness. - A satiation operation decreases the
effectiveness of reinforcement. This effect is
opposite to deprivation, in which the
effectiveness of a reinforcer is increased by
withholding it.
49- Operant Variability
- Behavioral variability increases the chances that
the organisms will either reinstate reinforcement
or contact other sources of reinforcement - The effect of reinforcement on response
stereotypy is a controversial issue in the field
of behavior analysis - Reinforcement contingencies may sometimes produce
novel and creative behavior patterns
50- Reinforcement, problem solving, and creativity
- Dr. Barry Schwartz argues that reinforcement
produced behavioral inflexibility and rigidity,
which interferes with finding solutions to
complex problems that require innovation and
creativity - Conversely, Dr. Allen Neuringer argues that
response stereotypy is not an inevitable outcome
of reinforcement, but rather the effects of
reinforcement depend upon the contingencies - If contingencies support response stereotypy,
then it will occur - Contingencies may generate novel sequences of
behavior if these patterns result in reinforcement
51Extinction
- The process of withholding reinforcement for a
previously reinforced response is called
extinction.
52Behavioral Effects of Extinction
- Extinction produces several behavioral effects in
addition to a decline in a rate of response. - Extinction Burst
- When extinction is started, operant behavior
tends to increase in frequency. - An initial increase in rate of response, or
extinction burst, occurs when reinforcement is
first withdrawn. - Response Topography
- In addition to extinction bursts, operants show
increases in operant variability, variations in
form or topography as extinction proceeds. - Antonitis (1951) and Pear (1985)
53(No Transcript)
54Behavioral Effects of Extinction
- Force of Response
- Reinforcement may be made contingent on the
force of response, resulting in response
differentation - When extinction occurs, the force of lever
pressing becomes more variable. Interestingly,
some responses were more forceful than any
emitted during reinforcement or during operant
level. This increase in response force may be
due to emotional behavior generated by extinction
procedures.
55Behavioral Effects of Extinction
- Emotional Responses
- A variety of emotional responses occur under
conditions of extinction. Birds flap their
wings, rats bite the response lever, and humans
may swear and kick at a vending machine. One
important kind of emotional behavior that occurs
during extinction is aggression. - Resistance to extinction
- The number of responses emitted by the bird at
a rate of response during the last session may be
used to index resistance to extinction. -
56The Partial Reinforcement Effect (PRE)
- Partial Reinforcement Effect
- Resistance to extinction is substantially
increased when an intermittent schedule of
reinforcement has been used to maintain behavior. - The higher the rate of reinforcement, the greater
the resistance to change. - Extinction occurs more rapidly on behavior with a
history of continuous reinforcement (CRF)
relative to behavior with a history of
intermittent reinforcement due to discrimination
between reinforcement and extinction - An organism can discriminate between CRF and
extinction more easily than between a lean and
intermittent schedule and no reinforcement
57The Partial Reinforcement Effect
- Another interpretation of PRE involves stimulus
generalization - Conditions of the low rate of reinforcement on
intermittent schedules are more similar to
extinction than conditions on a CRF - Organisms generalize from intermittent
reinforcement to extinction, resulting in more
time or responses to extinction - Contact with the contingencies A rat reinforced
for every 100 responses must emit 100 response in
order to encounter the transition into
extinction, while a rat on a CRF could contact
extinction immediately.
58Discriminative Stimuli and Extinction
- Skinner (1950), maximal responding during
extinction is obtained only when conditions
under which the response was reinforced are
precisely reproduced (p. 204). - If you want a response to extinguish more
rapidly
59Spontaneous Recovery and Extinction
- After a session of extinction, the response rate
may be close to operant level. - When the animal is placed in operant chamber the
next day (still on extinction) it will respond
above operant level Spontaneous Recovery. - After repeated sessions of extinction, the level
of recovery will decline.
60Forgetting vs. Extinction
- Forgetting is behavior change due to the passage
of time without opportunity to perform a
behavior. - Extinction is behavior change due to response
performance without reinforcement. - A very well learned response is quite resistant
to behavior change due to lack of opportunity to
perform or forgetting. Time itself can have
little effect on a well learned response if
occurrences of the response are reinforced when
an opportunity for a response performance occurs. - Extinction is a change in behavior due to
performance without consequences. The passage of
time will reduce the resistance to extinction of
a particular response.
61Remembering and Recalling
- In much of psychology, people and other organisms
are said to store information about events in
memory - The use of the noun memory is an example of
reification or treating an action as if it were a
thing - In behavior analysis, the verb remembering (or
forgetting) is used to refer to the effect of
some event on behavior after the passage of time
62Remembering and Recalling
- For humans, we may say that a person recalls
his/her trip to Costa Rica, when a picture of the
trip occasions a verbal description of the
vacation - From a behavioral perspective, recalling the trip
is behavior (mostly verbal) emitted now with
respect to events that occurred in the past - Remembering and recalling are treated as behavior
processes rather than some mysterious thing
(i.e., a memory) within us - Behavior analysts assume that the event recalled
is one that was described when it first occurred - Recalling refers to the reoccurrence of behavior
(mostly verbal) that has already occurred at
least once
63Remembering as Behavior
- Remembering involves behavior that occurred in
the past that now reoccurs after a period of time
has elapsed since the original performance. - Do elephants really remember (or never forget?
64Applying Extinction
- Extinction is an important role in behavior
modification - Williams (1959)-When put to bed by his parents a
20-month-old boy would throw a temper tantrum
requiring the parents to stay up with him. - The parental attention given to the boy served as
a reinforcement to the tantrums. - Extinction was implemented by the parents leaving
the room after the child was put to bed. - The first extinction session, the tantrum lasted
45 min. and on the third session only 10 min.
After ten days the boy had no tantrums.
65Applying Extinction
- The childs aunt reinforced his crying by staying
in the room and his tantrums reoccurred. - This intermittent reinforcement increases
resistance to extinction of the tantrums - The second extinction procedure took longer due
to the intermittent reinforcement. - The first procedure was more effective since the
child was continuously reinforced.