Chapter 4 Reinforcement and Extinction of Operant Behavior

About This Presentation

Title:

Chapter 4 Reinforcement and Extinction of Operant Behavior

Description:

Opponents of rewards often site experimental data showing rewards as ... Free operant method is when an animal may respond over an extensive period of time. ... – PowerPoint PPT presentation

Number of Views:2793

Avg rating:3.0/5.0

Slides: 66

Provided by: phel

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 4 Reinforcement and Extinction of Operant Behavior

1
Chapter 4Reinforcement and Extinction of Operant
Behavior
Prepared by Brady J. Phelps, South Dakota State
University
2
Two Types of Behavior

Classical Conditioning
Respondents reflexive, elicited behavior
CS controls behavior
Operant Conditioning
Operants voluntary, emitted behavior
outcomes control behavior

3
Operant Behavior

Operant behavior that is followed by reinforcing
consequences is selected in the sense that it
increases in frequency.
Behavior that is not followed by reinforcing
consequences decreases in frequency.
This is known as operant conditioning.
Operant conditioning as a process, has evolved
over species history and is based on genetic
endowment.
That is, operant (and respondent) conditioning as
a general behavior-change process is based on
phylogeny.

4
Operant Behavior

From a scientific perspective, operant behavior
is lawful and may be analyzed in terms of its
relationship to environmental events.
Formally, responses that produce a change in the
environment are called operants.
The term operant comes from the verb to operate
and refers to behavior that operates on the
environment to produce a consequence.
A positive reinforcer is defined as any
consequence that increases the probability of the
operant that produced it.
Topography
Operant class

5
Why rats and pigeons?
Rats and pigeons (and humans) have been common
research subjects For operant conditioning. Rats
and pigeons have been used for a number of
reasons. The selection of a basic response such
as a lever- press or a key peck while these
responses are simple, they are adequate to
illustrate the generality of basic operant
processes. In other words, the specific subject
or response isnt as important as the principles
being demonstrated. Rats and pigeons are what
can be considered technophilic species as they
readily adapt to human apparatus and habitations.
Just as these species readily move into our
garages, barns, attics or basements, they live
well in cages. By the way, operant researchers
have used subjects as diverse as insects to
cephalapods to primates.
6
Old Faithful
7
(No Transcript)
8
This slide and the prior slide are of Dr. Robert
Allen and here his student Shannon Lieb, with
subject 28, of Lafayette College. The next slide
is a close up of one of Dr. Allens pigeon
subjects responding in an operant chamber.
9
(No Transcript)
10
(No Transcript)
11
RoboRat, ready for search and rescue
12
Discriminative Stimuli

Operant behavior is said to be emitted in the
sense that it often occurs without an observable
stimulus preceding it. This is in contrast to
reflexive responses, which are elicited by a
preceding stimulus.
Stimuli may also precede operant behavior.
However, these events do not force the occurrence
of the response that follows them.
An event that precedes an operant and alters its
likelihood is said to set the occasion for
behavior and is called a discriminative stimulus,
or SD.

13
Discriminative Stimuli

The consequences that follow operant behavior
establish the control exerted by discriminative
stimuli. When an SD is followed by an operant
that produces positive reinforcement, the operant
is more likely to occur the next time the
stimulus is present.
Differential reinforcement
When an operant does not produce reinforcement,
the stimulus that precedes the response is called
an S-delta (S?.) In the presence of an S-delta,
the probability of emitting an operant declines.
An SD (or an S?) are not defined by their
physical parameters but rather by preceding and
altering the probability of responses.

14
Emitted versus occasioned
Operants can and do occur in the absence of any
eliciting stimulus, they are said to be freely
emitted. However, when an SD comes to control
occurrences of an operant, to alter its
probability of occurring, then it is said that
the SD occasions the operant. The term occasion
dictates that the operant is under the stimulus
control of an antecedent stimulus. Occasion as a
verb can be defined as creating a situation in
which something (in this case, an operant) is
especially likely to occur.
15
Contingencies of Reinforcement

A contingency of reinforcement defines the
relationship between the events that set the
occasion for behavior, the operant class, and the
consequences that follow this behavior.

16
Four Basic Contingencies

Positive Reinforcement
Positive reinforcement is one of the four basic
contingencies of operant behavior.
Positive reinforcement is when a stimulus
follows behavior and, as a result, the rate of
that behavior increases.
Positively reinforcing events usually include
consequences such as food, praise, and money.
These events, however, cannot be said to be
positive reinforcers until they have been shown
to increase behavior.

17
Four Basic Contingencies

Negative Reinforcement
When a response results in the removal of an
event, and this procedure increases the rate of
that response, the contingency is called negative
reinforcement.

18
Four Basic Contingencies

Positive Punishment
A situation in which responses produce an event
and the rate of behavior decreases. This
contingency is called positive punishment.

19
Four Basic Contingencies

Negative Punishment
Punishment can also be arranged by removing
stimuli contingent on behavior. This contingency
is called negative punishment.

20
Reinforcement, Intrinsic Motivation, and
Creativity

Many educators and social psychologists argue
that rewards/reinforcement reduce individual
self-determination, motivation, and creativity
Rewards/reinforcement are interpreted as
controlling which leads to the reduction in the
above listed behaviors
Reinforcement reduces a persons intrinsic
motivation, the critics argue

21
Rewards and Intrinsic Motivation

Opponents of rewards often site experimental data
showing rewards as having a negative effect(s)
Results are not consistent, some studies find
negative effects, some find positive effects
Tangible rewards given for meeting a certain
criterion level of performance or exceeding the
performance of others actually maintained or
enhanced intrinsic interest
Eisenberger and Cameron (1996) no inherent
negative property of rewards

Eisenberger and Camerons research suggests ways
to approach the use of rewards
Verbal rewards increases peoples performance and
interest on a task
Tangible rewards produced a slight decrease in
intrinsic motivation when rewards were given
simply for doing an activity, regardless of
quality of performance
In general, rewards increase intrinsic interest
and task enjoyment
The view that rewards undermine peoples
intrinsic motivation is an overgeneralization
rewards tied to level of performance can increase
intrinsic motivation
When used correctly, rewards have positive
effects on creativity and intrinsic motivation
Rewards tied to level or quality of performance
increase intrinsic motivation or leave intrinsic
interest unaffected

23
How does one indentify a reinforcing stimulus?

Test it.
If given contingent upon a behavior, what effect
is observed upon behavior?
Another way is to rely on the Premack principle

24
Premacks Principle

A higher frequency behavior will function as
reinforcement for a lower frequency behavior
Premack suggests it is possible to describe
reinforcing events as actions of the organism
rather than as discrete stimuli
Reinforcement relativity

25
Premack and Punishment

Less frequent behaviors can function as
punishment for more frequent behaviors
Often termed a reciprocal contingency
Use in applied behavior analysis

26
Response Deprivation

In a free choice setting, behaviors occur at
different frequencies, yielding a response
hierarchy. Higher frequency behaviors placement
in such a hierarchy relative to lower frequency
behaviors
Depriving an animal of the opportunity to engage
in a given behavior changes the response
frequencies and the hierarchy.

27
Response Deprivation and Equilibrium

Deprivation leads to a reordering of the
hierarchy and determines which behaviors will
function as reinforcement at any given time.
Rats, humans and others will work to gain access
to deprived activities.
Instrumental responses
Contingent responses

28
Operant Conditioning

Operant conditioning refers to an increase or
decrease in operant behavior as a function of a
contingency of reinforcement.
Latency is important in operant conditioning.
For example, in an experiment, time from closing
a trap door until a cat manages to get it open is
a period known as latency.
Thorndikes law of effect came out of measures of
latency. The law of effect says that operants
that produce positive reinforcers increase in
frequency.

29
ThorndikeThe Law of Effect
30
(No Transcript)
31
E. L. Thorndike studied cats in puzzle boxes.
A hungry cat had to learn press a lever to get
out of the box and get food.
32
Cats gradually learned to make the correct
response(by accident at first). Initially low
probability behavior, lever pressing became high
probability behavior and vice versa
33
Their response latency also got faster.
34
Operant Conditioning
Rate of Response as a Measure of Response
Strength Skinner suggested that rate of response
should be the basic datum (or measure) for
operant analysis. This is because rate of
response is an index of the probability that an
operant will occur in the future. Operant rate
provides a direct measure of the selection of
behavior by its consequences, selection by
consequences.
35
Operant conditioning of the neuron

Skinners speculation in 1953
In vitro reinforcement of individual neurons
Dopamine as reinforcement contingent upon bursts
of firings
Contingent delivery of dopamine versus
noncontingent
Dopamine and cannabinoids versus glutamate

36
Procedures in Operant Conditioning

Operant rate-probability of response
Free operant method is when an animal may respond
over an extensive period of time. It is free to
emit many responses or none at all. Bar pressing
as an operant can occur rapidly or slowly
This allows the researcher to observe changes in
the rate of response and is important to be used
as a measure of response probability.

37
The Operant Chamber

Operant chamber, also called Skinner box
- Allows continuous behavioral measure
Measured by cumulative recorder
- Main dependent variable is response
rate
- Now can measure maintenance of
response, not just learning of it

38
Understanding Cumulative Records

Plots responses as they occur moment to moment
A pen records time horizontally. Each response
moves the pen vertically
Reinforcers are marked
Slope of the record indicates response rate
Steep High
Flat Low

39
Time moves Response moves
low rate
No response
high rate
40
Procedures in Operant Conditioning

Deprivation
Because the delivery of food is used as
reinforcement, an animal must be motivated to
obtain food. An objective and quantifiable
measure of motivation for food is percentage of
free-feeding body weight.
The procedure of restricting access to food
(the potentially reinforcing stimulus) is called
a deprivation operation.

41
Magazine Training

After deprivation for food is established,
magazine training begins.
When the feeder releases a pellet of food, a
click sound is produced.
The sound of the feeder is associated with food
pellets and becomes a conditioned reinforcer.
After training the animal is observed to move
toward the magazine when the feeder is activated
signaled by the click.

42
Procedures in Operant Conditioning

Operant Level
Rats emit many exploratory and manipulative
responses and as a result may press the lever in
an operant chamber at some low frequency, even
when this behavior is not reinforced with food.
This baseline rate of response is called the
operant level. If food were to then be made
contingent on lever pressing, lever pressing
would be acquired as a high probability behavior
and increase in frequency above its operant level.

43
Procedures in Operant Conditioning

The Method of Successive Approximation, aka
Shaping
The previous example described the rats
behavioral repertoire and an alteration of the
repertoire. The animals repertoire refers to
the behavior it is capable of naturally emitting
on the basis of species and environmental
history.
Many novel forms of behavior may be shaped by
the method of successive approximation. Shaping
is a key part of acquisition and necessarily
involves differential reinforcement.

44
Shaping the lever press response(responses are
shaped, not rats)

Extinguish any UR to the chambers
Watch what the rat does
If you are too slow with a reinforcer, response
not strengthened
If you do not use differential reinforcement,
behavioral variability will not be selected. The
shaping process will not select a new operant.

45
A model experiment

Deprivation and ad libitum weight
Repeated exposure to the apparatus to extinguish
emotional responses
Pair sound of pellet dispenser with pellet
delivery
Shaping-reinforce successive approximations of
desired response. Responses are shaped, NOT rats.
Differential reinforcement
First definable response
Satiation

46
Shaping

In some instances, a behavior does not occur, so
reinforcement of it is not possible
Shaping differential reinforcement of successive
approximations of a terminal behavior
shake

10
1, 310
Raise right paw ½ from ground
On cue, shift weight on left paw, such that right
paw is available
47
Shaping behavior in the real world

Skinner was very good at shaping animals and
people, without their awareness (Rollo May)
The behavior of professors has been shaped by
students in the classroom.

48
Satiation

Satiation- the rate of response declines because
repeated presentations of the reinforcer weaken
its effectiveness.
A satiation operation decreases the
effectiveness of reinforcement. This effect is
opposite to deprivation, in which the
effectiveness of a reinforcer is increased by
withholding it.

Operant Variability
Behavioral variability increases the chances that
the organisms will either reinstate reinforcement
or contact other sources of reinforcement
The effect of reinforcement on response
stereotypy is a controversial issue in the field
of behavior analysis
Reinforcement contingencies may sometimes produce
novel and creative behavior patterns

Reinforcement, problem solving, and creativity
Dr. Barry Schwartz argues that reinforcement
produced behavioral inflexibility and rigidity,
which interferes with finding solutions to
complex problems that require innovation and
creativity
Conversely, Dr. Allen Neuringer argues that
response stereotypy is not an inevitable outcome
of reinforcement, but rather the effects of
reinforcement depend upon the contingencies
If contingencies support response stereotypy,
then it will occur
Contingencies may generate novel sequences of
behavior if these patterns result in reinforcement

51
Extinction

The process of withholding reinforcement for a
previously reinforced response is called
extinction.

52
Behavioral Effects of Extinction

Extinction produces several behavioral effects in
addition to a decline in a rate of response.
Extinction Burst
When extinction is started, operant behavior
tends to increase in frequency.
An initial increase in rate of response, or
extinction burst, occurs when reinforcement is
first withdrawn.
Response Topography
In addition to extinction bursts, operants show
increases in operant variability, variations in
form or topography as extinction proceeds.
Antonitis (1951) and Pear (1985)

53
(No Transcript)
54
Behavioral Effects of Extinction

Force of Response
Reinforcement may be made contingent on the
force of response, resulting in response
differentation
When extinction occurs, the force of lever
pressing becomes more variable. Interestingly,
some responses were more forceful than any
emitted during reinforcement or during operant
level. This increase in response force may be
due to emotional behavior generated by extinction
procedures.

55
Behavioral Effects of Extinction

Emotional Responses
A variety of emotional responses occur under
conditions of extinction. Birds flap their
wings, rats bite the response lever, and humans
may swear and kick at a vending machine. One
important kind of emotional behavior that occurs
during extinction is aggression.
Resistance to extinction
The number of responses emitted by the bird at
a rate of response during the last session may be
used to index resistance to extinction.

56
The Partial Reinforcement Effect (PRE)

Partial Reinforcement Effect
Resistance to extinction is substantially
increased when an intermittent schedule of
reinforcement has been used to maintain behavior.
The higher the rate of reinforcement, the greater
the resistance to change.
Extinction occurs more rapidly on behavior with a
history of continuous reinforcement (CRF)
relative to behavior with a history of
intermittent reinforcement due to discrimination
between reinforcement and extinction
An organism can discriminate between CRF and
extinction more easily than between a lean and
intermittent schedule and no reinforcement

57
The Partial Reinforcement Effect

Another interpretation of PRE involves stimulus
generalization
Conditions of the low rate of reinforcement on
intermittent schedules are more similar to
extinction than conditions on a CRF
Organisms generalize from intermittent
reinforcement to extinction, resulting in more
time or responses to extinction
Contact with the contingencies A rat reinforced
for every 100 responses must emit 100 response in
order to encounter the transition into
extinction, while a rat on a CRF could contact
extinction immediately.

58
Discriminative Stimuli and Extinction

Skinner (1950), maximal responding during
extinction is obtained only when conditions
under which the response was reinforced are
precisely reproduced (p. 204).
If you want a response to extinguish more
rapidly

59
Spontaneous Recovery and Extinction

After a session of extinction, the response rate
may be close to operant level.
When the animal is placed in operant chamber the
next day (still on extinction) it will respond
above operant level Spontaneous Recovery.
After repeated sessions of extinction, the level
of recovery will decline.

60
Forgetting vs. Extinction

Forgetting is behavior change due to the passage
of time without opportunity to perform a
behavior.
Extinction is behavior change due to response
performance without reinforcement.
A very well learned response is quite resistant
to behavior change due to lack of opportunity to
perform or forgetting. Time itself can have
little effect on a well learned response if
occurrences of the response are reinforced when
an opportunity for a response performance occurs.
Extinction is a change in behavior due to
performance without consequences. The passage of
time will reduce the resistance to extinction of
a particular response.

61
Remembering and Recalling

In much of psychology, people and other organisms
are said to store information about events in
memory
The use of the noun memory is an example of
reification or treating an action as if it were a
thing
In behavior analysis, the verb remembering (or
forgetting) is used to refer to the effect of
some event on behavior after the passage of time

62
Remembering and Recalling

For humans, we may say that a person recalls
his/her trip to Costa Rica, when a picture of the
trip occasions a verbal description of the
vacation
From a behavioral perspective, recalling the trip
is behavior (mostly verbal) emitted now with
respect to events that occurred in the past
Remembering and recalling are treated as behavior
processes rather than some mysterious thing
(i.e., a memory) within us
Behavior analysts assume that the event recalled
is one that was described when it first occurred
Recalling refers to the reoccurrence of behavior
(mostly verbal) that has already occurred at
least once

63
Remembering as Behavior

Remembering involves behavior that occurred in
the past that now reoccurs after a period of time
has elapsed since the original performance.
Do elephants really remember (or never forget?

64
Applying Extinction

Extinction is an important role in behavior
modification
Williams (1959)-When put to bed by his parents a
20-month-old boy would throw a temper tantrum
requiring the parents to stay up with him.
The parental attention given to the boy served as
a reinforcement to the tantrums.
Extinction was implemented by the parents leaving
the room after the child was put to bed.
The first extinction session, the tantrum lasted
45 min. and on the third session only 10 min.
After ten days the boy had no tantrums.

65
Applying Extinction

The childs aunt reinforced his crying by staying
in the room and his tantrums reoccurred.
This intermittent reinforcement increases
resistance to extinction of the tantrums
The second extinction procedure took longer due
to the intermittent reinforcement.
The first procedure was more effective since the
child was continuously reinforced.

Write a Comment

User Comments (0)