Title: Chapter 5 Schedules of Reinforcement
1Chapter 5Schedules of Reinforcement
Prepared by Brady J. Phelps, South Dakota State
University
2Schedules of Reinforcement
- A Schedule of Reinforcement is a prescription
that states how and when discriminative stimuli
and behavioral consequences will be presented. - A rat pressing a lever for food and a person
turning on a light seem to have little in common.
Humans are very complex organisms They build
cities, write books, go to college, go to war,
conduct experiments, and do many other things
that rats cannot do. Nonetheless, performance on
schedules of reinforcement has been found to be
remarkably similar for different organisms, many
types of behavior, and a variety of reinforcers.
3Importance of Schedules of Reinforcement
- Rule for the presentation of stimuli that precede
operants and the consequences that follow them - Schedules are often the fundamental determinants
of behavior - Rate and temporal patterns of responding
- Probability of responding
4Schedules and Patterns of Response
- Patterns of response develop on schedules of
reinforcement. These patterns come about after
an animal has experience with the contingency of
reinforcement defined by a particular schedule. - Subjects are exposed to a schedule of
reinforcement and, following an acquisition
period, behavior typically settles into a
consistent or steady-state performance. - It may take many experimental sessions before a
particular pattern emerges, but once it does, the
orderliness of behavior is remarkable.
5Schedules and Patterns of Response
- When organisms are reinforced for a fixed number
of responses, a pause-and-run pattern of behavior
develops. - Responses required by the schedule are made
rapidly and result in reinforcement. - Following each reinforcement, there is a pause in
responding, then another quick burst of
responses. - This pattern repeats over and over and occurs
even when the size of the schedule is changed. A
pause-and-run pattern has been found for horses,
chickens, vultures, and children.
6Schedules of Reinforcement
- CRF - continuous reinforcement FR 1
- Fixed Ratio- FR an FR schedule
- Postreinforcement pause
- run of responses
- Variable Ratio VR
- Fixed Interval - FI scalloping
- Long term - break and run
- Humans
- Variable Interval - VI
7Inner Causes of Behavior
- Some people are motivated to work, while others
are not? - Personalities explain observed consistencies in
behavior?
8Schedules of Positive Reinforcement
- Response based schedules
- Fixed ratio FR
- Variable ratio VR
- Progressive ratio PR
- Random ratio RR
- Time Based Schedules
- Fixed interval FI
- Variable interval VI
- Fixed and variable time FT/VT
9Other Schedules
- Differential reinforcement of low rates
- IRT gt t
- Differential reinforcement of high rates
- IRT lt t
10- Continuous Reinforcement
- Continuous reinforcement, or CRF, is the
simplest schedule of reinforcement. On this
schedule, every operant required by the
contingency is reinforced. - CRF and Resistance to Extinction
- Continuous reinforcement generates little
resistance to extinction. Resistance to
extinction is a measure of persistence when
reinforcement is discontinued. - CRF, response stereotypy, resurgence
11Response Stereotypy on CRF
- When on CRF, the topography of response become
very predictable, with very little variability. - Antonitis (1951) Rats were conditioned to poke
their noses anywhere along a 50 cm slot on CRF
schedule.The rats responded to the same point on
the slot while on CRF - When placed on extinction, the variability of the
placement along the slot increased.
12Response Stereotypy on CRF
- As continuous reinforcement persists less and
less variation occurs of the operant class. - The variability of response may be inversely
related to the rate of reinforcement. - Responses were stereotyped on CRF and became more
variable when on intermittent or extinction
schedule. - It appears that the general principle is when
things no longer work try new ways of behaving. - Resurgence increase in topography variability
during extinction, can contribute to the
development of creative or original behavior
13Ratio and Interval Schedules of Reinforcement
- On intermittent schedules of reinforcement, some
rather than all responses are reinforced. - Ratio schedules are response based that is,
these schedules are set to deliver reinforcement
following a number of responses. - Interval schedules pay off when one response is
made after some amount of time has passed. - Interval schedules may be fixed or variable.
- Fixed schedules set up reinforcement after a
fixed number of responses, or a constant amount
of time has passed. On variable schedules,
response and time requirements may vary from one
reinforcer to the next.
14Ratio Schedules
- A fixed ratio, or FR, schedule is programmed to
deliver reinforcement after a fixed number of
responses is made. - Continuous reinforcement is FR 1
- FR schedules produce a rapid run of responses,
followed by reinforcement, then a pause in
responding. - A cumulative record of behavior on fixed ratio
looks somewhat like a set of stairs. There is a
steep period of responding (the run), followed by
reinforcement, and finally a flat portion. The
flat part of the cumulative record is called the
postreinforcement pause, or PRP.
15Ratio Schedules
- Variable-ratio, or VR, schedules are similar to
FRs except that the number of responses required
for reinforcement changes after each reinforcer
is presented. - The average number of responses is used to define
the schedule. - Ratio schedules produce a high rate of response.
- When VR and FR schedules are compared, responding
is typically faster on variable ratio. One
reason for this is that pausing after
reinforcement (PRP) is reduced or eliminated when
the ratio contingency is changed from fixed to
variable. The provides evidence that the PRP
does not occur because the animal is consuming
the reinforcer.
16Ratio Schedules
- Many contingencies set by games of chance are
similar to variable-ratio schedules, more likely
a random-ratio. - Gambling is often called addictive, but from a
behavioral perspective it may be understood as
persistent high-rate behavior generated by ratio
contingencies of reinforcement. - A bird on a standard VR schedule may make
thousands of responses for a few brief
presentations of grain
17Ratio Schedules
- It is possible to set the average ratio
requirements so high that an animal will spend
all of its time working for a small amount of
food - Animal will show a net energy loss where effort
expended exceeds caloric intake, similar to the
self-defeating response sometimes seen in
gambling behavior - The seemingly irrational behavior of gamblers is
generated by an unfavorable probabilistic
schedule of reinforcement, not by addictive
personalities
18Interval Schedules
- On fixed-interval (FI) schedules, an operant is
reinforced after a fixed amount of time has
passed. - For example, on a fixed-interval 90-second
schedule, one bar press after 90 seconds results
in reinforcement. - When organisms are exposed to interval
contingencies, they typically produce many more
responses than the schedule requires. - Fixed-interval schedules produce a characteristic
pattern of responding. There is a pause after
reinforcement (PRP), then a few probe responses,
followed by more and more rapid responding as the
interval times out. This pattern of response is
called scalloping.
19Interval Schedules
- Following considerable experience with FI 5
minutes, you may get very good at judging the
time period. - In this case, you would wait out the interval and
then emit a burst of responses. Perhaps you
decide to pace back and forth during the session,
and you find out that after 250 steps the
interval has almost elapsed. This kind of
mediating behavior may develop after experience
with FI schedules. - Other animals behave in a similar way and
occasionally produce a break-and-run pattern of
responding.
20Generality of Schedule Effects
- Behavior analysts assume that research on
schedule effects with animals also apply to
humans. - The assumption of generality implies that the
effects of reinforcement extend over species,
reinforcement, and behavior. - Humans rarely show similar performances to rats
when placed on FI schedules.
21Generality of Schedule Effects
- The influence of language may explain why humans
do not show characteristics of scalloping on FI
schedules. - Humans either produce a high rate of response or
a low rate of response. - People generate a verbal rule and behave
according to the rule rather than the
experimental contingencies. - Humans who have not developed language skills
will respond similar to a rat and more like the
characteristic effects of the schedule - Rats performance on FI schedule after a history
of ratio reinforcement
22Interval Schedules
- On a variable-interval, VI, schedule responses
are reinforced after a variable amount of time
has passed. - For example, on a VI 30 second schedule, the time
to each reinforcement changes but the average
time is 30 seconds. - On this schedule rate of response is steady and
moderate. The pause after reinforcement that
occurs on FI usually does not appear in the
variable-interval record. Because rate of
response is moderate, VI performance is often
used as a baseline for evaluating other
independent variable such as drug effects - VI contingencies are common in everyday life.
23Reinforcement and Behavioral Momentum
- A concept of momentum derives from a combination
of response rate, as generated by schedules of
reinforcement, and the behavioral dynamic of
resistance to change, both of which are important
dimensions of operant behavior and analogous to
velocity and mass in physics - Behavioral Momentum refers to behavior persisting
in the presence of a particular stimulus despite
disruptive factors
24Schedule Performance in Transition
- Early performance on a schedule is referred to as
transition-state performance - Most learning occurs during such transition
periods - Stable performance
- Ratio strain
- Interreinforcement intervals
- Transitions between reinforcement schedules and
life transitions
25Schedule Performance in Transition
- After steady-state performance is established on
CRF, you are faced with the problem of how to
program the steps from CRF to a large schedule
such as an FR 100. - Notice that there is a large increase in the
amount of bar pressing required for
reinforcement. - If you simply move from CRF to large-ratio value
in one step, the animal will show ratio strain in
the sense that it produces longer and longer
pauses after reinforcement.
26Schedule Performance in Transition
- Large and sudden increases in schedules may
produce extinction and is why a slow progression
to a higher schedule is implemented - Transitions in schedules occur in major life
events. Ex. Divorce - Following a divorce a shift in contingencies of
reinforcement take place. - Feelings of depression and loneliness may be
produced by ratio strain and extinction.
27Schedules and Cigarettes
- The use of drugs is operant behavior maintained
by the reinforcing effects of the drug. - The effectiveness of an contingency for
abstaining from drug use depends on the magnitude
and schedule of reinforcement for nondrug use - A population of smokers (N60) were assigned to
one of three groups Progressive reinforcement
(N20), fixed rate reinforcement (N20), and
control (N20). Carbon monoxide testing detected
abstinence (or lack thereof) from smoking. - Money served as reinforcer to abstain from
smoking, in conjunction with a response cost
contingency
28Applying Schedules to Smoking
- Smokers in both experimental groups passed 80 of
CO tests while the control group passed 40 of
the tests - 22 of the progressive group resumed smoking
while 60 of the fixed and 82 of the control
group resumed smoking. - A progressive reinforcement schedule appears
effective in the short run for abstinence from
smoking. Further research is needed to indicate
if this schedule is effective for long term
abstinence.
29Rate of Response on Schedules
- Molecular accounts
- Molar accounts
- Interresponse times (IRTs)-Molecular
- Generally, ratio schedules produce shorter IRTs
and consequently higher rates of response than
interval schedules. - IRT length as an operant
- VR and VI response rates
30Schedules and IRTs
FRs and VRs differentially reinforce short IRTS,
the shorter the time interval between consecutive
responses, reinforcement is more probable. Ratio
schedules reinforce bursts of responding, like
a machine gun. Interval schedules, whether fixed
or variable, tend to reinforce longer IRTS such
as respond-wait-respond-wait since responding
faster will not produce more probable
reinforcement. On such schedules, only a single
response is required as long as that single
response occurs after the interval has elapsed.
As a result, a paced response rate with delays
between consecutive responses is likely to
produce reinforcement. These are molecular
accounts of response rate.
31Molar Accounts of Rate
- The overall relationship between responses and
reinforcement - The correlation between responses and
reinforcement produces the rate differences on
interval and ratio schedules
32- VR and VI response rates
- VR yoked VI
- Rates are higher on VR schedules even when rate
of reinforcement is the same - ?
33Variable schedules produce more consistent
responding (no pauses or scallops) than fixed
schedules. Why? VR schedules produce faster
response rates than VI. Why?
34Molar Accounts of Response Rate
Consider a subject responding on a VR 100
schedule for a 50 minute session. If this subject
responds leisurely at .8 responses per second,
in the 50 minutes, 48 responses per minute,
about 2 minutes per VR 100 or about 30
reinforcers obtained in the 50 minutes. If on
the same VR 100, the subject responded at 2
responses per sec., or 120 responses/minute or
about 48 seconds per VR 100 many more
reinforcers are obtained in the same 50 minutes.
Here, increases in response rate are correlated
with more frequent reinforcement. Increases in
response rate on interval schedules are not
correlated with increases in response
rate. Ratio schedules produce higher response
rates not due to timing on the order of IRT
length but in terms of timing over entire
sessions.
35Rates of Response
36 Ratio schedules produce higher response
rates than interval schedules Why? -Â Â Â Â Â Â Â Â
Feedback function Ratio schedules selectively
reinforce high response rates because increased
response rate means more reinforcement. The
faster the subject responds, the sooner the next
reinforcement is obtained. Interval schedules
preferentially reinforce long Inter Response
Times (IRTs) because the longer you pause, the
more likely the first response after the pause
will be reinforced. Rate of responding is
irrelevant to obtaining the next reinforcement
any sooner.
37PostReinforcement Pause
- Generally, it is well established that the
postreinforcement pause is a function of the
interreinforcement interval (IRI). - PRP as a function of IRI
- FI PRP is approximately ½ interval value.
- FR- PRP length increases as size of the FR is
increased, but problematically, rate of response
also increases.
38Analysis of Reinforcement Schedules
- FR postreinforcement pause theories
- Fatigue, a larger ratio produces a longer PRP as
the subject catches its breath. But why not a
PRP on VRs with large ratios? - Satiation, right after consuming reinforcement
the subject is in a state of relative satiation.
But PRPs happen after non-consumable reinforcers. - Remaining response, on a mixed schedule, two or
more schedules alternate each with its own SD and
its own reinforcement. If a large ratio
alternates with a shorter ratio, the PRP will be
longest after the shorter ratio. A PRP might be
called a PreRatio Pause
39PRP in a Multiple Schedule
On a multiple schedule, two or more basic
schedules alternate, each one w\ an SD and
primary reinforcement, as in a blue key light
FR50--gt SR-red key light-FR5--gtSR on a MULT
schedule such as this, the PRP is
typically longer after the FR5 than after the
FR50. A PRP is more a function of the upcoming
ratio than the ratio just completed.
40Molar Accounts of Pausing
- PRPs are typically half of the IRI but also in a
normal distribution of values over the IRI. - An animal that was sensitive to the overall rate
of reinforcement in experimental sessions would
be predicted to emit pauses that are on average
about half of the FI, a maximization view.
41Molecular Accounts of Pause
- During the PRP, an animal engages in other
behavior, typically schedule-induced behavior
controlled by their own reinforcement. - Work-time or run-of-responses the greater the
amount of effort expended to obtain the current
reinforcer, devalues the next upcoming reinforcer
42Why research such questions?
- Behavioral dynamics schedule performances
analyzed as being due to a few basic processes,
an analogy to physics.
43Rate of Response on Schedules
- Dynamic interactions between
- Molecular aspects - moment-to-moment
relationships - Molar aspects- length of session
44Schedules used to condition response rate 1)
Differential reinforcement of low rates of
responding (DRL) Reinforcers only follow
responses that occur after a minimum amount of
time has passed has elapsed between two
consecutive responses. Ex) DRL 10
sec Response--10 sec delay --gt Response --gt
Reinf If a response occurs during the delay, a
reinforcer is not given, DRLs reinforce
very long IRTs.
452) Differential reinforcement of high rates of
responding (DRH) Reinforcers only follow
responses that occur before a minimum amount of
time has passed. Ex) DRH 10 Response-? 10 sec
delay --gt Response -gt No reinf Two consecutive
responses must occur before the 10 secs to be
reinforced. The IRT between two consecutive
responses must be less than some specified value.
A DRH reinforces very short IRTs.
46Sanford (IQ 65) a prisoner in Ga prison
system The incentive value of the reinforcer
(points) was higher if Sanford learned faster. 1
grade level in 90 days (DRH 90) ---gt 120
pts (DRH 4) ----gt 900 pts
(DRH 1) -----gt 4700 pts With points exchangeable
for tangible privileges and goods, studying hard
became so reinforcing that Sanford started
skipping recreation time to study. Studying
Sanford completed 5 years of high school in 5
months! He was being differentially reinforced
for learning fast (high rate).
47Intelligence and IQ
Some students may be questioning whether the
subject Sanford actually had an IQ of 65. What is
the IQ test measuring, intelligence or the
subjects ability to do well on such tests?
Cognitive psychologists would of course argue
that some partially innate intellectual or
information processing capability is being
assessed. Hence, the notorious bell curve
data showing different races have different
IQs. But what if the IQ test is just measuring
ability to take such tests which could be
affected by variables such as motivation to work
hard??? African American students who took IQ
tests and were given affirmative feedback for
each correct score produced IQ test scores 10 15
points higher than for African Americans tested
without this feedback. The use of feedback had no
real effect on white students taking IQ tests.