Operant Conditioning - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Operant Conditioning

Description:

Chapter 13, Unit 4 Psychology Acquisition Extinction Spontaneous recovery Stimulus generalisation and discrimination Both types of conditioning are achieved as a ... – PowerPoint PPT presentation

Number of Views:325
Avg rating:3.0/5.0
Slides: 39
Provided by: kou95
Category:

less

Transcript and Presenter's Notes

Title: Operant Conditioning


1
Operant Conditioning
  • Chapter 13, Unit 4 Psychology

2
Introduction
  • While CC is useful for explaining learned
    behaviour, there are many other learned
    behaviours that CC cannot explain, such as
    behaviours that are voluntary.
  • Much of our learning occurs by trial and error.
  • We all make adjustments to our behaviour
    according to the outcomes or consequences it
    produces.
  • Operant conditioning is the learning that take
    places as a result of these consequences.

3
Trial and error learning
  • Describes an organisms attempts to learn, or to
    solve a problem, by trying alternative
    possibilities until a correct solution or desired
    outcome is achieved.
  • Involves a number of attempts (trials) and a
    number of incorrect choices (errors) before the
    correct behaviour is learned.
  • Also referred to as instrumental learning, as
    the individual is instrumental in learning the
    correct response.
  • More recently however, this learning has been
    referred to as operant conditioning, because the
    individual operates on the environment to solve
    a problem.

4
Trial and error learning cont
  • It involves
  • Motivation
  • Exploration
  • Incorrect and correct responses
  • Reward
  • Receiving a reward of some kind leads to the
    repeated performance of the correct response,
    strengthening the association between the
    behaviour and its outcome.

5
Thorndikes experiments with cats
  • American psychologist Edward Lee Thorndike
    (1874-1949) undertook the first studies of trial
    and error learning.
  • He put a hungry cat in a puzzle box and placed
    a piece of fish outside the box where it could be
    seen (and smelt) but was just outside of the
    cats reach.
  • The cat had to learn to escape from the box by
    operating a latch to escape from the box by
    operating a latch to release a door on the side
    of the box.
  • Learning was measured by the time it took the
    cat, on consecutive trials to escape.

6
The law of effect
  • The results of Thorndikes experiments left
    Thorndike to develop the law of effect.
  • It states that a behaviour that is followed by
    satisfying consequences is strengthened and a
    behaviour that is followed by annoying
    consequences is weakened.
  • In the puzzle-box experiments, behaviour that
    enabled the cat to escape and get to the food
    (satisfying) was more likely to occur and
    behaviour that kept the cat in the box (annoying)
    was less likely to occur.
  • The cat became instrumental in obtaining its
    release to get the food.

7
Operant conditioning
  • Term operant conditioning (OC) was not introduced
    until some years after Thorndikes experiments
    with cats escaping from puzzle boxes.
  • OC was coined by American psychologist Burrhus
    Skinner.
  • He referred to the responses observed in a trial
    and error learning as operants.
  • An operant is a response that occurs and acts on
    the environment to produce some kind of effect.
  • OC is based on the principle that an organism
    will tend to repeat behaviours that have
    desirable consequences, or that will enable it to
    avoid undesirable consequences.
  • Organisms will tend not to repeat behaviours that
    have undesirable consequences.

8
Burrhus Skinner (1904-1990)
  • Began his own experiments in the 1930s, but used
    to term operant conditioning to emphasise that
    animals and people learn to operate on the
    environment to produce desired consequences.
  • He also contrasted operants with respondents in
    CC.
  • Respondents are behaviours elicited by known or
    recognised stimuli (e.g. the meat powder making
    the dog salivate in Pavlovs experiment).
  • He believed that all behaviour can be explained
    by the relationships between the behaviour, its
    antecedents (the events that precede or come
    before it), and its consequences.
  • He argued that any behaviour that is followed by
    a consequence will change in strength and
    frequency depending on the nature if that
    consequence.

9
The Skinner box
  • He created an apparatus called a Skinner box,
    which is a small chamber in which an experimental
    animal learns to make a particular response for
    which the consequences can be controlled by the
    researcher.
  • It is attached to a cumulative recorder which
    indicates how often each response is made
    (frequency) and the rate of response (speed).

10
Skinners experiments with rats
  • 1938 Skinner uses the box to demonstrate OC.
  • 1. a hungry rat is placed in the box.
  • 2. it scurries around and randomly touches parts
    of the floor and walls.
  • 3. rat accidentally presses the lever and rat
    food is released into the box.
  • After additional repetitions, the rats random
    acts subsided and were replaced with more
    consistent lever pressing.
  • Eventually, the rat was pressing the lever as
    fast as it could eat each pellet.
  • Pellet is reward for the correct response.
  • Skinner referred to different types of rewards as
    a reinforcer.

11
Skinners experiment cont
  • The hunger of the rats was their motivation for
    frantic activity.
  • Skinner believed that there was no need to search
    for internal agents to explain changes in
    behaviour.
  • This view was based on the notion that behaviour
    can be understood in terms of environmental or
    external influences.

12
Elements of operant conditioning
  • Reinforcement Punishment

13
Elements of operant conditioning
  • Central to OC is reinforcement (reward).
  • A response that is rewarded is strengthened,
    whereas one that is punished is weakened.

14
Reinforcement
  • Reinforcement may involve receiving a pleasant
    stimulus or escaping an unpleasant stimulus.
  • Reinforcement is applying a positive stimulus or
    removing a negative stimulus to subsequently
    strengthen or increase the likelihood of a
    particular response that it follows.
  • A reinforcer is any object or event that changes
    the probability that an operant behaviour will
    occur again.
  • The term reinforcer is often used interchangeably
    with the term reward although they are not
    technically the same.
  • 1 difference is that a reward suggests an outcome
    that is positive a stimulus is a reinforcer if
    it strengthens the preceding behaviour.
  • Also a stimulus can be rewarding because it is
    pleasurable, but it cannot be said to reinforce
    unless it increases the likelihood of a response
    occurring.

A person might enjoy eating chocolate find it
pleasurable, but chocolate cannot be considered
to be a reinforcer unless it promotes or
strengthens a particular response.
15
Schedules of reinforcement
  • The schedule of reinforcement is the way in which
    the reinforcement is delivered in experimental
    settings.
  • It influences the speed of learning and the
    strength of the learned response.
  • Reinforcement may be provided on a continuous or
    partial reinforcement schedule.
  • Continuous reinforcement is when every correct
    response in the early stages of learning is
    reinforced (the reinforcer is typically provided
    immediately after every correct response).
  • Partial reinforcement is the process of
    reinforcing some correct responses but not all of
    them. It may be delivered in number of ways or by
    different schedules.

16
Schedules of reinforcement Cont
  • The term schedule of reinforcement refers to the
    frequency and manner in which a desired response
    is reinforced.
  • For instance, reinforcement can be given after a
    certain number of correct response have been made
    (i.e. after an interval).
  • Furthermore, reinforcement may be given on a
    regular basis, such as after every 6th correct
    response, or every 30 seconds following a correct
    response (that is, fixed) or it may be
    unpredictable (that is, variable)

17
Positive reinforcement
  • The food pellet in the Skinner box is a positive
    reinforcer for the hungry rat pressing the lever.
  • A positive reinforcer is a stimulus that
    strengthens or increases the likelihood of a
    desired response by providing a satisfying
    consequence (reward).
  • Positive reinforcement occurs from giving or
    applying a positive reinforcer after the desired
    response has been made.

18
Negative reinforcement
  • A negative reinforcer is any unpleasant or
    aversive stimulus that, when removed or avoided,
    strengthens or increases the likelihood of a
    desired response.
  • Negative reinforcement is the removal or
    avoidance of an unpleasant stimulus. It has the
    effect of increasing the likelihood of a response
    being repeated.
  • E.g. a Skinner box has a grid on the floor
    through which a mild electrical current can be
    passed continuously. The rat can feel the
    unpleasant foot shock (stimulus). When the rat
    presses the lever, the electric current is
    switched off and the mild shock is taken away.
  • The removal of the shock (negative reinforcer) is
    referred to as negative reinforcement.

19
Distinction between - reinforcers
  • Positive reinforcers are given and negative
    reinforcers are removed or avoided.
  • Yet because both procedures lead to desirable
    consequences, each procedure strengthens
    (reinforces) the behaviour that produced the
    consequence.
  • Examples of negative reinforcement in everyday
    life
  • Turning off a scary video
  • Taking an aspirin to remove a headache
  • Not drink-driving for fear of losing your license
  • In these examples, the removal of the negative
    reinforcer is providing a satisfying or desirable
    consequence.

20
A quick calculation
  • Positive reinforcer () adding something
    pleasant
  • Negative reinforcer (-) subtracting something
    unpleasant

21
punishment
  • Punishment is the delivery of an unpleasant
    stimulus following a response, or the removal of
    a pleasant stimulus following a response.
  • It has the same unpleasant quality as a negative
    reinforcer, the punishment is given or applied,
    whereas the negative reinforcer is prevented or
    avoided.
  • When closely associated with a response,
    punishment weakens the response, or decreases the
    probability of that response occurring again over
    time.

22
Factors that influence the effectiveness of
reinforcement and punishment
  • Reinforcement is intended to increase the
    likelihood of a behaviour being repeated and
    punishment is intended to decrease the likelihood
    of behaviour being repeated.
  • In OC, what happens after the desired response is
    performed is very important in determining the
    strength of learning and the rate at which is
    occurs.
  • E.g. when in the process of OC the consequence is
    presented, the time lapse between the response
    and consequence, and the appropriateness of the
    consequence used are all important in determining
    the effectiveness of reinforcement or punishment
    and therefore learning.

23
Order of presentation
  • For reinforcement and punishment to be used
    effectively, it must be presented after a desired
    response, never before.
  • This ensures that an organism learns the
    consequences of a particular response.
  • E.g., presenting a child with a lolly after every
    time they use the toilet instead of their nappy
    when they are in the process of being toilet
    trained.

24
timing
  • Reinforcement and punishment are most effective
    when given immediately after the response has
    occurred.
  • This allows for association between the response
    and the reinforcer or punisher.
  • It also influences the strength of the response,
    e.g., if there is a delay, the learning will
    generally be very slow to progress and in some
    cases may not occur at all.
  • This is easily controlled in a lab, but not as
    easy in everyday life.
  • E.g. a delay between studying hard in Year 12 and
    receiving your desired ENTER.
  • Or receiving a detention for misbehaviour can
    occur more than one day after the misdemeanour.

25
appropriateness
  • For any stimulus to be a reinforcer, it must
    provide a pleasing or satisfying consequence
    (reward) its recipient.
  • Technically, it will not be known if something
    will act as reinforcer until after it has been
    used.
  • Also it cannot be assumed that a reinforcer that
    works in one situation will work in another.
  • Similarly, for any stimulus to be an appropriate
    punisher, it must provide a consequence that is
    unpleasant and therefore likely to decrease the
    likelihood of the undesirable behaviour.
  • An inappropriate punisher can have the opposite
    effect and produce the same consequence as a
    reinforcer.

26
Key processes in operant conditioning
  • Acquisition, Extinction, Spontaneous Recovery,
    Stimulus Generalisation, Stimulus Discrimination
    involved in both CC and OC, however, the way in
    which these processes occur is slightly different
    in operant conditioning.

27
acquisition
  • Refers to the overall learning process during
    which a specific response is established.
  • Differs from acquisition in CC as the means by
    which the behaviour is acquired is different
    the types of behaviours acquired through OC are
    usually more complex than the reflexive,
    involuntary responses that became learned
    responses in CC.
  • In OC, acquisition is the establishment of a
    response through reinforcement.
  • The speed that the response is established
    depends on whether continuous or partial
    reinforcement is used.
  • Also, a gradual progression towards a more
    complex target behaviour can be achieved, by
    reinforcing successive approximations. This is
    known as shaping.

28
Acquisition cont
  • Shaping is a procedure in which reinforcement is
    given for any response that successively
    approximates and ultimately leads to the final
    desired response, or target behaviour.
  • Consequently, shaping is also known as the method
    of successive approximations.
  • Skinner used shaping in 1 experiment where he set
    a target behaviour for a pigeon to turn a
    complete circle in an anticlockwise direction.
  • He initially continually reinforced the pigeon
    with a food pellet that was delivered through a
    mechanically operated door every time it turned
    slightly to the left.
  • He then waited until the pigeon turned left
    further before reinforcing it.
  • By limiting the reinforcement only to those
    responses that gradually edged towards the target
    behaviour, Skinner was able to condition the
    pigeon to turn complete circles regularly.

29
Extinction
  • In OC, extinction is the gradual decrease in the
    strength or rate of a conditioned (learned)
    response following consistent non-reinforcement
    of the response.
  • It is said to occur when a conditioned response
    is no longer present.
  • With OC, extinction occurs over time, but after
    reinforcement is no longer given.
  • E.g., when Skinner sopped reinforcing his rats or
    pigeons with food pellets, their conditioned
    response (e.g. of lever pressing or turning
    circles) was eventually extinguished.
  • Extinction is less likely to occur when partial
    reinforcement is used i.e. when reinforcement
    does not regularly follow every correct response,
    as the uncertainty of the reinforcement leads to
    a greater tendency for the response to continue.

30
Spontaneous recovery
  • Same as in CC, extinction is often not permanent
    in OC.
  • After the apparent extinction of the CR,
    spontaneous recovery can occur and the organism
    will once again show the response in the absence
    of any reinforcement.

31
Stimulus generalisation
  • This occurs when the correct response is made to
    another stimulus that is similar (but not
    necessarily identical) to the stimulus that was
    present when the CR was reinforced (usually at a
    reduced level).
  • E.g. the sound of a car back firing as it goes
    past an athletics carnival may cause the athletes
    to generalise this sound to that of the starters
    pistol.

32
Stimulus discrimination
  • In OC, stimulus discrimination occurs when an
    organism makes the correct response to a stimulus
    and is reinforced, but does not respond to any
    other stimulus, even when stimuli are similar
    (but not identical).
  • Skinner taught lab animals to discriminate
    between similar stimuli by reinforcing some
    responses but not others.
  • E.g. a pigeon in a Skinner box could be taught to
    discriminate between a red and a green light, by
    reinforcing the pigeon when it pecked a target
    when the green light was illuminated, but not
    when the red one was.
  • Also, sniffer dogs are used in airports
    throughout the world to detect the smuggling on
    contraband items (e.g. drugs).
  • They have been taught this by OC.

33
Comparison of cC and oc
  • The role of the learning, timing of the stimulus
    and response, the nature of the response

34
Similarities of CC and Oc
  • Acquisition
  • Extinction
  • Spontaneous recovery
  • Stimulus generalisation and discrimination
  • Both types of conditioning are achieved as a
    result of the repeated association of 2 events
    that follow each other closely in time.
  • These similarities have led some psychologists to
    believe that both OC CC are variants of a
    single learning process.
  • E.g. when Little Albert learned to fear the rat,
    his response (trembling) was CC. But when he
    learned to avoid the rat by crawling away (a
    response that had the effect of reducing his
    fear), that was an example of OC

35
Differences of CC OC
  • OC
  • Emphasis on the consequences of a response.
  • Involves voluntary responses
  • CC
  • The behaviour of the learner does not have any
    environmental consequences.
  • Response is involuntary.

36
The role of the learner
  • In CC the learner is relatively passive when
    either the CS of the UCS is presented.
  • In OC the learner must actively operate on the
    environment so as to obtain the reinforcement or
    the punishment.

37
Timing of the stimulus and response
  • In CC the response depends on the presentation of
    the UCS occurring first.
  • In OC the presentation of the reinforcer depends
    on the response occurring first.
  • In CC, the timing of the 2 stimuli (CS, then
    UCS), produces an association between them that
    conditions the learner to anticipate the UCS and
    respond to it even if is not presented.
  • In OC, the association that is conditioned is
    between the stimulus and the response.
  • In CC the timing of the 2 stimuli needs to be
    very close and the sequencing is vital the CS
    must come before the UCS.
  • In OC, while learning generally occurs faster
    when the reinforcement or punishment occurs soon
    after the response, there can be a considerable
    time difference between them.

38
The nature of the response
  • In CC, the response by the learner is usually a
    reflexive, involuntary one.
  • In OC, the response by the learner is usually a
    voluntary one.
  • In CC, the response is often one involving the
    action of the autonomic nervous system, and the
    association of the 2 stimuli is often not a
    conscious or deliberate one.
  • In OC, the response is more likely to involve the
    central nervous system and to be conscious,
    intentional and often goal-directed.
Write a Comment
User Comments (0)
About PowerShow.com