Reinforcement Learning - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Reinforcement Learning

Description:

Reinforcement learning. Agent receives some evaluation of its action ... Fixed interval: fixed time interval passes between reinforcements ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 15
Provided by: harva79
Category:

less

Transcript and Presenter's Notes

Title: Reinforcement Learning


1
Reinforcement Learning
  • Ruti Glick
  • Bar-Ilan university

2
learning
  • result of interaction between an agent and the
    world
  • Percepts received by an agent should be used not
    only for acting, but also for improving the
    agents ability to behave optimally in the future
    to achieve the goal.

3
Supervised Learning
  • Learn from example
  • E.g. Decisions tree
  • Environment provide input / output pairs
  • Learn functions and probability models
  • You can think it as if there is a kind teacher

4
Supervised Learning
  • Problems
  • Difficulties in supplying large number of
    examples
  • Example
  • Train robot to juggle
  • Board state in chess

5
Reinforcement learning
  • Agent receives some evaluation of its action
  • not told of which action is the correct one to
    achieve its goal

6
Example play chess
  • Supervised Learning
  • Gets examples of board state best move in this
    state
  • Reinforcement learning
  • Tries random movements
  • Learn about environment
  • How board will looks like after performing the
    action
  • What the opponent will do
  • Must get rewards / reinforcement

7
Types of Reinforcement 1
  • Positive Reinforcement
  • pleasurable consequence administered after a
    desired behavior
  • strengthens behavior
  • e.g. praising a dog after it performs a trick
  • Extinction
  • withholding positive reinforcement following an
    undesirable behavior
  • reduces behavior
  • e.g. imposing early curfew on a child who stayed
    out too late

8
Types of Reinforcement 2
  • Punishment
  • an unpleasant consequence administered following
    an undesirable behavior
  • reduces behavior
  • e.g. a choke chain for a dog
  • Negative Reinforcement
  • withholding an unpleasant consequence following a
    correct behavior
  • strengthens behavior
  • e.g. a boxer learning to block a jab

9
Reinforcement Schedules 1
  • Continuous Reinforcement
  • every behavior is reinforced
  • Partial Reinforcement
  • not every behavior reinforced
  • Fixed interval fixed time interval passes
    between reinforcements
  • Variable interval time interval varies between a
    min and max
  • Final Reinforcement
  • At the terminal states

10
Reinforcement Schedules 2
  • Continuous schedule
  • results in faster learning
  • but fastest extinction if a reinforcement is
    missed
  • Variable schedule
  • results is most effective for developing more
    permanent behavior

11
Properties 1
  • Accessible / inaccessible environment
  • In accessible state can be identified by
    percept
  • Otherwise agent must keep trucking after
    environment
  • Exist knowledge
  • Does agent have knowledge of environment and
    actions effects?
  • If no has to learn the model
  • Rewards - Schedules
  • When does the agent get them?

12
Properties 2
  • Rewards type
  • Components of actual utility
  • Score in ping pong
  • Dollars on betting
  • Hints fot utility
  • bad dog
  • hot
  • Learning type
  • Passive or active

13
Passive / active learner
  • Passive learner
  • Watches the world goes by
  • tries to learn utility of being in varied states
  • Active learner
  • Acts using its learned information
  • Must experience as mush as possible the
    environment

14
Agents types
  • Utility - based agent
  • Learns utility of states
  • Use it to select best action
  • Must know the model of environment
  • E.g. in backgammon must know legal moves ant its
    effect
  • Q-learning agent
  • Learn expected utility of taking a given action
    in a given state
  • Doesnt need to know the effect but legal moves
  • Cant look ahead
Write a Comment
User Comments (0)
About PowerShow.com