Neural Networks Chapter 7

About This Presentation

Title:

Description:

Number of Views:5

Avg rating:3.0/5.0

Slides: 35

Provided by: Joo71

Transcript and Presenter's Notes

Title: Neural Networks Chapter 7

1
Neural NetworksChapter 7

2
Recurrent Networks

3
Recurrent Networks

4
Recurrent Networks

Drawbacks
Length must be chosen in advance, leads to large
number of input units, large number of training
patterns, etc.
Replace fixed time delays by filters

5
Recurrent Networks

6
Recurrent Networks

7
Recurrent Networks

8
Recurrent Networks

9
Recurrent Networks
10
(No Transcript)
11
Recurrent Networks

12
Reinforcement Learning

Supervised learning with some feedback
Reinforcement Learning Problems
Class I reinforcement signal is always the same
for given input-output pair
Class II stochastic environment, fixed
probability for each input-output pair
Class III reinforcement and input patterns
depend on past history of network output

13
Associative Reward-Penalty

14
Associative Reward Penalty

15
Models and Critics
Environment
16
Reinforcement Comparison
Environment
Critic
17
Reinforcement Learning

Reinforcement-Learning Model
Agent receives input I which is some indication
of current state s of environment
Then the agent chooses an action a
The action changes the state of the environment
and the value is communicated through a scalar
reinforcement signal r

18
Reinforcement Learning

Environment You are in state 65. You have four
possible actions.
Agent Ill take action 2.
Environment You received a reinforcement of 7
units. You are now in state 15. You have two
possible actions.
Agent Ill take action 1.
Environment You received a reinforcement of -4
units. You are now in state 12. You have two
possible actions.
Agent Ill take action 2.

19
Reinforcement Learning

Environment is non-deterministic
same action in same state may result in different
states and different reinforcements
The environment is stationary
Probabilities of making state transitions or
receiving specific reinforcement signals do not
change over time

20
Reinforcement Learning