Hierarchical Reinforcement Learning

About This Presentation

Title:

Description:

Number of Views:32

Avg rating:3.0/5.0

Slides: 18

Provided by: staffSci

Transcript and Presenter's Notes

Title: Hierarchical Reinforcement Learning

1
Hierarchical Reinforcement Learning

2
Reinforcement Learning, Formulas

3
Value Iteration

4
Extension SMDP's

5
Approaches to Hierarchical Reinforcement Learning

6
Options approach

7
Adapted value functions, update rules

Event of being initiated at
time t in s
Semi markov policy that follows o unitil
it terminates after timeteps and then
continues according to .
Analog Value-iteration step
Analog Q-learning step

8
MAXQ motivation
9
Max-Q value decomposition
10
MaxQ

-Decompose task in to set of closed
hierarchically subtasks M0, M1,.. M,n
-Subtasks have to be solved to complete root-task
M0
-Assign local reward to completing subtask
-When subtasks is called it runs until it, or a
subtask higher in the Hierarchie completes
-Use deterministic completion states
-Assign reward depending on completion state
- Recursive instead of Hierarchical optimality

11
Optimalities
12
MAXQ - continued

13
MaxQ graph
14
Results MAXQ
15
Topics for Future Research

16
Conclusions

Two main approaches
Extend statespace with macro's
Limit statespace using hierarchie
Macro's Problem How to learn good macro's
autonomically. Eiterh suboptimal performance
(limited actionspace) or no real gain (extended
actionspace). But might speedup learning
signifficantly.
MAXQ Making use of programmer-defined
hierarchical decomposition makes statespaces
smaller and learning faster. Problem is the
effort needed from the programmer.

17
The MAXQ algorithm

Write a Comment

User Comments (0)