Value Function Approximation with Diffusion Wavelets and Laplacian Eigenfunctions - PowerPoint PPT Presentation

About This Presentation
Title:

Value Function Approximation with Diffusion Wavelets and Laplacian Eigenfunctions

Description:

Any policy defines a unique value function , which satisfies the Bellman equation ... Green's function of the Markov chain, (I-P)-1, for solving Bellman's equation. ... – PowerPoint PPT presentation

Number of Views:124
Avg rating:3.0/5.0
Slides: 19
Provided by: QA6
Category:

less

Transcript and Presenter's Notes

Title: Value Function Approximation with Diffusion Wavelets and Laplacian Eigenfunctions


1
Value Function Approximation with Diffusion
Wavelets and Laplacian Eigenfunctions
  • by S. Mahadevan M. Maggioni

Discussion led by Qi An ECE, Duke University
2
Outline
  • Introduction
  • Approximate policy iteration
  • Value function approximation
  • Laplacian eigenfunctions approximation
  • Diffusion Wavelets approximation
  • Experimental results
  • Conclusions

3
Introduction
  • In MDP models, it is desirable/necessary to
    approximate the value function for a large state
    size or reinforcement learning situation.
  • Two novel approaches are explored in this paper
    to make value function approximation on state
    space graphs

4
Approximate policy iteration
  • In a RL MDP model, value function approximation
    is a part of approximate policy iteration
    process, which is used to iteratively solve the
    RL problem.

5
Approximate policy iteration
Sample
(s, a, r, s)
6
Value function approximation
  • A variety of linear and non-linear architectures
    have been widely studied as they offer many
    advantages in the context of value function
    approximation
  • However, many of them are handcoded in an ad hoc
    trial-and-error process by a human designer.

7
Value function approximation
  • A finite MDP can be defined as
  • Any policy defines a unique value function
    , which satisfies the Bellman equation
  • We want to project the value function into
    another lower dimensional space

8
Value function approximation
  • In the approximation, is a SAk matrix,
    each column of which is a basis function
    evaluated at (s,a) points, k is the number of
    basis functions selected and is a weight
    vector.
  • The problem is how to efficiently and effectively
    construct those basis functions

9
Laplacian eigenfunctions
  • We model the state space as a finite undirected
    weighted graph (G,E,W)
  • The combinational Laplacian L is defined as
  • The normalized Laplacian is
  • We use the eigenfunctions of L as the
    orthonormal basis

10
Diffusion wavelets
  • Diffusion wavelets generalize wavelet analysis
    and associated signal processing techniques to
    functions on manifolds and graphs.
  • They allows fast and accurate computation of high
    powers of a Markov chain P on the graph,
    including direct computation of the Greens
    function of the Markov chain, (I-P)-1, for
    solving Bellmans equation.

11
Diffusion wavelets
  • Markov Random Walk
  • We symmetrize P and take powers
  • where and are
    eigenvalues and eigenfunctions of the normalized
    Laplacian

12
Diffusion wavelets
  • A diffusion wavelets tree consists of orthogonal
    diffusion scaling function and orthogonal
    wavelets .
  • The scaling functions span a subspace with
    the property ,and the span of
    wavelets, ,is the orthogonal complement of
    into .

13
Diffusion wavelets
14
  • The detail subspaces
  • Downsampling, orthogonalization, and operator
    compression

A - diffusion operator, G Gram-Schmidt
ortho-normalization, M - A?G
  • - diffusion maps X is the data set

15
Diffusion wavelets
16
(No Transcript)
17
Experimental results
18
Conclusions
  • Two novel value function approximation methods
    are exploited
  • The underlying representation and policies are
    simultaneously learned
  • Diffusion wavelets is a powerful tool for signal
    processing techniques of functions on manifolds
    and graphs
Write a Comment
User Comments (0)
About PowerShow.com