Learning Linear Predictive State Representations

About This Presentation

Title:

Description:

Number of Views:46

Avg rating:3.0/5.0

Slides: 12

Provided by: Wol56

Category:

more less

Transcript and Presenter's Notes

Title: Learning Linear Predictive State Representations

1
Learning Linear Predictive State Representations

2
High-Level Problem

3
Linear PSRs

Test sequence of actions/observations
Succeeds if, given actions are taken,
observations are seen
Core Tests a set of tests such that predictions
for their success are sufficient for all test
predictions
Predictions form the state of the model
Update Parameters
Matrices that allow updates of core tests
predictions based on most recent
action/observation

4
Learning Linear PSRs

Problem How to find a set of core tests and the
corresponding update parameters from interacting
in an environment?
Current Methods
Singh, Littman, Jong, Pardoe, Stone (ICML 2003)
learning the parameters given the core tests
Singh, James (forthcoming) learning both core
tests and parameters using an artificial reset

5
Goal of Project

Two algorithms
Based on Jaegers algorithm for learning IO-OOMs
by sampling a training sequence
Direct Sampling and Intermediate Model
Question Can these algorithms generate models
that can make predictions about the occurrence of
some tests T with an MSE less than 0.001, using a
training sequence of length kp?

6
Amount of Training Data

7
Evaluation Tests T

Test sequence used random walk policy
Timepoints to evaluate the PSR chosen randomly
Measure predictions for observing each possible
observation, given the last action taken
Compute MSE against accurate model

time
a1 o1 a6 o2 a3 o1
Eval Pr(a1,o1) and Pr(a1,o2)
Eval Pr(a3,o1) and Pr(a3,o2)
8
Methods

Test sequence length 30000
Expected eval points 30000/24 1250
Results were averaged over all eval points
Training sequences 1000, 32 million
20 epochs for each training length for each
problem
Problems 4x3 maze, cheese maze, shuttle,
network, tiger, paint, bridge repair (problems
from http//www.cs.brown.edu/research/ai/pomdp/ex
amples/index.html)

9
Direct Sampling Results

10
Intermediate Model Results

11
Summary

Direct Sampling method can achieve decent
accuracy given substantial training data
Intermediate Model method takes prohibitively
long to run, does not appear to give better
precision, and has severe trouble with small
training sets
Future directions
Analyze why the methods achieve different
performance on the different problems

Write a Comment

User Comments (0)