Title: Predictive State Representations
1Predictive State Representations
Hui Li July 7, 2006
2Outline
- What are the advantages of predictive state
representation - Whats predictive state representation (PSR)
- How to learn PSR model
- Conclusions
3What are the advantages of PSR
- PSR are expressed entirely on observable
quantities - PSR avoids the problems of local minima and
saddle points in learning the model of POMDP - PSR attain generality and compactness at least
equal to POMDP
4What are predictive state representations (1/9)
Two notations in PSR
- History (h)
- History is the sequence of action-observation
(ao) pair that the agent has already experienced,
beginning at the first time step - Test (t)
- Test is a sequence of ao pair that begins
immediately after a history
5What are predictive state representations (2/9)
Prediction of a test p(th)
6What are predictive state representations (3/9)
System-dynamics matrix D
7What are predictive state representations (4/9)
Order of all possible tests in D
hi
Properties of the predictions in each row of D
hi
8What are predictive state representations (5/9)
Relation between PSR and POMDP
Belief state is updated according to Bayes rule
Constructing D from a POMDP
9What are predictive state representations (6/9)
10What are predictive state representations (7/9)
Since the rank of D ? k, there must exit at most
k linearly independent columns or rows in D.
- Core tests QT
- The tests corresponding to the k linearly
independent columns - are called core tests.
- Core histories Qh
- The histories corresponding to the k linearly
independent rows - are called core histories.
11What are predictive state representations (8/9)
12What are predictive state representations (9/9)
Linear PSR model
Definition
D(Q) is a linear sufficient statistic of the
histories since all the columns of D are a linear
combination of the columns in D(Q).
PSR State update
13How to learn PSR model (1/6)
Two subproblems in learning PSR model
- Discovery find the core tests QT which
predictions constitutes state (sufficient
statistic) - Learning learn the parameters maot that define
the system dynamics.
14How to learn PSR model (2/6)
The set of tests and histories corresponding to a
set of linearly independent columns and rows of
any submatrix of D are subsets of core-tests and
core-histories respectively.
Infinite Matrix
Finite, small matrix
15How to learn PSR model (3/6)
Analytical Discovery and Learning Algorithm (ADL)
- Assumption the exact D is obtained
- Analytical discovery algorithm (AD)
- Analytical learning algorithm (AL)
16How to learn PSR model (4/6)
- Analytical discovery algorithm (AD)
17How to learn PSR model (5/6)
2. Analytical learning algorithm (AD)
Since
Then
18How to learn PSR model (6/6)
Estimate the system-dynamic matrix D
19Conclusions
- New dynamical systems predictive state
representations (PSR) is introduced which is
grounded in actions and observations. - An algorithm is introduced analytical
discovery and learning (ADL) to learn the PSR
model
20References
- James, M. R., Singh, S. (2004). Learning and
discovery of predictive state representations in
dynamical systems with reset. Proceedings of the
21st International Conference on Machine Learning
(ICML) (pp. 719726). - Littman, M., Sutton, R. S., Singh, S. (2002).
Predictive representations of state. Advances in
Neural Information Processing Systems 14 (NIPS)
(pp. 15551561). MIT Press. - McCracken, P., Bowling, M. (2006). Online
learning of predictive state representations.
Advances in Neural Information Processing Systems
18 (NIPS). MIT Press. To appear. - Singh, S., James, M. R., Rudary, M. R. (2004).
Predictive state representations A new theory
for modeling dynamical systems. Uncertainty in
Artificial Intelligence Proceedings of the
Twentieth Conference (UAI) (pp. 512519). - Singh, S., Littman, M., Jong, N., Pardoe, D.,
Stone, P.(2003). Learning predictive state
representations. Proceedings of the Twentieth
International Conference on Machine Learning
(ICML) (pp. 712719). - Wiewiora, E. (2005). Learning predictive
representations from a history. Proceedings of
the 22nd International Conference on Machine
Learning (ICML) (pp. 969976). - Wolfe, B., James, M. R., Singh, S. (2005).
Learning predictive state representations in
dynamical systems without reset. Proceedings of
the 22nd International Conference on Machine
Learning (ICML) (pp. 985992). - Bowling, M., McCracken, P., James, M., Neufeld
J., Wilkinson, D. (2006). Learning predictive
state representations using non-blind polices.
ICML 2006