Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding - PowerPoint PPT Presentation

About This Presentation

Title:

Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding

Description:

Canonicity of representation (as for BDDs) Efficient package: CUDD. Algebraic Decision Diagrams ... (v) for each node v. Bottom-up phase for computing ... – PowerPoint PPT presentation

Number of Views:23

Avg rating:3.0/5.0

Slides: 34

Provided by: msac1

Learn more at: http://www.ai.mit.edu

Category:

more less

Transcript and Presenter's Notes

Title: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding

1
Exact Mode Estimation for POMDPs based on
Constraint Decomposition and Symbolic Encoding

Martin Sachenbacher
July 1, 2003

2
Exact vs. Approximate ME

Problems of ME with incomplete belief state
Dead ends (no solutions)
Incorrect leading solutions
Incorrect probabilities of solutions
Usefulness of ME with complete belief state
As accuracy reference
As performance reference
As a starting point for approximations
Key Compact representation of belief state
Map to semiring-based CSP
Decompose Hypergraph into Hypertree
Encode Tree Nodes symbolically as ADDs

3
Outline

SCSPs (Semiring-based CSPs)
Mapping State Constraints to SCSPs
Mapping Transition Constraints to SCSPs
ADDs (Algebraic Decision Diagrams)
Hypertree Decompositions of SCSPs
Solving Tree-structured SCSPs
Exact Mode Estimation for POMDPs as
Decomposition/ADD-based SCSP Solving
Demonstration Two Switches Example

4
SCSPs (Semiring-based CSPs)

Generalization of CSPs Bistarelli et al. 97
Domain D, Variables V, Set S, Type T ? V
Constraints are mappings Dk ? S
Operations ? (for join) and ? (for projection) on
S
(S, ?, ?, 0, 1) must for form c-semiring
Dynamic Programming applicable to all SCSPs
Examples
(0,1, ?, ?, 0, 1) Classical CSPs
(R, min, , ?, 0) Weighted CSPs
(0,1, max, , 0, 1) Probabilistic CSPs

5
Encoding States as SCSPs

Example Or-Gate
P(Orok) 99, P(Orfty) 1

Or
1
xt in1 in2 out
f
ok lo lo look lo hi hiok hi lo
hiok hi hi hifty
0.990.990.990.990.01
6
Encoding Observations as SCSPs

Example (Probabilistic) Observation

Distribution over values for xi
xi
f
0.9
0123
0.60.90.30.0
P
0.6
0.3
xi
0
1
2
3
7
Encoding Transitions as SCSPs

Example (Probabilistic) CCA

Transition Function
cmdoff
xt cmd xt1
f
0.9
0.90.10.10.90.90.10.10.9
0 off 00 on 00 off 10 on 11 off 01
on 01 off 11 on 1
0
cmdoff
cmdon
0.9
0.9
1
0.9
cmdon
8
Algebraic Decision Diagrams

ADDs Symbolic (graph-based) representation of
functions 0,1n ? R
Generalization of BDDs (functions 0,1n ? 0,1)
Canonicity of representation (as for BDDs)
Efficient package CUDD

A
B
B
C
C
0
1
2
3
9
ADD Join Operations

Multiplication, addition, maximum,
Generalization of BDD operations

ABC
f
fg
g
fgt1
fg
5f
max(f,g)
000001010011100101110111
01121223
32010001
32131224
02020003
055105101015
00010111
32121223
10
Example

Summation of ADD f, ADD g

A
A
A
B
B
B
B
B
B

C
C
C
C
C
C
C
C
3
2
1
0
0
1
2
3
4
3
2
1
11
ADD Projection Operations

?(f,X) (and ?(f,X)) obtained by summing
(multiplying) values of tuples that differ only
w.r.t. X

ABC
f
AB
?(f,C)
?(f,C)
000001010011100101110111
01121223
00011011
1335
0226
12
ADD Projection Operations

For optimization, we require operation ?max(f,X)
that yields maximum value of tuples differing
only w.r.t. X

ABC
f
AB
?(f,C)
?(f,C)
?max(f,C)
000001010011100101110111
01121223
00011011
1335
0226
1223
Not part of CUDD, but easy to implement as
variant of ?/?(f,X).
13
Solving SCSPs using Decomposition

Transform SCSPs into Hypertree H(T,?,?)
Compute constraint ?(v) for each node v
Bottom-up phase for computing values
Top-down phase for extracting solutions

14
Pseudocode for Bottom-Up Phase

Function solve(v)
For Each child ? children(v)
?(v) ? ?(v) ? ?max(?(child), ?(child) \ ?(v))
Next child
Return ?(v)

Generalization of (Semi-)Join Operation
15
Example

Boolean Polycell

X
A 1
Or1
And1
B 1
F 0
Y
Or2
C 1
And2
G 1
D 1
Z
Or3
E 0
16
Example

Hypertree Decomposition of Boolean Polycell

ok ok 1 0 0 0 0 1ok ok 1 0 0 0 1 1ok ok 1 0 0 1
0 1
U.98505
O3A1CEFXYZ
v0
Y,Z
Y
C,X
O2BDY
A2GYZ
O1ACX
v1
v2
v3
ok 1 1 1fty 1 1 1fty 1 0 1fty 1 1 0fty 1 0 0
ok 1 1 1fty 1 1 1fty 1 1 0
ok 1 1 1fty 1 1 1fty 1 1 0
U.99
U.99
U.995
U.005
U.01
U.01
17
Example

Initial ?(v0)

U.98505
ok ok 1 0 0 0 0 1ok ok 1 0 0 0 1 1ok ok 1 0 0 1
0 1
ADD with20 nodes,5 leaves
fty ok 1 0 0 0 1 1fty ok 1 0 0 0 1 0fty ok 1 0
0 1 0 0fty ok 1 0 0 1 0 1fty ok 1 0 0 0 1 1fty
ok 1 0 0 1 0 1
U.00995
O3A1CEFXYZ
v0

U.00495

U.00005
18
Example

After multiplication with ?max(?(v1),A2,G)

ok ok 1 0 0 0 1 1
U.98012
ADD with28 nodes,7 leaves
fty ok 1 0 0 0 1 1
U.00990
ok ok 1 0 0 0 0 1ok ok 1 0 0 1 0 1
U.00492
O3A1CEFXYZ
v0
U2.4E-5

U4.9E-5

U2.5E-7
19
Example

After multiplication with ?max(?(v2),O2,B,D)

ok ok 1 0 0 0 1 1
ADD with30 nodes,8 leaves
U.97032
fty ok 1 0 0 0 1 1
U.00980
U.00487
ok fty 1 0 0 0 1 1

U4.9E-5

U4.9E-7
O3A1CEFXYZ
v0
U2.4E-7

U2.5E-9
20
Example

After multiplication with ?max(?(v3),O1,A)

ADD with35 nodes,10 leaves
ok ok 1 0 0 0 1 1
U.00970
ok fty 1 0 0 1 1 1
U.00482
U9.8E-5
fty ok 1 0 0 0 1 1
Best SolutionUmax .0097

U4.8E-5

U4.9E-7
O3A1CEFXYZ

v0
U2.4E-7

U4.9E-9

U2.4E-9

U2.5E-11
21
Pseudocode for Top-Down Phase
No search queue necessary

Function extractSolutions(vroot)
E ? edges(vroot)
? ? ?(vroot)
? ? ?max(?, vars(?) \ decvars(?)?vars(E))
While E ? ? Do
e ? choose(E)
v ? son-node(e)
E ? (E \ e) ? edges(v)
?0-1 ? (??0)
?div ? ?max(?0-1 ? ?(v), vars(?))
? ? (? ? ?(v)) ?-1 ?div
? ? ?max(?, vars(?) \ decvars(?)?vars(E))
End While

Restrict todecision andshared variables
Divisor
22
Example

Initial ? ?max(?(vroot),E,F)

ok ok 1 0 1 1
O3A1CXYZ
U.00970
ok fty 1 1 1 1
U.00482
U9.8E-5
fty ok 1 0 1 1

U4.8E-5
ADD with21 tuples, 33 nodes, 10 leaves

U4.9E-7

U2.4E-7

U4.9E-9

U2.4E-9

U2.5E-11
23
Example

After processing edge(v0,v3)

fty ok ok 1 1
O1O3A1YZ
U.00970
ok ok fty 1 1
U.00482
U9.8E-5
fty fty ok 1 1

U4.8E-5
ADD with21 tuples, 32 nodes, 10 leaves

U4.9E-7

U2.4E-7

U4.9E-9

U2.4E-9

U2.5E-11
24
Example

After processing edge(v0,v2)

fty ok ok ok 1 1
O1O2O3A1YZ
U.00970
ok ok ok fty 1 1
U.00482
U9.8E-5
fty fty ok ok 1 1fty ok fty ok 1 1
ADD with30 tuples, 47 nodes, 11 leaves

U4.8E-5

U9.9E-7

U4.9E-7

U2.5E-11
25
Example

After processing edge(v0,v1)

fty ok ok ok ok
O1O2O3A1A2
U.00970
ok ok ok fty ok
U.00482
U9.8E-5
fty fty ok ok okfty ok fty ok ok
ADD with26 tuples,35 nodes, 12 leaves

U4.8E-5

U2.4E-5
Solutions 26

U9.9E-7

Easy to focus on leading solutions.

U2.5E-11
26
Application Exact ME for POMDPs

Given POMDP (Feasible States, Observables,
Control Actions, Transitions), Observations
Approach Complete representation of belief state
(through decomposition and symbolic encoding)
Benefit Allows for exploiting Markov property

S0 S1 Sn
S0 S1 Sn
Time t
Time t1
27
Algorithm Exact ME for POMDPs

Construct Hypertree (offline)
Construct State-ADDs for each node (offline)
Construct Transition-ADDs for each node (offline)
Repeat for each time step
Multiply nodes with Obs-ADDs (Condition on
Observations)
Establish consistency in the tree (Bottom-up)
Extract leading solution(s) from the tree
(Top-down)
Multiply nodes with Transition-ADDs, project on
xt1, set xt xt1, multiply with State-ADDs
(Transition Expansion)
Complexity Polynomial in width of Hypertree

28
Example

Adapted from Jim Kuriens thesis
t0 Sw1.cmd on
t1 Or.out lo, Sw1.cmd idl, Sw2.cmd on
t2 Or.out lo

Sw1
Or
Switches more likely to fail than Or-Gate
hi
1
hi
Sw2
29
Example

Switch Model

cmdoff,idl
cmdon,idl
0.95
0.95
0.95
t1 t2
cmdoff
t1 t2
lo lo lo hihi lohi hi
on
off
lo lo hi hi
cmdon
0.95
0.05
0.05
fty
true
1.0
30
Example

Switch Model

xt cmd xt1
f
on on onon off offon idl onon
ftyoff on onoff off offoff idl offoff
ftyfty fty
0.950.950.950.050.950.950.950.051.0
xt t1 t2
f
on lo loon hi hioff fty
1.01.01.01.0
31
Example

Or-Gate Model

xt in1 in2 out
f
ok lo lo look lo hi hiok hi lo
hiok hi hi hifty
1.01.01.01.01.0
0.99
in1 in2 out
lo lo lolo hi hihi lo hihi hi
hi
ok
0.01
xt xt1
f
fty
true
ok okok ftyfty fty
0.990.011.0
1.0
32
Example

Initial belief state (chosen)
p(Swon) p(Swoff) 0.475, p(Swfty) 0.05
p(Orok) 0.99, p(Orfty) 0.01
Observations/Commands
t0 Sw1.cmdon
t1 Or.outlo, Sw1.cmdidl, Sw2.cmdon
t2 Or.outlo
Leading Solutions
t0 Sw1on/off, Sw2on/off, Orok
t1 Sw1fty, Sw2off, Orok
t2 Sw1on, Sw2on, Orfty

33
Conclusion

SCSPs elegant and general representation
ADDs encoding of SCSPs efficient in average case,
exponential in the number of variables in worst
case
Decomposition factors problem into set of ADDs,
each confined to small numbers of variables
The two methods complement each other well
How far can we get with this combination?

Write a Comment

User Comments (0)