Title: Fast Propositional Algorithms for Planning
1Fast Propositional Algorithms for Planning
- Fast stochastic algorithms for propositional
satisfiability GSAT, WSAT (WalkSAT) - Compile a planning problem in to a satisfiability
problem (example of a constraint satisfaction
problem -- CSP), and use a fast algorithm for
satisfiability.
2Review of Satisfiability
- A problem instance is a Boolean conjunctive
normal form (CNF) formula, that is, a conjunction
of propositional clauses, over some set X1,,Xn
of propositions. - Goal is to find an assignment to the propositions
(variables) that satisfies the CNF formula.
3Satisfiability Review (Continued)
- Satisfiability is important for several reasons,
including - It is at the foundation of NP-completeness
- Its the canonical example of constraint
satisfaction problems (CSPs) - Many interesting tasks, including planning tasks,
can be encoded as satisfiability problems. - Broadly speaking, CSPs grow easier with
4Satisfiability (Continued)
- (Continued) more variables but harder with more
constraints. In the case of satisfiability, each
clause is a constraint. - Kautz, Levesque, Mitchell, and Selman showed that
the critical measure of hardness of
satisfiability is the fraction of the number of
clauses over the number of variables. For a
large fraction, its almost always easy
5Satisfiability (Continued)
- (Continued) to answer no quickly, and for a
small fraction its almost always easy to answer
yes quickly. Theres a relatively slim phase
transition area in between these extremes where
most of the hard problems are located. - GSAT and WSAT were created (by subsets of the
preceding authors) to address these.
6GSAT
- Input CNF formula and integers Max_flips (e.g.
100) and Max_climbs (e.g. 20). - Output Yes (satisfiable) or No (couldnt find a
satisfying assignment). Might also output the
best assignment found. - Assignments are scored by the number of clauses
they satisfy. - GSAT performs a (greedy) hill-climbing search
with random restarts (next slide).
7GSAT Algorithm
- For i from 1 to Max_Climbs
- Randomly draw a truth assignment over the
variables in the CNF formula (e.g. flip a coin
for each variable to decide whether to make it 0
or 1 -- in practice, use pseudo-random number).
If assignment satisfies formula, return Yes. - For j from 1 to Max_Flips
- For each variable, calculate the score of the
truth assignment that results when we flip the
value of
8GSAT Algorithm (Continued)
- (Continued) that variable. Make the flip that
yields the highest score (need not be greater
than or equal to the score of the previous
assignment). If the new assignment satisfies the
formula, return Yes. - Return No (no satisfying assignment found,
although one might still exist).
9Key Points about GSAT
- Cannot tell us a formula is unsatisfiable (but we
can just run propositional resolution in
parallel). - Random re-starts help us find multiple local
optima -- the hope is that one will be global. - Sideways (or even downward) moves help us get
off a plateau -- can bounce us off a local
optimum. Significant practical advance over
standard greedy approach.
10WalkSAT (WSAT)
- To further get around the problems of local
optima, we can occasionally choose to make a
random flip rather than a GSAT flip (as in a
random walk). WSAT differs from GSAT as follows - One additional input a probability p of a random
move at any step. - A random move will involve randomly choosing an
unsatisfied clause, randomly
11WSAT (Continued)
- (Continued) choosing a variable in that clause,
and flipping that variable in the assignment
(even if the net result of the flip is a decrease
in score). - For each move, draw a pseudo-random number
between 0 and 1. If less than p, make a random
move otherwise, make a GSAT move. - WSAT outperforms GSAT, GAs, and Simulated
Annealing on random trials.
12Davis-Putnam with RRR
- For awhile, GSAT and WSAT displaced the old
standard deterministic algorithm, Davis-Putnam. - Actually, whats called Davis-Putnam is really
Davis-Putnam-Logemann-Loveland. - Recently, its been seen that the key to
GSAT/WSAT success is the random restart idea.
13DPLL with RRR (Continued)
- In the last few years, Davis-Putnam-Logemann-Lovel
and has been fitted with rapid random restarts
(RRR). The result often outperforms WSAT and
GSAT. - DPLL is a backtrack search algorithm that uses
some heuristics. Different restarts involve
different choices at backtrack points.
14DPLL(CNF formula f)
- If f is empty then return yes.
- Else if there is an empty clause in f then return
no. - Else if there is a pure literal l in f then
return DPLL(f(l)). - Else if there is a unit clause l in f then
return DPLL(f(l)). - Else choose a variable v mentioned in f. If
DPLL(f(v)) yes then return yes. Else return
DPLL(f(v)).
15DPLL with RRR
- Randomly select the variable and variable setting
at the choice point. - Restart after a short period of time if a
solution has not been found. - Avoids heavy tail directions in the search
that will lead to very long run times.
16Classical Planning Problem
- Input descriptions of the current world state
(initial conditions), the agents goal, and the
possible actions that can be performed. - Output a sequence of actions that, when executed
from the initial state, will result in a state in
which the goal is true.
17Formal Language and Vocabulary
- Must choose a formal language (e.g. propositional
or first-order logic) in which to represent
states, goals, and actions. Also need a
vocabulary (e.g. choice of propositions or
predicate symbols, function symbols, etc.). - Examples include propositional and first-order
STRIPS representations, situation calculus
representations, etc.
18A Simple Classical Framework
- Propositional STRIPS each action, or operator,
characterized by preconditions and postconditions
(add list and delete list). - Atomic time time proceeds in discrete steps.
- Omniscient agent no probabilities on world
states, states are completely specified. - Deterministic effects no probabilities on
postconditions.
19Classical Framework (Continued)
- Conjunctive goals.
- Conjunctive preconditions.
- Later we will discuss relaxing the constraints of
the propositional representation, conjunctive
goals, and conjunctive preconditions.
20GRAPHPLAN at a High Level
- Graph-expansion phase extend a planning graph
forward in time until a necessary (though not
sufficient) condition for plan existence has been
achieved. - Solution-extraction phase search the resulting
graph for a correct plan. - If no plan is found, then repeat the two phases
through more time steps.
21Planning Graph
- Two types of nodes propositions and actions.
- Nodes partitioned into levels labeled 0 to n
for some natural number n. - Nodes at even-numbered levels are labeled by
propositions, and nodes at odd-numbered levels
are labeled by actions.
22(No Transcript)
23Planning Graph (Continued)
- An odd-numbered level contains one node for each
action whose preconditions are present at the
previous level, and that level contains no other
actions. - An edge exists between a proposition p at level i
and an action a at level i1 if and only if p is
a precondition for i.
24Planning Graph (Continued)
- An action node at level i has an edge to a
proposition node at level i1 if and only if the
action has the effect of making the proposition
true. - The only other ordinary edges in the graph are as
follows for any proposition p at level i, if p
remains true when no action is taken, then there
is an edge from p at level i to p at level i2.
25Planning Graph Represents Parallel Actions
- A planning graph with k action levels can
represent a plan with more than k actions. - That two actions appear at the same level does
not imply that both can be executed at once. - Whether two actions can be executed at once is
captured by a relation called mutually exclusive
(mutex), defined next.
26The Mutex Relation
- A mutex relation may hold between two actions or
two propositions at some level. - Two actions at level i are mutex if either
- the effect of one action is the negation of
another actions effect (inconsistent effects)
27Mutex (Continued)
- one action deletes the precondition of another
(interference) - the actions have preconditions that are mutually
exclusive at level i-1 (competing needs)
28Mutex Relation (Continued)
- Two propositions at level i are mutex if either
- One is the negation of the other
- all ways of achieving the propositions (that is,
actions at level i-1) are pairwise mutex
(inconsistent support).
29Mutex Relation (Continued)
- Maintenance of a proposition p from propositional
level i-1 to propositional level i1 is also
considered as an action at level i (although not
represented by a node at level i, but simply an
edge from p at level i-1 to p at level i1. - An action a at level i is mutex with the
persistence of p from level i-1 to level i1 if a
makes p false (inconsistent effects).
30(No Transcript)
31An Example
- Propositions
- garb garbage is in the house
- dinner dinner is prepared
- present present is wrapped
- cleanH hands are clean
- quiet house is quiet
32Example (Continued)
- Goal dinner, present, garb
- Initial State garb, cleanH, quiet
- Actions
- cook requires cleanH, achieves dinner
- wrap requires quiet, produces present
- carry achieves garb, deletes cleanH
- dolly achieves garb, deletes quiet
33Example (Continued)
- Inferred Mutex relations
- carry and garb are mutex because carry deletes
garb. - dolly and wrap are mutex because dolly deletes
quiet, which is a precondition for wrap. - At proposition level 2, quiet is mutex with
present because of inconsistent support.
34(No Transcript)
35(No Transcript)
36Solution Extraction
- Suppose the goal has n conjuncts.
- A plan might exist if GRAPHPLAN has proceeded to
some propositional level at which all the goal
propositions are present and no pair of these is
mutex. (This condition is necessary but not
sufficient.) - Must attempt to extract a solution from the
graph---test whether a solution is embedded
37Solution Extraction (Continued)
- (Continued) in the graph.
- Original method is a backtracking search
(depth-first search where state transitions
consist of choosing a next action).
38Backtrack Algorithm for Solution Extraction
- Suppose i is the last level in the planning graph
(we assume i is a propositional level). The goal
at level i is the goal for the plan. - For each propositional level from i to 0
- For each proposition (say, p) that appears as a
conjunct of the goal - Choose one of the actions a that makes p true
(could be a maintenance action) and that is not
mutex with any of the actions chosen so far at
this level.
39Backtracking Solution Extraction Algorithm
(Continued)
- If no such action exists, backtrack (try another
alternative for the previous choice). If no
previous choices were made, FAIL. - If the current level i is greater than 0, then
take the union of the preconditions for the
actions chosen at this level i, and set these to
be the conjuncts of the goal for level i-2.
Otherwise, return then plan (reverse the order of
the sequence of selected actions).
40Putting it all Together
- The Backtracking Solution Extraction Algorithm
succeeds if and only if there exists a plan
within the planning graph. - If no plan is found, then extend the planning
graph with additional levels.
41Example (Continued from Earlier)
- There exists no plan in the planning graph to
level 2 for our example, because of the mutex
relations between the propositions of our goal. - At level 4 several plans exist. Note that the
propositions at level 4 are the same as level 2,
but there are fewer mutex relations (because we
can use maintenance actions for propositions
achieved at level 2).
42(No Transcript)
43Using Fast Satisfiability Algorithms for Planning
- Fast stochastic algorithms for propositional
satisfiability GSAT, WSAT (WalkSAT) - Compile a planning problem in to a satisfiability
problem (example of a constraint satisfaction
problem -- CSP), and use a fast algorithm for
satisfiability.
44SATPLAN
- Compile a planning problem into a satisfiability
problem. - Use GSAT (or WSAT) to solve the satisfiability
problem. A satisfying assignment encodes a plan - Well see later that we also can merge GRAPHPLAN
and SATPLAN.
45SATPLAN (Continued)
- As we might expect, we need to encode the initial
state, the goal, and the available actions. - Included among the actions are the maintenance
actions (must write frame axioms). - At the end, we will discuss encoding
non-propositional planning tasks.
46A Subtle Point
- We still will use the idea of proposition and
action levels, but for now we will assume only
one action occurs per level. - For now we will consider using SAT-based planning
alone, without GRAPHPLAN. - Afterward, we will discuss merging the two.
47Compiling Planning to SAT
- INIT initial state is specified by a set of
single-literal (empty-body) clauses. For
example, the initial state from our earlier
example would be specified by the clauses garb-0,
cleanH-0, quiet-0, dinner-0, and present-0. - GOAL To test for a plan of length at most n,
each goal conjunct is asserted to be true at
level 2n. For the goal in our example, we
48Compilation (Continued)
- (Continued) if we want to test whether it is
true at time 1, we would add the following
single-literal (empty-body) clauses garb-2,
dinner-2, and present-2. - ACTIONS Actions imply both their preconditions
and effects. Thus among the clauses we would add
for our preceding example would be (cook-1
cleanH-0) as
49Compiling (Continued)
- (Continued) well as (cook-1 cleanH-0).
- EXCLUSION axioms saying at most one action
occurs at an action level (can relax) for all
actions a and b add (a-i b-i). - FRAME Also must encode some type of frame axioms
(maintenance actions). Well spend several
slides on this because it is more complicated and
two options exist.
50Two Types of Frame Encodings
- Classical frame axioms at-least-one axioms
classical frame axioms say which propositions are
left unchanged by a given action, and
at-least-one axioms enforce that some action
occurs at each action level. - Explanatory frame axioms enumerate the set of
actions that could have occurred to account for
some state change.
51Classical Frame Axioms
- In our previous example, we would specify that if
the garbage was in the house at level 0, and our
action at level 1 was cook, then garbage is still
in the house at level 2 (garb-0 cook-1
garb-2). - In general, for each action a and each
proposition p that a leaves unchanged, we have
(p-(i-1) a-i p-(i1)).
52At-Least-One Axioms
- But if no action occurs at an action level, we
will lose all our propositions from the previous
level. Therefore, we add axioms that specify an
action must occur at each level. - For each action level i, we have a disjunction of
all possible actions, e.g., (cook-i wrap-i
dolly-i carry-i).
53Explanatory Frame Axioms
- If garbage was in the house at level 0 but is not
in the house at level 2, then one of the actions
that removes garbage must have occurred at level
1 (garb-0
garb-2 carry-1 dolly-1). - We do not need at-least-one axioms, but we do
still need exclusion axioms.
54Linking GRAPHPLAN and SATPLAN
- Build the planning graph as normally, and then
convert the planning graph (partially solved and
hence simpler task) into a CNF formula. - INIT and GOAL axioms are as before.
- Actions imply their preconditions (we use our
ACTION axioms without the implication of effects.
55Linking (Continued)
- (Almost) explanatory frame axioms each fact at a
propositional level implies the disjunction of
all actions that could have caused it, including
explicit maintenance actions. For example, if
garbage is not in the house at level 4, then
either dolly or carry occurred at level 3 or we
maintained garbage from level 2 ...
56Linking (Continued)
- Specialized exclusion axioms instead of saying
no two actions can occur at the same action
level, we simply say that conflicting (mutex)
actions cannot occur at the same level. - GSAT or WSAT can the be used to more efficiently
search the planning graph (so represented) for a
plan.
57Relaxing the Restriction to Propositional Logic
- Suppose we have a first-order representations,
such as the standard STRIPS representation for
the blocks world. - Neither GRAPHPLAN nor SATPLAN (nor their
combination) as described so far can be applied,
because they assume a propositional
representation. - Solution convert to propositional.
58Methods of Conversion
- Convert each ground atom (member of the Herbrand
Universe) and level pair to a distinct
proposition. For example, each of the following
becomes a proposition - ontop(a,b) at level 0
- ontop(b,a) at level 0
- clear(a) at level 0
- clear(b) at level 0
59Conversion (Continued)
- ontop(a,b) at level 2
- unstack(a,b) at level 1
- unstack(b,a) at level 1
- stack(a,b) at level 1
- pickup(a) at level 1
- etc.
60Conversion (Continued)
- The problem with the preceding method is that the
number of propositions grows exponentially with
the predicate arity. - Alternative Break the representation of each
first-order ground atom into parts (e.g.,
arguments or bits), all of which have to be true
for the atom to be construed as true. - One distinct proposition for each argument of
61Conversion (Continued)
- (Continued) of each ground atom.
- One distinct proposition for each argument of a
given predicate (some ground atoms could share
some propositions). - Number all the ground atoms, and have one
proposition for each bit in the binary
representation of the atoms number.
62SATPLAN Example
63Example Conversion
- INIT
- x1 ontable(a,0)
- x2 ontable(b,0)
- x3 ontop(c,a,0)
- x4 clear(c,0)
- x5 clear(b,0)
- x6 handempty(0)
64Example (Continued)
- GOAL
- x7 clear(a,2)
- x8 clear(c,2)
65Example (Continued)
- ACTIONS
- pickup(a,1) -gt ontable(a,0) clear(a,0)
handempty(0) - Must convert into clauses
- pickup(a,1) ontable(a,0)
- pickup(a,1) clear(a,0)
- pickup(a,1) handempty(0)
- Similarly for pickup(b,1) and pickup(c,1)
66Example (Continued)
- ACTIONS (Continued)
- putdown(a,1) -gt holding(a,0)
- Must convert into clauses
- putdown(a,1) holding(a,0)
- Similarly for putdown(b,1) and putdown(c,1)
67Example (Continued)
- ACTIONS (Continued)
- stack(a,b,1) -gt clear(b,0) holding(a,0)
- Must convert into clauses
- stack(a,b,1) clear(b,0)
- stack(a,b,1) holding(a,0)
- Similarly for other instantiations of the stack
operator and for other action levels.
68Example (Continued)
- ACTIONS (Continued)
- unstack(a,b,1) -gt ontop(a,b,0) handempty(0)
clear(a,0) - Must convert into clauses
- unstack(a,b,1) ontop(a,b,0)
- unstack(a,b,1) handempty(0)
- unstack(a,b,1) clear(a,0)
- Similarly for other instantiations of the unstack
operator and for other action levels.
69Example (Continued)
- FRAME AXIOMS (Explanatory)
- clear(a,0) clear(a,2) -gt
- unstack(b,a,1)
- unstack(c,a,1)
- putdown(c,1).
- Convert to clauses by generalization of the rule
(ab) (cd) (ac)(ad)(bc)(bd). Must
build frame axioms for all action instances. - Also need clear(a,0) clear(a,2) -gt ...
70Example (Continued)
- FRAME AXIOMS (Explanatory)
- clear(a,0) clear(a,2) -gt
- stack(b,a,1)
- stack(c,a,1)
- pickup(c,1).
- Must repeat these for all other time steps, and
must also do explanatory frame axioms for all
other propositions besides those based on clear.
71Example (Continued)
- EXCLUSION AXIOMS For each pair of action
instances a and b and each action level i, add
(a-i b-i). - For example, we would add (among others)
- stack(a,b,1) pickup(c,1)
- unstack(a,b,1) unstack(b,a,1)
- etc.
72Example of the Benefit of Action Splitting
- With just 10 blocks, we will require nearly
10,000 axioms of the form stack(a,b,1)
stack(c,d,1). To see this, note that 90 ground
stack literals can be built given 10 blocks, and
therefore 9089 pairs of stack literals can be
built. - With splitting instead, we require only 180
literals (109 for the first argument, and 109
for the second).
73SATPLAN
- Compile a planning problem into a propositional
satisfiability problem. - Use a fast satisfiability algorithm to solve the
satisfiability problem. A satisfying assignment
encodes a plan - Well see later that we also can merge GRAPHPLAN
and SATPLAN.
74SATPLAN (Continued)
- As we might expect, we need to encode the initial
state, the goal, and the available actions. - Included among the actions are the maintenance
actions (must write frame axioms). - At the end, we will discuss encoding
non-propositional planning tasks.
75A Subtle Point
- We still will use the idea of proposition and
action levels, but for now we will assume only
one action occurs per level. - For now we will consider using SAT-based planning
alone, without GRAPHPLAN. - Afterward, we will discuss merging the two.
76Compiling Planning to SAT
- INIT initial state is specified by a set of
single-literal (empty-body) clauses. For
example, the initial state from our earlier
example would be specified by the clauses garb-0,
cleanH-0, quiet-0, dinner-0, and present-0. - GOAL To test for a plan of length at most n,
each goal conjunct is asserted to be true at
level 2n. For the goal in our example, we
77Compilation (Continued)
- (Continued) if we want to test whether it is
true at time 1, we would add the following
single-literal (empty-body) clauses garb-2,
dinner-2, and present-2. - ACTIONS Actions imply both their preconditions
and effects. Thus among the clauses we would add
for our preceding example would be (cook-1
cleanH-0) as
78Compiling (Continued)
- (Continued) well as (cook-1 cleanH-0).
- EXCLUSION axioms saying at most one action
occurs at an action level (can relax) for all
actions a and b add (a-i b-i). - FRAME Also must encode some type of frame axioms
(maintenance actions). Well spend several
slides on this because it is more complicated and
two options exist.
79Two Types of Frame Encodings
- Classical frame axioms at-least-one axioms
classical frame axioms say which propositions are
left unchanged by a given action, and
at-least-one axioms enforce that some action
occurs at each action level. - Explanatory frame axioms enumerate the set of
actions that could have occurred to account for
some state change.
80Classical Frame Axioms
- In our previous example, we would specify that if
the garbage was in the house at level 0, and our
action at level 1 was cook, then garbage is still
in the house at level 2 (garb-0 cook-1
garb-2). - In general, for each action a and each
proposition p that a leaves unchanged, we have
(p-(i-1) a-i p-(i1)).
81At-Least-One Axioms
- But if no action occurs at an action level, we
will lose all our propositions from the previous
level. Therefore, we add axioms that specify an
action must occur at each level. - For each action level i, we have a disjunction of
all possible actions, e.g., (cook-i wrap-i
dolly-i carry-i).
82Explanatory Frame Axioms
- If garbage was in the house at level 0 but is not
in the house at level 2, then one of the actions
that removes garbage must have occurred at level
1 (garb-0
garb-2 carry-1 dolly-1). - We do not need at-least-one axioms, but we do
still need exclusion axioms.
83Linking GRAPHPLAN and SATPLAN
- Build the planning graph as normally, and then
convert the planning graph (partially solved and
hence simpler task) into a CNF formula. - INIT and GOAL axioms are as before.
- Actions imply their preconditions (we use our
ACTION axioms without the implication of effects.
84Linking (Continued)
- (Almost) explanatory frame axioms each fact at a
propositional level implies the disjunction of
all actions that could have caused it, including
explicit maintenance actions. For example, if
garbage is not in the house at level 4, then
either dolly or carry occurred at level 3 or we
maintained garbage from level 2 ...
85Linking (Continued)
- Specialized exclusion axioms instead of saying
no two actions can occur at the same action
level, we simply say that conflicting (mutex)
actions cannot occur at the same level. - GSAT or WSAT can the be used to more efficiently
search the planning graph (so represented) for a
plan.
86Relaxing the Restriction to Propositional Logic
- Suppose we have a first-order representations,
such as the standard STRIPS representation for
the blocks world. - Neither GRAPHPLAN nor SATPLAN (nor their
combination) as described so far can be applied,
because they assume a propositional
representation. - Solution convert to propositional.
87Methods of Conversion
- Convert each ground atom (member of the Herbrand
Universe) and level pair to a distinct
proposition. For example, each of the following
becomes a proposition - ontop(a,b) at level 0
- ontop(b,a) at level 0
- clear(a) at level 0
- clear(b) at level 0
88Conversion (Continued)
- ontop(a,b) at level 2
- unstack(a,b) at level 1
- unstack(b,a) at level 1
- stack(a,b) at level 1
- pickup(a) at level 1
- etc.
89Conversion (Continued)
- The problem with the preceding method is that the
number of propositions grows exponentially with
the predicate arity. - Alternative Break the representation of each
first-order ground atom into parts (e.g.,
arguments or bits), all of which have to be true
for the atom to be construed as true. - One distinct proposition for each argument of
90Conversion (Continued)
- (Continued) of each ground atom.
- One distinct proposition for each argument of a
given predicate (some ground atoms could share
some propositions). - Number all the ground atoms, and have one
proposition for each bit in the binary
representation of the atoms number.