Title: Planning with Local Search
1Planning with Local Search
- MERS Seminar Lecture
- March 6, 2003
- Jonathan Kennell
2Presentation Outline
- Planning Overview
- What is planning? 5 mins.
- Taxonomy of planners 40 mins.(or everything
you ever wanted to know about planning in
approximately 40 minutes) - 5 minute break
- LPG
- Background information (WalkSAT) 10 mins.
- Linear action graphs and precedence graphs 10
mins. - WalkPlan planning algorithm 10 mins.
- Example 10 mins.
3What is Planning?
- Input
- Set of world-states
- Action operators (fn world-state ? world-state)
- Initial world-state
- Goal (possibly a partial state / set of
world-states) - Output
- Ordering of actions
From 6.834J POP lecture
4World State
- Set of facts and their degree of truth
- Examples
- (Student Jonathan) // true
- (Likes Jonathan Golf) // false
- (Graduating Jonathan June) // unknown
- Note lisp notation used extensively in planning
community - Most planners dont consider unknown facts
5Planning Operators
- Fn world-state ? world-state
- Generally use STRIPS format
- Preconditions facts that must be true before
action can occur - Effects facts that become true (or false) after
the action occurs - Extra properties
- Separate start / invariant / end conditions and
effects - Durations
- Resource constraints
(action Move (params ((robot ?r) (location
?a) (location ?b)) (preconds (at ?r ?a))
(effects (and (not (at ?r ?a)) (at ?r ?b))))
6Mutual Exclusion
- Sometimes planning operators conflict with each
other we call a pair of conflicting operators
mutex - Examples of mutex actions
- Interference A deletes precondition or effect of
B - Competing Needs A and B have mutex preconditions
- Planner must ensure no mutex actions co-occur.
7What is a plan?
- A plan is an ordering of actions that will
transition the system from the initial state to
the goal state.
8Completeness / Consistency / Minimality
- Complete Plan
- A plan is complete IFF every precondition of
every activity is achieved. - An activitys precondition is achieved IFF
- The precondition is the effect of a preceding
activity (support), and - No intervening step conflicts with the
precondition (mutex). - Consistent Plan
- The plan is consistent IFF the temporal
constraints of its activities are consistent (the
associated distance graph has no negative
cycles), and - no conflicting (mutex) activities can co-occur.
- Minimal Plan
- The plan is minimal IFF every constraint serves a
purpose, i.e., - If we remove any temporal or symbolic constraint
from a minimal plan,the new plan is not
equivalent to the original plan
9Variations on Classical Planning
- Temporal planning
- Actions have durations
- Planning with resources
- Facts can be quantified
- Planning with uncertainty
- Effects / durations of actions not guaranteed
10Taxonomy of Planners
Planners
11Forward Chaining / Backward Propagation
- Searches through entire plan-space by
non-deterministically adding actions to plan
candidates. - Advantages
- generative (does not require strategies)
- expressive (can handle time, resources, easily)
- Disadvantages
- Inherently slow (plan-space is enormous)
12Forward Chaining Example
Familiar tradeoff Efficient pruning methods
versus optimality.
13Case Study TLPlan
- TLPlan (Temporal Logic Planner) by Fahiem Bacchus
and Froduald Kabanza - TLPlan is based on a forward-chaining planner
- TLPlan uses domain-dependent temporal logic to
prune the search space
14TLPlan First-order Temporal Logic
- Definition First-order linear temporal logic
- standard first-order logic, plus
- U (until), ? (always), ? (eventually), ? (next)
- Bounded quantifiers
- ?xy? ? ?x . y(x)??(x)
- ?xy? ? ?x . y(x)??(x)
- Example
- ?(on(B,C) ? (on(B,C) U on(A,B)))
- Asserts that whenever we enter a state in which B
is on C it remains on C until A is on B
15TLPlan Formula Progression Algorihtm
- The Progress algorithm is used to check control
strategies as the system searches for a plan. - Inputs An LTL formula f and a world w (generated
by forward-chaining) - Output A new formula f, also expressed as an
LTL formula, representing the progression of f
through the world w. - Algorithm Progress(f,w)
- Case
- f ? is atomic if w entails f, f TRUE,
else f FALSE - f f1 ? f2 f Progress(f1,w) ?
Progress(f2,w) - f ?f1 f ?Progress(f1,w)
- etc. (see paper for complete algorithm)
16TLPlan Example
Forward chaining begins
Rules
Etc.
(Any color)
This thread is efficiently guided by the
rules This thread is not guided well since no
rules apply.This results in pure
forward-chaining search.
17TLPlan Review
- TLPlan has been around in various implementations
since 1995, although improvements have been made
as recently as last year. - TLPlan functions initially as a forward-chaining
planner, but can use logical rules to guide its
search and prune unfeasible threads. - TLPlan was the fastest domain-specific planner in
the 2002 AIPS competition.
18Case Study TLPlan
- TLPlan is a temporal forward chaining planner
that uses temporal logic to help prune the search
space. - Temporal logic is an extension of normal
first-order logic that includes durative
constructs such as always and eventually,
etc. - Goals are specified using temporal logic to
require not only goal conditions, but goal
sequences. - Tells the planner not only what to achieve, but
how to achieve it - Any candidate plan that is inconsistent with the
temporal logic constraints is pruned.
19Domain Knowledge
- Planning is hard the most general planners are
extremely slow - To increase speed, some planners sacrifice
generality by using domain-specific strategies. - TLPlan encodes the strategy into the goal
specification, while other planners decouple the
goals and the strategies.
20Forward Chaining Speedup
- Many researches have focused on discovering ways
to help speedup domain-independent forward
chaining planners. - Ex. SAPA by Minh B. Do Subbarao Kambhampati
- Methods focus on estimating plan cost using
- Relaxed plan-graphs
- Estimated remaining cost to goal
- Cost metrics
- Ex. actions, plan duration, etc.
21Taxonomy of Planners
Planners
22Plan Graph
- Plan-graph based planners first construct a
compact representation of the plan-space (the
plan-graph), and then search that space. - Plan-graphs contains all possible plans up to a
certain size, excluding incomplete plans with
co-occurring binary mutex actions. - Plan-graphs do not exclude all invalid plans, and
depending on the domain may yield extremely
efficient or inefficient results. - Advantages
- generative
- much faster than most forward-chaining planners
- plan-graph can be generated in polynomial time
and space - Disadvantages
- plan-graphs are less expressive (resources and
time difficult) - in certain domains, search of plan-graph can be
very inefficient
23Forward Chaining vs. Plan Graph
Forward Chaining
Plan Graph
24Case Study Graphplan
Note the compact structure in this graph its
polynomial in size!
25Mutex Relationships
26Case Study LPGP
- Idea
- use Graphplan to identify complete plan (action
structure) - then use Linear Programming to determine plan
consistency and perform scheduling (assign
durations to actions) - Advantage
- Two-phase approach accomplishes temporal planning
with the speed of a plan-graph based planner - Disadvantages
- Cannot optimize over time (only optimizes over
makespan) - Two-phase approach is potentially very
inefficient - no temporal conflicts are used to guide Graphplan
search - search not incremental LP must be started from
scratch each time
27Taxonomy of Planners
Planners
28Macro Decomposition
- Operates similar to context-free grammar
- planner non-deterministically expands
macro-activities until all plan actions are
primitive. - rules ensure that planner only explores space of
complete plans - Planner still must ensure plan consistency.
- Advantages
- Fast
- Disadvantages
- all achieving strategies must be pre-encoded into
macros - non-optimal explores restricted plan-space,
potentially excluding optimal solutions
29Case Study SHOP2
- SHOP2 by Dana Nau, Hector Munoz-Avila, Yue Cao,
Amnon Lotem and Steven Mitchell - SHOP2 works similar to the task-decomposition
mechanism in Kirk - SHOP2 problems consist of
- Operators (with preconditions, add-effects and
delete-effects) - Methods (rules for how to progress the plan)
- Initial conditions and goals
- SHOP2 is fairly fast, but all plan happenings
must be pre-designed (at some level) by a
programmer. - SHOP2 plans do not support concurrency
30SHOP2 Example
(defdomain basic-example ( (operator (pickup
?a) () () ((have ?a))) (operator (drop ?a)
((have ?a)) ((have ?a)) ()) (method (swap ?x
?y) ((have ?x)) ((drop ?x) (pickup ?y))
((have ?y)) ((drop ?y) (pickup
?x))))) (defproblem problem1 basic-example
((have banjo)) ((swap banjo kiwi)))
Preconds Delete-effects Add-effects
Condition Strategy
Allows one method todecompose into
multiplepossible subplans, dependingon the
current state
Initial Condition
Start Strategy
31SHOP2 In Action
(defdomain basic-example ( (operator (pickup
?a) () () ((have ?a))) (operator (drop ?a)
((have ?a)) ((have ?a)) ()) (method (swap ?x
?y) ((have ?x)) ((drop ?x) (pickup ?y))
((have ?y)) ((drop ?y) (pickup
?x))))) (defproblem problem1 basic-example
((have banjo)) ((swap banjo kiwi)))
(defdomain basic-example ( (operator (pickup
?a) () () ((have ?a))) (operator (drop ?a)
((have ?a)) ((have ?a)) ()) (method (swap
banjo kiwi) ((have banjo)) ((drop banjo)
(pickup kiwi)) ((have kiwi)) ((drop kiwi)
(pickup banjo))))) (defproblem problem1
basic-example ((have banjo)) ((swap banjo
kiwi)))
State
(have kiwi)
?
?
(have banjo)
DONE
32Case Study SHOP2
33Case Study Kirk TPN Planner
Macro-Activity() l,u
Decomposition 1
Decomposition 2
345 Minute Break
35Presentation Outline
- Planning Overview
- What is planning? 5 mins.
- Taxonomy of planners 40 mins.(or everything
you ever wanted to know about planning in
approximately 40 minutes) - 5 minute break
- LPG
- Background information (WalkSAT) 10 mins.
- Linear action graphs and precedence graphs 10
mins. - WalkPlan planning algorithm 10 mins.
- Example 10 mins.
36Taxonomy of Planners
Planners
37Local Search WalkSAT
- WalkSAT is a randomized algorithm for solving SAT
(propositional satisfiability) problems. - It builds on the DPLL algorithm, but utilizes
local search and randomness.
38WalkSAT
- Problem
- Find a satisfying assignment to a logic formula
- (A !B) (B !C) (C !A) (A B
C) - WalkSAT
- Pick a random assignment to the variables
- Until formula satisfied (or up to some max of
iterations), - Choose an unsatisfied clause and enumerate the
ways of adjusting the variables in order to
satisfy it - With probability p
- Choose the best-utility adjustment
- Else
- Choose a random adjustment
39WalkSAT Example
- (A !B) (B !C) (C !A) (A B
C) - Pick !A, !B, !C
- (A !B) (B !C) (C !A) (A B
C) - Options are to switch A, B, or C
- Pick A, !B, !C
- (A !B) (B !C) (C !A) (A B
C) - Options are to switch A or C
- Pick A, !B, C
- (A !B) (B !C) (C !A) (A B
C) - Options are to switch B or C
- Pick A, B, C
- (A !B) (B !C) (C !A) (A B
C) - Formula Satisfied!
40WalkSAT Discussion
- WalkSAT has proven to be very fast at solving
complicated SAT problems - WalkSAT can solve some problems that systematic
algorithms simply cant handle - Due to randomness, WalkSAT is incomplete
- WalkSAT may fail to discover a solution
41Introduction to LPG
- LPG (local search for plan-graphs) by Alfonso
Gerevini and Ivan Serina - Blackbox mapped the planning problem to a CSP and
solved it using a SAT solver. - LPG unifies the planning and WalkSAT algorithms
to create the WalkPlan search algorithm.
42LPG Big Idea
- Big Idea
- Start with a random plan
- While plan is incorrect / inconsistent
- Identify and repair conflict
- Basically the same idea of WalkSAT, but applied
to a special form of plan-graph
43Temporal Action Graphs
- Definitions
- Action-graph the subset of a plan-graph
containing the action layers - Support a fact is said to be supported if it
is achieved by some action in the previous action
layer - Conflict
- a mutex between two actions
- an action with an unsupported precondition
44Linearization of Action Graphs
- An Action Graph can be made linear by allowing
only one action per action layer. - The layers no longer explicitly represent an
ordering of time (temporal concurrency is still
possible) - The layer ordering simply presents an action
sequence for the purposes of establishing fact
support relationships.
45Example Linear Action Graph
A B C
A B C
A B C
A B C
A B C
A plan-graph consists of alternating fact layers
and action layers.
The actions alone constitute an action graph.
LPG operates directly on the action graph
structure, inserting and removingactions from
various action layers as it repairs incomplete
plans.
46Example Temporal Action Graph
47Conflicts and Repair
- An incomplete plan is manifested as an action
graph with conflicts. - Example conflicts with resolution (repair)
strategies
Conflict Description Conflict Resolution Strategies
Permanent mutex between two actions in the same action layer Remove one of the actions
Precondition mutex between two actions in the same action layer Remove one of the actions
Precondition mutex between two actions in the same action layer Add support for one of the mutex preconditions
Unsupported precondition for an action in an action layer Add an action to the previous action layer that achieves the unsupported precondition
Unsupported precondition for an action in an action layer Remove the action whose preconditions are not satisfied
48LPG Algorithm
LPGs WalkPlan Planning Algorithm
- LPG
- Generate an initial dummy plan, P, either
- Randomly
- By adding actions to support all facts ignoring
mutexes, or - Via some front-end plan generator
- Randomly choose a conflict in the action-graph,
C - Identify all possible ways of resolving C and
evaluate them using the action evaluation
function - Resolution techniques include removing one of
two mutex actions, adding a supporting action for
an unsupported fact, or removing an action that
has an unsupported precondition - If a conflict resolution has cost 0, the plan is
complete - Note The action evaluation function uses
Lagrange multipliers to dynamically weight the
different factors in the action evaluation
function - If a resolution introduces no new conflicts,
apply it and go to step (2)Else, - with probability p, randomly choose a resolution,
apply it and go to step (2) - with probability 1-p, choose the lowest cost
conflict resolution, apply it and go to step (2) - Note The resolution step includes a mechanism
for extending the plan-graph
Generate Initial Plan
Choose Conflict
Resolve Evaluate
Resolution Selection
49LPG Example
A B C
A B C
A B C
A B C
A B C
Permanently mutex actionsin the same action
layer(resolved by removing one of the two
actions)
Unsupported precondition(resolved by adding
achievingaction at previous action layer)
Unsupported precondition(resolved by removing
theconflicting action)
Unsupported precondition(resolved by adding
achievingaction at previous action layer)
Initial Conditions ( nil ) Goals ( A, B, C
) Actions A0 preconds ( nil ) effects ( A ) A1
preconds ( A ) effects ( A, B ) A2 preconds ( A,
B ) effects ( C )
Note No-ops are propagated during conflict
resolution
Initial dummy plan
Identify conflict
Resolve conflict
Plan complete
50LPG Analysis
- Advantages
- LPG is fast four orders of magnitude faster
than the leading optimal planners - LPG is domain-independent
- LPG can easily handle resources and durative
actions - Disadvantages
- LPG is randomized, so plans are not usually
optimal and often contain extraneous actions - LPG includes option to continue searching for
multiple solutions, in the hope of finding better
plans - While maintaining expressivity, LPG sacrifices
optimality for speed.
51AIPS 2002 Results (subset)
Planner Problems Solved Problems Attempted Success Ratio Capabilities
SHOP2 2nd place (hand-coded) 899 904 99 (Strips, Numeric, HardNumeric, SimpleTime, Time, Complex)
TLPlan 1st place (hand-coded) 894 894 100 (Strips, Numeric, HardNumeric, SimpleTime, Time, Complex)
LPG 1st place (fully-automated) 372 428 87 (Strips, Numeric, HardNumeric, SimpleTime, Time)
52Summary
- Planning is hard!
- We want planners that
- are fast
- are domain-independent
- are optimal
- handle durative actions / resources / uncertainty
- Want a speedup?
- Sacrificing expressivity helps
- Sacrificing optimality helps more
- Sacrificing generality helps the most
- LPG is todays best planner than is
domain-independent, expressive, and fast to
achieve speed, it sacrifices optimality and uses
local search.
53Planning References
- Planning in general
- Russell and Norvig, Artificial Intelligence A
Modern Approach, section IV, Prentice Hall 2nd
edition (December 20, 2002) - AIPS International Planning Competition, 2002
- http//www.dur.ac.uk/d.p.long/competition.html
- Graphplan
- A. Blum and M. Furst, Fast Planning Through
Planning Graph Analysis, Artificial
Intelligence, 90281300 (1997). - www.cs.cmu.edu/avrim/graphplan.html
- LPG
- A. Gerevini and I. Serina, Planning through
Stochastic Local Search and Temporal Action
Graphs, technical report from Universita degli
Studi di Brescia, November, 2002. - prometeo.ing.unibs.it/lpg/