Planning with Local Search - PowerPoint PPT Presentation

About This Presentation
Title:

Planning with Local Search

Description:

(have banjo)) ((swap banjo kiwi))) Preconds. Delete-effects. Add-effects ... (have banjo)) ((swap banjo kiwi))) State: (have banjo) (defdomain basic-example ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 53
Provided by: jonathan105
Learn more at: http://www.ai.mit.edu
Category:
Tags: kiwi | local | planning | search

less

Transcript and Presenter's Notes

Title: Planning with Local Search


1
Planning with Local Search
  • MERS Seminar Lecture
  • March 6, 2003
  • Jonathan Kennell

2
Presentation Outline
  • Planning Overview
  • What is planning? 5 mins.
  • Taxonomy of planners 40 mins.(or everything
    you ever wanted to know about planning in
    approximately 40 minutes)
  • 5 minute break
  • LPG
  • Background information (WalkSAT) 10 mins.
  • Linear action graphs and precedence graphs 10
    mins.
  • WalkPlan planning algorithm 10 mins.
  • Example 10 mins.

3
What is Planning?
  • Input
  • Set of world-states
  • Action operators (fn world-state ? world-state)
  • Initial world-state
  • Goal (possibly a partial state / set of
    world-states)
  • Output
  • Ordering of actions

From 6.834J POP lecture
4
World State
  • Set of facts and their degree of truth
  • Examples
  • (Student Jonathan) // true
  • (Likes Jonathan Golf) // false
  • (Graduating Jonathan June) // unknown
  • Note lisp notation used extensively in planning
    community
  • Most planners dont consider unknown facts

5
Planning Operators
  • Fn world-state ? world-state
  • Generally use STRIPS format
  • Preconditions facts that must be true before
    action can occur
  • Effects facts that become true (or false) after
    the action occurs
  • Extra properties
  • Separate start / invariant / end conditions and
    effects
  • Durations
  • Resource constraints

(action Move (params ((robot ?r) (location
?a) (location ?b)) (preconds (at ?r ?a))
(effects (and (not (at ?r ?a)) (at ?r ?b))))
6
Mutual Exclusion
  • Sometimes planning operators conflict with each
    other we call a pair of conflicting operators
    mutex
  • Examples of mutex actions
  • Interference A deletes precondition or effect of
    B
  • Competing Needs A and B have mutex preconditions
  • Planner must ensure no mutex actions co-occur.

7
What is a plan?
  • A plan is an ordering of actions that will
    transition the system from the initial state to
    the goal state.

8
Completeness / Consistency / Minimality
  • Complete Plan
  • A plan is complete IFF every precondition of
    every activity is achieved.
  • An activitys precondition is achieved IFF
  • The precondition is the effect of a preceding
    activity (support), and
  • No intervening step conflicts with the
    precondition (mutex).
  • Consistent Plan
  • The plan is consistent IFF the temporal
    constraints of its activities are consistent (the
    associated distance graph has no negative
    cycles), and
  • no conflicting (mutex) activities can co-occur.
  • Minimal Plan
  • The plan is minimal IFF every constraint serves a
    purpose, i.e.,
  • If we remove any temporal or symbolic constraint
    from a minimal plan,the new plan is not
    equivalent to the original plan

9
Variations on Classical Planning
  • Temporal planning
  • Actions have durations
  • Planning with resources
  • Facts can be quantified
  • Planning with uncertainty
  • Effects / durations of actions not guaranteed

10
Taxonomy of Planners
Planners
11
Forward Chaining / Backward Propagation
  • Searches through entire plan-space by
    non-deterministically adding actions to plan
    candidates.
  • Advantages
  • generative (does not require strategies)
  • expressive (can handle time, resources, easily)
  • Disadvantages
  • Inherently slow (plan-space is enormous)

12
Forward Chaining Example
Familiar tradeoff Efficient pruning methods
versus optimality.
13
Case Study TLPlan
  • TLPlan (Temporal Logic Planner) by Fahiem Bacchus
    and Froduald Kabanza
  • TLPlan is based on a forward-chaining planner
  • TLPlan uses domain-dependent temporal logic to
    prune the search space

14
TLPlan First-order Temporal Logic
  • Definition First-order linear temporal logic
  • standard first-order logic, plus
  • U (until), ? (always), ? (eventually), ? (next)
  • Bounded quantifiers
  • ?xy? ? ?x . y(x)??(x)
  • ?xy? ? ?x . y(x)??(x)
  • Example
  • ?(on(B,C) ? (on(B,C) U on(A,B)))
  • Asserts that whenever we enter a state in which B
    is on C it remains on C until A is on B

15
TLPlan Formula Progression Algorihtm
  • The Progress algorithm is used to check control
    strategies as the system searches for a plan.
  • Inputs An LTL formula f and a world w (generated
    by forward-chaining)
  • Output A new formula f, also expressed as an
    LTL formula, representing the progression of f
    through the world w.
  • Algorithm Progress(f,w)
  • Case
  • f ? is atomic if w entails f, f TRUE,
    else f FALSE
  • f f1 ? f2 f Progress(f1,w) ?
    Progress(f2,w)
  • f ?f1 f ?Progress(f1,w)
  • etc. (see paper for complete algorithm)

16
TLPlan Example
Forward chaining begins
Rules
Etc.
(Any color)
This thread is efficiently guided by the
rules This thread is not guided well since no
rules apply.This results in pure
forward-chaining search.
17
TLPlan Review
  • TLPlan has been around in various implementations
    since 1995, although improvements have been made
    as recently as last year.
  • TLPlan functions initially as a forward-chaining
    planner, but can use logical rules to guide its
    search and prune unfeasible threads.
  • TLPlan was the fastest domain-specific planner in
    the 2002 AIPS competition.

18
Case Study TLPlan
  • TLPlan is a temporal forward chaining planner
    that uses temporal logic to help prune the search
    space.
  • Temporal logic is an extension of normal
    first-order logic that includes durative
    constructs such as always and eventually,
    etc.
  • Goals are specified using temporal logic to
    require not only goal conditions, but goal
    sequences.
  • Tells the planner not only what to achieve, but
    how to achieve it
  • Any candidate plan that is inconsistent with the
    temporal logic constraints is pruned.

19
Domain Knowledge
  • Planning is hard the most general planners are
    extremely slow
  • To increase speed, some planners sacrifice
    generality by using domain-specific strategies.
  • TLPlan encodes the strategy into the goal
    specification, while other planners decouple the
    goals and the strategies.

20
Forward Chaining Speedup
  • Many researches have focused on discovering ways
    to help speedup domain-independent forward
    chaining planners.
  • Ex. SAPA by Minh B. Do Subbarao Kambhampati
  • Methods focus on estimating plan cost using
  • Relaxed plan-graphs
  • Estimated remaining cost to goal
  • Cost metrics
  • Ex. actions, plan duration, etc.

21
Taxonomy of Planners
Planners
22
Plan Graph
  • Plan-graph based planners first construct a
    compact representation of the plan-space (the
    plan-graph), and then search that space.
  • Plan-graphs contains all possible plans up to a
    certain size, excluding incomplete plans with
    co-occurring binary mutex actions.
  • Plan-graphs do not exclude all invalid plans, and
    depending on the domain may yield extremely
    efficient or inefficient results.
  • Advantages
  • generative
  • much faster than most forward-chaining planners
  • plan-graph can be generated in polynomial time
    and space
  • Disadvantages
  • plan-graphs are less expressive (resources and
    time difficult)
  • in certain domains, search of plan-graph can be
    very inefficient

23
Forward Chaining vs. Plan Graph
Forward Chaining
Plan Graph
24
Case Study Graphplan
Note the compact structure in this graph its
polynomial in size!
25
Mutex Relationships
26
Case Study LPGP
  • Idea
  • use Graphplan to identify complete plan (action
    structure)
  • then use Linear Programming to determine plan
    consistency and perform scheduling (assign
    durations to actions)
  • Advantage
  • Two-phase approach accomplishes temporal planning
    with the speed of a plan-graph based planner
  • Disadvantages
  • Cannot optimize over time (only optimizes over
    makespan)
  • Two-phase approach is potentially very
    inefficient
  • no temporal conflicts are used to guide Graphplan
    search
  • search not incremental LP must be started from
    scratch each time

27
Taxonomy of Planners
Planners
28
Macro Decomposition
  • Operates similar to context-free grammar
  • planner non-deterministically expands
    macro-activities until all plan actions are
    primitive.
  • rules ensure that planner only explores space of
    complete plans
  • Planner still must ensure plan consistency.
  • Advantages
  • Fast
  • Disadvantages
  • all achieving strategies must be pre-encoded into
    macros
  • non-optimal explores restricted plan-space,
    potentially excluding optimal solutions

29
Case Study SHOP2
  • SHOP2 by Dana Nau, Hector Munoz-Avila, Yue Cao,
    Amnon Lotem and Steven Mitchell
  • SHOP2 works similar to the task-decomposition
    mechanism in Kirk
  • SHOP2 problems consist of
  • Operators (with preconditions, add-effects and
    delete-effects)
  • Methods (rules for how to progress the plan)
  • Initial conditions and goals
  • SHOP2 is fairly fast, but all plan happenings
    must be pre-designed (at some level) by a
    programmer.
  • SHOP2 plans do not support concurrency

30
SHOP2 Example
(defdomain basic-example ( (operator (pickup
?a) () () ((have ?a))) (operator (drop ?a)
((have ?a)) ((have ?a)) ()) (method (swap ?x
?y) ((have ?x)) ((drop ?x) (pickup ?y))
((have ?y)) ((drop ?y) (pickup
?x))))) (defproblem problem1 basic-example
((have banjo)) ((swap banjo kiwi)))
Preconds Delete-effects Add-effects
Condition Strategy
Allows one method todecompose into
multiplepossible subplans, dependingon the
current state
Initial Condition
Start Strategy
31
SHOP2 In Action
(defdomain basic-example ( (operator (pickup
?a) () () ((have ?a))) (operator (drop ?a)
((have ?a)) ((have ?a)) ()) (method (swap ?x
?y) ((have ?x)) ((drop ?x) (pickup ?y))
((have ?y)) ((drop ?y) (pickup
?x))))) (defproblem problem1 basic-example
((have banjo)) ((swap banjo kiwi)))
(defdomain basic-example ( (operator (pickup
?a) () () ((have ?a))) (operator (drop ?a)
((have ?a)) ((have ?a)) ()) (method (swap
banjo kiwi) ((have banjo)) ((drop banjo)
(pickup kiwi)) ((have kiwi)) ((drop kiwi)
(pickup banjo))))) (defproblem problem1
basic-example ((have banjo)) ((swap banjo
kiwi)))
State
(have kiwi)
?
?
(have banjo)
DONE
32
Case Study SHOP2
33
Case Study Kirk TPN Planner
Macro-Activity() l,u
Decomposition 1
Decomposition 2
34
5 Minute Break
35
Presentation Outline
  • Planning Overview
  • What is planning? 5 mins.
  • Taxonomy of planners 40 mins.(or everything
    you ever wanted to know about planning in
    approximately 40 minutes)
  • 5 minute break
  • LPG
  • Background information (WalkSAT) 10 mins.
  • Linear action graphs and precedence graphs 10
    mins.
  • WalkPlan planning algorithm 10 mins.
  • Example 10 mins.

36
Taxonomy of Planners
Planners
37
Local Search WalkSAT
  • WalkSAT is a randomized algorithm for solving SAT
    (propositional satisfiability) problems.
  • It builds on the DPLL algorithm, but utilizes
    local search and randomness.

38
WalkSAT
  • Problem
  • Find a satisfying assignment to a logic formula
  • (A !B) (B !C) (C !A) (A B
    C)
  • WalkSAT
  • Pick a random assignment to the variables
  • Until formula satisfied (or up to some max of
    iterations),
  • Choose an unsatisfied clause and enumerate the
    ways of adjusting the variables in order to
    satisfy it
  • With probability p
  • Choose the best-utility adjustment
  • Else
  • Choose a random adjustment

39
WalkSAT Example
  • (A !B) (B !C) (C !A) (A B
    C)
  • Pick !A, !B, !C
  • (A !B) (B !C) (C !A) (A B
    C)
  • Options are to switch A, B, or C
  • Pick A, !B, !C
  • (A !B) (B !C) (C !A) (A B
    C)
  • Options are to switch A or C
  • Pick A, !B, C
  • (A !B) (B !C) (C !A) (A B
    C)
  • Options are to switch B or C
  • Pick A, B, C
  • (A !B) (B !C) (C !A) (A B
    C)
  • Formula Satisfied!

40
WalkSAT Discussion
  • WalkSAT has proven to be very fast at solving
    complicated SAT problems
  • WalkSAT can solve some problems that systematic
    algorithms simply cant handle
  • Due to randomness, WalkSAT is incomplete
  • WalkSAT may fail to discover a solution

41
Introduction to LPG
  • LPG (local search for plan-graphs) by Alfonso
    Gerevini and Ivan Serina
  • Blackbox mapped the planning problem to a CSP and
    solved it using a SAT solver.
  • LPG unifies the planning and WalkSAT algorithms
    to create the WalkPlan search algorithm.

42
LPG Big Idea
  • Big Idea
  • Start with a random plan
  • While plan is incorrect / inconsistent
  • Identify and repair conflict
  • Basically the same idea of WalkSAT, but applied
    to a special form of plan-graph

43
Temporal Action Graphs
  • Definitions
  • Action-graph the subset of a plan-graph
    containing the action layers
  • Support a fact is said to be supported if it
    is achieved by some action in the previous action
    layer
  • Conflict
  • a mutex between two actions
  • an action with an unsupported precondition

44
Linearization of Action Graphs
  • An Action Graph can be made linear by allowing
    only one action per action layer.
  • The layers no longer explicitly represent an
    ordering of time (temporal concurrency is still
    possible)
  • The layer ordering simply presents an action
    sequence for the purposes of establishing fact
    support relationships.

45
Example Linear Action Graph
A B C
A B C
A B C
A B C
A B C
A plan-graph consists of alternating fact layers
and action layers.
The actions alone constitute an action graph.
LPG operates directly on the action graph
structure, inserting and removingactions from
various action layers as it repairs incomplete
plans.
46
Example Temporal Action Graph
47
Conflicts and Repair
  • An incomplete plan is manifested as an action
    graph with conflicts.
  • Example conflicts with resolution (repair)
    strategies

Conflict Description Conflict Resolution Strategies
Permanent mutex between two actions in the same action layer Remove one of the actions
Precondition mutex between two actions in the same action layer Remove one of the actions
Precondition mutex between two actions in the same action layer Add support for one of the mutex preconditions
Unsupported precondition for an action in an action layer Add an action to the previous action layer that achieves the unsupported precondition
Unsupported precondition for an action in an action layer Remove the action whose preconditions are not satisfied
48
LPG Algorithm
LPGs WalkPlan Planning Algorithm
  • LPG
  • Generate an initial dummy plan, P, either
  • Randomly
  • By adding actions to support all facts ignoring
    mutexes, or
  • Via some front-end plan generator
  • Randomly choose a conflict in the action-graph,
    C
  • Identify all possible ways of resolving C and
    evaluate them using the action evaluation
    function
  • Resolution techniques include removing one of
    two mutex actions, adding a supporting action for
    an unsupported fact, or removing an action that
    has an unsupported precondition
  • If a conflict resolution has cost 0, the plan is
    complete
  • Note The action evaluation function uses
    Lagrange multipliers to dynamically weight the
    different factors in the action evaluation
    function
  • If a resolution introduces no new conflicts,
    apply it and go to step (2)Else,
  • with probability p, randomly choose a resolution,
    apply it and go to step (2)
  • with probability 1-p, choose the lowest cost
    conflict resolution, apply it and go to step (2)
  • Note The resolution step includes a mechanism
    for extending the plan-graph

Generate Initial Plan
Choose Conflict
Resolve Evaluate
Resolution Selection
49
LPG Example
A B C
A B C
A B C
A B C
A B C
Permanently mutex actionsin the same action
layer(resolved by removing one of the two
actions)
Unsupported precondition(resolved by adding
achievingaction at previous action layer)
Unsupported precondition(resolved by removing
theconflicting action)
Unsupported precondition(resolved by adding
achievingaction at previous action layer)
Initial Conditions ( nil ) Goals ( A, B, C
) Actions A0 preconds ( nil ) effects ( A ) A1
preconds ( A ) effects ( A, B ) A2 preconds ( A,
B ) effects ( C )
Note No-ops are propagated during conflict
resolution
Initial dummy plan
Identify conflict
Resolve conflict
Plan complete
50
LPG Analysis
  • Advantages
  • LPG is fast four orders of magnitude faster
    than the leading optimal planners
  • LPG is domain-independent
  • LPG can easily handle resources and durative
    actions
  • Disadvantages
  • LPG is randomized, so plans are not usually
    optimal and often contain extraneous actions
  • LPG includes option to continue searching for
    multiple solutions, in the hope of finding better
    plans
  • While maintaining expressivity, LPG sacrifices
    optimality for speed.

51
AIPS 2002 Results (subset)
Planner Problems Solved Problems Attempted Success Ratio Capabilities
SHOP2 2nd place (hand-coded) 899 904 99 (Strips, Numeric, HardNumeric, SimpleTime, Time, Complex)
TLPlan 1st place (hand-coded) 894 894 100 (Strips, Numeric, HardNumeric, SimpleTime, Time, Complex)
LPG 1st place (fully-automated) 372 428 87 (Strips, Numeric, HardNumeric, SimpleTime, Time)
52
Summary
  • Planning is hard!
  • We want planners that
  • are fast
  • are domain-independent
  • are optimal
  • handle durative actions / resources / uncertainty
  • Want a speedup?
  • Sacrificing expressivity helps
  • Sacrificing optimality helps more
  • Sacrificing generality helps the most
  • LPG is todays best planner than is
    domain-independent, expressive, and fast to
    achieve speed, it sacrifices optimality and uses
    local search.

53
Planning References
  • Planning in general
  • Russell and Norvig, Artificial Intelligence A
    Modern Approach, section IV, Prentice Hall 2nd
    edition (December 20, 2002)
  • AIPS International Planning Competition, 2002
  • http//www.dur.ac.uk/d.p.long/competition.html
  • Graphplan
  • A. Blum and M. Furst, Fast Planning Through
    Planning Graph Analysis, Artificial
    Intelligence, 90281300 (1997).
  • www.cs.cmu.edu/avrim/graphplan.html
  • LPG
  • A. Gerevini and I. Serina, Planning through
    Stochastic Local Search and Temporal Action
    Graphs, technical report from Universita degli
    Studi di Brescia, November, 2002.
  • prometeo.ing.unibs.it/lpg/
Write a Comment
User Comments (0)
About PowerShow.com