Title: Execution%20Monitoring%20
1Execution Monitoring Replanning
2Replanning(?)
- Why would we need to replan?
- You recognize, during execution, that it is not
going according to the plan - Execution Failure
- Quality reduction
- How can this happen?
- Simple answer Modeling failure (or intentional
model simplification) - The world is not static (dynamic)
- The actions are not instantaneous (durative)
- The world is not deterministic (stochastic)
- The wold is not fully observable (partially
observable) - The specific action model you assumed is faulty
- There are additional preconditions for actions
- There are additional effects for actions
- The specific cost/reward model you assume dis
faulty - Actions are more (or less) costly than you
assumed - Goals have higher (or lower) reward than you
assumed - The problem specification is not yet complete(!)
- New goals are being added
- Some goals are being retracted
3Replanning (contd.)
- What should you do?
- First, you need to recognize that something is
astray - Execution Monitoring
- Could be non-trivial if what is going astray is
plan-quality - Then, you need to fix the problem at least for
now - Simplestrestart execution (somewhere)
- Complex Modify the plan
- Figure out where you are (both initial and goal
states and cost/reward metrics) - Init state?sense
- Goal state? ?re-select objectives
- Cost/reward ? Modify costs/rewards to allow for
new goals, impossible actions and/or
commitment/reservation penalties - Plan
- This process can be different from normal
planning (commitments caused by publication of
the old plan) - Finally, if this keeps happening, you need to fix
the model
There is nothing wrong in going with the wrong
model if it causes only very occasional
failures (all models are wrong!)
4Simple Replanning Scenario
- Replanning necessitated only because of
correctness considerations (no regard to
optimality) - Problem specification is complete (no new goals
are being added)
5Things more complicated if the world is
partially observable ?Need to insert
sensing actions to sense fluents
that can only be indirectly sensed
6Cutset(s) P s.t. lts,P,sgt is a causal
link and sltslts
For sequential plans, this is also simply the
regression of goal state up to this action
?cutset
Triangle Tables
Can be generalized to Partially ordered plans
7This involves disjunction of conjuctive
goalsets!
The only reason to get back to the old plan is
to reduce planning cost
8(Simple) Replanning as Disjunctive Goal Sets
- Suppose you are executing a plan P which goes
through regression states (or cut-sets) G1..Gn - You find yourself in a state S
- If any of of G1..Gn hold in S then restart
execution from the action after that state - If not, you need to go from S to any one of
G1..Gn - Use relaxed plan heuristic to find out which of
G1..Gn are closes to S. Suppose it is Gi - Solve the problem S,Gi
9Replanning as the universal antidote to
domain-modeling laziness
- As long as the world is forgiving, you can
always go with a possibly faulty domain model
during planning, and replan as needed - You learn to improve the domain model only when
the failure are getting too frequent.. - (The alternative of going with the correct domain
model up-front can be computationally
intractable!)
10Stochastic Planning with Replanning
- If the domain is observable and lenient to
failures, and we are willing to do replanning,
then we can always handle non-deterministic as
well as stochastic actions with classical
planning! - Solve the deterministic relaxation of the
problem - Start executing it, while monitoring the world
state - When an unexpected state is encountered, replan
- A planner that did this in the First Intl.
Planning CompetitionProbabilistic Track, called
FF-Replan, won the competition. - (Much to the chagrin of many planners which took
the stochastic dynamics into account while doing
planning..)
1120 years of research into decision
theoretic planning, ..and FF-Replan is the
result?
30 years of research into programming
languages, ..and C is the result?
12Quality sensitive replanning?
13MURI Rescue Scenario
- Human and Robot collaborating on a rescue
scenario - The planner helps the Robot in prioritizing its
goals and selecting its actions - Planning part has characteristics of
- Online planning (new goals may arrive as the
current plan is being executed relative rewards
for existing goals may change because of affect) - Replanning (current plan may hit execution snags)
- Opportunistic planning (previously inactive goals
may become active because of the knowledge gained
during execution) - Commitment-sensitivity (The robot needs to be
sensitive to the plans that it said it will be
following)
14Can PSP model help?
- We argued that PSP model helps in MURI
- It does help in capturing the replanning,
changing utilities and commitment sensitivity - Can we extend it to also handle opportunistic
goals? - Simple answer Yeswe just re-select objectives
(goals) during each replanning epoch
15Opportunistic Goals in PSP
Would these be related to Conditional reward
models? --e.g. how to model the goal that if
you see someone injured, help them (and not
let the robot injure someone just so
it can collect the reward)
- Opportunistic goals can be handled in the PSP
model without much change - Goals like spot aliens may be seen as always
being present in the list of goals that the
planner (robot) has - Initially, these goals may not be picked because
despite having high reward, these goals also have
high cost (i.e., no cheap plan to satisfy them
even as estimated by the relaxed plan analysis) - As execution progresses however, the robot may
reach states from where these goals become
reachable (even as estimated by the PSP goal
selection heuristic) - Note that this happens only because the world is
not static
16ReplanningRespecting Commitments
- In real-world, where you make commitments based
on your plan, you cannot just throw away the plan
at the first sign of failure - One heuristic is to reuse as much of the old plan
as possible while doing replanning. - A more systematic approach is to
- Capture the commitments made by the agent based
on the current plan - Model these commitments in terms of penalties for
certain (new) goals - Just as goals can have rewards, they can also
have penalties that you accrue for not achieving
them. Makes PSP objective selection a bit more
interesting -)
The worst team member is not the one who
doesnt do anything, but rather the one who
promises but doesnt deliver
17Interaction between Opportunistic goals and
Commitments
- Even if a high-reward sleeper goal becomes
available because a plan for it is feasible, it
may still not get selected because of the
commitments already made by the partial execution
of the current plan - The interesting point is that the objective
selection phase used by the PSP should be able to
handle it automatically (as long as we did post
commitment induced goals onto the stack).
18Monitoring for optimality
- Given the online-nature of planning, we need to
assume an epoch based model of planning, where
every so often you replan - So as not to spend the whole life planning, you
need to be good at monitoring - Not just potential execution failures
- But also potential optimality reductions
- (The plan being followed is no longer likely to
be optimal.) - Optimality monitoring has been considered by
Koenig (in Life Long Planning work) and more
recently by FritzMcIlraith. Their approaches
are similar and have several limitations - The annotations used are often of the size of
the search space. E.g., the idea in Fritz seems
to be mostly to keep track of all possible action
sequences (including those that werent
applicable originally) and see if they become
applicable and reduce the f-value. Secondly,
Fritz doesnt consider optimality damage caused
by, for example, sleeping goals becoming active - A better idea may be to reduce the scope of
monitoring and check necessary conditions and
sufficient conditions for optimality separately
(e.g. secondary search) - Monitoring, in general, may have to do some
relaxed plan based objective (re)selection, to
see whether or not the current plans focus on the
selected set of goal is still optimal
19Allegiance to the old plan in replanning
- You never have any allegiance to the old plan in
replanning from the execution cost point of view - You may try to salvage the old plan to reduce
planning cost - You can have allegiance to the commitment you
made if you published your old plan (it is to the
commitment, not the specific plan) - These can be modeled in terms of goal penalties
- Of course, one way of ensuring commitments are
not broken is to stick to the exact old plan. But
this could be sub-optimal - E.g. I was going to go by a red mercedes to
Tucson and when I published it, my friend in Casa
Grande said he will meet me to say hello. Now the
commitment is only to meeting the friend in Casa
Grande, not driving Red Mercedes.
I make a big-deal about this only because in the
literature, replanning is made synonymous with
sticking to as much of the old plan as possible
20Effect of Planning Strategy on Allegiance to the
Plan
- Once we agree that allegiance to the old plan can
be useful to reduce planning cost, a related
question is what planning strategies are best
equipped to modify an existing plan to achieve
the new objectives. - Here, there is some argument in favor of partial
order planners (in as much as they allow
insertion of actions into the existing plan) - But I am not fully convinced
21Epilogue
- As we go forward to look at how to do planning in
the presence of more and more expressive domain
models, remember that you can (and may)
intentionally simplify the domain model, plan
with it and then replan.. - So you already know one way of handling dynamic,
multi-agent, stochastic, non-instantaneous etc.
worlds - And it may not even be all that sub-optimal
considering the success of FF-Replan
22Related Work
- Ours
- All the PSP work
- The online planning work (benton/Minh)
- The commitment sensitive replanning work (Will)
- Outsiders
- Life-long A work by Sven Koenig
- Optimality monitoring work by Fritz McIlraith
- Online anticipatory algorithms?