Title: Multiagent Planning regarding Resource constraints
1Multiagent Planning regarding Resourceconstraints
2Presentation Plan
- Multiagent System
- Cooperative planning
- A specific study in a certain context (a paper in
AAMAS-2003) - Conclusion
3Multiagent System(MAS)
- a loosely coupled network of that interact to
solve problems that are beyond the individual
capacities or knowledge of ...
4Why MAS?
- Real problems are too large and complex for a
single agent - Individual agents are limited by its
knowledge, computing resources, and perspective - Provide efficient solutions where resources are
spatially distributed - Distributed sensors, seismic monitoring,
information gathering - Provide solutions where expertise is distributed
- Concurrent engineering, manufacturing,
health care
5MAS Characteristics
- Modular, distributed systems
- Decentralized data
- Agent has incomplete information or capabilities
- No global system control
- Asynchronous computation
6Cooperative Planning(1)
- is an important topic in MAS
- Objectives
- coordinate plans
- share resources
- share goals
7Cooperative Planning(2)
- is used for two reasons
- there exist problems that cannot be solved by a
single agent in isolation - improve efficiency and save cost even problems
can be solved on their own
8Cooperative Planning(3)
- An example in Supply Chain Management
- First, a chain manager assigns each agent a part
of task - next, the agents create their plans to complete
their part of task - finally, the chain manager analyses these plans
and may insist on cooperation between some agents
- In such cases, cooperation can be accomplished by
plan revision an agent tries to revise part of
its plan by exchanging resources and goals with
other agents.
9Cooperative Planning(4)
- Most existing research
- Negotiation
- Plan merging
- Multiagent MDPs(Markov Decision Processes)
- (Generic) Partial Global Planning
- Common points
- To avoid conflicts
- Assume sufficient resources
10Cooperative Planning(5)
- An important consideration
- How much an agent knows ahead of time about
the other agents? - 3 possibilities
- knowing nothing
- knowing everything need to know
- knowing some
11Paper study
- Title
- Multiagent Planning for Agents with Internal
Execution Resource Constraints - Objective
- Study how agents can cooperate to revise their
plans as they attempt to ensure not over-use
their local resources
12Introduction(1)
- Concepts
- Unconditional events
- e.g., rockslides in the road
- Conditional events
- e.g., merging traffic
- Execution resources
- include the perceptual, effectual, and
reasoning capabilities during execution
13Introduction(2)
- An ideal agent
- could manage its resources
-
- to respond rapidly and correctly to all events of
both types - to guarantee hard real-time performance
14Introduction(3)
- A realistic agent
- Execution resources are constrained
- Have to give up on guaranteeing timely responses
to some events - Concentrate their resources on other more
important demands - (e.g., the driver might focus on traffic
ahead at the expense of missing signs for an
exit.) - Might modify their behaviors to elongate reaction
times for events (e.g., drive more slowly) - Adopt restrictions on their behaviors to
eliminate some dangerous controllable events
(e.g., drive on the right), - Share information to help each other know what
conditional events to be prepared for (e.g., use
directional signals).
15Strategy in general
- The agent prioritizes its use of resources by
planning for events in order of their occurrence
probabilities - Unlikely events are ignored in case of
insufficient resources.
16CIRCA
- Cooperative Intelligent Real-time Control
Architecture - Realizes the strategy in MAS with execution
resources constraints - Models the interactions between actions and
(conditional and unconditional) events - Selects, schedules, and executes
recognition-reactions
17Two components
- AIS (Artificial Intelligence Subsystem)
- Probabilistic Planner
- Searches through the state space to determine the
appropriate reactions for hazardous states. - generates a set of recognition-reactions (TAPs).
- Choose the period for each TAP.
- Scheduler
- Bases on the resource constraints of the RTS
- Schedules the set of TAPs according to their
periods. - RTS (Real-Time Subsystem)
- executes the real-time control plans
pre-computed by the AIS.
18Concepts(1)
- TAPs
- Test-Action-Pairs, recognition reaction
- The recognition test is done by actively
collecting data or monitor for the relevant
aspects of the world. - A reaction is only executed if the world matches
the state description in the corresponding
recognition test. - are also referred as actions later
19Concepts(2)
- Control Plan
- is composed of a scheduled set of recognition
reaction pairs - is a cyclic (periodic) real-time schedule of
TAPs. - Processor utilization of each TAP
- u (worst testing time worst execution
time)/period
20Concepts(3)
- unlikely state (cutoff) heuristic
- if u gt 1 in a set of TAPs, no schedule is
possible! - In this case,
- CIRCA computes the probabilities, called state
probabilities, of the agent reaching different
states based on its local state diagram. - It finds a subset of the TAPs by removing those
planned for states with state probabilities below
a threshold. - It keeps increasing this threshold until a
schedulable subset is found. - Problem
- The failure probability may increase when it
is applied
21Concepts(4)
- Necessary actions
- are those that an agent may have to perform
during execution to preempt some hazards. - are planned for unconditional events and some
conditional events - Unnecessary actions
- are those that the agent includes in its plan
due to its ignorance about the plans of other
agents. - are planned for those conditional events that
will not arise. - To identify and remove enough unnecessary actions
to deal with heuristic problem
22Concepts(5)
- State-space representation
- is constructed from
- a set of state propositions, called state
features - actions events, called transitions
- A state consists of a set of state features that
describe the different aspects of the world. - Two types transitions
- Action transitions, controlled by plan executor
in RTS. - Temporal transitions, events outside the
systems control.
23Concepts(6)
- Temporal transitions
- Two types
- innocuous temporal transitions (labeled tt) or
- deleterious temporal transitions leading to
system failure (labeled ttf) - Any temporal transition is described by
- A Precondition
- An effect
- A probability function
- describes the probability of a transition
happening as a function of the time since it was
enabled, independently of other transitions.
24Concepts(7)
- Guaranteed actions
- When there is a ttf in a state, CIRCA plans a TAP
to preempt the hazard. - Preempting actions are called guaranteed actions.
- Reliable actions
- is another type of action, which is also
scheduled with real-time deadlines and thus
utilize resources. - However, they do not preempt any explicit
failures.
25Concepts(8)
- Private (local) features
- are those that no other agents are interested
in, e.g. its current fuel level. - Do not appear in the state diagrams of other
agents. - Public (shared) features
- are those features that more than one agent is
interested in. - An agent includes in its feature set only the
public features that it cares about. - It is through manipulating the public features
that agents impact each other.
26Concepts(9)
- Furthermore, a CIRCA agent includes
- Some public temporal transitions (labeled tts)
- Some public temporal action transitions (labeled
ttacs) - Of other agents into its KB
- Tts and ttacs can affect the public features the
agent cares about.
27A State diagram
- The diagram shown in next page is a partial state
diagram for an agent named FIGHTER. It is also
the reachability graph for FIGHTER. - Action SHOOT-MISSILE-1 is a guaranteed action to
preempt the ttf BEING-ATTACKED. - Action HEAD-TO-LOC1 is a reliable action and
private for FIGHTER. - COMM and ENEMY are public features shared by both
BOMBER and FIGHTER - HEADINGF and LOCF are private features that are
accessible only to FIGHTER. - BBOMB-1 and BBOMB-2 are public actions of
BOMBER - The temporal transitions FLY-TO-LOC0,
FLY-TO-LOC1, and FLYTO-LOC2 are private for
FIGHTER.
28State diagram for FIGHTER
ACTION
COMM F ENEMY F HEADINGF NULL LOCF
LOC0 FAILURE F
FLY-TO-LOC0
TT OR TTAC
GOAL STATE WITH DOTTED EDGE
HEAD-TO-LOC1
COMM F ENEMY F HEADINGF LOC1 LOCF
LOC0 FAILURE F
FAILURE STATE WITH THICK BORDER
FLY-TO-LOC1
COMM F ENEMY F HEADINGF NULL LOCF
LOC1 FAILURE F
BBOMB-1
HEAD-TO-LOC2
PUBLIC FEATURES/ ACTIONS/TEMPORALS IN
ITALIC PRIVATE FEATURES/ .ACTIONS.TEMORALS
TTACS IN NORMAL
COMM F ENEMY T HEADINGF NULL LOCF
LOC1 FAILURE F
COMM F ENEMY F HEADINGF LOC2 LOCF
LOC1 FAILURE F
COMM F ENEMY F HEADINGF LOC0 LOCF
LOC2 FAILURE F
COMM F ENEMY T HEADINGF NULL LOCF
LOC2 FAILURE T
SHOOT-MISSILE-1
HEAD-TO-LOC0
BEING-ATTACKED
FLY-TO-LOC2
COMM F ENEMY T HEADINGF NULL LOCF
LOC1 FAILURE T
COMM F ENEMY F HEADINGF NULL LOCF
LOC2 FAILURE F
COMM F ENEMY T HEADINGF NULL LOCF
LOC2 FAILURE F
BBOMB-2
BEING-ATTACKED
SHOOT-MISSILE-2
29Reachability analysis
- A rational agent need to foresee what actions
other agents might take, and choose its own
actions accordingly. - To play it most safe, the agent must consider and
plan for all states that it foresees, such an
analysis is a reachability analysis. - However, some states that might never
arise---unreachable states. - Unreachable states are included in a reachability
graph only because of ignorance at beginning and
can be removed if they can be recognized as such. - In an ideal case, an agent need not to know other
agents plan. But due to resources constraints,
it need to know intersecting parts of their plans.
30Convergence Protocol(1)
- Benefits
- Agents can identify unreachable states in the
state diagrams and - eliminate the associated actions from their
tentative plans. - e.g., in the figure before.
- Assumption before using it
- they have locally formed their reachability
graphs and - have selected all actions they would like to take
(as if there were no resource constraints).
31Convergence Protocol(2)
- Inquiring agent ()
- Choose the uncertain point that gives the
biggest estimated utilization reduction // - Ask the corresponding agent which action(s) it
will take - Upon receiving an answer, update the state
diagram and drop unnecessary actions from the
local plan - Loop until either the resource constraints are
satisfied or all uncertain points are examined -
- Answering agent ()
- When (being asked by another agent about an
uncertain point) - Identify the corresponding state(s) in the
local graph - Reply with the action(s) (or none) planned for
the state(s) - Record the agents name with the state(s)
-
- If (an action is removed from its state
diagram/plan) // - Inform all agents with names recorded with the
state that the - action is no longer planned for that state
-
32Convergence Protocol(3)
- Main points
- Uncertain points is a combination of a state and
a set of mutually exclusive ttacs. - If the agent starts with sufficient resources,
then it is guaranteed to find a plan that
schedules all the actions. And the agents
utility is not compromised. - If an agent fails to schedule for all remaining
actions, it resorts to the unlikely state
heuristic to remove the most unlikely (but
possibly necessary) actions. And the agents
utility decreases only when it drops some
necessary actions by raising the probability
threshold.
33Demonstration
- Both FIGHTER and BOMBER have 5 actions to
schedule if they do not know anothers plan. - See The Reachability Graph for BOMBER
- Suppose the resource constraints are simplified
such that each agent can schedule only 4 TAPs. - By running the Convergence Protocol, FIGHTER asks
BOMBER what actions it plans when ((COMM F)
(ENEMY F)) - We can see the results
34Evaluation(1)
- Experiment environment
- A set of random domains.Each domain has a random
number of agents from 2 up to a maximum of 10. - Each agent has its own knowledge base. The
knowledge base has 7 private and public binary
features (T/F) total. - The number of public features in a domain is
random. - There are 15 private and public actions combined,
and 7 private and public temporal transitions
combined for each agent. - We have generated 1126 agents (KBs) for 402
domains with which we perform our experiments.
35Evaluation(3)
- Experiment results (following)
- Action effectiveness is the percentage of
unnecessary actions removed by the protocol. - The average effectiveness is 51.74 and the
standard deviation is 35.84 - State effectiveness is the percentage of states
included in an agents reachability graph but
removed by the protocol. - The average effectiveness is 53.74 and the
standard deviation is 29.45. - The data suggest that more than half of the
resources are often wasted when an agent is
ignorant about the plans of other agents. - very often more than 50 of the states that they
think they may encounter are in fact not
reachable.
36Conclusions
- Strategy review
- First the agents construct their reachability
graphs - then iteratively refine their plans using the
Convergence Protocol. - The agents cooperatively determine the set of
states for which they need to react by exchanging
partial plans to generate more coherent views of
their activities. - Experiments conclusion
- They suggest that it is often worthwhile for
agents to exchange partial details of their plans
under resources restraints. - One major drawback
- is that it requires the agents to construct the
entire reachability graphs before they start to
talk.
37Reference
- Haksun Li, Edmund H. Durfee and Kang G. Shin,
2003, Multiagent Planning for Agents with
Internal Execution Resource Constraints - Boutilier. C. 1999. Sequential Optimality and
Coordination in Multiagent Systems. IJCAI-99. - Durfee, E. H. and Lesser, V. R. September 1991.
Partial Global Planning A Coordination Framework
for Distributed Hypothesis Formation. - Georgeff. M. 1983. Communication and Interaction
in multiagent planning. - Shintani, T, Ito, T., and Sycara, K. 2000.
Multiple Negotiations among Agents for a
Distributed Meeting Scheduler.
38Questions?
- Thank You!
- Merci Boucoup!