Title: Meta-Level Control in Multi-Agent Systems
1Meta-Level Control inMulti-Agent Systems
- Anita Raja and Victor Lesser
- Department of Computer Science
- University of Massachusetts
- Amherst, MA 01002
2Bounded Rationality
A theory of rationality that does not give an
account of problem-solving in the face of
complexity is sadly incomplete. It is worse than
incomplete it can be seriously misleading by
providing solutions that are without operational
significance Herb Simon, 1958
Basic Insight Computations are actions with
costs
3Motivation
- Control actions like scheduling and coordination
can be expensive - Current multi-agent systems do not explicitly
reason about these costs - Need to account for costs at all levels of
reasoning to provide accurate solutions - Build meta-level control framework with minimum
cost that reasons about cost of different
control actions
4Assumptions
- Agent can pursue multiple tasks simultaneously
- Agent can partially fulfill or omit tasks
- Agent can coordinate with other agents to
complete tasks - Tasks have varying arrival times, deadlines and
associated utilities - Tasks have alternate ways of being achieved
- Objective function MAX utility over a fixed time
horizon
5Agent Architecture
6Meta-level Decision Taxonomy
- Whether to accept, delay or reject an incoming
new task? - How much effort to put into reasoning about a new
task? - Whether to negotiate with another agent about
task transfer? - Whether to renegotiate in case of failure of
previous negotiation? - Whether to re-evaluate current plan when a task
completes?
7Decision Tree for New task arrival event
8Some State Features
Name Description Value Complexity
F0 Relative Utility of new task High Med Low Simple
F1 Relative Deadline of new task Simple
F2 Relative Utility of current schedule Simple
F8 Relation of slack fragments to current schedule Complex
F9 Relation of other agents slack fragments to non-local task High Med Low Complex
9Some Heuristic Decisions
- If current schedule has low priority (expected
quality is low) and incoming task is of high
priority (high expected quality with tight
deadline), then drop current schedule and
schedule new task immediately. - If current schedule has very high priority and
new task has low expected utility and a tight
deadline, drop the new task - If current task to be scheduled has high
execution uncertainty associated with it and a
deadline which is not tight, then introduce high
slack in the schedule and use medium scheduling
effort
10Related Work
- Monitoring Progress of Anytime Algorithms (Hansen
Zilberstein) - Uses dynamic programming for computation of a
non-myopic stopping rule - Predictability versus Responsiveness (Durfee
Lesser) - Control amount of coordination using a user
specified buffer - Meta-level Control of Coordination Protocols
(Kuwabara) - Detects and handles exceptions by switching
between protocols - Does not account for overhead of reasoning process
11Evaluation
- Compare system using hand-generated MLC
heuristics to - Naïve multi-agent system with no explicit MLC
- Deterministic choice MLC
- Random choice MLC
- MLC with knowledge of environment characteristics
including arrival model - Environments are characterized by the following
parameters - Type of tasks Simple (S), Complex (C),
Combination (A) - Frequency of Arrivals High (H), Medium (M), Low
(L) - Deadline Tightness High (H), Medium (M), Low
(L)
12An Example
13Evaluation, Continued
14Evaluation, Continued
15 Contributions
- Meta-level control in a complex environment
- Designed agent architecture that reasons about
overhead at all levels of the decision process - Parametric control algorithm which reasons about
effort and slack - Identified state features for control using
reinforcement learning
16Future Work
- Implement Reinforcement-Learning based control
algorithm - Function approximation (Sarsa(?) linear
tile-coding) - MDP states will be abstractions of actual system
state - Study effectiveness of RL algorithm on complex
domain - Compare performance of heuristic approach to RL
approach
17Research Questions
- What are the major obstacles to efficient
meta-level control? - How can costs be accurately included at all
levels of reasoning? - How to deal with the huge, complex state space?
- Is reinforcement learning a feasible approach to
learn good meta-level control policies?