Title: Incremental Negotiation and Coalition Formation for Resourcebounded Reasoners
1Incremental Negotiation and Coalition Formation
for Resource-bounded Reasoners
Charlie Ortiz (PI) Eric Hsu Marie des Jardins
Regis Vincent SRI International
Barbara Grosz Tim Rauenbusch Harvard University
Sarit Kraus and Osher Yadgar Bar-Ilan University
2Accomplishments since last meeting
- New algorithm, experiments, analysis An anytime
algorithm for distributed task allocation in
combinatorial environments, Tim Rauenbusch. - New algorithms, architecture, demo for load
balancing and team organization in very large
scale systems (hundreds of nodes) - New tracking algorithm single mobile sensor can
track target - Logic for analysis of emergent properties,
Provable emergent properties of agent
societies, Ortiz. - Incremental negotiation and coalition formation
for resource bounded reasoners Preliminary
report, SRI team. - Improved tracking on 16 node RadSim
- Symposium program committee Sarit Kraus invited
talk
3Accomplishment report
4Incremental allocation and coalition formation
Node that detects target initiates task auction
Perimeter agents search
Initial coalition
(I)
Projected cone
(II) Anticipate coalition
(III) Adapt/ refine coalition
Auction-like mechanisms represent quick and
de-centralized methods of task and resource
allocation ? agents need not exchange all
their information to some central point for
decision on allocation.
5Time-bounded architecture incremental solution
refinement
New Tasks
Existing allocations
Plan context
Existing allocations
State
Solution refinement/ adaptation
Projecting Solution Forward
Initial Seed Solution
Coalitions
Future tasks
Solution refinements
Low-cost initial assignments
Dynamic Team Structuring (pre-process
ing/online)
Time/complexity
6Commitment Networks
- Hybrid negotiation mechanism
- Contract nets
- Combinatorial auctions (good area of
collaboration with complexity folks) - Flexible task announcement
- Multi-stage interactions for allocation
refinement - Mediation methods
- SharedPlans
- Richer set of commitment/contract types for fault
tolerance and communication savings - Anytime implementation with methods for handling
task interaction dynamics
7Commitment protocol
TaskST1,ST2,ST3
Task(L1,T1),(L2,T2),(L3,T3),
N1 N2 N3 N4 N5 N6 N7 N8
N9 N10...
Task/bidders list computation
Announce
Compute bid
Bidding
Winner determination
Award
time
8Task allocation announcement methods
Object detected here
- Standard auction (L1,T1) gt set of nodes
- Combinatorial auction (L1,T1),(L2,T2),
gt set of nodes - Incremental mediation (L1,T1)
gt Ni, (L2,T2) gt Nj,.
Tasks announced by way of constraints gt
scalability
9Summary of methods for controlling complexity
- Anytime task auction with damping functions
- Restrict local and inter-agent search
(self-scheduling) - Time-bounded, multi-stage combinatorial task
auctions via performance profiling (soon
enough/good enough) - Resource allocation anticipation step
- Communication savings via default commitments
10Anytime allocationtask proximity
Two tasks/track segments S,T
Three coalitions 1,2,1,3,4,2
2
1
S,T,
S
3
T
4
S,T,,
T,S.
,S,T,
,,S,T
,S,T
,T,S
11Summary of methods for controlling dynamics
- Issues not in standard auctions and contract nets
- Tasks change over time (use flexible commitments,
damping functions) - Tasks cannot be announced to every agent
Auctioneer
Team 1
Team 2
Nodes aware of team members re-allocation
occurs as regions overlap
Q,0 vs Q/2,Q/2
12Unfavorable dynamic behaviormultiple tasks with
priorities
S
T
U
R
Solution willingness to de-commitment
decreases as a function of distance
1316-node RadSim Experiments
1416 node configuration results
15Average RMS Errors for Various Sensor
Configurations
16Combinatorial resource allocation
dynamics/complexity of task interaction
- The problem of bid generation in combinatorial
auctions is non-trivial (tasks can interact) - Definition A relevant bid is one that cannot be
inferred as an additive combination of an agents
other bids. - In many domains, agents do not have additive cost
functions (e.g., sensor warm up time) - Associate an interaction probability ip with a
domain extent to which tasks interact
17Relevant bid generation
- To achieve a minimum cost allocation, it is
important to generate all relevant bids. - Proposition If all relevant bids are not
generated, then the optimal task allocation
solution computed by a combinatorial auction
winner determination may be suboptimal - Proposition If ip1, the number of relevant bids
for each agent is 2 - 1
m
18Incremental task allocation improvement algorithm
- Algorithm
- 1. Generate an initial allocation (e.g.,
sequential auction) - 2. Initialize CG, an unconnected graph with m
vertices, each corresponding to a task. - 3. Iteratively improve the allocation
- add an edge that connects two unconnected
subgraphs - optimally allocate the tasks that correspond to
edges in newly connected subgraph
19Complexity and convergence
- Proposition The algorithm is anytime and
guaranteed to find an optimal task allocation. - Proposition Assuming an iteration of the
improvement phase that allocates i tasks takes
O(n2 ) time, the running time of the
improvement phase is O(n2 )
i
m
20ITAI experimental results
21RadSim experiments
- Test uninformed mediation
- Assumptions
- Each task is a 2-second tracking slice
- Assigned to a single tracker agent (simplifying
assumption) - An Auction/Mediation mechanism simultaneously
assigns m of those slices
22Assumptions
- Each agents utility for a bundle of m tasks to
track an object at location xyi (i1..m)is given
by - Si1..m 100-dist(xyi)-10time_on(xy1,xy2,,xyi)
- where time_on(bundle) is the number of seconds
that an agent is required to activate his sensor
(introduces task interaction) - An agents utility is local information that must
be requested and passed via Radsim messages
23Combinatorial Bidding Time
maximum of 2550 seconds to communicate bids for
only 16 seconds of tracking
- Assumptions
- 4 agents
- 10 simulation seconds to evaluate and communicate
one bid (approximate time obtained in simulation) - MRB Maximum number of relevant bids assuming (1)
agents areconcerned with assignment of all
tasks (2) are concerned only with assignment of
tasks to which they are assigned
24Mediation Provides Early, Rapid Increase in Total
Agent Utility
- Assumptions
- 4 agents using Radsim system with Uninformed
Mediation (51 sequential mediation steps takes
approximately 450 simulation seconds parallel
mediation steps will provide increased speed).
Results are averaged over 3 runs.
25Conclusion
- Combinatorial auction bidding may be too costly
in domain with high communication costs (e.g.,
Radsim) even for moderately sized bundles - Mediation allows agents to reach approximate
solutions quickly - Ongoing work develop bid strategies for
combinatorial auction and compare
26Scaling up to thousands of nodes
- Group formation and
- Dynamic re-organization
27Distributed Dispatcher Model (DDM)
- Environment a large number of targets a large
number of mobile Dopplers. - Dynamic formation of coalitions
- Dynamic allocation of Dopplers to areas
- No need for close coordination/synchronization
between agents each acts autonomously--reduce
messages save time increase security. - Fault tolerant
28General Structure
- Hierarchical structure of coalitions based on
geographical areas the base level consists of
the Dopplers. - Each level controls the level below it, processes
the information obtained from the level below it
and from its coalition leader above it and
reports its estimation to its coalition leader.
29Hierarchy Example
Goals 1) generate a map of the targets as a
function of time 2) achieve good overage of
Dopplers of the area
A sampler agent is attached to each Doppler
30Samplers
- Every Doppler has an associated sampler.
- The sampler supplies sets of target states
t,x,y,vx,vy according to the Dopplers
measurements and directs the Doppler where to go
and how fast according to the instructions of its
Sampler Coalition Leader. - NEW ALGORITHM
- take 4 consecutive measurements
- determine a set of at most 4 possible states of a
sensed target. - No need to obtain information from other samplers.
31Coalition Leader
- Controls a certain area. Each higher level
coalition leader controls a larger area. - Task form a map of its area.
- Zone coalition leader controls other coalition
leaders balances the number of Dopplers in its
area. - Sampler coalition leader controls the behavior
of a set of samplers (on top of Dopplers) in a
given area. Directs its Dopplers movement to
cover its area.
32Forming the Targets Map
- Obtain estimations about targets in its area from
lower level coalition leaders. - Obtain estimations on coming targets into its
area from its higher level coalition leader. - Update the map of your area. The algorithm is
based on a graph of timed events and logical
rules based on physics on the timed events. - Send your map to your leader.
33Coalition Leader Algorithm to Move Dopplers
- Every dT seconds ask the superior coalition
leader for a prediction about incoming targets to
the controlled area in tdT. - Ask the superior coalition leader for
instructions of how many Dopplers to send to
neighboring zones. - Form an estimation of targets and Dopplers at
tdT according to your current map. - Decide on Dopplers zone changes and send to your
subjugated coalition leaders balance the ratio
of the number of sensors over all controlled
zones with the ratio of the targets and satisfy
your leader instructions.
34Fault Tolerance
- Each coalition leader knows the leaders in its
zone and the ones in a lower level. - When one of the leaders in its zone stops
functioning the coalition leader either divides
the zone of the failed leader between the others
or chooses one of the lower level leaders as a
leader. - Coalition leader below the top level knows its
neighbors. If the top coalition leader stops
functioning the second level leaders choose
distributively one of them as a leader and
divides its zone between themselves.
35Provable emergent properties
- Develop a logical language with which to specify
and prove emergent system properties - Predictability, discrete event systems
- Modal temporal logic in which knowledge is
externally ascribed to agents (Halpern et al) - Possible worlds are possible system executions
- Add notion of closest possible execution
through counterfactual semantics capture
regional dependencies
36Formalizing the notion of an emergent MAS
behavior
- Basic notions system trajectory, stability,...
- Axiom (B emerges from protocol P1 relative to
protocol P2 and background condition f) - emerges(P1,B,P2,f) ?
- cost(P1) gt cost(P2)
- ? (CKf ? CKf)?occurs P1 generates occurs B
- ? agents(P2,G) ? CKGf ? (occurs P2 generates
occurs B) - P1 requires awareness/acquisition of knowledge
of some environmental condition, f - p generates q ? (def)
- (pgtq) ? (p gt q) ?time(p)time(q)
37DRAPs
- Definition A distributed resource allocation
problem (DRAP), D, is a ltR,T,Ugt s.t., - R set of resources/agents
- T ? 2 , where E is a set of task elements
- U R x N x T ? ?
- Definition A solution, S(d), to D is a mapping
T x N ? ? (with possible constraints on group
cost/quality also).
E
38ExampleANTS
39Beneficial emergent behavior
- Goal reverse engineer to computationally exploit
beneficial emergent properties prove stability
regional dependencies. - WD?1 on(ni,j), ?i?1,..,9,j ?1,2,3
- ?1 B(n3, ?1 in_region(T1,(n3,3)),
- axioms describing protocol
- B(n, ?j in_region(j,l)) ? send(n,j, ?j
in_region(j,l)) ... - Proposition Behavior ? (T1 is tracked) emerges
from handoff protocol P1 with respect to dispatch
protocol P2.
40Meeting the requirements
41Accomplishments since last meeting
- New algorithm, experiments, analysis An anytime
algorithm for distributed task allocation in
combinatorial environments, Tim Rauenbusch. - New algorithms, architecture, demo for load
balancing and team organization in very large
scale systems (hundreds of nodes) - New tracking algorithm single mobile sensor can
track target - Logic for analysis of emergent properties,
Provable emergent properties of agent
societies, Ortiz. - Incremental negotiation and coalition formation
for resource bounded reasoners Preliminary
report, SRI team. - Improved tracking on 16 node RadSim
- AAAI symposium program committee participation
42Remaining work
- Q3,Q4 of 2001
- Complete systematic experimentation
(tasks,deadlines,etc) - Statistical acquisition of performance profiles
(self-scheduling) - Demonstration of DDM scalability to hundreds of
resources/tasks - 2002
- Self-stabilizing algorithms for fault tolerance
- Convergence analysis
- Catalog of predictable properties of task-based
auctions in dynamic domains (task interaction and
time-bounds)