Incremental Negotiation and Coalition Formation for Resourcebounded Reasoners presentation

About This Presentation

Transcript and Presenter's Notes

Title: Incremental Negotiation and Coalition Formation for Resourcebounded Reasoners

1
Incremental Negotiation and Coalition Formation
for Resource-bounded Reasoners
Charlie Ortiz (PI) Eric Hsu Marie des Jardins
Regis Vincent SRI International
Barbara Grosz Tim Rauenbusch Harvard University
Sarit Kraus and Osher Yadgar Bar-Ilan University
2
Accomplishments since last meeting

New algorithm, experiments, analysis An anytime
algorithm for distributed task allocation in
combinatorial environments, Tim Rauenbusch.
New algorithms, architecture, demo for load
balancing and team organization in very large
scale systems (hundreds of nodes)
New tracking algorithm single mobile sensor can
track target
Logic for analysis of emergent properties,
Provable emergent properties of agent
societies, Ortiz.
Incremental negotiation and coalition formation
for resource bounded reasoners Preliminary
report, SRI team.
Improved tracking on 16 node RadSim
Symposium program committee Sarit Kraus invited
talk

3
Accomplishment report
4
Incremental allocation and coalition formation
Node that detects target initiates task auction
Perimeter agents search
Initial coalition
(I)
Projected cone
(II) Anticipate coalition
(III) Adapt/ refine coalition
Auction-like mechanisms represent quick and
de-centralized methods of task and resource
allocation ? agents need not exchange all
their information to some central point for
decision on allocation.
5
Time-bounded architecture incremental solution
refinement
New Tasks
Existing allocations
Plan context
Existing allocations
State
Solution refinement/ adaptation
Projecting Solution Forward
Initial Seed Solution
Coalitions
Future tasks
Solution refinements
Low-cost initial assignments
Dynamic Team Structuring (pre-process
ing/online)
Time/complexity
6
Commitment Networks

Hybrid negotiation mechanism
Contract nets
Combinatorial auctions (good area of
collaboration with complexity folks)
Flexible task announcement
Multi-stage interactions for allocation
refinement
Mediation methods
SharedPlans
Richer set of commitment/contract types for fault
tolerance and communication savings
Anytime implementation with methods for handling
task interaction dynamics

7
Commitment protocol
TaskST1,ST2,ST3
Task(L1,T1),(L2,T2),(L3,T3),
N1 N2 N3 N4 N5 N6 N7 N8
N9 N10...
Task/bidders list computation
Announce
Compute bid
Bidding
Winner determination
Award
time
8
Task allocation announcement methods
Object detected here
- Standard auction (L1,T1) gt set of nodes
- Combinatorial auction (L1,T1),(L2,T2),
gt set of nodes - Incremental mediation (L1,T1)
gt Ni, (L2,T2) gt Nj,.

Tasks announced by way of constraints gt
scalability
9
Summary of methods for controlling complexity

Anytime task auction with damping functions
Restrict local and inter-agent search
(self-scheduling)
Time-bounded, multi-stage combinatorial task
auctions via performance profiling (soon
enough/good enough)
Resource allocation anticipation step
Communication savings via default commitments

10
Anytime allocationtask proximity
Two tasks/track segments S,T
Three coalitions 1,2,1,3,4,2
2
1
S,T,
S
3
T
4
S,T,,
T,S.
,S,T,
,,S,T
,S,T
,T,S
11
Summary of methods for controlling dynamics

Issues not in standard auctions and contract nets
Tasks change over time (use flexible commitments,
damping functions)
Tasks cannot be announced to every agent

Auctioneer
Team 1
Team 2
Nodes aware of team members re-allocation
occurs as regions overlap
Q,0 vs Q/2,Q/2
12
Unfavorable dynamic behaviormultiple tasks with
priorities
S
T
U
R
Solution willingness to de-commitment
decreases as a function of distance
13
16-node RadSim Experiments
14
16 node configuration results
15
Average RMS Errors for Various Sensor
Configurations
16
Combinatorial resource allocation
dynamics/complexity of task interaction

The problem of bid generation in combinatorial
auctions is non-trivial (tasks can interact)
Definition A relevant bid is one that cannot be
inferred as an additive combination of an agents
other bids.
In many domains, agents do not have additive cost
functions (e.g., sensor warm up time)
Associate an interaction probability ip with a
domain extent to which tasks interact

17
Relevant bid generation

To achieve a minimum cost allocation, it is
important to generate all relevant bids.
Proposition If all relevant bids are not
generated, then the optimal task allocation
solution computed by a combinatorial auction
winner determination may be suboptimal
Proposition If ip1, the number of relevant bids
for each agent is 2 - 1

m
18
Incremental task allocation improvement algorithm

Algorithm
1. Generate an initial allocation (e.g.,
sequential auction)
2. Initialize CG, an unconnected graph with m
vertices, each corresponding to a task.
3. Iteratively improve the allocation
add an edge that connects two unconnected
subgraphs
optimally allocate the tasks that correspond to
edges in newly connected subgraph

19
Complexity and convergence

Proposition The algorithm is anytime and
guaranteed to find an optimal task allocation.
Proposition Assuming an iteration of the
improvement phase that allocates i tasks takes
O(n2 ) time, the running time of the
improvement phase is O(n2 )

i
m
20
ITAI experimental results
21
RadSim experiments

Test uninformed mediation
Assumptions
Each task is a 2-second tracking slice
Assigned to a single tracker agent (simplifying
assumption)
An Auction/Mediation mechanism simultaneously
assigns m of those slices

22
Assumptions

Each agents utility for a bundle of m tasks to
track an object at location xyi (i1..m)is given
by
Si1..m 100-dist(xyi)-10time_on(xy1,xy2,,xyi)
where time_on(bundle) is the number of seconds
that an agent is required to activate his sensor
(introduces task interaction)
An agents utility is local information that must
be requested and passed via Radsim messages

23
Combinatorial Bidding Time
maximum of 2550 seconds to communicate bids for
only 16 seconds of tracking

Assumptions
4 agents
10 simulation seconds to evaluate and communicate
one bid (approximate time obtained in simulation)
MRB Maximum number of relevant bids assuming (1)
agents areconcerned with assignment of all
tasks (2) are concerned only with assignment of
tasks to which they are assigned

24
Mediation Provides Early, Rapid Increase in Total
Agent Utility

Assumptions
4 agents using Radsim system with Uninformed
Mediation (51 sequential mediation steps takes
approximately 450 simulation seconds parallel
mediation steps will provide increased speed).
Results are averaged over 3 runs.

25
Conclusion

Combinatorial auction bidding may be too costly
in domain with high communication costs (e.g.,
Radsim) even for moderately sized bundles
Mediation allows agents to reach approximate
solutions quickly
Ongoing work develop bid strategies for
combinatorial auction and compare

26
Scaling up to thousands of nodes

Group formation and
Dynamic re-organization

27
Distributed Dispatcher Model (DDM)

Environment a large number of targets a large
number of mobile Dopplers.
Dynamic formation of coalitions
Dynamic allocation of Dopplers to areas
No need for close coordination/synchronization
between agents each acts autonomously--reduce
messages save time increase security.
Fault tolerant

28
General Structure

Hierarchical structure of coalitions based on
geographical areas the base level consists of
the Dopplers.
Each level controls the level below it, processes
the information obtained from the level below it
and from its coalition leader above it and
reports its estimation to its coalition leader.

29
Hierarchy Example
Goals 1) generate a map of the targets as a
function of time 2) achieve good overage of
Dopplers of the area
A sampler agent is attached to each Doppler
30
Samplers

Every Doppler has an associated sampler.
The sampler supplies sets of target states
t,x,y,vx,vy according to the Dopplers
measurements and directs the Doppler where to go
and how fast according to the instructions of its
Sampler Coalition Leader.
NEW ALGORITHM
take 4 consecutive measurements
determine a set of at most 4 possible states of a
sensed target.
No need to obtain information from other samplers.

31
Coalition Leader

Controls a certain area. Each higher level
coalition leader controls a larger area.
Task form a map of its area.
Zone coalition leader controls other coalition
leaders balances the number of Dopplers in its
area.
Sampler coalition leader controls the behavior
of a set of samplers (on top of Dopplers) in a
given area. Directs its Dopplers movement to
cover its area.

32
Forming the Targets Map

Obtain estimations about targets in its area from
lower level coalition leaders.
Obtain estimations on coming targets into its
area from its higher level coalition leader.
Update the map of your area. The algorithm is
based on a graph of timed events and logical
rules based on physics on the timed events.
Send your map to your leader.

33
Coalition Leader Algorithm to Move Dopplers

Every dT seconds ask the superior coalition
leader for a prediction about incoming targets to
the controlled area in tdT.
Ask the superior coalition leader for
instructions of how many Dopplers to send to
neighboring zones.
Form an estimation of targets and Dopplers at
tdT according to your current map.
Decide on Dopplers zone changes and send to your
subjugated coalition leaders balance the ratio
of the number of sensors over all controlled
zones with the ratio of the targets and satisfy
your leader instructions.

34
Fault Tolerance

Each coalition leader knows the leaders in its
zone and the ones in a lower level.
When one of the leaders in its zone stops
functioning the coalition leader either divides
the zone of the failed leader between the others
or chooses one of the lower level leaders as a
leader.
Coalition leader below the top level knows its
neighbors. If the top coalition leader stops
functioning the second level leaders choose
distributively one of them as a leader and
divides its zone between themselves.

35
Provable emergent properties

Develop a logical language with which to specify
and prove emergent system properties
Predictability, discrete event systems
Modal temporal logic in which knowledge is
externally ascribed to agents (Halpern et al)
Possible worlds are possible system executions
Add notion of closest possible execution
through counterfactual semantics capture
regional dependencies

36
Formalizing the notion of an emergent MAS
behavior

Basic notions system trajectory, stability,...
Axiom (B emerges from protocol P1 relative to
protocol P2 and background condition f)
emerges(P1,B,P2,f) ?
cost(P1) gt cost(P2)
? (CKf ? CKf)?occurs P1 generates occurs B
? agents(P2,G) ? CKGf ? (occurs P2 generates
occurs B)
P1 requires awareness/acquisition of knowledge
of some environmental condition, f
p generates q ? (def)
(pgtq) ? (p gt q) ?time(p)time(q)

37
DRAPs

Definition A distributed resource allocation
problem (DRAP), D, is a ltR,T,Ugt s.t.,
R set of resources/agents
T ? 2 , where E is a set of task elements
U R x N x T ? ?
Definition A solution, S(d), to D is a mapping
T x N ? ? (with possible constraints on group
cost/quality also).

E
38
ExampleANTS
39
Beneficial emergent behavior

Goal reverse engineer to computationally exploit
beneficial emergent properties prove stability
regional dependencies.
WD?1 on(ni,j), ?i?1,..,9,j ?1,2,3
?1 B(n3, ?1 in_region(T1,(n3,3)),
axioms describing protocol
B(n, ?j in_region(j,l)) ? send(n,j, ?j
in_region(j,l)) ...
Proposition Behavior ? (T1 is tracked) emerges
from handoff protocol P1 with respect to dispatch
protocol P2.

40
Meeting the requirements
41
Accomplishments since last meeting

New algorithm, experiments, analysis An anytime
algorithm for distributed task allocation in
combinatorial environments, Tim Rauenbusch.
New algorithms, architecture, demo for load
balancing and team organization in very large
scale systems (hundreds of nodes)
New tracking algorithm single mobile sensor can
track target
Logic for analysis of emergent properties,
Provable emergent properties of agent
societies, Ortiz.
Incremental negotiation and coalition formation
for resource bounded reasoners Preliminary
report, SRI team.
Improved tracking on 16 node RadSim
AAAI symposium program committee participation

42
Remaining work

Q3,Q4 of 2001
Complete systematic experimentation
(tasks,deadlines,etc)
Statistical acquisition of performance profiles
(self-scheduling)
Demonstration of DDM scalability to hundreds of
resources/tasks
2002
Self-stabilizing algorithms for fault tolerance
Convergence analysis
Catalog of predictable properties of task-based
auctions in dynamic domains (task interaction and
time-bounds)

Write a Comment

User Comments (0)

About PowerShow.com

Incremental Negotiation and Coalition Formation for Resourcebounded Reasoners PowerPoint PPT Presentation