Incremental Negotiation and Coalition Formation for Resourcebounded Reasoners PowerPoint PPT Presentation

presentation player overlay
1 / 42
About This Presentation
Transcript and Presenter's Notes

Title: Incremental Negotiation and Coalition Formation for Resourcebounded Reasoners


1
Incremental Negotiation and Coalition Formation
for Resource-bounded Reasoners
Charlie Ortiz (PI) Eric Hsu Marie des Jardins
Regis Vincent SRI International
Barbara Grosz Tim Rauenbusch Harvard University
Sarit Kraus and Osher Yadgar Bar-Ilan University
2
Accomplishments since last meeting
  • New algorithm, experiments, analysis An anytime
    algorithm for distributed task allocation in
    combinatorial environments, Tim Rauenbusch.
  • New algorithms, architecture, demo for load
    balancing and team organization in very large
    scale systems (hundreds of nodes)
  • New tracking algorithm single mobile sensor can
    track target
  • Logic for analysis of emergent properties,
    Provable emergent properties of agent
    societies, Ortiz.
  • Incremental negotiation and coalition formation
    for resource bounded reasoners Preliminary
    report, SRI team.
  • Improved tracking on 16 node RadSim
  • Symposium program committee Sarit Kraus invited
    talk

3
Accomplishment report
4
Incremental allocation and coalition formation
Node that detects target initiates task auction
Perimeter agents search
Initial coalition
(I)
Projected cone
(II) Anticipate coalition
(III) Adapt/ refine coalition
Auction-like mechanisms represent quick and
de-centralized methods of task and resource
allocation ? agents need not exchange all
their information to some central point for
decision on allocation.
5
Time-bounded architecture incremental solution
refinement
New Tasks
Existing allocations
Plan context
Existing allocations
State
Solution refinement/ adaptation
Projecting Solution Forward
Initial Seed Solution
Coalitions
Future tasks
Solution refinements
Low-cost initial assignments
Dynamic Team Structuring (pre-process
ing/online)
Time/complexity
6
Commitment Networks
  • Hybrid negotiation mechanism
  • Contract nets
  • Combinatorial auctions (good area of
    collaboration with complexity folks)
  • Flexible task announcement
  • Multi-stage interactions for allocation
    refinement
  • Mediation methods
  • SharedPlans
  • Richer set of commitment/contract types for fault
    tolerance and communication savings
  • Anytime implementation with methods for handling
    task interaction dynamics

7
Commitment protocol
TaskST1,ST2,ST3
Task(L1,T1),(L2,T2),(L3,T3),
N1 N2 N3 N4 N5 N6 N7 N8
N9 N10...
Task/bidders list computation
Announce
Compute bid
Bidding
Winner determination
Award
time
8
Task allocation announcement methods
Object detected here
- Standard auction (L1,T1) gt set of nodes
- Combinatorial auction (L1,T1),(L2,T2),
gt set of nodes - Incremental mediation (L1,T1)
gt Ni, (L2,T2) gt Nj,.

Tasks announced by way of constraints gt
scalability
9
Summary of methods for controlling complexity
  • Anytime task auction with damping functions
  • Restrict local and inter-agent search
    (self-scheduling)
  • Time-bounded, multi-stage combinatorial task
    auctions via performance profiling (soon
    enough/good enough)
  • Resource allocation anticipation step
  • Communication savings via default commitments

10
Anytime allocationtask proximity
Two tasks/track segments S,T
Three coalitions 1,2,1,3,4,2
2
1
S,T,
S
3
T
4
S,T,,
T,S.
,S,T,
,,S,T
,S,T
,T,S
11
Summary of methods for controlling dynamics
  • Issues not in standard auctions and contract nets
  • Tasks change over time (use flexible commitments,
    damping functions)
  • Tasks cannot be announced to every agent

Auctioneer
Team 1
Team 2
Nodes aware of team members re-allocation
occurs as regions overlap
Q,0 vs Q/2,Q/2
12
Unfavorable dynamic behaviormultiple tasks with
priorities
S
T
U
R
Solution willingness to de-commitment
decreases as a function of distance
13
16-node RadSim Experiments
14
16 node configuration results
15
Average RMS Errors for Various Sensor
Configurations
16
Combinatorial resource allocation
dynamics/complexity of task interaction
  • The problem of bid generation in combinatorial
    auctions is non-trivial (tasks can interact)
  • Definition A relevant bid is one that cannot be
    inferred as an additive combination of an agents
    other bids.
  • In many domains, agents do not have additive cost
    functions (e.g., sensor warm up time)
  • Associate an interaction probability ip with a
    domain extent to which tasks interact

17
Relevant bid generation
  • To achieve a minimum cost allocation, it is
    important to generate all relevant bids.
  • Proposition If all relevant bids are not
    generated, then the optimal task allocation
    solution computed by a combinatorial auction
    winner determination may be suboptimal
  • Proposition If ip1, the number of relevant bids
    for each agent is 2 - 1

m
18
Incremental task allocation improvement algorithm
  • Algorithm
  • 1. Generate an initial allocation (e.g.,
    sequential auction)
  • 2. Initialize CG, an unconnected graph with m
    vertices, each corresponding to a task.
  • 3. Iteratively improve the allocation
  • add an edge that connects two unconnected
    subgraphs
  • optimally allocate the tasks that correspond to
    edges in newly connected subgraph

19
Complexity and convergence
  • Proposition The algorithm is anytime and
    guaranteed to find an optimal task allocation.
  • Proposition Assuming an iteration of the
    improvement phase that allocates i tasks takes
    O(n2 ) time, the running time of the
    improvement phase is O(n2 )

i
m
20
ITAI experimental results
21
RadSim experiments
  • Test uninformed mediation
  • Assumptions
  • Each task is a 2-second tracking slice
  • Assigned to a single tracker agent (simplifying
    assumption)
  • An Auction/Mediation mechanism simultaneously
    assigns m of those slices

22
Assumptions
  • Each agents utility for a bundle of m tasks to
    track an object at location xyi (i1..m)is given
    by
  • Si1..m 100-dist(xyi)-10time_on(xy1,xy2,,xyi)
  • where time_on(bundle) is the number of seconds
    that an agent is required to activate his sensor
    (introduces task interaction)
  • An agents utility is local information that must
    be requested and passed via Radsim messages

23
Combinatorial Bidding Time
maximum of 2550 seconds to communicate bids for
only 16 seconds of tracking
  • Assumptions
  • 4 agents
  • 10 simulation seconds to evaluate and communicate
    one bid (approximate time obtained in simulation)
  • MRB Maximum number of relevant bids assuming (1)
    agents areconcerned with assignment of all
    tasks (2) are concerned only with assignment of
    tasks to which they are assigned

24
Mediation Provides Early, Rapid Increase in Total
Agent Utility
  • Assumptions
  • 4 agents using Radsim system with Uninformed
    Mediation (51 sequential mediation steps takes
    approximately 450 simulation seconds parallel
    mediation steps will provide increased speed).
    Results are averaged over 3 runs.

25
Conclusion
  • Combinatorial auction bidding may be too costly
    in domain with high communication costs (e.g.,
    Radsim) even for moderately sized bundles
  • Mediation allows agents to reach approximate
    solutions quickly
  • Ongoing work develop bid strategies for
    combinatorial auction and compare

26
Scaling up to thousands of nodes
  • Group formation and
  • Dynamic re-organization

27
Distributed Dispatcher Model (DDM)
  • Environment a large number of targets a large
    number of mobile Dopplers.
  • Dynamic formation of coalitions
  • Dynamic allocation of Dopplers to areas
  • No need for close coordination/synchronization
    between agents each acts autonomously--reduce
    messages save time increase security.
  • Fault tolerant

28
General Structure
  • Hierarchical structure of coalitions based on
    geographical areas the base level consists of
    the Dopplers.
  • Each level controls the level below it, processes
    the information obtained from the level below it
    and from its coalition leader above it and
    reports its estimation to its coalition leader.

29
Hierarchy Example
Goals 1) generate a map of the targets as a
function of time 2) achieve good overage of
Dopplers of the area
A sampler agent is attached to each Doppler
30
Samplers
  • Every Doppler has an associated sampler.
  • The sampler supplies sets of target states
    t,x,y,vx,vy according to the Dopplers
    measurements and directs the Doppler where to go
    and how fast according to the instructions of its
    Sampler Coalition Leader.
  • NEW ALGORITHM
  • take 4 consecutive measurements
  • determine a set of at most 4 possible states of a
    sensed target.
  • No need to obtain information from other samplers.

31
Coalition Leader
  • Controls a certain area. Each higher level
    coalition leader controls a larger area.
  • Task form a map of its area.
  • Zone coalition leader controls other coalition
    leaders balances the number of Dopplers in its
    area.
  • Sampler coalition leader controls the behavior
    of a set of samplers (on top of Dopplers) in a
    given area. Directs its Dopplers movement to
    cover its area.

32
Forming the Targets Map
  • Obtain estimations about targets in its area from
    lower level coalition leaders.
  • Obtain estimations on coming targets into its
    area from its higher level coalition leader.
  • Update the map of your area. The algorithm is
    based on a graph of timed events and logical
    rules based on physics on the timed events.
  • Send your map to your leader.

33
Coalition Leader Algorithm to Move Dopplers
  • Every dT seconds ask the superior coalition
    leader for a prediction about incoming targets to
    the controlled area in tdT.
  • Ask the superior coalition leader for
    instructions of how many Dopplers to send to
    neighboring zones.
  • Form an estimation of targets and Dopplers at
    tdT according to your current map.
  • Decide on Dopplers zone changes and send to your
    subjugated coalition leaders balance the ratio
    of the number of sensors over all controlled
    zones with the ratio of the targets and satisfy
    your leader instructions.

34
Fault Tolerance
  • Each coalition leader knows the leaders in its
    zone and the ones in a lower level.
  • When one of the leaders in its zone stops
    functioning the coalition leader either divides
    the zone of the failed leader between the others
    or chooses one of the lower level leaders as a
    leader.
  • Coalition leader below the top level knows its
    neighbors. If the top coalition leader stops
    functioning the second level leaders choose
    distributively one of them as a leader and
    divides its zone between themselves.

35
Provable emergent properties
  • Develop a logical language with which to specify
    and prove emergent system properties
  • Predictability, discrete event systems
  • Modal temporal logic in which knowledge is
    externally ascribed to agents (Halpern et al)
  • Possible worlds are possible system executions
  • Add notion of closest possible execution
    through counterfactual semantics capture
    regional dependencies

36
Formalizing the notion of an emergent MAS
behavior
  • Basic notions system trajectory, stability,...
  • Axiom (B emerges from protocol P1 relative to
    protocol P2 and background condition f)
  • emerges(P1,B,P2,f) ?
  • cost(P1) gt cost(P2)
  • ? (CKf ? CKf)?occurs P1 generates occurs B
  • ? agents(P2,G) ? CKGf ? (occurs P2 generates
    occurs B)
  • P1 requires awareness/acquisition of knowledge
    of some environmental condition, f
  • p generates q ? (def)
  • (pgtq) ? (p gt q) ?time(p)time(q)

37
DRAPs
  • Definition A distributed resource allocation
    problem (DRAP), D, is a ltR,T,Ugt s.t.,
  • R set of resources/agents
  • T ? 2 , where E is a set of task elements
  • U R x N x T ? ?
  • Definition A solution, S(d), to D is a mapping
    T x N ? ? (with possible constraints on group
    cost/quality also).

E
38
ExampleANTS
39
Beneficial emergent behavior
  • Goal reverse engineer to computationally exploit
    beneficial emergent properties prove stability
    regional dependencies.
  • WD?1 on(ni,j), ?i?1,..,9,j ?1,2,3
  • ?1 B(n3, ?1 in_region(T1,(n3,3)),
  • axioms describing protocol
  • B(n, ?j in_region(j,l)) ? send(n,j, ?j
    in_region(j,l)) ...
  • Proposition Behavior ? (T1 is tracked) emerges
    from handoff protocol P1 with respect to dispatch
    protocol P2.

40
Meeting the requirements
41
Accomplishments since last meeting
  • New algorithm, experiments, analysis An anytime
    algorithm for distributed task allocation in
    combinatorial environments, Tim Rauenbusch.
  • New algorithms, architecture, demo for load
    balancing and team organization in very large
    scale systems (hundreds of nodes)
  • New tracking algorithm single mobile sensor can
    track target
  • Logic for analysis of emergent properties,
    Provable emergent properties of agent
    societies, Ortiz.
  • Incremental negotiation and coalition formation
    for resource bounded reasoners Preliminary
    report, SRI team.
  • Improved tracking on 16 node RadSim
  • AAAI symposium program committee participation

42
Remaining work
  • Q3,Q4 of 2001
  • Complete systematic experimentation
    (tasks,deadlines,etc)
  • Statistical acquisition of performance profiles
    (self-scheduling)
  • Demonstration of DDM scalability to hundreds of
    resources/tasks
  • 2002
  • Self-stabilizing algorithms for fault tolerance
  • Convergence analysis
  • Catalog of predictable properties of task-based
    auctions in dynamic domains (task interaction and
    time-bounds)
Write a Comment
User Comments (0)
About PowerShow.com