Search Algorithms for Agents - PowerPoint PPT Presentation

About This Presentation

Title:

Search Algorithms for Agents

Description:

path-finding problems. constraint satisfaction problems ... CSP & Path-finding ... When xi cannot find a consistent value with its local view, xi sends nogoods ... – PowerPoint PPT presentation

Number of Views:48

Avg rating:3.0/5.0

Slides: 50

Provided by: Yosi9

Category:

more less

Transcript and Presenter's Notes

Title: Search Algorithms for Agents

1
Search Algorithms for Agents

problems that have been addressed by search
algorithms can be divided into three classes
path-finding problems
constraint satisfaction problems (CSP)
two-player games

2
Two-player games

Two-player games studies are obviously related to
DAI/multiagent systems where agents are
competitive.

3
CSP Path-finding

Most algorithms for these classes were
originally developed for a single-agent
Among them, what kinds of algorithms would be
useful for cooperative problem solving by
multiple agents?

4
search algorithm graph representation

A search problem can be represented by using a
graph.
Some of the search problems can be solved by
accumulating local computations for each node in
the graph.

5
Asynchronous search algorithms definition

Asynchronous search algorithm
solves a search problem by accumulating
local computations.
The execution order of these local
computation can be arbitrary or highly
flexible, and can be executed
asynchronously and concurrently.

6
CSP a quick reminder

A CSP consists of n variables x1,,xn,
Whose values are taken from finite, discrete
domains
D1,,Dn, respectively, and a set of
constraints on their values.
The constraint pk(xk1,,xkj) is a predicate
that is defined on the Cartesian product
Dk1 x x Dkj. This predicate is true iff the
value assignment of these variables satisfies
this constraint.

7
CSP

Since constraint satisfaction is NP-complete in
general, a trial-and-error exploration of
alternatives is inevitable.
For simplicity, we will focus our attention on
binary CSPs, i.e., all the constraints are
between two variables.

8
Example binary CSP graph

The figure shows 3 variables x1,x2,x3 and
constraints x1 ! x3, x1 x2

x2
x1

!
x3
9
Distributed CSP

Assuming that the variables of a CSP
are distributed among agents, solving the
consist of achieving coherence between the
agents.
Problems like multiagent truth maintenance
tasks, interpretation problems, and assignment
problems can be formalized as distributed CSPs.

10
CSP and asynchronous algorithms

Each process will correspond to a variable.
We assume the following communication
model
Processes communicate by sending messages.
The delay in delivering a massage is finite.
Between two processes, messages are received in
the order they were sent.
Processes that have links to xi is called
neighbors
of xi.

11
Filtering Algorithm

A process xi perform the following procedure
revise(xi,xj) for each neighboring process xj.
procedure revise(xi,xj)
for all xi in Di do
if there is no value vj in Dj such that vj is
consistent with vi then delete vi from Di end
if end do
When a value is deleted, the process sends its
new
domain to his neighboring processes.
When xi receives a new domain from a neighbor xj,
the
procedure revise(xi,xj) is performed again.
The execution order of these processes is
arbitrary.

12
Filtering example 3-Queens
x1 x2 x3

Revise(x1,x2)
x1
Revise(x2,x3)
Revise(x3,x2)
x1 x2 x3
x1 x2 x3
x2
x3

13
3-Queens example continue
x1 x2 x3

Revise(x1,x3)
x1
x1 x2 x3
x1 x2 x3
x2
x3

14
Filtering Algorithm

If a domain of some variable becomes an empty
set, the problem is over-constrained and has no
solution.
If each domain has a unique value, then the
remaining values are a solution.
If there exist multiple values for some
variables, we cannot tell whether the problem has
a solution or not, and further search is
required.
Filtering should be considered a preprocessing
procedure that is invoked before the application
of other search methods.

15
K-Consistency

A CSP is k-consistent iff given any instantiation
of any k-1 variables satisfying all the
constraints among them, it is possible to find an
instantiation of any kth variable such that these
k variable values satisfy all the constraints
among them.
If the problem is k-consistent and j-consistent
for all jltk, the problem is called strongly
k-consistent.
Next, well see an algorithm that transforms a
given problem into an equivalent strongly
k-consistent problem.

16
Hyper-Resolution-Based Consistency Algorithm

The hyper-resolution rule is described as follows
(Ai is a proposition such as x11).

In this algorithm, all constraints are
represented as a nogood, which is a prohibited
combination of variables values. (example next
slide).
17
Graph coloring example

The constraints between x1 and x2 can be
represented as two nogoods x1red,x2red and
x1blue,x2blue.
By using the hyper-resolution rule we can obtain
from x1red,x2red and x1blue,x3blue a new
nogood x2red,x3blue

x2
x1
red,blue
red,blue
x3
red,blue
18
Hyper-Resolution-Based Consistency Algorithm

Each process represents its constraints as
nogoods.
Each process generates new nogoods by combining
the information about its domain and existing
nogoods using the hyper-resolution rule.
A newly obtained nogood is communicated to
related processes.
If a new nogood is communicated, the process
tries to generate further new nogoods using the
communicated nogood.

19
Hyper-Resolution-Based Consistency Algorithm

A nogood is a combination of variables values
that is
prohibited, therefore, a superset of a nogood
cannot be a solution.
If an empty set becomes a nogood, the problem is
over-
constrained and has no solution.
The hyper-resolution rule can generate a very
large
number of nogoods. If we restrict the
application of the
rules so that only nogoods whose length are less
than k
are produced, the problem becomes strongly
k-consistent.

20
Asynchronous Backtracking

An asynchronous version of a backtracking
algorithm, which is a standard method for solving
CSPs.
The completeness of the algorithm is guaranteed.
The processes are ordered by the alphabetical
order of the variable identifiers. Each process
chooses an assignment.
Each process maintains the current value of other
processes from its viewpoint (local view). A
process changes its assignment if its current
value isnt consistent with the assignments of
higher priority processes.
If there exist no value that is consistent with
the higher priority processes, the process
generates a new nogood, and communicate the
nogood to a higher priority process.

21
Asynchronous Backtracking

The local view may contain obsolete information.
Therefore, the receiver of a new nogood must
check whether the nogood ia actually violated
from its own local view.
The main messages types communicated among
processes are ok? to communicate the current
value,
and nogood to communicate a new nogood.

22
Asynchronous Backtracking example
X2 2
X1 1,2
!
!
(((ok?, (x2,2
(ok?, (x1,1))
X3 1,2
Local view (x1,1),(x2,2)
23
Asynchronous Backtracking example continue(1)
Add neighbor, and get value requests
Local view (x1,1)
X2 2
X1 1,2
New link
!
!
X3 1,2
(nogood, (x1,1),(x2,2))
24
Asynchronous Backtracking example continue(2)
(nogood,(x1,1))
X2 2
X1 1,2
!
!
X3 1,2
25
Asynchronous Backtracking

When received (ok?, (xj,dj)) do
add (xj,dj) to local_view
check_local_view end do
When received (nogood, nogood) do
record nogood as a new constraint
when (xk,dk) where xk is not a neighbor do
request xk to add xi to its neighbors
add xk to neighbors
add (xk,dk) to local_view end do
check_local_view
end do

26
Asynchronous Backtracking

Procedure check_local_view
when local_view and current_value are not
consistent do
if no value in Di is consistent with local_view
then resolve new nogood using hyper-resolution
rule and send the nogood to the lowest priority
process in the nogood
when an empty nogood is found do
broadcast to other processes that there is no
solution, terminate this algorithm end do
else select d in Di where local_view and d are
consistent
current_value f d
send (ok?, (xi,d)) to neighbors end if end
do

27
Asynchronous Weak-Commitment Search

This algorithm introduces a method for
dynamically ordering processes so that a bad
decision can be revised without an exhaustive
search.
For each process, the initial priority is 0.
If there exists no consistent value for xi, the
priority of xi is changed to k1, where k is the
largest value of related processes.
The order is defined such that any process with a
larger priority value has higher priority. If the
priority value of processes are the same, the
order is determined by the alphabetical order of
the variables.

28
Asynchronous Weak-Commitment Search

As in the asynchronous backtracking, each process
concurrently assigns a value to its variable, and
send the variable value to other processes.
The priority value, as well as the current
assignment, is communicated through the ok?
message.
If the current value is not consistent with the
local view the agent changes its value using the
min-conflict heuristic, i.e., a value that is
consistent with the local view and minimizes the
number of constraint violations with variable of
lower priority processes.

29
Asynchronous Weak-Commitment Search

Each process records the nogoods that have been
resolved.
When xi cannot find a consistent value with its
local view, xi sends nogoods messages to other
processes,
and increment its priority only if he created a
new nogood.

30
Asynchronous Weak-Commitment Search example

Q
Q
Q
Q
Q
Q
Q
Q
X1 (0)
X1 (0)
X2 (0)
X2 (0)
X3 (0)
X3 (0)
X4 (0)
X4 (1)
(a)
(b)
31
Asynchronous Weak-Commitment Search example -
continue
Q
Q
Q
Q
Q
Q
Q
Q
X1 (0)
X1 (0)
X2 (0)
X2 (0)
X3 (2)
X3 (2)
X4 (1)
X4 (1)
(c)
(d)
32
Asynchronous Weak-Commitment Search Completeness

The completeness of algorithm is guaranteed by
the fact
that the processes record all nogoods found so
far.
Handling a large number of nogoods is time/space
consuming. We can restrict the number of recorded
nogoods, such that each processes records only
the most
recently found nogoods. In this case the
theoretical
completeness is not guaranteed. Yet, when the
number of
recorded nogoods is reasonably large, an infinite
processing loop rarely occurs.

33
Path Finding Problem

A path finding problem consist of the following
components
A set of nodes N, each representing a state.
A set of directed links L, each representing an
operator available to a problem solving agent.
A unique node s called the start node.
A set of nodes G, each represents a goal state.

34
Path Finding Problem

More definitions
h(i) is the shortest distance from node i to
goal nodes
If j is a neighbor of i, the shortest distance
via j is given by f(j) k(i,j) h(j), where
k(i,j) is the cost of the link between i and j.
If i is not a goal node, then h(i) minjf(j)
holds.

35
Asynchronous Dynamic Programming Algorithm

Let assume the following situation.
For each node i there exist a process
corresponding to it.
Each process records h(i), which is the estimated
value of h(i). The initial value of h(i) is
except for goal nodes.
For each goal node g, h(g) is 0.
Each process can refer to h value of neighboring
nodes.

The algorithm each process updates h(i) by the
following procedure. For each neighboring node j,
compute f(j) k(i,j) h(j), and update h(i) as
follows h(i) f minjf(j).
36
Asynchronous Dynamic Programming Example
1
3
a
c
2
1
1
4
0
s
g
1
1
1
3
3
b
d
2
3
2
2
37
Asynchronous Dynamic Programming

If the costs of all links are positive, it is
proved that for each node i, h(i) converges to
the true value h(i).
In reality, the number of nodes can be huge, and
we cannot afford to have processes for all nodes.

38
Learning Real-Time A Algorithm (LRTA)

As with asynchronous dynamic programming, each
agent
records the estimated distance h(i)
Each agent repeats the following procedure.
Lookahead calculate f(j) k(i,j) h(j).
Update h(i) f minjf(j).
Action selection move to the neighbor j that has
the minimum f(j) value.

39
LRTA

The initial value of h is determined using an
admissible heuristic function.
By using an admissible heuristic function on a
problem with finite number of nodes, in which all
links are positive and there exist a path from
every node to a goal node, the completeness is
guaranteed.
Since LRTA never overestimates, it learns the
optimal solutions through repeated trials.

40
Real-Time A Algorithm (RTA)

Similar to LRTA, only that the updating phase is
different.
- instead of setting h(i) to the smallest value
of f(j),
the second smallest value is assigned to
h(i).
- as a result, RTA learns more efficiently
than LRTA, but can overestimate heuristic
costs.

In a finite space with positive edge costs, in
which there exist a path from every state to a
goal, using a non-negative admissible initial
heuristic values, RTA is complete.
41
Moving Target Search (MTS)

MST algorithm is a generalization of LRTA to the
case where the target can move.
We assume that the problem solver and the target
move alternately, and each can traverse at most
one edge in a single move.
The task is accomplished when the problem solver
and the target occupy the same node.

MTS maintains a matrix of heuristic values,
representing the function h(x,y) for all
pairs of states x and y.
The matrix initialized to the values returned
by the static evaluation
function.

42
MTS

To simplify the following discussion, we assume
that all
edges in the graph have unit cost.
When the problem solver moves
Calculate h(xj,yi) for each neighbor xj of xi.
Update the value of h(xi,yi) as follows
h(xi,yi) f max h(xi,yi), minxjh(xj,yi) 1
3. Move to the neighbor xj with the minimum
h(xj,yi).

43
MTS

When the target moves
Calculate h(xi,yj) for the targets new position
yj.
Update the value of h(xi,yi) as follows
h(xi,yi) f max h(xi,yi), h(xi,yj) -1
3. Assign yj to yi, yj is the new targets
position.
MST completeness
In a finite problem space with positive edge
costs , in which
there exists a path from every state to the goal
state,
starting with non-negative admissible initial
heuristic
values, and with the other assumptions we
mentioned,
the problem solver will eventually reach the
target.

44
Real-Time Bidirectional Search Algorithm (RTBS)

Two problem solvers starting from the initial and
goal states move toward each other.
Each of them knows its current location, and can
communicate with the other.
The following steps are executed until the
solvers meet
Control strategy select a forward or backward
move.
Forward move the forward solver moves
toward the other.
Backward move the backward solver moves
toward the other.

45
RTBS

There are two categories of RTBS
Centralized RTBS where the best action is
selected from among all possible moves of the two
solvers.
Decoupled RTBS where the two solvers
independently make their own decisions.
The evaluation results show that when the
heuristic
function return accurate values decoupled
performs better
than centralized.
Otherwise, centralized is better.

46
Is RTBS better than unidirectional search?

The number of moves for centralized RTBS is
around 1/2 in 15-puzzles and 1/6 in 24-puzzles
that for real-time unidirectional search.
In mazes, the number of moves for RTBS is double
that for unidirectional search.
The key to understand this results is to view
that the
difference between RTBS and unidirectional
search is their
problem spaces.

47
RTBS

We call a pair of locations (x,y) a p-state.
We call the problem space consisting of p-states
a combined problem space.
A heuristic depression is a set of connected
states with heuristic values less than or equal
to the set of immediate surrounding.
The performance of real-time search is sensitive
to the topography of the problem space,
especially to heuristic depressions.

48
RTBS

Heuristic depressions of the original problem
space have been observed to become large and
shallow in the combined problem space.
- if the original heuristic depressions are
deep, they become large, and that makes the
problem harder to solve.
- if the original depressions are shallow, they
become very shallow, and this makes the
problem easier to solve

49
(No Transcript)

Write a Comment

User Comments (0)