Title: Heuristic Search in Artificial Intelligence: Recent Enhancements and Applications
1Heuristic Search in Artificial Intelligence
Recent Enhancements and Applications
- Ariel Felner
- ISE Department
- Ben-Gurion University.
- felner_at_bgu.ac.il
2optimal path search algorithms
- For small graphs provided explicitly, algorithm
such as Dijkstras shortest path, Bellman-Ford or
Floyd-Warshal. Complexity O(n2). - For very large graphs , which are implicitly
defined, the A algorithm which is a best-first
search algorithm.
3Best-first search schema
- sorts all generated nodes in an OPEN-LIST and
chooses the node with the best cost value for
expansion. - generate(x) insert x into OPEN_LIST.
- expand(x) delete x from OPEN_LIST and generate
its children. - BFS depends on its cost (heuristic) function.
Different functions cause BFS to expand different
nodes..
20
25
30
35
30
35
35
40
Open-List
4Best-first search Cost functions
- g(x) Real distance from the initial state to x
- h(x) The estimated remained distance from x to
the goal state. - ExamplesAir distance
- Manhattan Dinstance
- Different cost combinations of g and h
- f(x)level(x) Breadth-First Search.
- f(x)g(x) Dijkstras algorithms.
- f(x)h(x) Pure Heuristic Search (PHS).
- f(x)g(x)h(x) The A algorithm (1968).
5A
- A is a best-first search algorithm that uses
f(n)g(n)h(n) as its cost function. - f(x) in A is an estimation of the shortest path
to the goal via x. - A is admissible, complete and optimally
effective. Pearl 84 - Result any other optimal search algorithm will
expand at least all the nodes expanded by A
Breadth First Search
A
6How to improve search?
- Enhanced algorithms Perimeter-search, RBFS,
Frontier-search etc, They all try to better
explore the search tree. - Better heuristics more parts of the search tree
will be pruned. - In the 3rd Millennium we have very large
memories. - We can build large tables.
- For enhanced algorithms large open-lists or
transposition tables. They store nodes
explicitly. - A more intelligent way is to store general
knowledge. We can do this with heuristics
7Pattern databases
- Many problems can be decomposed into subproblems
(patterns) that must be also solved. - The cost of a solution to a subproblem is a
lower-bound on the cost of the complete solution - Instead of calculating the lower bounds on the
fly, we expand the whole pattern-space and store
the solution to each pattern configuration in a
pattern database
Search space
Mapping function
Pattern space
8Non-additive pattern databases
- 15 puzzle 1013 states
- Fringe pattern database Culberson Schaeffer
1996. Has only 259 Million states. - Improvement of a factor of 100 over Manhattan
Distance
9Rubiks Cube (Korf 1997)
- Has 1019 States.
- PDB of the corner cubies has only 88 Million
states. - Korf AAAI-97 built 2 other pattern databases
for this domain. - The best way to combine different non-additive
pattern databases is to take their maximum!
10Disjoint Additive Databases
- 15 and 24 puzzles Korf Felner AIJ-02,
- Felner, Korf Hanan JAIR-04
8
7
- Values of disjoint databases can be added for the
heuristic - Better than maxing heuristics
6
6
6
6
11Dynamically-partitioned additive databases
- Statically-partition databases do not capture
conflicts of tiles from different patterns. - We want to store as many pattern databases as
possible and partition them to disjoint
subproblems on the fly such the chosen partition
will yield the best heuristic. - This is called Dynamically
- Partitioning PDBs.
2
1
2
1
1
2
1
3
4
1
12Experimental Results15 puzzle
Fives
Sixes
SevenEight
13Results 24 puzzle.
- For the 24 puzzle we compared the SDB of sixes
with the DDB of pairs triples on 10 random
instances.
- The relative advantage of the SDB decreases when
the problem scales up - What will happen for the 6x6 35 puzzle???
1435 puzzle
We sampled 10 Billion random states and
calculated their heuristic. The table was created
by the method presented by Korf, Reid and
Edelkamp. (AIJ 129, 2001)
15Tile puzzles Summary
- The relative advantage of the SDB over DDB
decreases over time. - For the 15 puzzle 1/2 of the domain is stored.
- For the 24 puzzle 1/4 of the domain is stored.
- For the 35 puzzle 1/7 of the domain is stored.
- The memory needed by the DDB was 100 times
smaller than that of the SDB!!
164-peg Towers of Hanoi (TOH4)
- There is a conjecture about the length of optimal
path but it was not proven. - Systematic search is the only way to solve this
problem or to verify the conjecture. - There are too many cycles. IDA as a DFS will not
prune these cycles. Therefore, A (actually
frontier A Korf Zhang 2000) was used.
17Heuristics for the TOH
- Infinite peg heuristic (INP) Each disk moves to
its own temporary peg. - Additive pattern databases
- Felner, Korf Hanan, JAIR-04
18Additive PDBS for TOH4
- Partition the disks into disjoint sets
- Store the cost of the complete pattern space of
each set in a pattern database. - Add values from these PDBs for the heuristic
value. - The n-disk problem contains 4n states
- The largest database that we stored was of 14
disks which needed 414256MB.
6
10
19TOH4 results
- The difference between static and dynamic is
covered in Felner, Korf Hanan JAIR-04
20Vertex-Cover (VC) Felner et al JAIR-04
- Given a graph we want the minimal set of vertices
such that they cover all the edges. - VC was one of the first problems that was proved
to be NP-complete. - Search tree
- At each level, either include or exclude a
vertex. - Improvements
- If a node is excluded, all its neighbors bust be
included. - Dealing with degree-0 and degree-1 vertices.
0
1
2
3
R
X0 V1,2,3
V0
V0,2 X1
V0,1
21Depth-first Baranch and Bound
- DFBnB Searches the tree from left to right.
- Expands only sub trees with costs smaller than
the best solution found so far. - We also used Itervative Deepening A IDA (Korf
85)
6
8
7
6
7
8
6
22Heuristics for VC
- The included edges form the g part of fgh.
- We want an admissible heuristic of the free
vertices. - Pairwise heuristic
- A maximum-matching of the free-graph.
- For a triangle we can add two to the heuristic.
- In general, a clique of size k contributes k-1 to
h. - So partition the free-graph into disjoint
cliques and sum up their heuristics.
VC EX
1
3
2
4
Free vertices
23Additive pattern databases
- Clique is NP-complete. However, in random
graphs, cliques of size 5 and larger are rare.
Thus, it is easy to finds small cliques - Pattern databases Instead of finding the cliques
on the fly we identify them before the search and
store them in a pattern database. We stored
cliques of size 4 or smaller. - During the search we need to retrieve disjoint
cliques from the pattern database.
24VC results
- The results are on random graphs of size 150 and
an average degree of 16. - When we added our dynamic database to the best
proven tree search algorithm we further improved
the running time by a fact or more than 10.
25Conclusions and Summary
- In general Additivity can be applied whenever a
problem can be decomposed into disjoint
subproblems such that the sum of the costs is a
lower bound on cost of the complete problem. - Additive databases is a special case of additive
heuristic where we save the heuristics in a
table.
26The Graph Partitioning ProblemFelner AMAI-2004
- Given a graph G(E,V) the problem is to partition
the graph into two equal sized subsets of
vertices. - The number of edges that are crossing the
partition should me minimized. - The partition in the graph on the right is of
cost 2.
27 GPP as a search problem
- A sub problem in GPP is to assign a vertex to one
of the subsets of the partition - Each level of the search tree corresponds to a
specific vertex of the graph. - Each branch assigns the vertex to another subset
of the partition.
1
- Each node of the tree is a partial partition
including some of the vertices. - Size of the tree 2n
1,2
1
2
1
1,2,3
1,2
1,3
3
2
2,3
- Leaves of the tree are the complete partitions.
One of them is the optimal.
28 Definitions
- A node of the search tree is denoted by k while
vertex of the graph is denoted by x. - A vertex that is already assigned to one of the
subsets is called an assigned vertex. - Each of the other vertices is a free vertex.
Free vertices are unsolved subgoals. - Given a node k of the search tree we define
- g(k) the number of edges that already cross
the partial partition due to assigned vertices. - h(k) A lower bound on the number of edges
that will cross the given partition due to free
vertices.
29A heuristic from the free vertices
- The free vertices have many edges connected to
them. - Can we have an estimation on the number of such
edges that must cross the partition?
1
3
4
2
Free vertices
30- More definitions
- The subsets of the partial partition are A and B.
- Each of the following heuristics completes the
partition with A and B - We can guess about A and B
- Types of the edges
- I Edges in A A
- II Edges from A to B
- III Edges from A to B
- IV Edges from A to B
A5,6 B7,8
II
3 4 7 8
1 2 5 6
B
A
I
III
A
B
IV
31f0 Uniform Cost Search
- f0(k) g(k).
- Edges that already cross the partition. Edges of
type II. - Mainly for comparison reasons.
Assigned
II
3 4 7 8
1 2 5 6
A
B
Free
A
B
32f1 Adding edges of type III
- For each free vertex x we define d(x,A) as the
number of edges from x to A and d(x,B) as the
number of edges from to B.
A B
- An admissible heuristic for a vertex x will be
h1(x)mind(x,A),d(x,B) - h1(k)summing h1(x) for all free vertices x.
- f1(k)g(k)h1(k)
1 2
3 4
x
33- Results for other graphs as well as using IDA
were very similar. - A better heuristic solves the problem faster
34- f3 if faster than f0 by almost 10,000 for
graphs with density of 6. - f3 is faster than f1 by a factor of 100 for a
graph with density of 20.
35- Graphs of size 100. Solved by f3 only.
- Once again as the density of the graph increase
the optimal cut increases linearly and the time
to solve the problem increases exponentially.
36Other domains
- Traveling salesman problem Korf 1996
- Number partitioning Korf 1998
- Bin-packing Korf 2002
- Rectangle packing Korf 2004.
- Multiple sequence alignment Hansen Zough 2004
37Best Usage of Memory
- Given 1 giga byte of memory, how do we best use
it with pattern databases? - Holte, Newton, Felner, Meshulam and Furcy,
ICAPS-2004 showed that it is better to use many
small databases and take their maximum instead of
one large database. - We will present a different (orthogonal) method
Felner, Mushlam Holte AAAI-04.
38Compressing pattern database Felner et al
AAAI-04
- Traditionally, each configuration of the pattern
had a unique entry in the PDB. - Our main claim ?
- Nearby entries in PDBs are highly correlated
!! - We propose to compress nearby entries by storing
their minimum in one entry. - We show that ?
- most of the knowledge is preserved
- Consequences Memory is saved, larger patterns
can be used ? speedup in search is obtained.
39Cliques in the pattern space
- The values in a PDB for a clique are d or d1
- In permutation puzzles cliques exist when only
one object moves to another location.
d
G
d1
d
- Usually they have nearby entries in the PDB
- A44444
A clique in TOH4
40Compressing cliques
- Assume a clique of size K with values d or d1
- Store only one entry (instead of K) for the
clique with the minimum d. Lose at most 1. - A44444 A44441
- Instead of 4p we need only 4(p-1) entries.
- This can be generalized to a set of nodes with
diameter D. (for cliques D1) - A44444 A44411
- In general compressing by k disks reduces memory
requirements from 4p to 4(p-k)
41TOH4 results 16 disks (142)
- Memory was reduced by a factor of 1000!!! at a
cost of only a factor of 2 in the search effort.
42TOH4 larger versions
Memory was reduced by a factor of 1000!!! At a
cost of only a factor of 2 in the search
effort. Lossless compressing is noe efficient in
this domain.
- For the 17 disks problem a speed up of 3 orders
of magnitude is obtained!!! - The 18 disks problem can be solved in 5 minutes!!
43Tile Puzzles
Goal State
Clique
- Storing PDBs for the tile puzzle
- (Simple mapping) A multi dimensional array ?
- A1616161616 size1.04Mb
- (Packed mapping) One dimensional array ?
A1615141312 size 0.52Mb. - Time versus memory tradeoff !!
4415 puzzle results
- A clique in the tile puzzle is of size 2.
- We compressed the last index by two ?
- A161616168
4524 puzzle
- The same tendencies were obtained for the 24
puzzle. - The 6-6-6-6 partitioning is so good that adding
another set of 6-6-6-6 did not speedup the
search. - We have also tried a 7-7-5-5 partitioning but it
did not speedup the search.
46Ongoing and future work
- An item for the PDB of tiles (a,b,c,d) is in the
form ltLa, Lb, Lc, Ldgtd - Store the PDBs in a Trie
- A PDB of 5 tiles will have a level in the trie
for each tile. The values will be in the leaves
of the trie. - This data-structure will enable flexibility and
will save memory as subtrees of the trie can be
pruned
47Trie pruninig
Simple (lossless) pruning Fold leaves with
exactly the same values.
No data will be lost.
2
2
2
2
2
48Trie pruninig
- Intelligent (lossy) pruning
- Fold leaves/subtrees with are correlated to each
- other (many option for this!!)
- Some data will be lost.
- Admissibility is still kept.
2
2
2
2
4
49Trie Initial Results
A 5-5-5 partitioning stored in a trie with simple
folding
50Neural Networks (NN)
- We can feed a PDB into a neural network engine.
Especially, Addition above MD - For each tile we focus on its dx and dy from its
goal position. (i.e. MD) - Linear conflict
- dx1 dx2 0
- dy1 gt dy21
- A NN can learn
- these rules
2
1
dy1 2 dy20
51Neural network
- We train the NN by feeding the entire (or part of
the) pattern space. - For example for a pattern of 5 tiles we have 10
features, 2 for each tile. - During the search, given the locations of the
tiles we look them up in the NN.
52Neural network example
dx4
Layout for the pattern of the tiles 4, 5 and 6
dy4
dx5
4
dy5
dx6
dy6
53Neural Network problems
- We face the problem of overestimating and will
have to bias the results towards underestimating. - We keep the overestimating values in a separate
hash table - Results are encouraging!!
54Ongoing and Future Work
- Dual pattern Databases
- VARIABLES versus VALUES
- In the tile puzzles locations are variables and
tiles are the values. - We ask Who is located in location X.
- Swap their role and ask Where is tile X
located
55Dual pattern Databases
- In regular PDBs we asked
- Where are tiles lt1,2,3,4gt located and what
does it take to move them to their goal postion - We now ask
- Who is located in positions lt1,2,3,4gt and
what does it take to distribute them to their
goals - The same tables answers both questions!!
56Search in Artificial Intelligence course
- Course number 37214506
- Semester B.
- Monday at 1700-1830.
- In Ben-Gurion University.
- Open for CS students.