Heuristic Search in Artificial Intelligence: Recent Enhancements and Applications - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

Heuristic Search in Artificial Intelligence: Recent Enhancements and Applications

Description:

Heuristic Search in Artificial Intelligence: Recent Enhancements and Applications ... BFS depends on its cost (heuristic) function. ... – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 57
Provided by: iseB8
Category:

less

Transcript and Presenter's Notes

Title: Heuristic Search in Artificial Intelligence: Recent Enhancements and Applications


1
Heuristic Search in Artificial Intelligence
Recent Enhancements and Applications
  • Ariel Felner
  • ISE Department
  • Ben-Gurion University.
  • felner_at_bgu.ac.il

2
optimal path search algorithms
  • For small graphs provided explicitly, algorithm
    such as Dijkstras shortest path, Bellman-Ford or
    Floyd-Warshal. Complexity O(n2).
  • For very large graphs , which are implicitly
    defined, the A algorithm which is a best-first
    search algorithm.

3
Best-first search schema
  • sorts all generated nodes in an OPEN-LIST and
    chooses the node with the best cost value for
    expansion.
  • generate(x) insert x into OPEN_LIST.
  • expand(x) delete x from OPEN_LIST and generate
    its children.
  • BFS depends on its cost (heuristic) function.
    Different functions cause BFS to expand different
    nodes..

20
25
30
35
30
35
35
40
Open-List
4
Best-first search Cost functions
  • g(x) Real distance from the initial state to x
  • h(x) The estimated remained distance from x to
    the goal state.
  • ExamplesAir distance
  • Manhattan Dinstance
  • Different cost combinations of g and h
  • f(x)level(x) Breadth-First Search.
  • f(x)g(x) Dijkstras algorithms.
  • f(x)h(x) Pure Heuristic Search (PHS).
  • f(x)g(x)h(x) The A algorithm (1968).

5
A
  • A is a best-first search algorithm that uses
    f(n)g(n)h(n) as its cost function.
  • f(x) in A is an estimation of the shortest path
    to the goal via x.
  • A is admissible, complete and optimally
    effective. Pearl 84
  • Result any other optimal search algorithm will
    expand at least all the nodes expanded by A

Breadth First Search
A
6
How to improve search?
  • Enhanced algorithms Perimeter-search, RBFS,
    Frontier-search etc, They all try to better
    explore the search tree.
  • Better heuristics more parts of the search tree
    will be pruned.
  • In the 3rd Millennium we have very large
    memories.
  • We can build large tables.
  • For enhanced algorithms large open-lists or
    transposition tables. They store nodes
    explicitly.
  • A more intelligent way is to store general
    knowledge. We can do this with heuristics

7
Pattern databases
  • Many problems can be decomposed into subproblems
    (patterns) that must be also solved.
  • The cost of a solution to a subproblem is a
    lower-bound on the cost of the complete solution
  • Instead of calculating the lower bounds on the
    fly, we expand the whole pattern-space and store
    the solution to each pattern configuration in a
    pattern database

Search space
Mapping function
Pattern space
8
Non-additive pattern databases
  • 15 puzzle 1013 states
  • Fringe pattern database Culberson Schaeffer
    1996. Has only 259 Million states.
  • Improvement of a factor of 100 over Manhattan
    Distance


9
Rubiks Cube (Korf 1997)
  • Has 1019 States.
  • PDB of the corner cubies has only 88 Million
    states.
  • Korf AAAI-97 built 2 other pattern databases
    for this domain.
  • The best way to combine different non-additive
    pattern databases is to take their maximum!

10
Disjoint Additive Databases
  • 15 and 24 puzzles Korf Felner AIJ-02,
  • Felner, Korf Hanan JAIR-04

8
7
  • Values of disjoint databases can be added for the
    heuristic
  • Better than maxing heuristics

6
6
6
6
11
Dynamically-partitioned additive databases
  • Statically-partition databases do not capture
    conflicts of tiles from different patterns.
  • We want to store as many pattern databases as
    possible and partition them to disjoint
    subproblems on the fly such the chosen partition
    will yield the best heuristic.
  • This is called Dynamically
  • Partitioning PDBs.

2
1
2
1
1
2
1
3
4
1
12
Experimental Results15 puzzle
Fives
Sixes
SevenEight
13
Results 24 puzzle.
  • For the 24 puzzle we compared the SDB of sixes
    with the DDB of pairs triples on 10 random
    instances.
  • The relative advantage of the SDB decreases when
    the problem scales up
  • What will happen for the 6x6 35 puzzle???

14
35 puzzle
We sampled 10 Billion random states and
calculated their heuristic. The table was created
by the method presented by Korf, Reid and
Edelkamp. (AIJ 129, 2001)

15
Tile puzzles Summary
  • The relative advantage of the SDB over DDB
    decreases over time.
  • For the 15 puzzle 1/2 of the domain is stored.
  • For the 24 puzzle 1/4 of the domain is stored.
  • For the 35 puzzle 1/7 of the domain is stored.
  • The memory needed by the DDB was 100 times
    smaller than that of the SDB!!

16
4-peg Towers of Hanoi (TOH4)
  • There is a conjecture about the length of optimal
    path but it was not proven.
  • Systematic search is the only way to solve this
    problem or to verify the conjecture.
  • There are too many cycles. IDA as a DFS will not
    prune these cycles. Therefore, A (actually
    frontier A Korf Zhang 2000) was used.

17
Heuristics for the TOH
  • Infinite peg heuristic (INP) Each disk moves to
    its own temporary peg.
  • Additive pattern databases
  • Felner, Korf Hanan, JAIR-04

18
Additive PDBS for TOH4
  • Partition the disks into disjoint sets
  • Store the cost of the complete pattern space of
    each set in a pattern database.
  • Add values from these PDBs for the heuristic
    value.
  • The n-disk problem contains 4n states
  • The largest database that we stored was of 14
    disks which needed 414256MB.

6
10
19
TOH4 results
  • The difference between static and dynamic is
    covered in Felner, Korf Hanan JAIR-04

20
Vertex-Cover (VC) Felner et al JAIR-04
  • Given a graph we want the minimal set of vertices
    such that they cover all the edges.
  • VC was one of the first problems that was proved
    to be NP-complete.
  • Search tree
  • At each level, either include or exclude a
    vertex.
  • Improvements
  • If a node is excluded, all its neighbors bust be
    included.
  • Dealing with degree-0 and degree-1 vertices.

0
1
2
3
R
X0 V1,2,3
V0
V0,2 X1
V0,1
21
Depth-first Baranch and Bound
  • DFBnB Searches the tree from left to right.
  • Expands only sub trees with costs smaller than
    the best solution found so far.
  • We also used Itervative Deepening A IDA (Korf
    85)

6
8
7
6
7
8
6
22
Heuristics for VC
  • The included edges form the g part of fgh.
  • We want an admissible heuristic of the free
    vertices.
  • Pairwise heuristic
  • A maximum-matching of the free-graph.
  • For a triangle we can add two to the heuristic.
  • In general, a clique of size k contributes k-1 to
    h.
  • So partition the free-graph into disjoint
    cliques and sum up their heuristics.

VC EX
1
3
2
4
Free vertices
23
Additive pattern databases
  • Clique is NP-complete. However, in random
    graphs, cliques of size 5 and larger are rare.
    Thus, it is easy to finds small cliques
  • Pattern databases Instead of finding the cliques
    on the fly we identify them before the search and
    store them in a pattern database. We stored
    cliques of size 4 or smaller.
  • During the search we need to retrieve disjoint
    cliques from the pattern database.

24
VC results
  • The results are on random graphs of size 150 and
    an average degree of 16.
  • When we added our dynamic database to the best
    proven tree search algorithm we further improved
    the running time by a fact or more than 10.

25
Conclusions and Summary
  • In general Additivity can be applied whenever a
    problem can be decomposed into disjoint
    subproblems such that the sum of the costs is a
    lower bound on cost of the complete problem.
  • Additive databases is a special case of additive
    heuristic where we save the heuristics in a
    table.

26
The Graph Partitioning ProblemFelner AMAI-2004
  • Given a graph G(E,V) the problem is to partition
    the graph into two equal sized subsets of
    vertices.
  • The number of edges that are crossing the
    partition should me minimized.
  • The partition in the graph on the right is of
    cost 2.

27
GPP as a search problem
  • A sub problem in GPP is to assign a vertex to one
    of the subsets of the partition
  • Each level of the search tree corresponds to a
    specific vertex of the graph.
  • Each branch assigns the vertex to another subset
    of the partition.

1
  • Each node of the tree is a partial partition
    including some of the vertices.
  • Size of the tree 2n

1,2
1
2
1
1,2,3
1,2
1,3
3
2
2,3
  • Leaves of the tree are the complete partitions.
    One of them is the optimal.

28
Definitions
  • A node of the search tree is denoted by k while
    vertex of the graph is denoted by x.
  • A vertex that is already assigned to one of the
    subsets is called an assigned vertex.
  • Each of the other vertices is a free vertex.
    Free vertices are unsolved subgoals.
  • Given a node k of the search tree we define
  • g(k) the number of edges that already cross
    the partial partition due to assigned vertices.
  • h(k) A lower bound on the number of edges
    that will cross the given partition due to free
    vertices.

29
A heuristic from the free vertices
  • The free vertices have many edges connected to
    them.
  • Can we have an estimation on the number of such
    edges that must cross the partition?

1
3
4
2
Free vertices
30
  • More definitions
  • The subsets of the partial partition are A and B.
  • Each of the following heuristics completes the
    partition with A and B
  • We can guess about A and B
  • Types of the edges
  • I Edges in A A
  • II Edges from A to B
  • III Edges from A to B
  • IV Edges from A to B
  • A B
  • 3
  • 4

A5,6 B7,8
II
3 4 7 8
1 2 5 6
B
A
I
III
A
B
IV
31
f0 Uniform Cost Search
  • f0(k) g(k).
  • Edges that already cross the partition. Edges of
    type II.
  • Mainly for comparison reasons.

Assigned
II
3 4 7 8
1 2 5 6
A
B
Free
A
B
32
f1 Adding edges of type III
  • For each free vertex x we define d(x,A) as the
    number of edges from x to A and d(x,B) as the
    number of edges from to B.

A B
  • An admissible heuristic for a vertex x will be
    h1(x)mind(x,A),d(x,B)
  • h1(k)summing h1(x) for all free vertices x.
  • f1(k)g(k)h1(k)

1 2
3 4
x
33
  • Results for other graphs as well as using IDA
    were very similar.
  • A better heuristic solves the problem faster

34
  • f3 if faster than f0 by almost 10,000 for
    graphs with density of 6.
  • f3 is faster than f1 by a factor of 100 for a
    graph with density of 20.

35
  • Graphs of size 100. Solved by f3 only.
  • Once again as the density of the graph increase
    the optimal cut increases linearly and the time
    to solve the problem increases exponentially.

36
Other domains
  • Traveling salesman problem Korf 1996
  • Number partitioning Korf 1998
  • Bin-packing Korf 2002
  • Rectangle packing Korf 2004.
  • Multiple sequence alignment Hansen Zough 2004

37
Best Usage of Memory
  • Given 1 giga byte of memory, how do we best use
    it with pattern databases?
  • Holte, Newton, Felner, Meshulam and Furcy,
    ICAPS-2004 showed that it is better to use many
    small databases and take their maximum instead of
    one large database.
  • We will present a different (orthogonal) method
    Felner, Mushlam Holte AAAI-04.

38
Compressing pattern database Felner et al
AAAI-04
  • Traditionally, each configuration of the pattern
    had a unique entry in the PDB.
  • Our main claim ?
  • Nearby entries in PDBs are highly correlated
    !!
  • We propose to compress nearby entries by storing
    their minimum in one entry.
  • We show that ?
  • most of the knowledge is preserved
  • Consequences Memory is saved, larger patterns
    can be used ? speedup in search is obtained.

39
Cliques in the pattern space
  • The values in a PDB for a clique are d or d1
  • In permutation puzzles cliques exist when only
    one object moves to another location.

d
G
d1
d
  • Usually they have nearby entries in the PDB
  • A44444

A clique in TOH4
40
Compressing cliques
  • Assume a clique of size K with values d or d1
  • Store only one entry (instead of K) for the
    clique with the minimum d. Lose at most 1.
  • A44444 A44441
  • Instead of 4p we need only 4(p-1) entries.
  • This can be generalized to a set of nodes with
    diameter D. (for cliques D1)
  • A44444 A44411
  • In general compressing by k disks reduces memory
    requirements from 4p to 4(p-k)

41
TOH4 results 16 disks (142)
  • Memory was reduced by a factor of 1000!!! at a
    cost of only a factor of 2 in the search effort.


42
TOH4 larger versions
Memory was reduced by a factor of 1000!!! At a
cost of only a factor of 2 in the search
effort. Lossless compressing is noe efficient in
this domain.
  • For the 17 disks problem a speed up of 3 orders
    of magnitude is obtained!!!
  • The 18 disks problem can be solved in 5 minutes!!

43
Tile Puzzles
Goal State
Clique
  • Storing PDBs for the tile puzzle
  • (Simple mapping) A multi dimensional array ?
  • A1616161616 size1.04Mb
  • (Packed mapping) One dimensional array ?
    A1615141312 size 0.52Mb.
  • Time versus memory tradeoff !!

44
15 puzzle results
  • A clique in the tile puzzle is of size 2.
  • We compressed the last index by two ?
  • A161616168


45
24 puzzle
  • The same tendencies were obtained for the 24
    puzzle.
  • The 6-6-6-6 partitioning is so good that adding
    another set of 6-6-6-6 did not speedup the
    search.
  • We have also tried a 7-7-5-5 partitioning but it
    did not speedup the search.

46
Ongoing and future work
  • An item for the PDB of tiles (a,b,c,d) is in the
    form ltLa, Lb, Lc, Ldgtd
  • Store the PDBs in a Trie
  • A PDB of 5 tiles will have a level in the trie
    for each tile. The values will be in the leaves
    of the trie.
  • This data-structure will enable flexibility and
    will save memory as subtrees of the trie can be
    pruned

47
Trie pruninig
Simple (lossless) pruning Fold leaves with
exactly the same values.
No data will be lost.
2
2
2
2
2
48
Trie pruninig
  • Intelligent (lossy) pruning
  • Fold leaves/subtrees with are correlated to each
  • other (many option for this!!)
  • Some data will be lost.
  • Admissibility is still kept.

2
2
2
2
4
49
Trie Initial Results
A 5-5-5 partitioning stored in a trie with simple
folding
50
Neural Networks (NN)
  • We can feed a PDB into a neural network engine.
    Especially, Addition above MD
  • For each tile we focus on its dx and dy from its
    goal position. (i.e. MD)
  • Linear conflict
  • dx1 dx2 0
  • dy1 gt dy21
  • A NN can learn
  • these rules

2
1
dy1 2 dy20
51
Neural network
  • We train the NN by feeding the entire (or part of
    the) pattern space.
  • For example for a pattern of 5 tiles we have 10
    features, 2 for each tile.
  • During the search, given the locations of the
    tiles we look them up in the NN.

52
Neural network example
dx4
Layout for the pattern of the tiles 4, 5 and 6
dy4
dx5
4
dy5
dx6
dy6
53
Neural Network problems
  • We face the problem of overestimating and will
    have to bias the results towards underestimating.
  • We keep the overestimating values in a separate
    hash table
  • Results are encouraging!!

54
Ongoing and Future Work
  • Dual pattern Databases
  • VARIABLES versus VALUES
  • In the tile puzzles locations are variables and
    tiles are the values.
  • We ask Who is located in location X.
  • Swap their role and ask Where is tile X
    located

55
Dual pattern Databases
  • In regular PDBs we asked
  • Where are tiles lt1,2,3,4gt located and what
    does it take to move them to their goal postion
  • We now ask
  • Who is located in positions lt1,2,3,4gt and
    what does it take to distribute them to their
    goals
  • The same tables answers both questions!!

56
Search in Artificial Intelligence course
  • Course number 37214506
  • Semester B.
  • Monday at 1700-1830.
  • In Ben-Gurion University.
  • Open for CS students.
Write a Comment
User Comments (0)
About PowerShow.com