Compressing Pattern Databases - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Compressing Pattern Databases

Description:

In permutation puzzles cliques exist when only one object moves to another location. ... Storing PDBs for the tile puzzle (Simple mapping) A multi dimensional array ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 29
Provided by: iseB8
Category:

less

Transcript and Presenter's Notes

Title: Compressing Pattern Databases


1
Compressing Pattern Databases
  • Ariel Felner
  • Bar-Ilan University.
  • felner_at_cs.biu.ac.il
  • March 2004
  • Joint work with Ram Meshulam, Robert Holte and
    Richard E. Korf
  • Submitted to AAAI04.
  • Available at http//www.cs.biu.ac.il/felner

2
A and its variants
  • A (and IDA) is a best-first search algorithm
    that uses f(n)g(n)h(n) as its cost function.
    Nodes are sorted in an open-list according to
    their f-value.
  • g(n) is the shortest known path between the
    initial node and the current node n.
  • h(n) is an admissible (lower bound) heuristic
    estimation from n to the goal node
  • Recently, the attention has shifted towards
    creating more accurate heuristic functions.

3
Pattern databases
  • Many problems can be decomposed into subproblems
    (patterns) that must be also solved.
  • The pattern space is a domain abstraction of the
    original space
  • The cost of a solution to a subproblem is a
    lower-bound on the cost of the complete solution
  • Instead of calculating the lower bounds on the
    fly, we expand the whole pattern-space and store
    the solution to each pattern configuration in a
    pattern database

Search space
Mapping function
Pattern space
4
Non-additive pattern databases
  • Fringe database for the 15 puzzle by Culberson
    and Schaeffer 1996.
  • Stores the number of moves including tiles not in
    the pattern
  • Rubiks Cube. Korf 1997
  • The best way to combine different non-additive
    pattern databases is to take their maximum!


5
Additive pattern databases
  • We can add values from different pattern
    databases if they are disjoint (and count their
    own moves)
  • There are two ways to build additive databases
  • ? Statically-partitioned additive databases
    (they were also called disjoint pattern
    databases)
  • ? Dynamically-partitioned additive databases.
  • Applications of additive pattern databases
  • Tile puzzles
  • 4-peg Towers of Hanoi Puzzle (TOH4)

6
Statically-partitioned additive databases
  • These were created for the 15 and 24 puzzles
    Korf Felner 2002
  • We statically partition the tiles into disjoint
    patterns and compute the cost of moving only
    these tiles into their goal states.

7
8
6
6
6
6
  • For the 15 puzzle
  • 36,710 nodes.
  • 0.027 seconds.
  • 575 MB
  • For the 24 puzzle
  • 360,892,479,671
  • 2 days
  • 242 MB

7
4-peg Towers of Hanoi (TOH4)
  • There is a conjecture about the length of optimal
    path but it was not proven.
  • Systematic search is the only way to solve this
    problem or to verify the conjecture.
  • There are too many cycles. IDA as a DFS will not
    prune these cycle. Therefore, A (actually
    frontier A Korf Zhang 2000) was used.

8
Additive PDBS for TOH4
  • Partition the disks into disjoint sets (patterns)
    . For example, 10 and 6 for the 16-disk problem.
  • Store the cost of the complete pattern space of
    each set in a pattern database. (There are many
    enhancements)
  • The n-disk problem contains 4n states and 2n
    bits suffice to store each state.
  • The largest databases that we stored was of size
    14 which needed 414256MB.

9
TOH4 results
10
How to best use the memory
  • The speed of the search is directly related to
    the size of the pattern database.
  • We usually omit the computation time of the PDBs
    but cannot ignore the memory requirements
  • Holte, Newton, Felner, Mushulam and Furcy 2004
    showed that it is better to use many small
    databases and take their maximum instead of one
    large database.
  • We limit the discussion to 1 Giga bytes.

11
Compressing pattern databases
  • Traditionally, each configuration of the pattern
    had a unique entry in the PDB.
  • Our main cliam ?
  • Nearby entries in PDBs are highly correlated
    !!
  • We propose to compress nearby entries by storing
    their minimum in one entry.
  • We show that ?
  • most of the knowledge is preserved
  • Consequences Memory is saved, larger patterns
    cab be used ? speedup in search is obtained.

12
Cliques in the pattern space
  • The values in a PDB for a clique are d or d1
  • In permutation puzzles cliques exist when only
    one object moves to another location.
  • Usually they have
  • nearby entries
  • in the PDB

d
G
d1
d
13
Storing cliques
  • Assume a clique of size K with values d or d1
  • Lossy compression ? Store only one entry for the
    clique with the minimum d. Loose at most 1.
  • Lossless compression ? Store the minimum d. Also
    store K additional bits, one per entry.

A clique in TOH4
14
Compressing PDBs in TOH4
  • If we compress the last index of smallest disk
    then a PDB with P disks can now be stored in only
    4(P-1) entries instead of 4P
  • This can be generalized to a set of nodes with
    diameter D. (for cliques D1)
  • For TOH4, we fix the position of the largest P-2
    disks and compress all the 4216 entries of the
    smallest 2 disks.
  • In general, compressing any block will work, not
    necessarily cliques.

15
TOH4 results 16 disks (142)
  • Memory was reduced by a factor of 1000!!! at a
    cost of only a factor of 2 in the search effort.
  • Lossless compressing is not efficient in this
    domain.


16
TOH4 larger versions
Memory was reduced by a factor of 1000!!! At a
cost of only a factor of 2 in the search
effort. Lossless compressing is noe efficient in
this domain.
  • For the 17 disks problem a speed up of 3 orders
    of magnitude is obtained!!!
  • The 18 disks problem can be solved in 5 minutes!!

17
Tile Puzzles
0
0
3
  • We can take advantage of the simple heuristics.
    We can store only the addition above the
    Manhattan distance heuristic
  • Storing PDBs for the tile puzzle

0
0
6
7
0
0
10
11
0
0
2
2
  • (Simple mapping) A multi dimensional array ?
  • A1616161616 size1.04Mb
  • (Packed mapping) One dimensional array with ?
  • A1615141312 size 0.52Mb.
  • The time and memory tradeoff is straightforward!!

18
15 puzzle results
  • A clique in the tile puzzle is of size 2.
  • We compressed the last index by two ?
  • A161616168


19
24 puzzle
  • The same tendencies were obtained for the 24
    puzzle.
  • The 6-6-6-6 partitioning is so good that adding
    another set of 6-6-6-6 did not speedup the
    search.
  • We have also tried a 7-7-5-5 partitioning but it
    did not speedup the search.

20
Ongoing and future work
  • An item for the PDB of tiles (a,b,c,d) is in the
    form d
  • Store the PDBs in a Trie
  • A PDB of 5 tiles will have a level in the trie
    for each tile. The values will be in the leaves
    of the trie.
  • This data-structure will enable flexibility and
    will save memory as subtrees of the trie can be
    pruned

21
Trie pruninig
Simple (lossless) pruning Fold leaves with
exactly the same values.
No data will be lost.
2
2
2
2
2
22
Trie pruninig
  • Intelligent (lossy)pruning
  • Fold leaves/subtrees with are correlated to each
    other (many option for this!!)
  • Some data will be lost.
  • Admissibility is still kept.

2
2
2
2
4
23
Trie Initial Results
A 5-5-5 partitioning stored in a trie with simple
folding
24
Neural Networks (NN)
  • We can feed a PDB into a neural network engine.
    Especially, Addition above MD
  • For each tile we focus on its dx and dy from its
    goal position. (i.e. MD)
  • Linear conflict
  • dx1 dx2 0
  • dy1 dy21
  • A NN can learn
  • these rules

2
1
dy1 2 dy20
25
Neural network
  • We train the NN by feeding the entire (or part of
    the) pattern space.
  • For example for a pattern of 5 tiles we have 10
    features, 2 for each tile.
  • During the search, given the locations of the
    tiles we look them up in the NN.

26
Neural network example
dx4
Layout for the pattern of the tiles 4, 5 and 6
dy4
dx5
4
dy5
dx6
dy6
27
Neural Network problems
  • We face the problem of overestimating and will
    have to bias the results towards underestimating.
  • We keep the overestimating values in a separate
    hash table
  • Results are encouraging!!

28
Selective Pattern Database
  • Only part of the pattern space is queried for a
    single problem instance.
  • If we can identify that part we can only generate
    that part.
Write a Comment
User Comments (0)
About PowerShow.com