Title: Searching with Structured Keys Objectives
1Searching with Structured KeysObjectives
- To look at search structures for variable length
and multidimensional keys. - To understand several basic radix search
techniques, their advantages and disadvantages - To look at range queries and partial match
searches in multidimensional structures.
2Radix Techniques
. You remember the card sorter Basic operation
examine a particular column (field with finite
range) and put card into a bin based on it. To
sort k field deck Make k passes from
least to most significant column. Each
pass sorts on current column keeping ties in
previous order. Cost k passes n cards per
pass. Linear in number of characters in file
3Radix Sorting Technique
Dog Dog Dap BipBoy Bog Map BogTop
Top Bip BopBop Bop Dip BoyBip Bip
Dog DapBog Dip Bog DipDip Dap Top
DogDap Map Bop MapMap Boy Boy Top
pass 1
pass 2
pass 3
4Radix Searching Techniques
We will treat keys as consisting of a sequence of
small fields or simple bit patterns to develop
useful search structures Underlying assumption
about keys - They are long and/or of variable
length. - A comparison between keys (as opposed
to parts of keys) is costly or of variable
cost.
5Digital Search Trees
Same as a binary search tree, except that the
decision at each node is either - A match (key
matches) or - Branch on next bit
6Digital Search Tree
S 1 0 0 1 1E 0 0 1 0 1A 0 0 0 0 1R 1 0 0 1
0C 0 0 0 1 1H 0 1 0 0 0I 0 1 0 0 1N 0 1 1 1
0G 0 0 1 1 1
7Digital Search Tree Updates
Insert Search and hang it on. Delete Remove
and pull up any leaf descendant
8Digital Search Tree Updates
Insert T 1 0 1 0 0 Delete N,E
9Digital Search Trees - Costs
- The number of nodes inspected n a search is, at
most, the number of leading bits needed to
distinguish it from other keys - For randomly distributed keys this should be
T(log n), but if keys agree on lead bits problems
arise - But, visiting each node involves a key
comparison, this may be the bulk of the cost
10Radix Search Triestrie as in retrieval is
pronounced try
Keep keys at leaves only that way only one
full key comparison is needed per search
11Radix Search Trie Updates
Insert Search to a leaf or fall out at a single
branch node
- If you fall out, hang a new node on there
- If you come to a leaf continue the path in the
tree with the two elements until the bits differ
Insert Y 1 1 0 0 1 M 0 1 1 0
1 Deletions are similar
12A real fix Leave out 1 way branch nodes The
Search Structure Patricia (Practical Algorithm To
Retrieve Information Coded In Alphanumeric) Permit
s searches for n arbitrarily long keys with
- Only n nodes of storage
- Just one full key compare
- No useless tests
13Patricia Step 1Leave out 1 way branch nodes
- Add to each node the position of the bit to be
tested (we will still move left to right)
Bit 4 3 2 1 0 S 1 0 0 1 1 E 0 0 1 0 1
A 0 0 0 0 1 R 1 0 0 1 0 C 0 0 0 1 1
14- Patricia Step 2
- Add the dummy value 00000 (which is either not
stored or stored in the header). - Store keys in internal nodes that are ancestors
of their leaf positions in the tree
Since we still examine bits from left to right,
we know we have reached a key test when the bit
field is not less than that of previous node.
15Patricia Insertions
- Insertions can be made in the natural way.
- Follow the search path of new leaf.
- Determine the highest number bit where new key
and leaft key differ. - Retrace search path to the point whre a test on
this bit could occur. - Insert new node there with the node on the path
to leaf node as one child and a new leaf
containing the new key
16Patricia Insertions
Insert H 0 1 0 0 0 I 0 1 0 0 1
Observe that the structure must contain one more
leaf than internal node.
17Patricia Insertions If new key differs from
those along its search path in an untested bit
put in a new node with new key and pointer to
itself
18.Otherwise search ends at an upward
branchInsert new node with new key and pointers
to itself and node of upward branch. The left
pointer refers to the one with 0 in the differing
bit
bit
19Insertion Example
Insert N 0 1 1 1 0 G 0 0 1 1 1
Deletions are similar
20Radix Techniques
- Insensitive to order of insertions
- Sensitive to particular value
- Very useful for long or variable length keys