Title: KDimensional Trees
1K-Dimensional Trees
- Orest Pilskalns
- Washington State University - Vancouver
2Motivation
- Which data structure would you use?
- Have (x, y) locations on a map, need to search
with a bounding box? (e.g. you data set includes
lat, lon locations for restaurants, you want all
restaurants in downtown Missoula.) - Consider a database for personnel? Need to query
by salary range, age range, etc
3Review Trees
- A tree is a collection of nodes
- The collection can be empty
- (recursive definition) If not empty, a tree
consists of a distinguished node r (the root),
and zero or more nonempty subtrees T1, T2, ....,
Tk, each of whose roots are connected by a
directed edge from r
4Review Trees
- Child and parent
- Every node except the root has one parent
- A node can have an arbitrary number of children
- Leaves
- Nodes with no children
- Sibling
- nodes with same parent
5Review Binary Search Trees
- Binary Search Tree (BST) is a binary tree in
which the value in every node is - gt all values in the nodes left subtree
- lt all values in the nodes right subtree
6Review BST Properties
- Basic operation proportional to height
- Randomly built trees h O(lg n) (dividing
dataset) - Worst Case h O(n) (linked list)
7Review Search BST
- Recursive algorithm
- If root is null, then throw an exception (record
not found) - If search key is equal to the roots search key,
return a reference to the corresponding record - If search key is less than the roots search key,
search the left subtree - If search key is greater than the roots search
key, search the right subtree
Key 13
8Review BST - Insert
- Begin at the root and trace a path down the tree
as if we are searching for the node that contains
the key of z 14 - The new node must be a child of the leaf node
where we stop the search
9Review BST Delete (3 cases)
Root
- Node to be deleted has no children (leaf node)
- Delete 9
- Node to be deleted has a single child
- Delete 7
- Node to be deleted has 2 children
- Delete 6 (need to find left most successor with
one child)
15
18
6
30
7
3
13
2
4
14
9
10Review BST Delete Case 1
Root
15
18
6
30
7
3
13
4
9
11Review BST Delete Case 2
Root
Root
15
15
18
18
6
6
13
30
7
30
3
3
13
2
2
4
4
9
9
After 7 is deleted
Deleting 7 Splice out the node By making a
link between its child and its parent
12Review BST Delete Case 3
Root
17
Root
17
18
6
18
7
30
14
3
30
14
3
16
2
10
4
16
2
10
4
7
13
8
13
8
Deleting 6 Splice out 6s successor 7, which
has no left child, and replace the contents of 6
with the contents of the successor 7
After 6 is deleted
Note Instead of zs successor, we could have
spliced out zs predecessor
13Review BST Range Queries
- SearchRange(x1, x2)
- Find the split node
- Continue searching for x1, report all right
subtrees - Continue searching for x2, report all left
subtrees - When leaves q1 and q2 see if they need to be
reported
14Review BST Range Query (Range Tree)
- Let P be set of Points
- P stored in Balanced Binary Tree (restrict
difference in path length) - Uses O(n) storage O(nlog(n)) build time
- Range query reported in O(klog(n)), k is the
number of reported points
15K-Dimensional Trees (kd-tree)
- Space-partitioning data structure for organizing
points in a k-dimensional space. - Levels of the tree are split along successive
dimensions dividing the points. - Applications include multidimensional search key
(e.g. range searches and nearest neighbor
searches). - KD-trees are similar to BSP trees.
16- Each node in the tree represents a cut of the key
space in a direction parallel to one of the
dimensional axes
As with a BST, the tree and the division of the
key space depend upon the order in which the data
are inserted into the tree.
17Building from list of points
- Create Function BuildKDTree(P, depth)
- if(P.length 1) leaf node, set value
- else if (depth even)
- split P into two subsets (P1, P2) with
vertical line l through median x
coord of points - else
- split P into two subsets (P1, P2) with
horizontal line l through - median y coord of points
- vleft BuildKDTree(P1, depth1)
- vright BuildKDTree(P2, depth1)
- Return a node v storing l and make vleft the
left child of v and vright the right child of v -
18Cost of Building
- Recurrence
- T(n) O(1) if n 1
- T(n) O(n) 2T(n/2) if ngt1
- (Finding the median, splitting and sorting)
- Use Master Method to find closed form
- O(nlog(n))
19Insert Point
- Insert into a K-D tree is similar to BST
insertion - First search until a NULL pointer is found
- Insert the new record into the proper child
pointer
20Delete Nodes
- K-D delete is more complicated than BST delete.
To delete a node, N - If N has no children, replace it with a NULL
- If N has two children, we must find the smallest
value in the right subtree. However we must find
the smallest value for the same discriminator
(e.g. x, y) - Not necessarily leftmost, since some branches are
not based on this discriminator - Then we call delete recursively to remove the min
node.
21Search
- SearchKDTree(v,R)
- if v is a leaf report the point stored in v if
within R - else if ( region(lc(v) is fully contained in R )
- ReportSubTree(lc(v))
- else if ( region(lc(v) intersects R)
- SearchKDTree(lc(v), R)
- else if ( region(rc(v) is fully contained in R
) - ReportSubTree(rc(v))
- else if ( region(rc(v) intersects R)
- SearchKDTree(rc(v), R)
-
22Cost of Searching
- Recurrence
- Let T(N) be the number of nodes we looked at.
For the current node, we may need to look at one
of the two children and two of the four grand
children - T(n) 1 if n1
- T(n) 2 2T(n/4) if ngt1
- Closed form O(sqrt(n))
23Variation (K-Trie)
- 2d-Kdtrie
- 1. Uses internal node for Splitting Planes
- 2. All values stored in leaves
24Similar Trees
- Quad Tree split exactly into fourths
- OcTree split into eigths
- Binary Space Partition (BSP) splits do not have
to be aligned with axis
25KD-Trees