Title: Completely Balanced Trees
1CompletelyBalanced Trees
2Completely Balanced Trees
- So far, weve always grown our trees from the
root to the leaf nodes - Problem
- Unequal path lengths
- Goals
- Some maximum number of level traversals
- Expand from binary to N-ary trees
- For a tree containing N nodes, having M children
per node where M grows large means that the
height of the tree will be small
3B-Trees
- Balanced trees
- Used for database index creation
- An on-disk data structure
- Nodes consist of
- N key values
- N data pointers (pointing to entire data item
with that key, perhaps in a sequential file) - N1 node pointers (pointing to other B-tree
nodes) - Note
- Can calculate N given size of pointers and keys,
block size
4B-Trees
- Each node may contain a large number of keys
- of subtrees of each node may be large
- An on-disk data structure
- Designed to branch out large number of directions
- Contain lots of keys in each node
- ? the height of the tree is relatively small N
key values - Small number of nodes must be read from disk to
retrieve an item - Large node size (with lots of keys in the node)
but disk drive can usually read a fair amount of
data at once.
5B-Tree Definition
- A multi-way tree of order m is an ordered tree
where each node has at most m children - If k is the actual number of children in the
node, - ? k - 1 is the number of keys in the node.
- If the keys and subtrees are arranged in the
fashion of a search tree, then this is called a
multiway search tree of order m.
6Multi-way Search Tree of order of 4
Keys
Pointers
7B-Trees
8B-Tree Properties
- A B-tree of order m is a multiway search tree of
order m such that - All leaves are on the bottom level.
- 2. All internal nodes (except perhaps the root
node) have at least - ?(m / 2)? (nonempty) children ? keep it bushy
and balanced. -
- 3. The root node can have as few as 2 children if
it is an internal - node, and can obviously have no children if the
root node is a - leaf (that is, the whole tree consists only of
the root node).
9B-Tree Properties
4. Each leaf node (other than the root node if
it is a leaf) must contain at least ?(m /
2)? - 1 keys. Note ?x? is the ceiling
function whose value is the smallest integer that
is greater than or equal to x. E.g., ?3?
3 ?3.34? 4 ?1.98 ? 2 ?5.001 ? 6
10B-Tree Properties
A B-tree is a fairly well-balanced tree since all
leaf nodes must be at the bottom. Recall
condition 2. All internal nodes (except perhaps
the root node) have at least ?(m / 2)?
(nonempty) children ? keep it bushy and
balanced. Causes the tree to fan out, i.e.,
shorter height
11B-Tree Insertion
- Insert keys in order in a single block until it
fills - When need to add value where there is no room,
split the node into two nodes - Smaller half (rounded down) of values go into
first node - Larger half (rounded down) of values go into
second node - Median value goes into new parent node
- Node pointers around median value point to two
nodes at lower level
12Insertion Example
Insert the following letters into what is
originally an empty B-tree of order 5 C N G A
H E K Q M F W L T Z D P R X Y S Order 5 ? max
of 5 children and 4 keys. All nodes (except
root) must have a minimum of 2 keys. Inserting
in alphabetical order the first 4 letters
13Insertion Example
Insert H next. No room. Split into 2 nodes.
Move median item G up into new root node
Insert EKQ next
14Insertion Example
C N G A H E K Q M F W L T Z D P R X Y
S Inserting E, K, Q doesnt require splits.
But inserting M does
?? - split into 2
15Insertion Example
C N G A H E K Q M F W L T Z D P R X Y S F, W,
L, and T are then added without needing any
split.
16Insertion Example
C N G A H E K Q M F W L T Z D P R X Y S F, W,
L, and T are then added without needing any
split.
Move median (T) up split node
Adding Z requires node to split
17Insertion Example
Z
18C N G A H E K Q M F W L T Z D P R X Y S
D (which is the median too)
Insert PRXY without any splitting
19C N G A H E K Q M F W L T Z D P R X Y S
- - When S is inserted, node must split ? Q
(median) goes up. - Qs parent is full, split further ? make M
(medium of Qs parent) - go up.
20B Tree Deletion
- Delete H
- H is a leaf node. Easy.
- Move K and L over to the left
21B Tree Deletion
- Delete T (non-leaf) Find Ts successor (i.e., W)
and move it up to replace T
22B Tree Deletion
- Delete T
- In all cases, delete leaf if leaf has extra keys
23B Tree Deletion
- Delete R next
- R is a leaf node.
- Move S to Rs spot, move W to Ss spot
- Move X up to Ws spot
?
ß
a
24B Tree Deletion
25B Tree Deletion
- Delete E next - Very problematic ? siblings as
well as E has no extra keys - Combine the leaf
with one of two siblings - Move down parents key
that was between these two siblings
26B Tree (Delete E)
27B Tree Deletion
- - But node G must have at least ?(5 / 2)? -1
keys - G cannot borrow key from sibling (QX node) (no
extra keys) - Combine siblings and parent node into one (root)
node
28B Tree Deletion
29B Trees
30B-Tree
- Problem with B-Tree
- Somewhat unequal access times
- Difficult to traverse index sequentially
- B-Tree
- All data stored at lowest level (leaf nodes)
- Some values duplicated in internal nodes for
indexing - Cost extra storage, duplication, two types of
nodes - Benefits sequential access across bottom level
31B-Trees
- Variant of B-trees
- All data stored at lowest level (leaf nodes)
- All leaves are linked sequentially
- Used as a dynamic indexing method in relational
DBs - Contains index pages and data pages.
- Root and non-leaf nodes are index pages.
32B-Tree Example
- With 4 keys
- Leaf nodes are linked to each other via doubly
linked lists (not shown)
33B-Tree Insertions
- Key value determines placement
- Three cases for insertion
-
34(No Transcript)
35(No Transcript)
36Insertion Example
Adding record with Key 28
37Insertion Example
- Adding record with Key 70
-
- Should go into leaf containing 50 55 60 65 -gt
too bad, its full - Split page as follows
- Left leaf Right leaf
- 50 55 60 65 70
- 3. Middle key(60) placed in the corresponding
parent index page
38Insert (Leaf full index not)
70
39Insert (Leaf and index pages are full)
Add 95 ? belongs here Split leaf page into 2
75 80 and 85 90 95 Middle key (85)
goes up ? parent is full ? split parent 25 50
60 75 85 Middle key (60) made a
new parent of the parents
40Insert (Leaf and index pages are full)
95
41Rotation
When a leaf node is full and its sibling is
not. Reduce number of page splits. E.g., add 70 ?
previously, we split the 50 55 60 65 node and
brought 60 up Instead, move a record to its
sibling
42(No Transcript)
43Deletion Example
Delete 70 OK since fill factor 50 (min
records in a node)
44Deletion Example
Now delete 25 Leaf OK (fill factor
satisfied) Index not OK ? replace with 28
45Deletion Example
Delete 60 fill factor lt 50 ? combine leaf
pages and index pages
46Deletion Example