Tirgul 6 - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Tirgul 6

Description:

The keys in each node are ordered, and relate to their left and right sub-trees ... Each node contains as many keys as possible without being larger than a ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 21
Provided by: Noam6
Category:
Tags: keys | tirgul

less

Transcript and Presenter's Notes

Title: Tirgul 6


1
Tirgul 6
  • B-Trees Another kind of balanced trees

2
Motivation
  • Primary memory (RAM) very fast, but
    costlySecondary storage (disk) very cheap, but
    slow
  • Problem a large D.B. must reside partially on
    disk. But disk operations are very slow.
  • Solution take advantage of important disk
    property -Basic read/write unit is a page (2-4
    Kb) - cant read/write less.
  • Thus when analyzing D.B. performance, we consider
    two different measures CPU time and number of
    times we need to access the disk.
  • Besides, B-trees are an interesting type of
    balanced trees...

3
B-Trees
  • B-Tree a balanced search tree whose nodes can
    have many children
  • A node x contains nx keys, and has nx1
    children (c1x, c2x, , cnx1x).
  • The keys in each node are ordered, and relate to
    their left and right sub-trees like regular
    search trees if ki is any key stored in the
    sub-tree rooted at cix, then
  • All leaves have the same depth h (the trees
    height)
  • There is a fixed integer t (the minimum degree)
  • Every node (besides the root) has at least t-1
    keys (i.e. t children)
  • Every node can contain at most 2t-1 keys (2t
    children).

4
Example
50
t3
5
B-Trees and disk access (last time...)
  • Each node contains as many keys as possible
    without being larger than a single page on disk.
  • Whenever we need to access a node load it from
    the disk (one read operation), after changing a
    node rewrite it to the disk.
  • For example, say each node contains 1000 keys
    then the root has 1001 children, each of which
    also has 1001 children. Thus with just 2 disk
    accesses we are able to access 10003 records.
  • Operations are designed to work in one pass from
    the root to the leaves we do not need to
    backtrack our steps. This further reduces the
    number of disk accesses we make.

6
The height of a B-Tree
  • Theorem If n ? 1, then for any B-tree of height
    h with n keys and minimum degree t ? 2
  • h ? log t ( (n1) / 2 )
  • Proof Each child of the root has at least t
    children, each of them also has at least t
    children, and so on. Thus in every sub-tree of
    the root there are at least
    nodes. Each of them contains at least t-1 keys.
    The root contains at least one key and has at
    least two children, so we have

7
B-Tree Search
  • Search is done in the regular way In each node,
    we find the sub-tree in which our value might be,
    and recursively find it there.
  • Performance O(th) O(tlogtn) - total run-time,
    out of which
  • O(h) O(logtn) - disk access operations

8
B-Tree Split
  • Used for insertion. This operation verifies that
    a node will have less than 2t-1 keys.
  • What we do is split the node into two nodes, each
    witht-1 keys. The extra key goes into the nodes
    parent (We assume the parent is not full)
  • To split a node x (look at the next slide for
    illustration), take keytx (notice it is the
    median key). All smaller keys (exactly t-1 of
    them) form one new (legal) node, the same with
    all larger keys. keytx goes into xs parent.
  • If the node we split is the root, then a new root
    is created. This new root contains only one key.

9
B-tree split
x
y
x
y
m
(parent)
m
(full node)
t-1 keys...
t-1 keys...
t-1 keys...
t-1 keys...
. . .
. . .
. . .
. . .
Notice that the parent has many other sub-trees
that dont change.
10
Example
50
89
83
65
95
96
83
65
95
96
A full node (t3)
11
B-Tree Insert
  • We insert a key only to a leaf. We start from the
    root and go down to the appropriate leaf.
  • On the way, before going down to the next node,
    we check if it is full. If so, we split it (its
    father is non-full because we checked this before
    going down to the father).
  • When we reach the right leaf, we know that the
    leaf is not full, so we can simply insert the new
    value to the leaf.
  • Notice that we may need to split the root, if it
    is full. In this case, the trees height
    increases (but the tree remains completely
    balanced!). Thats why we say that a B-tree grows
    from the root, in contrast to most of the trees,
    who grow from the leaves...

12
Example
We start with an empty tree (t3)
(II) Inserting 25 splits the root
10
(I) Inserting 3,7,34,10,39
(III) Inserting 40 and 20
(IV) Inserting 17 splits the right leaf
10
25
20
17
13
B-Tree Insert (cont.)
  • Performance
  • Split
  • three disk accesses (to write the 2 new nodes,
    and the parent)
  • O(t) - total run time
  • Insert
  • O(h) - disk accesses
  • O(tlogtn) - total run time
  • Requires O(1) pages in main memory.

14
B-Tree Delete
  • Delete is a bit more complicated...
  • The basic idea when we go down the tree, we
    make sure the next node has at least t keys, so
    we will be able to delete a value from a leaf.
  • For this we use merge - the inverse of split. If
    we have two children with t-1 keys and a parent
    with at least t keys, we take one key from the
    parent and merge it with the two children to
    become a single node.
  • We also use a rotation - move a value from one
    child who has at least t keys to another (see
    next slides).

15
Merge
parent with at least t keys
x
y
x
y
m
m
t-1 keys...
t-1 keys...
t-1 keys...
t-1 keys...
. . .
. . .
. . .
. . .
Notice that the parent has many other sub-trees
that dont change.
16
(left) rotation
m
X
t-1 keys
t-1 keys m
keys gt t-1
X keys gt t-1
T
T
  • Rotation is done when we have two consecutive
    siblings, one with exactly t-1 keys and one with
    at least t keys.
  • Similarly we can do right rotation.

17
B-Tree Delete (1)
  • For each node x on the way to k (x is internal,
    and doesnt contain k), determine the sub-tree
    that should contain k (whose root is cix), and,
  • If cix has at least t keys, simply delete k
    from it by recursion.
  • If cix has only t-1 keys but has an adjacent
    sibling with t keys, add cix an extra key by
    left/right rotation. Now recurse to cix
  • Otherwise, merge cix and one of its adjacent
    siblings (note that x itself has at least t
    keys). Continue deletion from the merged node.
  • Similarly to insertion, the tree height decreases
    (only) when we perform a merge operation, the
    parent is the root, and it contains only one
    value.

18
B-Tree Delete (2)
  • If we want to delete a key k and we found the
    node x that contains k
  • If x is a leaf simply remove k from x (we
    assume x has at least t keys, which we verify in
    the descent, see also below).
  • If x is an internal node
  • Let y be the left child of k. If y has at least t
    keys, then recursively delete the largest key
    from y (call it k) and replace k with k in x.
  • Otherwise, do this for the right child, z, of k.
  • If both y and z have only t-1 keys, merge y, z,
    and k. Recursively delete k from the merged node
    (notice that the new node has 2t-1 keys).

19
Performance of delete operation
  • O(h) disk accesses
  • O(th)O(tlogtn) total run time
  • The advantage - only one pass down the tree!

20
Example
Start
(I) Deleting 39 causes a right rotation
25
20
17
(II) Deleting 20 causes a merge
20
17
10
(III) Deleting 10
(IV) Deleting 40
25
40
34
17
17
17
(V) Deleting 34 causes a merge and the tree
shrinks
34
40
25
34
25
7
25
17
3
Write a Comment
User Comments (0)
About PowerShow.com