Lecture 11 : B-Tree - PowerPoint PPT Presentation

About This Presentation
Title:

Lecture 11 : B-Tree

Description:

If we adopt trivial method for storing BST in a disk, ... We want to reduce height of BST by using multiway search tree. m-way search tree ... – PowerPoint PPT presentation

Number of Views:222
Avg rating:3.0/5.0
Slides: 17
Provided by: cau2
Category:
Tags: bst | lecture | tree

less

Transcript and Presenter's Notes

Title: Lecture 11 : B-Tree


1
Lecture 11 B-Tree
  • Bong-Soo Sohn
  • Assistant Professor
  • School of Computer Science and Engineering
  • Chung-Ang University

Lecture notes courtesy of David Matuszek
2
Binary Search Tree (BST) Problem
  • Consider disk access for BST
  • Disk access is much slower than memory access
  • Disk access
  • Seek time gtgt rotational delay gt transfer time
  • Reducing seek time significantly affects overall
    performance
  • If we adopt trivial method for storing BST in a
    disk,
  • Each visit to a child node involves one disk
    access.
  • That is inefficient.
  • We want to reduce height of BST by using multiway
    search tree

3
m-way search tree
  • A non-empty node has M subtrees (2ltMltm)
  • Therefore, has M-1 keys(elements)
  • The values in a node are stored in ascending
    order, V1 lt V2 lt ... Vk (k lt M-1)
  • subtrees are placed between adjacent values, with
    one additional subtree at each end.
  • We can thus associate with each value a left'
    and right' subtree
  • the right subtree of Vi is the same as the left
    subtree of V(i1).
  • All the values in V1's left subtree are less than
    V1 all the values in Vk's subtree are greater
    than Vk and all the values in the subtree
    between V(i) and V(i1) are greater than V(i) and
    less than V(i1).

4
3-way search tree
5
B-Tree
  • B-Tree of order m has following property
  • m-way search tree
  • Keys in a node are in increasing order.
  • The root node (if not a leaf node) has at least
    two children
  • All nodes other than the root node have at least
    m/2 keys. (how many children?)
  • All external nodes are at the same level
  • Mostly used in Database systems

6
B-Tree
  • a variation on binary search trees that allow
    quick searching in files on disk
  • Instead of storing one key and having two
    children, B-tree nodes have (up to) n keys and
    n1 children, where n can be large
  • This shortens the tree (in terms of height) and
    requires much less disk access than a binary
    search tree would
  • Algorithm is complex and requires more
    computation. But computation is much cheaper than
    disk acces

7
Disk Access
  • Platter
  • Track
  • Sector (typical size 512B)
  • Block read/write unit , several consecutive
    sectors
  • Store related data into one block
  • Locality???
  • B-Tree utilize (spatial) locality

8
B-Tree
  • B-tree nodes have a variable number of keys and
    children, subject to some constraints.
  • In many respects, they work just like binary
    search trees, but are considerably "fatter."

9
B-Tree
  • Every node has the following fields
  • x.n, the number of keys currently in node x. For
    example, 4050.n in the above example B-tree is
    2. 708090.n is 3.
  • The x.n keys themselves, stored in nondecreasing
    order x.key1 lt x.key2 lt ... lt x.keyx.n
    For example, the keys in 708090 are ordered.
  • x.leaf, a boolean value that is True if x is a
    leaf and False if x is an internal node.
  • If x is an internal node, it contains x.n1
    pointers c1, c2, ... , x.cn, x.cn1 to
    its children.
  • Leaf nodes have no children so their ci fields
    are undefined.

10
B-Tree
  • The keys x.keyi separate the ranges of keys
    stored in each subtree if ki is any key stored
    in the subtree with root x.ci, then k1 lt
    x.key1 lt k2 lt x.key2 lt ... lt x.keyx.n
    lt kx.n1.
  • Every leaf has the same depth, which is the
    tree's height h.

11
B-Tree Search
  • Perform Just like Binary Search Tree.

12
Insert value X into a B-tree
  • 1. using the SEARCH procedure for M-way trees
    (described above) find the leaf node to which X
    should be added
  • 2. add X to this node in the appropriate place
    among the values already there
  • 3. if there are M-1 or fewer values in the node
    after adding X, then we are finished
  • 4. If there are M nodes after adding X, we say
    the node has overflowed

13
When overflowed during insertion
  • Left the first (M-1)/2 values
  • Middle the middle value (position 1((M-1)/2)
  • Right the last (M-1)/2 values
  • Notice that Left and Right have just enough
    values to be made into individual nodes. That's
    what we do... they become the left and right
    children of Middle, which we add in the
    appropriate place in this node's parent.
  • what if there is no room in the parent? If it
    overflows we do the same thing again split it
    into Left-Middle-Right, make Left and Right into
    new nodes and add Middle (with Left and Right as
    its children) to the node above.
  • We continue doing this until no overflow occurs,
    or until the root itself overflows. If the root
    overflows, we split it, as usual, and create a
    new root node with Middle as its only value and
    Left and Right as its children (as usual).

14
Example Insert 17, 6, 21, 67
17
6
67
21
15
B-Tree Deletion
  • Not covered here.

16
B-Tree Summary
  • B-Tree
  • Perfectly balanced
  • Every leaf node is at the same depth
  • Every node except root node is at least half full
  • Rebalancing is not so frequent
  • Reduced disk accesses when tree is stored in
    disks
  • Make the size of one node be one or more disk
    blocks to improve efficiency of disk accesses.
  • B-Tree height
  • search/insert/delete O(log N) amortized
Write a Comment
User Comments (0)
About PowerShow.com