BTree - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

BTree

Description:

Every page except the root and leaves. has at least m/2 descendents, m/2 ... The leaf level forms a complete, ordered index. Insert a new key : split (or ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 21
Provided by: csSung
Category:

less

Transcript and Presenter's Notes

Title: BTree


1
B-Tree
2
Definition
  • The order of B-tree m
  • Max. of descendants that a page can have
  • Every page except the root and leaves
  • has at least m/2 descendents, ?m/2?
  • The root has at least two descendants.
  • All the leaves appear on the same level.
  • The leaf level forms a complete, ordered index.
  • Insert a new key split (or redistribution)
  • Delete a key merge or redistribution

3
Variations
  • Page (Node) structure
  • max. m keys and max. m descendants
  • max. m keys and max. (m1) descendants

m K1 D1 K2 D2 Km-1 Dm-1 Km Dm
lt K1
Km-1 lt k lt Km
m D0 K1 D1 K2 D2 Km-1 Dm-1 Km Dm
lt K1
Km-1 lt k lt Km
Km ltk
4
Variations
  • no Key duplication in B-tree

m D0 K1 D1 K2 D2 Km-1 Dm-1 Km Dm
lt K1
Km-1 lt k lt Km
Km lt k
5
Variations
  • ltKey, Record-refgt
  • ltkey, refgt order? ???
  • Key-only
  • record ref. only in the leaf nodes

m D0 ltK1,R1gt D1 ltKm-1,Rm-1gt Dm-1
ltKm,Rmgt Dm
Km-1 lt k lt Km
Km lt k
lt K1
Data File
6
variations
m D0 K0 D1 K1 D2 Km-2 Dm-1 Km-1
lt K0
Km-2 lt k lt Km-1
root
m R0 K0 R1 Km-2 Rm-1 Km-1
internal
internal
leaf
leaf
leaf
Data file
7
Worst-Case Analysis
  • In worst-case
  • root has only 2 descendants
  • minimal breadth of all the other nodes is ?m/2?
  • let d be the height of a B-tree

Level 1 2 3 d
Minimum of Descendants 2 2 ?m/2? 2
?m/2?2 2 ?m/2?d-1
8
Worst-case Analysis
  • the upper bound for the depth of a B-tree with N
    keys
  • N gt 2 ?m/2?d-1
  • d lt 1 log ?m/2? (N/2)
  • 1,000,000 keys, order 512 B-tree
  • d lt 1 log?512/2? (1000000/2)
  • d lt 3.37
  • a order 512 B-tree with 1,000,000 keys has a
    depth of no more than 3 levels

9
Insertion Rules
  • B-tree
  • max. (m-1) keys and max. m descendants
  • key-only record ref. only in the leaf nodes
  • duplicate keys
  • Insert a key k to a node with n keys in a order m
    B-tree
  • If n lt (m-1), simply insert k
  • update the parent node if k is the new largest
    key of the node
  • If n (m-1), (overfull)
  • split the node (with m keys ) to 2 nodes
  • update the parent node
  • insert a new key for the new node
  • update the old key (in the parent) for the old
    node if k is the new largest key of the node

10
Create a B-tree
  • Key sequence
  • C S D T A M P I B W N G U R K E H O L J Y Q Z F X
    V
  • order 4 B-tree

11
Deletion Rules
  • Delete a key k from a node with n keys in a order
    m B-tree
  • If n gt ?m/2? , simply delete k
  • update the parent node if k was the largest key
    of the node
  • If n ?m/2? and (underfull)
  • one of the siblings has less than ?m/2? keys,
    then
  • merge the node with the siblings ( s (n-1) gt
    ?m/2? )
  • delete a key from the parent node
  • update the parent node if k was the largest key
    of the two nodes before merge
  • one of the siblings has extra keys, then
  • redistribute all the keys of the node and the
    sibling
  • update the old keys (in the parent) for the two
    nodes
  • sibling at the same level with the same parent

12
Delete keys
  • C, P, H

13
Merge and Redistribute
Delete C - only can merge
Delete W - only can redistribute
Delete M - Both are possible
14
Ways to redistribute
  • By moving only one key from sibling
  • Dividing the keys as evenly as possible
  • Redistribution During insertion
  • place the overflowing keys into another nodes
    (siblings) if possible
  • Space utilization
  • only splitting 67
  • mixed over 86
  • B-tree
  • Every page except for the root has at least
    ?(2m-1)/3?
  • How to manage (split) the initial root

15
Indexed Sequential File Access and B-Tree
16
ISAM
  • Indexed Sequential Access Method
  • Applications that need both
  • sequential access of the file, and
  • indexed access

17
B-Tree
  • Data File (Sequence Set)
  • A (doubly) Linked List of Blocks
  • Records in a Block are sorted by the key values
  • The order of the records are not necessarily
    physically sequential throughout the file
  • Maintaining Sequence Set
  • split, merge, redistribute
  • overflow, underflow when a block be less than
    half full
  • B-Tree (Index Set)
  • The leaf node (block) contains the representative
    keys (separator) of blocks in the sequence set
    and block addresses

18
(No Transcript)
19
Shortest separator
20
Prefix B-Tree
  • Index Set
  • contains shortest separators or prefixes of the
    keys rather than copies of the actual keys
  • Variable-length
  • the maximum of descendants can vary
Write a Comment
User Comments (0)
About PowerShow.com