Title: Btrees Hashing
1B-trees - Hashing
2Review B-trees and B-trees
- Multilevel, disk-aware, balanced index methods
- primary or secondary
- dense or sparse
- supports selection and range queries
-
- B-trees most common indexing structure in
databases - all actual values stored on leaf-nodes.
- Optimality
- space O(N/B), updates O(log B (N/B)),
queries O(log B (N/B)K/B) - (B is the fan out of a node)
3BTree Example
100
120 150 180
30
3 5 11
120 130
180 200
100 101 110
150 156 179
30 35
4Btree rules tree of order d
- (1) All leaves at same lowest level (balanced
tree) - (2) Pointers in leaves point to records except
for sequence pointer - (3) Number of pointers/keys for Btree
Max Max Min Min ptrs keys ptrs
keys
Non-leaf (non-root)
n
n-1
?n/2?
d ?n/2?- 1
Leaf (non-root)
n
n-1
?(n-1)/2?
?(n-1)/2?
Root
n
n-1
2
1
5Insert into Btree
- (a) simple case
- space available in leaf
- (b) leaf overflow
- (c) non-leaf overflow
- (d) new root
-
6n4
100
30
3 5 11
30 31
7n4
100
30
3 5 11
30 31
8n4
100
120 150 180
180 200
150 156 179
9n4
10 20 30
1 2 3
10 12
20 25
30 32 40
10Deletion from Btree
- (a) Simple case - no example
- (b) Coalesce with neighbor (sibling)
- (c) Re-distribute keys
- (d) Cases (b) or (c) at non-leaf
11- (b) Coalesce with sibling
- Delete 50
n5
10 40 100
10 20 30
40 50
12- (c) Redistribute keys
- Delete 50
n5
10 40 100
10 20 30 35
40 50
13- (d) Non-leaf coalesce
- Delete 37
n5
25
10 20
30 40
25 26
30 37
40 45
20 22
10 14
1 3
14Bulk Loading of a B Tree
- If we have a large collection of records, and we
want to create a B tree on some field, doing so
by repeatedly inserting records is very slow. - Bulk Loading can be done much more efficiently.
- Initialization Sort all data entries, insert
pointer to first (leaf) page in a new (root) page.
Root
Sorted pages of data entries not yet in B tree
15Bulk Loading (Contd.)
- Index entries for leaf pages always entered into
right-most index page just above leaf level. - Assume pages in the rightmost path to have double
page size. - Split when double plus one.
Root
Data entry pages
10
12
20
6
not yet in B tree
3
6
9
10
11
12
13
23
31
36
38
41
44
4
20
22
35
Root
12
Data entry pages
not yet in B tree
6
20
10
23
3
6
9
10
11
12
13
23
31
36
38
41
44
4
20
22
35
16Summary of Bulk Loading
- Option 1 multiple inserts.
- Slow.
- Does not give sequential storage of leaves.
- Option 2 Bulk Loading
- Has advantages for concurrency control.
- Fewer I/Os during build.
- Leaves will be stored sequentially (and linked,
of course). - Can control fill factor on pages.
17A Note on Order
- Order (d) concept replaced by physical space
criterion in practice (at least half-full). - Index pages can typically hold many more entries
than leaf pages. - Variable sized records and search keys mean
different nodes will contain different numbers of
entries. - Even with fixed length fields, multiple records
with the same search key value (duplicates) can
lead to variable-sized data entries (if we use
Alternative (3)). - Many real systems are even sloppier than this ---
only reclaim space when a page is completely
empty.
18Selection Queries
B-tree is perfect, but.... to answer a
selection query (ssn10) needs to traverse a full
path. In practice, 3-4 block accesses (depending
on the height of the tree,
buffering) Any
better approach?
- Yes! Hashing
- static hashing
- dynamic hashing
19Hashing
- Hash-based indexes are best for equality
selections. Cannot support range searches. - Static and dynamic hashing techniques exist
trade-offs similar to ISAM vs. B trees.
20Static Hashing
- primary pages fixed, allocated sequentially,
never de-allocated overflow pages if needed. - h(k) MOD N bucket to which data entry with key k
belongs. (N of buckets)
0
h(key) mod N
1
key
h
N-1
Primary bucket pages
Overflow pages
21Static Hashing (Contd.)
- Buckets contain data entries.
- Hash fn works on search key field of record r.
Use its value MOD N to distribute values over
range 0 ... N-1. - h(key) (a key b) usually works well.
- a and b are constants lots known about how to
tune h. - Long overflow chains can develop and degrade
performance. - Extendible and Linear Hashing Dynamic techniques
to fix this problem.
22Extendible Hashing
- Situation Bucket (primary page) becomes full.
Why not re-organize file by doubling of
buckets? - Reading and writing all pages is expensive!
- Idea Use directory of pointers to buckets,
double of buckets by doubling the directory,
splitting just the bucket that overflowed! - Directory much smaller than file, so doubling it
is much cheaper. Only one page of data entries
is split. No overflow page! - Trick lies in how hash function is adjusted!
23Example
- Directory is array of size 4.
- Bucket for record r has entry with index
global depth least significant bits of h(r) - If h(r) 5 binary 101, it is in bucket
pointed to by 01. - If h(r) 7 binary 111, it is in bucket
pointed to by 11.
2
LOCAL DEPTH
Bucket A
16
4
12
32
GLOBAL DEPTH
2
1
Bucket B
00
13
1
7
5
01
10
2
Bucket C
10
11
we denote r by h(r).
DIRECTORY
24Handling Inserts
- Find bucket where record belongs.
- If theres room, put it there.
- Else, if bucket is full, split it
- increment local depth of original page
- allocate new page with new local depth
- re-distribute records from original page.
- add entry for the new page to the directory
25Example Insert 21, then 19, 15
- 21 10101
- 19 10011
- 15 01111
LOCAL DEPTH
Bucket A
GLOBAL DEPTH
2
2
1
Bucket B
00
13
1
7
5
21
01
2
10
Bucket C
10
11
DIRECTORY
15
19
7
DATA PAGES
26Insert h(r)20 (Causes Doubling)
LOCAL DEPTH
Bucket A
GLOBAL DEPTH
2
2
Bucket B
1
5
21
13
00
01
2
10
Bucket C
10
11
2
Bucket D
15
7
19
of Bucket A)
27Points to Note
- 20 binary 10100. Last 2 bits (00) tell us r
belongs in either A or A2. Last 3 bits needed to
tell which. - Global depth of directory Max of bits needed
to tell which bucket an entry belongs to. - Local depth of a bucket of bits used to
determine if an entry belongs to this bucket. - When does bucket split cause directory doubling?
- Before insert, local depth of bucket global
depth. Insert causes local depth to become gt
global depth directory is doubled by copying it
over and fixing pointer to split image page.
28Directory Doubling
- Why use least significant bits in directory?
- Allows for doubling via copying!
6 110
6 110
3
3
2
2
00
1
1
6
10
0
0
6
6
01
1
1
6
11
6
6
vs.
Most Significant
Least Significant
29Comments on Extendible Hashing
- If directory fits in memory, equality search
answered with one disk access else two. - 100MB file, 100 bytes/rec, 4K pages contains
1,000,000 records (as data entries) and 25,000
directory elements chances are high that
directory will fit in memory. - Directory grows in spurts, and, if the
distribution of hash values is skewed, directory
can grow large. - Multiple entries with same hash value cause
problems! - Delete If removal of data entry makes bucket
empty, can be merged with split image. If each
directory element points to same bucket as its
split image, can halve directory.