B Trees and Its Variants
B Trees and Its Variants
Variants
Advanced Data Structure
Spring 2007
Zareen Alamgir
Motivation
Yet another Tree!
Why do we need another
Tree-Structure ?
12 19
15 17
32 39
21 23
30 31
34 37
73 84
45 51
69 71
75 79
90 94
Multiway(m-way)
Search Trees
Multiway(m-way) Search
Trees
keys<K1
K1
P2
Ki-1
Pi
Ki
Ki-1<keys<Ki
Kq-1
Pq
Kq-1<keys
Multiway(m-way) Search
Trees
Benefits
Fast information
retrieval.
Fast update.
Problems
M-way tree
B-Trees
B-Trees
Key size
Disk block size
Data organization (keys or entire data records are store in nodes)
12 19
15 17
32 39
21 23
30 31
34 37
73 84
45 51
69 71
75 79
90 94
B-Tree: Definition
keys<K1
K1
P2
Ki-1
Pi
Ki
Ki-1<keys<Ki
Kq-1
Pq
Kq-1<keys
B-Tree
B-Tree can have a field KeyTally, KT, to indicate the number of keys
currently stored in the node.
B-Tree node usually contains key and data pointer pair. The data
pointer points to the data record which is not stored in the node, with
this scheme we can pack more keys & pointers in a B-Tree node.
KT
P1
keys<K1
K1
D1
Data pointer
Ki-1 D
Ki-1
Pi
Ki
Ki-1<keys<Ki
Di
Kq-1 KD
q-1
q-1
Data pointer
Pq
Kq-1<keys
Height of B-Tree
Height of B-Tree
1 ( q 1)
h 2
2qi
i 0
1 2( q 1)
q
1 2q h 1
Thus, the number of keys in B - Tree of height h is given as :
n 1 2q h 1
h log q
n 1
1
2
Height of B-Tree
The height of B-tree is minimum if all nodes are full, thus we have
m-1 keys in the root + m(m-1) keys on the second level ++ m h-1(m-1) keys in the leaf nodes
(m - 1) m(m - 1) m 2 (m - 1) m h-1 (m - 1)
h 1
h 1
( m 1)m (m 1) mi
i
i 0
i 0
(m 1)
m
mh 1
Thus, the number of keys in B - Tree of height h is given as :
n mh 1
h logm ( n 1)
logm (n 1) h log q
n 1
1
2
Height of B-Tree
Search in a B-Tree
12 19
15 17
32 39
21 23
30 31
34 37
73 84
45 51
Search key 71
69 71
75 79
90 94
Basic Idea
Find position for the key
in the appropriate leaf node
Is node
full ?
Split node:
Create a new node
Move half of the keys from the full node to
the new node
Promote the median key (before split)
to the parent.
Split guarantees that each node has
m / 2 1
keys.
If parent is full
Case 1: The leaf node has room for the new key.
Case 2: The leaf in which key is to be placed is full.
This case can lead to the increase in tree height.
Case 1: The leaf node has room for the new key.
Find appropriate leaf
node for key 3
Insert 3
3
10 25
Insert 3 in order
14 19 20 23
32 38
Insert 16
16
10 25
19
14 19 20 23
32 38
No room for key 16 in leaf node
19 20 23
14 16
Insert 16
55
45
55
55
67
48
13
12
14
19
20
23
27
19
27
29
33
31
52
57
61
72
19
20
23
92
35
36
41
42
16
86
38
14
77
m
2 1
m
2
.
1 .
10 25
14 19
32 38 40 45
Delete 14
10 25
14 19
32 38 40 45
Underflow occurs and the keys in the left & right sibling are
m / 2 1
Delete 25
10 32
19 25
38 40
Delete 21
70
Underflow occurs,
merge nodes
8
32
21 27
79 85
47 66
73 75 78
81 83
88 90 92
Issues in B-tree
P1
keys<K1
K1
D1
Data pointer
Ki-1
D
Ki-1
Pi
Ki
Ki-1<keys<Ki
Di
Kq-1free
KD
space
Pq
q-1
q-1
Data pointer
Kq-1<keys
B+-Trees
In B+ -tree
Pointers to data is stored only in leaf nodes.
Internal nodes contain only keys and subtree pointers
B+-Tree Structure
Sequence Set
Index Set
Sequential
Search
Sequence Set
keys<K1
K1
P2
Ki-1
Pi
Ki
Ki-1<keys<Ki
Kq-1
Pq
Kq-1<keys
K1
D1
Ki-1 Di-1
Ki
Di
Kq-1
Dq-1
Pointer to next
leaf node in tree
Data pointer
Data pointer
Data pointer
Data pointer
Search in a B+-Tree
30 62
15 21
15 17
34 45
21 23
30 31
34 37
75 90
45 51
69 71
75 79
90 94
B+-Tree Insert
Case: The leaf node has room for the key to be inserted.
Find appropriate leaf node for
key 3, and insert in order.
Insert 3
3
14 32
14 19 20 23
32 38
Insert 16
16
10 25
19
14 19 20 23
32 38
14 16
19 20 23
19
Modify Sequence Set next node links
B+ -Tree Insert
Operation
Insert 18
18
No room for key 18,
Split node: create a new
sequence set node and
move m / 2 keys to the
new node.
18
10 17 20 23
18
Create a new index set node and
make it a root node.
Insert the first key of the new
sequence set node in the new root.
B+-tree deletion follows same rules as that of Btree deletion but the separator in index set node is
not removed when a key is deleted.
Deletion cases:
m
2 1
m
2
.
1 .
B+-Tree Delete
Search key 14
Delete 14
14 32
14 19 21
32 38 40
Delete 14
14 38
32
14 19
32 38 40
Underflow occurs and the keys in the left & right sibling are
m / 2 1
Delete 32
14 38
19 32
38 40
the sequence set and traverse the sequence set using the
next pointers in sequence set nodes
45
13
27
33
38
48
52
55
67
57
81
61
72
77
86
92
B+ -Tree
B* -Trees
In B*-tree split operation two nodes are split into three instead of
one into two as in B-tree.
All three nodes participating in the split are guaranteed to be twothirds full after split.
22
32
10 12 15 16 21 24 25 29
35 42 47 51 53
Insert 72
72
32
10 12 15 16 21 24 25 29
35 42 47 51 53 55 57 59
Deletion cases:
After deletion node is at least two third full.
After deletion underflow occurs
2m 1
Redistribute: if number of keys in siblings >
3 .
2m 1
Merge nodes if number of keys in siblings <
.
3
Questions ?
References
The End