B-Tree
• I/O operations on a disk is known as block.
• When information is read from a disk, the entire block containing this information is read into
memory, and when information is stored on a disk, an entire block is written to the disk.
• Large differences between time access to disk, cash memory and core memory.
• Minimize expensive access (e.g., disk access).
• access time = seek time + rotational delay (latency) + transfer time
• B-tree: Dynamic sets that is optimized for disks.
• Data stored on Secondary Storage(SS or SM).
• Information is stored on disks or tapes so access data from SM is time taken process.
• A B-tree is an M-way search tree with properties :
1. It is perfectly balanced: every leaf node is at the same depth
2. The root has at least two subtrees unless it is a leaf.
3. Every internal node other than the root, is at least half-full, i.e. M/2-1 ≤ #keys ≤ M-1
4. Each leaf node holds k – 1 keys where ceil(m/2) ≤ k ≤ m.
5. Every internal node with k keys has k+1 non-null children
For simplicity we consider M even and we use t=M/2:
3. * Every internal node other than the root is at least half full, i.e. t-1≤ #keys ≤2t-1, t≤
#children ≤2t.
B-tree Height
• any B-tree with n keys, height h and minimum degree t satisfies:
𝑛+1
ℎ ≤ 𝑙𝑜𝑔𝑡
2
• The minimum number of KEYS for a tree with height h is obtained when:
o The root contains one key
o All other nodes contain t-1 keys
B-Tree: Insert X
• As in M-way tree find the leaf node to which X should be added
• Add X to this node in the appropriate place among the values already there (there are no
subtrees to worry about)
• Number of values in the node after adding the key:
o Fewer than 2t-1: done
o Equal to 2t: overflowed
• Fix overflowed node
Fix an Overflowed
• Split the node into three parts, M=2t:
o Left: the first t values, become a left child node
o Middle: the middle value at position t, goes up to parent
o Right: the last t-1 values, become a right child node
• Continue with the parent:
o Until no overflow occurs in the parent
o If the root overflows, split it too, and create a new root node
Insert Example
Complexity Insert
• Inserting a key into a B-tree of height h is done in a single pass down the tree and a single pass
up the tree.
• Complexity: O(h)= O(logt n)
B-Tree: Delete X
• Delete as in M-way tree
• A problem: – might cause underflow: the number of keys remain in a node < t-1
• Deletion from:
o A key is in leaf node
o A key is in internal node
• If Order (m) = 5
o Min children = ceil(m/2) = 3
o Max children = m = 5
o Min keys = ceil(m/2) – 1
o Max keys = m-1 = 4
• Leaf node
o Leaf node contains more than min. no. of keys
o Leaf node contains min. no. of keys
Underflow Example on Delete
B-Tree: Delete X,k
• The root should have at least 1 value in it, and all other nodes should have at least t-1 (at most
2t-1) values in them.
• A problem: – might cause underflow: the number of keys remain in a node < t-1.
• Solution:
o make sure a node that is visited has at least t instead of t-1 keys.
o If it doesn’t have k
▪ either take from sibling via a rotate, or
▪ merge with the parent
o If it does have k then:
Complexity Delete
• Basically, downward pass:
o Most of the keys are in the leaves – one downward pass.
o When deleting a key in internal node – may have to go one step up to replace the key
with its predecessor or successor.
o Complexity : O(h) = O(logt n)