index4
index4
1
Search Trees
2
Binary Search Tree
Root
Nodes
Leaf Nodes
3
Binary Search Tree
Root Level 1
3 Parent of 1 Level 2
Nodes
1 Child of 3
Level 3
Level 4
Leaf Nodes
4
Binary Search Tree
Root Level 1
3 Parent of 1 Level 2
Nodes
Subtree
1 Child of 3
Level 3
Subtree
Level 4
Leaf Nodes
6
Tree-Based Indexing
B
33 44
Leaf Level L1 L2 L3
Daniels, 22, 6003 Basu, 33, 4003 Smith, 44, 3000
Ashby, 25, 3000 Jones, 40, 6003 Tracy, 44, 5004
Bristow, 29, 2007 Cass, 50, 5004
Tree-Structure Index
Tree-Based Indexing Cont.
The structure of leaf nodes differs from the structure of internal nodes
11
B+ Tree
Internal Nodes
Mazda Toyota
Leaf Nodes
…
…
…
Data
Holden
Suzuki
Mazda
Toyota
Blocks
Volvo
BMW
14
Example B+ Tree
Search begins at root, and
key comparisons direct search to a leaf
Example: Search for 5, 14
15
B+ Tree Index: Internal Node
16
B+ Tree Index: Internal Node
An internal node is of the form:
[p1 , k1 , .. , ki-1, pi , ki , .., kq-1 , pq ]
pi is pointer and ki is field value (key)
1. Keys
k1 < k2 < …< ki < … < kq-1
Some search values from the leaf nodes are repeated
in the internal nodes to guide the search
For all search values X in the subtree pointed at by pi,
we have:
X < Ki, and X ≥ Ki-1
17
B+ Tree Index: Internal Node
An internal node is of the form:
[p1 , k1 , .. , ki-1, pi , ki , .., kq-1 , pq ]
pi is pointer and ki is field value (key)
2. Pointers:
Pointers are tree pointers to blocks that are tree nodes
For a tree of Order q:
Each internal node has at most:
q tree pointers
Each internal node has at least:
q/2 tree pointers
except the root, which can have at least 2 tree pointers
18
B+ Tree Index: Leaf Node
19
B+ Tree Index: Leaf Node
A leaf node is of the form:
[<k1 , pr1>, .. ,<kq-1 , prq-1 >, pnext ]
1. Pointers pri
Each pri is a data pointer to ki’s record(s):
If ki is a key field (i.e., unique)
pri points to a data block that contains the record
If ki is a non-key (i.e., repeated)
pri points to a block containing pointers to the data file
records (as in secondary index)
20
B+ Tree Index: Leaf Node
A leaf node is of the form:
[<k1 , pr1>, .. ,<kq-1 , prq-1 >, pnext ]
2. Pointer pnext
pnext points to the next leaf node
Leaf nodes are linked to provide ordered access on the
indexing field
Traverse leaf nodes as a linked list
Very useful for range search
(e.g., cars between BMW and Mazda)
21
Example B+ Tree
Search for all car
Suzuki
makes ≥
Mazda(alphabetically)
Internal Nodes
Mazda Toyota
Leaf Nodes
…
…
…
Data
Holden
Suzuki
Mazda
Toyota
Blocks
Volvo
BMW
22
Dynamic Multi-Level Index (B+ trees)
Insertion:
into a node that is not full is quite efficient
if a node is full it is split into two nodes
splitting may propagate to other tree levels
Deletion:
efficient if a node remains more than half full
otherwise, it is merged with neighboring nodes
23
Example B+ Tree
Insert Nissan
Suzuki
Mazda Toyota
…
Data
Holden
Nissan
Suzuki
Mazda
Blocks
Toyota
Volvo
BMW
24
Example B+ Tree
Insert Nissan
Suzuki
Mazda Toyota
…
Data
Holden
Nissan
Suzuki
Mazda
Blocks
Toyota
Volvo
BMW
25
Example B+ Tree
Insert Skoda
Suzuki
Mazda Toyota
…
Data
Holden
Nissan
Suzuki
Mazda
Blocks
Toyota
Skoda
Volvo
BMW
26
Example B+ Tree
Insert Skoda
Suzuki
Mazda Toyota
Node
Overflow
BMW * Holden*
✗
Mazda Nissan* Skoda*
*
…
…
…
Data
Holden
Nissan
Suzuki
Mazda
Blocks
Toyota
Skoda
Volvo
BMW
27
Example B+ Tree
Insert Skoda
Suzuki
Mazda Toyota
Parent?
…
Data
Holden
Nissan
Suzuki
Mazda
Blocks
Toyota
Skoda
Volvo
BMW
28
Split leaf node L:
Example B+ Tree 1.redistribute entries: If L is of order p,
then (p+1)/2 entries stay in L and the
rest move to Lnew (eg., (3+1)/2=2)
Insert Skoda
Suzuki 2.copy up first key in Lnew into L’s parent
(eg., Skoda)
L’s 3.Insert pointer to Lnew into L’s parent
parent
Mazda Skoda Toyota
L Lnew
BMW * Holden* Mazda Nissan* Skoda*
*
…
…
…
Data
Holden
Nissan
Suzuki
Mazda
Blocks
Toyota
Skoda
Volvo
BMW
29
Example B+ Tree
Insert Audi
Suzuki
…
…
…
Data
Holden
Nissan
Suzuki
Mazda
Blocks
Toyota
Skoda
Volvo
BMW
Audi
30
Example B+ Tree
Insert Audi
Suzuki
Insert Holden
into parent
…
…
…
Data
Holden
Nissan
Suzuki
Mazda
Blocks
Toyota
Skoda
Volvo
BMW
Audi
31
Example B+ Tree
Insert Audi
Suzuki
Node
Overflow
Holden Mazda Skoda Toyota
✗
Audi* BMW * Holden* Mazda Nissan* Skoda*
*
…
…
…
…
Data
Holden
Nissan
Suzuki
Mazda
Blocks
Toyota
Skoda
Volvo
BMW
Audi
32
Example B+ Tree
Insert Audi
Mazda Suzuki
…
…
…
Data
Holden
Nissan
Suzuki
Mazda
Blocks
Toyota
Skoda
“Mazda” is pushed up and appears only once
Volvo
BMW
Audi
in the index (compare to leaf node split?)
33
Inserting an new entry into a B+ tree
Find correct leaf L, put new entry onto L
If L has enough space, done
Else, must split L (into L and a new node Lnew)
redistribute entries evenly
If L is of order p, then (p+1)/2 entries stay in L and
the rest move to Lnew (or some other similar rule)
copy up first key in Lnew into L’s parent
Insert pointer to Lnew into L’s parent
This can happen recursively
To split internal node:
redistribute entries evenly
push up middle key (vs. copy up)
34
Example B+ Tree
Insert ”8”
35
Example B+ Tree
Step 3: Push up “17” into
Insert ”8” parent
36
Example B+ Tree After Inserting 8
37
Search in a B+ tree index
Searching for a record using B+ tree:
1) Read one block at each level in the index
Number of accessed index blocks = tree height
2) Read the data block containing the searched record
38
Impact of tree order
Tree Order = 3
Tree Order = 5
39
Search in a B+ tree index
Tree height depends on:
1. The number of entries in leaf nodes (e), and
2. Tree order (i.e., fan-out (fo))
40
Dynamic Multi-Level Index (B+ trees)
Insertion:
into a node that is not full is quite efficient
if a node is full it is split into two nodes
splitting may propagate to other tree levels
Deletion:
efficient if a node remains more than half full
otherwise, it is merged with neighboring nodes
41