SINGLE AND
MULTI-LEVEL INDEXING,
B TREES & B+ TREES
Group:
• Vaibhav Gupta
• Ankit Kumar Meena
• Abhishek Kumar
• Varun Kumar Verma
• Krishna
CONTENTS
1 Introduction to Indexing 8 Introduction to B Tree
2 Type of Indexing 9 Structure of B Tree
3 Single Level Indexing & Structure 10 Operations of & in B Tree
4 Limitations of Single Level 11 Intriduction to B+ Tree
Indexing
5 Multi-Level Indexing 12 Structure of B+ Tree
Limitations and Advantages of B+
6 Structure of Multi-Level Indexing 13
Tree an/Over B Tree
7 Multi-Level Indexing 14 Multi-Level Indexing
INTRODUCTION
• Indexing is a technique used to optimize the performance of
database queries by reducing the time taken to locate specific
records. It involves creating an additional data structure, the index,
which acts as a roadmap for the database.
Purpose:
• Speeds up data retrieval by minimizing the need to scan the entire
dataset.
• Reduces search complexity from linear time O(n) to logarithmic
time O(log n), significantly improving efficiency.
• Helps in scenarios where the database size is large and frequent
queries are performed.
Key Concepts:
• Keys: Unique identifiers in the index that correspond to data
entries.
• Pointers: References that connect keys to the actual location of the
data in the database.
Example: Consider a phone book. Instead of searching through every
name, you can directly jump to the section starting with a specific
letter, drastically reducing the search effort.
Primary Indexing
• Built on the primary key of a database table.
• Ensures each key is unique, allowing one-to-one
mapping with records.
• Example: A student ID in a school database serves as a
primary index, providing a unique reference to each
student's record.
Secondary Indexing
• Applied to non-primary key attributes, allowing
multiple ways to access data.
• Example: Indexing student names in a database
lets users search for records based on names rather
than IDs.
Clustering Indexing
TYPES OF
• Groups records based on a non-unique attribute
that defines the physical clustering of data on disk.
• Example: Indexing a department name where
INDEXING multiple students belong to the same department.
Why Single & Multi-Level
Indexing
• A hierarchical indexing approach that breaks the
index into multiple levels.
• Suitable for large datasets where a single-level
index becomes impractical.
SINGLE-LEVEL INDEXING &
STRUCTURE
Definition:
• Single-level indexing is the simplest form of indexing, where each
record in the dataset is indexed directly by its key.
Structure Details:
• Comprises a table with two columns: keys and
pointers.
• Each pointer directs to the corresponding data
in the database.
Performance:
• Works efficiently for small datasets but
becomes inefficient as the dataset grows.
• The index itself may require substantial
memory, and frequent I/O operations can slow
down performance.
Example: Imagine a library system where books
are indexed by their ISBN numbers. The index table
lists the ISBN along with a pointer to the shelf
location.
Visualization: Include a diagram showing a simple
[Key | Pointer] structure, with arrows pointing to
LIMITATIONS OF SINGLE-LEVEL
INDEXING
Size Constraints:
• As the dataset grows, the index table also increases in size, often
surpassing the memory limits of the system. This results in slower access
times and higher resource consumption.
Performance Degradation:
• For large datasets, the number of disk I/O operations required to search
through the index becomes significant, negating the speed benefits of
indexing.
Static Nature:
• Adding or removing records requires extensive updates to the index table,
making it unsuitable for dynamic datasets where data changes frequently.
Example Problem:
• Consider a retail database with millions of product records. Maintaining a
single-level index for all products would require substantial resources and
become inefficient.
MULTI-LEVEL INDEXING
Definition:
• Multi-level indexing is a hierarchical approach to indexing that divides
a large index into smaller, more manageable segments, improving
search efficiency.
Concept:
• The index is split into multiple levels.
• The top-level index contains pointers to second-level indices,
and so on.
• The final level directly points to the data.
Benefits:
• Reduces the size of individual index blocks.
• Minimizes the number of disk accesses required to locate data.
How It Works:
1.Start at the top-level index.
2.Follow the pointer to the appropriate second-level index.
3.Continue until the desired record is located.
Use Case:
Ideal for very large databases with millions of records, such as
social media platforms or e-commerce sites.
STRUCTURE OF MULTI-
LEVEL INDEXING
Top-Level Index:
Points to blocks containing the second-level index.
• Example: Contains pointers to index blocks for alphabetical
ranges, like A-M and N-Z.
Second-Level Index:
Points to specific data blocks or further lower-level indices.
• Example: Contains pointers to subsets of records within the
alphabetical range.
Search Procedure:
1.Start at the top-level index to locate the relevant block.
2.Traverse through each level until the data is found.
Efficiency:
By reducing the number of nodes traversed, multi-level indexing
improves search performance for large datasets.
Diagram: Include a hierarchical tree-like structure to visually
represent the levels.
ADVANTAGES AND DISADVANTAGES
OF SINGLE AND MULTI-LEVEL
INDEXING
Single-Level Multi-Level
Indexing: Indexing:
Advantages: Disadvantages: Advantages: Disadvantages:
• Simple to • Does not scale well • Efficient for large • More complex to
implement and for large datasets. datasets. maintain.
maintain. • Performance • Reduces disk I/O • Performance can
• Suitable for small degrades with operations. be affected by
datasets. frequent updates. frequent updates.
INTRODUCTION TO B-
TREE Definition:
A B-Tree is a self-balancing search tree data structure that
maintains sorted data and allows searches, sequential access,
insertions, and deletions in logarithmic time. It is widely used in
databases and file systems to organize and retrieve data
efficiently.
Key Properties:
• Ensures all leaf nodes are at the same depth, maintaining balance
across the tree.
• Each node can contain multiple keys and child pointers.
• The number of keys in a node is determined by the order of the
tree.
Applications:
• Database indexing to speed up query execution.
• File systems to manage file storage and access.
• Used in scenarios where data is too large to fit into main memory.
Comparison to Binary Trees:
Unlike binary trees, where each node has at most two children, a B-Tree node can have many children, depending on its order.
This characteristic reduces the height of the tree, leading to fewer disk I/O operations.
STRUCTURE OF B TREE
Node Composition:
• A B-Tree node can store multiple keys and pointers
to child nodes.
• The number of keys and children is governed by
the order of the tree (m):
⚬ Keys in a node: ⌈m/2⌉ - 1 ≤ keys ≤ m - 1
⚬ Children of a node: ⌈m/2⌉ ≤ children ≤ m
Leaf Level:
• All leaf nodes are at the same depth, ensuring
balance.
• Leaf nodes store actual data or pointers to data
blocks.
Example:
For a B-Tree of order 3, a node can hold up to 2 keys
and 3 pointers. Keys are kept sorted within the node,
and pointers guide the search process.
OPERATIONS IN B-TREE
Insertion:
1.Locate the appropriate leaf node for the new key.
2.If the node has space, insert the key and rearrange in sorted
order.
3.If the node is full, split it into two nodes and promote the
middle key to the parent node.
Deletion:
4.Locate the key to delete.
5.Remove the key. If underflow occurs, rebalance by borrowing
a key from a sibling or merging nodes.
Search:
6.Start at the root node and compare the key.
7.Follow the child pointer that corresponds to the range of the
search key.
8.Repeat until the key is found or a leaf node is reached.
Balancing:
The tree remains balanced by ensuring no node has fewer than
⌈m/2⌉ keys.
INTRODUCTION TO B+
TREE Definition:
A B+ Tree is an advanced variation of the B-Tree,
designed specifically for database indexing. It
differs from a B-Tree by storing all data at the leaf
nodes and using internal nodes solely for
navigation.
Key Characteristics:
• All data records are stored in the leaf nodes, making
range queries faster and more efficient.
• Leaf nodes are linked sequentially, allowing direct
traversal in order.
• Internal nodes only contain keys for navigation, reducing
their size.
Applications:
Comparison to B-Tree:
• Frequently used in database systems where range queries
• B-Trees store keys and data together at
are common.
all levels, while B+ Trees separate them
• Suitable for applications requiring ordered sequential access
for better sequential access.
to data.
STRUCTURE OF B+ TREE
Internal Nodes:
• Contain only keys, which act as navigation points to direct
searches.
• No data pointers are stored in internal nodes.
Leaf Nodes:
• Hold all the data pointers, ensuring all data is at the same level.
• Linked together in a doubly linked list, enabling efficient range
queries and traversal.
Advantages Over B-Tree:
• Sequential access is faster due to linked leaf nodes.
• Smaller internal nodes allow more keys to be stored in memory,
reducing disk I/O.
Example Diagram:
Illustrate a B+ Tree with internal nodes containing keys and leaf
nodes containing data pointers, linked sequentially.
LIMITATIONS AND ADVANTAGES OF
B+ TREE AND B-TREE
B TREE B+ TREE
Advantages: Disadvantages: Advantages: Disadvantages:
• Supports faster Sequential access is • Optimized for Requires slightly more
insertions and slower as data is range queries and storage space as data
deletions as keys spread across multiple sequential access is duplicated in leaf
and data are stored levels. due to linked leaf nodes.
together. Key Differences: nodes. Key Differences:
• Suitable for B+ Trees store data • Efficient use of B+ Trees are more
scenarios where only in leaf nodes, internal nodes for efficient for range
random access is while B-Trees store navigation. queries, while B-Trees
prioritized. data at all levels. are better for random
access.
CONCLUSION AND KEY
POINTS
Summary:
This presentation explored advanced indexing mechanisms, highlighting their importance in
database optimization:
1.Indexing reduces search time and improves query performance.
2.Multi-level indexing addresses scalability for large datasets.
3.B-Trees ensure balanced searches with logarithmic time complexity.
4.B+ Trees optimize sequential access and range queries.
Key Takeaways:
• Choose an indexing method based on data size and query patterns.
• B+ Trees are ideal for database systems due to their efficient range query handling.
• Effective indexing is critical for large-scale systems like databases, search engines,
and file storage.
Closing Note:
• Efficient indexing is not just a technical requirement but a strategic tool for enhancing
application performance.
Present by GROUP 9