Cs3301 UNIT III - V Data Structure Notes
Cs3301 UNIT III - V Data Structure Notes
TREES
Tree ADT – Tree Traversals - Binary Tree ADT – Expression trees – Binary Search Tree ADT – AVL Trees
– Priority Queue (Heaps) – Binary Heap.
Various operations that can be performed on the tree data structure are -
1. Creation of a tree.
2. Insertion of a node in the tree as a child of desired node.
3. Deletion of any node(except root node) from the tree.
4. Modification of the node value of the tree.
5. Searching particular node from the tree.
3. Child nodes
The child nodes in above given tree are marked as shown below
4. Leaves
These are the terminal nodes of the tree.
For example -
5. Degree of the node
The total number of subtrees attached to that node is called the degree of a node.
For example.
6. Degree of tree
The maximum degree in the tree is degree of tree.
12. Sibling
The nodes with common parent are called siblings or brothers.
For example
In this chapter we will deal with special type of trees called binary trees. Let us understand it.
Algorithm:
1. If tree is not empty then
a. traverse the left subtree in inorder
b. visit the root node
c. traverse the right subtree in inorder
The recursive routine for inorder traversal is as given below to
void inorder(node *temp)
{
if(temp!= NULL)
{
inorder(temp->left);
printf("%d",temp->data);
inorder(temp->right);
}
}
2) Preorder Traversal :
In this type of traversal, the parent node or root node is visited first; then left node and finally right node will
be visited.
For example
Algorithm:
1. If tree is not empty then
a. visit the root node
b. traverse the left subtree in preorder
c. traverse the right subtree in preorder
The recursive routine for preorder traversal is as given below.
void preorder(node *temp)
{
if(temp!= NULL)
{
printf("%d",temp->data);
preorder(temp->left);
preorder(temp->right);
}
}
3) Postorder Traversal :
In this type of traversal, the left node is visited first, then right node and finally parent node is visited.
For example
Algorithm:
1. If tree is not empty then
a. traverse the left subtree in postorder
b. traverse the right subtree in postorder
c. visit the root node
The recursive routine for postorder traversal is as given below.
void postorder(node *temp)
{
if(temp!= NULL)
{
postorder(temp->left);
postorder(temp->right);
printf("%d",temp->data);
}
}
Ex. 4.6.1: Write inorder, preorder and postorder traversal for the following tree:
Sol. :
Inorder 8 10 11 30 20 25 40 42 45 60 50 55
Preorder 40 30 10 8 11 25 20 50 45 42 60 55
Postorder 8 11 10 20 25 30 42 60 45 55 50 40.
Ex. 4.6.2. Implementation of binary tree
***************************************************************
Program for creation of a binary tree and display the tree using recursive inorder, preorder and post order
traversals
****************************************************************/
#include <stdio.h>
#include <alloc.h>
#include<conio.h>
typedef struct bin
{
int data;
struct bin *left;
struct bin *right;
}node;/*Binary tree structure*/
void insert(node *, node *);
void inorder(node *);
void preorder(node *);
void postorder(node *);
node *get_node();
void main()
{
int choice;
char ans='n';
node *New,*root;
root=NULL;
clrscr();
do
{
printf("\n Program For Implementing Simple Binary Tree");
printf("\n [Link]");
printf("\n [Link]");
printf("\n [Link] ");
printf("\n [Link]");
printf("\n [Link]");
printf("\n\t Enter Your Choice: ");
scanf("%d", &choice);
switch(choice)
{
case 1:root = NULL;
do
{
New get_node();
printf("\n Enter The Element: ");
scanf("%d", &New->data);
if(root == NULL)
root=New;
else
insert(root,New);
printf("\n Do You want To Enter More elements?(y/n):");
ans=getche();
} while (ans=='y' || ans == 'Y');
clrscr();
break;
case 2:if(root == NULL)
printf("Tree Is not Created!");
else
inorder(root);
break;
case 3:if(root == NULL)
printf("Tree Is Not Created!");
else
preorder(root);
break;
case 4:if(root == NULL)
printf("Tree Is Not Created!");
else
postorder(root);
break;
}
}while(choice!=5);
}
node *get_node()
{
node *temp;
temp=(node *)malloc(sizeof(node));
temp->left = NULL;
temp->right= NULL;
return temp;
}
void insert(node *root,node *New)
{
char ch;
printf("\n Where to insert left/right of %d: ", root->data);
ch=getche();
if ((ch=='r') || (ch=='R'))
{
if(root->right== NULL)
{
root->right=New;
}
else
insert(root->right,New);
}
else
{
if (root->left== NULL)
{
root->left=New;
}
else
insert(root->left, New);
}
}
void inorder(node *temp)
{
if(temp!= NULL)
{
inorder(temp->left);
printf("%d",temp->data);
inorder(temp->right);
}
}
void preorder(node *temp)
{
if(temp!= NULL)
{
printf("%d",temp->data);
preorder(temp->left);
preorder(temp->right);
}
}
void postorder(node *temp)
{
if(temp!= NULL)
{
postorder(temp->left);
postorder(temp->right);
printf("%d",temp->data);
}
}
Output
Program For Implementing Simple Binary Tree
[Link]
[Link]
[Link]
[Link]
[Link]
Enter Your Choice: 1
Enter The Element: 10
Do You Want To Enter More Elements?(y/n): y
Enter The Element: 12
Where to insert left/right of 10: 1
Do You Want To Enter More Elements?(y/n): y
Enter The Element: 17
Where to insert left/right of 10: r
Do You Want To Enter More Elements?(y/n): y
Enter The Element: 8
Where to insert left/right of 10:1
Where to insert left/right of 12: r
Do You Want To Enter More Elements?(y/n):
Program For Implementing Simple Binary Tree
[Link]
[Link]
[Link]
[Link]
[Link]
Enter Your Choice: 2
12 8 10 17
Program For Implementing Simple Binary Tree
[Link]
[Link]
[Link]
[Link]
[Link]
Enter Your Choice: 3
10 12 8 17
Program For Implementing Simple Binary Tree
[Link]
[Link]
[Link]
[Link]
[Link]
Enter Your Choice: 4
8 12 17 10
Program For Implementing Simple Binary Tree
1. Create
2. Inorder
3. Preorder
4. Postorder
5. Exit
Enter Your Choice: 5
• In other words, the full binary tree is a binary tree in which all the nodes have two children except the leaf
node.
Complete Binary Tree
• The complete binary tree is a binary in which all the levels are completely filled except the last level which
is filled from left.
• There are two points to be remembered
1) The leftmost side of the leaf node must always be filled first.
2) It is not necessary for the last leaf node to have right sibling.
+++++++++++++++++++++++++++++++++++++++++++
• Note that in above representation, Tree 1 is a complete binary tree in which all the levels are completely
filled, whereas Tree 2 is also a complete binary tree in which the last level is filled from left to right but it is
incompletely filled.
Difference between complete binary tree and full binary tree
Now, observe Fig. 4.5.1 carefully. You will get a point that a binary tree of depth n having 2n-1 number of
nodes. In Fig. 4.5.1 the tree is having the depth 4 and total number of nodes are 15. Thus remember that in a
binary tree of depth n there will be maximum 2n-1 nodes. And so if we know the maximum depth of the tree
then we can represent binary tree using arrays data structure. Because we can then predict the maximum size
of an array that can accommodate the tree.
Thus array size can be >=n. The root will be at index 0. Its left child will be at index 1, its right child will be
at index 2 and so on. Another way of placing the elements in the array is by applying the formula as shown
below-
• When n = 0 the root node will placed at 0th location
•Parent(n) = floor(n-1)/2
•Left(n) = (2n+1)
• Right(n) = (2n+2).
As these drawbacks are there with this sequential type of representation, we will search for more flexible
representation. So instead of array we will make use of linked list to represent the tree.
Root = A = index 0
Left(0) = 2*0 + 1 = 1
B = Index 1
Left (1) = 2*1+1=3
C = Index 3
Left (3) = 2*3+1 = 7
D = Index 7
Left (7) = 2*7+1=15
E = Index 15
Right (7) = 2*7 +2 = 16
F = Index 16
Linked Representation
3.4 Expression Trees :
Definition: An expression tree is a binary tree in which the operands are attached as leaf nodes and operators
become the internal nodes.
For example -
Now we will read each symbol from left to right one character at a time. If we read an operand then we will
make a node of it and push it onto the stack. If we read operator then pop two nodes from stack, the first
popped node will be attached as right child to operator node and second popped node will be attached as a left
child to operator node. For the above given expression let us build a tree.
As now we read '\0', pop the content of stack. The node which we will get is the root node of an expression
tree.
Let us implement it
Ex. 4.7.1 Program for creating an expression tree and printing it using an inorder traversal.
Sol. :
#include <stdio.h>
#include <conio.h>
#include <alloc.h>
#include <ctype.h>
#define size 20
typedef struct node
{
char data;
struct node *left;
struct node *right;
}btree;
/*stack stores the operand nodes of the tree*/
btree *stack[size];
int top;
void main()
{
btree *root;
char exp[80]; /* exp stores postfix expression */
btree *create(char exp[80]);
void display(btree *root);
clrscr();
printf("Enter the postfix expression\n");
scanf("%s",exp);
top = -1; /* Initialise the stack */
root = create(exp);
printf("\n The Tree is created...\n");
printf("\n The inorder traversal of it is \n");
display(root);
getch();
}
btree* create(char exp[])
{
btree *temp;
int pos;
char ch;
void push(btree *);
btree *pop();
pos = 0;
ch = exp[pos];
while (ch!= '\0')
{
/* Create a new node */
temp = (btree *)malloc(sizeof(btree));
temp -> left = temp -> right = NULL;
temp -> data = ch;
if (isalpha(ch) ) /* is it a operand */
push (temp); /* push operand */
else if (ch=='+' || ch=='-'||ch=='*' || ch=='/')
{
/* it is operator, so pop two nodes from stack
set first node as right child and
set second as left child and push the
operator node on to the stack
*/
temp->right = pop();
temp ->left = pop();
push(temp);
}
else
printf("Invalid character in expression\n");
pos ++;
ch= exp[pos]; /* Read next character */
}
temp = pop();
return(temp);
}
void push(btree *Node)
{
if (top+1 >= size)
printf("Error: Stack is Full\n");
top++;
stack[top] = Node;
}
btree* pop()
{
btree *Node;
if (top == -1)
printf("Error: Stack is Empty\n");
Node = stack[top];
top--;
return(Node);
}
void display(btree *root)
{
btree *temp;
temp = root;
if (temp!= NULL)
{
display(temp->left);
printf("%c", temp->data);
display(temp->right);
}
}
Output
Enter the postfix expression
ab+cd-*
The Tree is created...
The inorder traversal of it is
a + b * c-d
Ex. 4.7.2 Show the binary tree with arithmetic expression A/B * C * D + E. Give the algorithm for
inorder, preorder, postorder traversals and show the result of these traversals.
Sol. :
Algorithm for inorder, preorder and postorder Traversal - Refer section 4.6.
Inorder Traversal A/B*C*D+E
Preorder Traversal + ** / ABCDE
Postorder Traversal AB/C*D*E+
➢ Applications of Trees
Various applications of trees are -
1. Binary search tree
2. Expression tree
3. Threaded binary tree.
Review Question
1. What are expression trees. Write the procedure for constructing an expression tree.
________________________________________________________
Algorithm:
1. Read the value for the node which is to be created and store it in a node called New.
2. Initially if (root!=NULL) then root-New.
3. Again read the next value of node created in New.
4. If (New->value < root->value) then attach New node as a left child of root otherwise attach New node
as a right child of root.
5. Repeat steps 3 aand 4 for constructing required binary search tree completely.
In the Fig. 4.9.2 if we want to insert 23. Then we will start comparing 23 with value of root node i.e. 6.
As 23 is greater than 10, we will move on right onto ONA 2.25 015 subtree. Now we will compare 23
with 20 and move right, compare 23 with 22 and move right. Now compare 23 with 24 but it is less than
24. We will move on left branch of 24. But as there is NULL node as left child of 24, we can attach 23 as
left child of 24.
From the above tree, we want to delete the node having value 6 then we will set left pointer of its parent
node as NULL. That is left pointer of node having value 8 is set to NULL.
ii. Deletion of a node having one child
To explain this kind of deletion, consider a tree as shown in the Fig. 4.9.6.
If we want to delete the node 15, then we will simply copy node 18 at place of 15 and then set the node
free. The inorder successor is always copied at the position of a node being deleted.
In the above tree, if we want to search for value 9. Then we will compare 9 with root node 10. As 9 is
less than 10 we will search on left subbranch. Now compare 9 with 5, but 9 is greater than 5. So we will
move on right subbranch. Now compare 9 with 8 but as 9 is greater than 8 we will move on right
subbranch. Now we read the node value as 9. Thus the desired node can be searched. Let us see the 'C'
implementation of it. The routine is as given below -
Non-recursive search routine
node *search(node *root,int key,node **parent)
{
node *temp;
temp=root;
while(temp!= NULL)
{
if(temp->data==key)
{
printf("\n The %d Element is Present",temp->data);
return temp;
}
*parent=temp; ← Marking the parent node
if(temp->data>key) ← if current node is greater than key
temp-temp->left; ← Search for the left subtree.
else
temp=temp->right;
}
return NULL;
}We can display a tree in inorder fashion. Hence the complete implementation is given below along with
appropriate output.
/*****************************************************************
Program for Implementation of Binary Search Tree and perform insertion deletion, searching, display
of tree.
*****************************************************************
#include <stdio.h>
#include <conio.h>
#include <stdlib.h>
typedef struct bst
{
int data;
struct bst *left, *right;
}node;
void insert(node *,node *);
void inorder(node *);
node *search(node *,int,node **);
void del(node *,int);
void main()
{
int choice;
char ans='N';
int key;
node *New; *root, *tmp, *parent;
node *get_node();
root=NULL;
clrscr();
printf("\n\t Program For Binary Search Tree ");
do
{
printf("\[Link]\[Link]\[Link]\[Link]");
printf("\n\n Enter your choice :");
scanf("%d", &choice);
switch(choice)
{
case 1:do
{
New=get_node();
printf("\n Enter The Element ");
scanf("%d", &New->data);
if(root == NULL) /* Tree is not Created */
root=New;
else
insert(root,New);
printf("\n Do u Want To enter More Elements?(y/n)");
ans=getch();;
} while (ans=='y');
break;
case 2:printf("\n Enter The Element Which You Want To Search");
scanf("%d", &key);
tmp=search(root,key,&parent);
printf("\n Parent of node %d is %d",
tmp->data,parent->data);
break;
case 3:printf("\n Enter The Element U wish to Delete");
scanf("%d", &key);
del(root,key);
break;
case 4:if(root == NULL)
printf("Tree Is Not Created");
else
{
printf("\n The Tree is: ");
inorder(root);
}
break;
}
} while (choice!=5);
}
node *get_node()
{
node *temp;
temp=(node *)malloc(sizeof(node));
temp->left=NULL;
temp->right=NULL;
return temp;
}
/*This function is for creating a binary search tree */
void insert(node *root,node *New)
{
if(New->data<root->data)
{
if(root->left== NULL)
root->left=New;
else
insert(root->left,New);
}
if(New->data>root->data)
{
if(root->right == NULL)
root->right=New;
else
insert(root->right,New);
}
}
/*
This function is for searching the node from binary Search Tree
*/
node *search(node *root,int key,node **parent)
{
node *temp;
temp=root;
while(temp!= NULL)
{
if(temp->data==key)
{
}
}
printf("\n The %d Element is Present",temp->data);
return temp;
}
*parent=temp;
if(temp->data>key)
temp=temp->left;
else
temp=temp->right;
}
return NULL;
}
/*
This function is for deleting a node from binary search tree. There exists three possible cases for deletion
of a node
*/
void del(node *root,int key)
{
node *temp, *parent,*temp_succ;
temp=search(root,key,&parent);
/*deleting a node with two children*/
if(temp->left!= NULL&&temp->right!= NULL)
{
parent=temp;
temp_succ=temp->right;
while(temp_succ->left!= NULL)
{
parent=temp_succ;
temp_succ=temp_succ->left;
}
temp->data=temp_succ->data;
if(temp_succ == parent->left)
parent->left =NULL;
else
parent->right NULL;
printf(" Now Deleted it!");
return;
}
/*deleting a node having only one child*/
/*The node to be deleted has left child*/
if(temp->left!= NULL &&temp->right== NULL)
{
if(parent->left==temp)
parent->left=temp->left;
else
parent->right=temp->left;
temp=NULL;
free(temp);
printf(" Now Deleted it!");
return;
}
/*The node to be deleted has right child*/
if(temp->left==NULL &&temp->right!= NULL)
{
if(parent->left==temp)
parent->left=temp->right;
else
parent->right=temp->right;
temp=NULL;
free(temp);
printf(" Now Deleted it!");
return;
}
/*deleting a node which is having no child*/
if(temp->left==NULL &&temp->right== NULL)
{
if(parent->left==temp)
parent->left = NULL;
else
parent->right=NULL;
printf(" Now Deleted it!");
return;
}
/*
This function displays the tree in inorder fashion
*/
void inorder(node *temp)
{
if(temp!= NULL)
{
inorder(temp->left);
printf("%d",temp->data);
inorder(temp->right);
}
}
Output
Program For Binary Search Tree
1. Create
2. Search
3. Delete
4. Display
Enter your choice :1
Enter The Element 10
Do u Want To enter More Elements?(y/n)
Enter The Element 8
Do u Want To enter More Elements?(y/n)
Enter The Element 9
Do u Want To enter More Elements? (y/n)
Enter The Element 7
Do u Want To enter More Elements?(y/n)
Enter The Element 15
Do u Want To enter More Elements? (y/n)
Enter The Element 13
Do u Want To enter More Elements?(y/n)
Enter The Element 14
Do u Want To enter More Elements?(y/n)
Enter The Element 12
Do u Want To enter More Elements?(y/n)
Enter The Element 16
Do u Want To enter More Elements? (y/n)
1. Create
2. Search
3. Delete
4. Display
Enter your choice :4
The Tree is :
1. Create
2. Search
3. Delete
7 8 9 10 12 13 14 15 16
4. Display
Enter your choice :2
Enter The Element Which You Want To Search16
The 16 Element is Present
Parent of node 16 is 15
1. Create
2. Search
3. Delete
4. Display
Ex. 4.9.1 Define binary search tree. Draw the binary search tree for the following input. 14, 15, 4, 9, 7,
18, 3, 5, 16, 4, 20, 17, 9, 14, 5
Sol. Binary Search Tree (Refer section 4.9)
Ex. 4.9.2 Define a binary search tree and construct a binary search tree. With elements (22, 28, 20, 25,
22, 15, 18, 10, 14). Give recursive search algorithm to search an element in that tree.
Sol. Binary Search Tree (Refer section 4.9)
Example
Recursive Algorithm for search- Refer section 7.7.
Ex. 4.9.3 What is binary search tree? Draw the binary search tree for the following. input. 14, 5, 6, 2,
18, 20, 16, 18, -1, 21.
Sol. Binary Search Tree: Refer section 4.9.
Example
Ex. 4.9.4: What is binary search tree? Write a recursive search routine for binary search tree.
Sol. Binary Search Tree: Refer section 4.9.
Recursive Search Routine
node *search(node temp, int key)
{
if (temp == NULL || key == [Link])
return temp;
else
if (key<[Link])
return search(temp->left, key);
else
return search(temp->right,key);
}
Ex. 4.9.5 Write the following routines to implement the basic binary search tree operations
(i) Perform search operation in binary search tree
(ii) Find_min and Find_max
Sol.: (i) Search operation - Refer section 4.9.1.
(ii) Find_min and Find_max
Consider following binary search tree
For finding the minimum value from the binary search tree, we need to traverse to the left most node.
Hence the left most node in above Fig. 4.9.11 is with value 7 which is the minimum value. Note that for
the leftmost node the left pointer is NULL.
The routine for finding the minimum value from the binary search tree is,
Find_min(node *root)
{
struct node* current=root;
while(current->left != NULL)
current=current->left;
printf("%d", current->data);
}
For finding the maximum value from the binary search tree, we need to traverse to the right most node.
Hence the right most node in above Fig. 4.9.11 is with value 13 which is the maximum value. Note that
for the rightmost node the right pointer is NULL.
The routine for finding the maximum value from the binary search tree is
Find_max(node *root)
{
struct node* current=root;
while(current->right != NULL)
current=current->right;
printf("%d", current->data);
}
Review Questions
1. Write a iterative search routine for a binary search tree.
2. Describe the binary search tree with an example. Write a iterative function to search for the key value
in binary search tree.
3. How to insert and delete an element into binary search tree and write down the code for the insertion
routine with an example.
The AVL tree is named after its two inventors, G.M. Adelson-Velsky and E.M.
Landis, who published it in their 1962 paper "An algorithm for the organization of
information."
Avl tree is a self-balancing binary search tree. In an AVL tree, the heights ofthe two
child subtrees of any node differ by at most one; therefore, it is alsosaid to be height-
balanced.
The balance factor of a node is the height of its right subtree minus theheight of its
left subtree and a node with balance factor 1, 0, or -1 is considered balanced. A node with
any other balance factor is considered unbalanced and requires rebalancing the tree. This
can be done by avl tree rotations
The disadvantage of a binary search tree is that its height can be as largeas N-1
This means that the time needed to perform insertion and deletion andmany other
operations can be O(N) in the worst case
We want a tree with small height
A binary tree with N node has height at least Q(log N)
Thus, our goal is to keep the height of a binary search tree O(log N) Such trees are
called balanced binary search trees. Examples are AVLtree, red-black tree.
Thus we go for AVL tree.
An AVL tree is a special type of binary tree that is always "partially" [Link] criteria that is
used to determine the "level" of "balanced-ness" which is the difference between the heights of
subtrees of a root in the tree. The "height" of tree is the "number of levels" in the tree. The height
of a tree is defined as follows:
The height of an internal node is the maximum height of its children plus 1.
AVL trees are identical to standard binary search trees except that for every node in an AVL tree,
the height of the left and right subtrees can differ by at most 1 . AVL trees are HB-k trees (height
balanced trees of order k) of order HB-1. The following is the height differential formula:
|Height (Tl)-Height(Tr)|<=k
When storing an AVL tree, a field must be added to each node with one of three values: 1, 0, or
-1. A value of 1 in this field means that the left subtree has a height one more than the right
subtree. A value of -1 denotes the opposite. A value of 0 indicates that the heights of both
subtrees are the [Link] FOR HEIGHT OF AVL TREE
An AVL tree is a binary search tree with a balanced condition.
Balance Factor(BF) = Hl --- Hr. Hl
=> Height of the left subtree. Hr
=> Height of the right subtree.
Rotation :
Modification to the tree. i.e. , If the AVL tree is Imbalanced, proper rotationshas to be done.
A rotation is a process of switching children and parents among two or threeadjacent nodes to
restore balance to a tree.
-> RL ( Right -- Left rotation) --- Do single Right, then single Left.
-> LR ( Left -- Right rotation) --- Do single Left, then single Right.
1. LL Rotation :
2. RR Rotation :
EXAMPLE:
LET US CONSIDER INSERTING OF NODES 20,10,40,50,90,30,60,70 in an AVLTREE
[Link]
[Link]
[Link]
[Link]
Struct avlnode
Typedef struct avlnode *position;
Typedef structavlnode *avltree; Typedef
int elementtype;
Struct avlnode
{
Elementtype element;
Avltree left;
Avltree right;Int
height;
};
Static int height(position P)
{ If(P==NULL)
return -1;
else
return P-->height;
}
Avltree insert(elementtype X, avltree T)
{ If(T==NULL)
{ / * Create and return a one node tree*/T=
malloc(sizeof(structavlnode)); If(T==NULL)
Fatalerror(“Out of Space”);Else
{
T-->element=X;
[Link]
T-->height=0;
T-->left=T-->right=NULL;
}
}
Else if(X<T-->element)
{
T-->left=Insert(X,T-->left);
If(height(T-->left) - height(T-->right)==2)If(X<T--
>left-->element) T=singlerotatewithleft(T);
Else T=doublerotatewithleft(T);
}
Else if(X>T-->element)
{
T-->right=insert(X,T-->right);
If(height(T-->left) - height(T-->right)==2)If(X>T--
>right-->element)
T= singlerotatewithright(T);Else
T= doublerotatewithright(T);
}
T-->height=max(height(T-->left),height(T-->right)) + 1;Return T;
}
{
Position k1;
k1=k2-->left;
k2-->left=k1-->right;k1--
>right=k2;
k2-->height= max(height(k2-->left),height(k2-->right)) + 1; k1-->height=
max(height(k1-->left),height(k1-->right)) + 1; return k1; / * New Root * /
}
AVL trees play an important role in most computer related applications. Theneed and use of avl
trees are increasing day by day. their efficiency and less complexity add value to their reputation.
Some of the applications are
AVL trees guarantee that the difference in height of any two subtrees rooted at the same
node will be at most one. This guarantees an asymptotic running time of O(log(n)) as
opposed to O(n) in the case of astandard bst.
Height of an AVL tree with n nodes is always very close to thetheoretical
minimum.
Since the avl tree is height balabced the operation like insertion anddeletion have low time
complexity.
Since tree is always height [Link] implementation ispossible.
The height of left and the right sub-trees should differ by [Link] are possible.
Ex. 4.13.1: Show the result of inserting 2, 1, 4, 5, 9, 3, 6, 7 into an empty AVL tree.
Sol. :
[Link]
Ex. 4.13.2 Draw the result of inserting 20, 10 and 24 one by one into the AVL tree given below. Draw
the tree after each insertion.
[Link]
Sol. :
Step 1: Insertion of 20
[Link]
Ex. 4.13.3: Show the results of inserting 43, 11, 69, 72 and 30 into an initally empty AVL tree. Show the
results of deleting the nodes 11 and 72 one after the other of constructed tree.
Sol. Insertion operation
Ex. 4.13.4: Construct an AVL tree with the values 3,1,4,5,9,2,8,7,0 into an initially empty tree. Write
code for inserting into AVL tree.
Sol.
[Link]
Ex. 4.13.5 Define AVL Tree and starting with an empty AVL search following elements in the given
order 35, 45, 65, 55,75,15, 25.
Sol. AVL Tree Refer section 4.13.
[Link]
Ex. 4.13.6: Write a routine for AVL tree insertion. Insert the following elements in the empty tree and
how do you balance the tree after each element insertion Elements: 2, 5, 4, 6, 7, 9, 8, 3, 1, 10
Sol. :
[Link]
PRIORITY QUEUE
Deletion(h) I
PRIORITY QUEUE
Insertion(h)
[Link]
Linked List :
A simple linked list implementation of priority queue requires o(1) timeto perform the
insertion at the front and o(n) to delete at minimum element.
1. Structure Property :
The Heap should be a complete binary tree, which is a completely filled tree, which is a
completely filled binary tree with the possible exception of thebottom level, which is filled from
left to right.
A Complete Binary tree of height H, has between 2h and (2h+1 - 1) nodes.
Sentinel Value :
The zeroth element is called the sentinel value. It is not a node of the [Link] value is
required because while addition of new node, certain operations are performed in a loop and to
terminate the loop, sentinel value is used.
Index 0 is the sentinel value. It stores irrelated value, inorder to terminate theprogram in case of complex
codings.
Structure Property : Always index 1 should be starting position.
Mintree:
Parent should have lesser value than children.
Maxtree:
Parent should have greater value than children.
These two properties are known as heap propertiesMax-heap
Min-heap
Min-heap:
The smallest element is always in the root [Link] node must have akey that is less or
equal to the key of each of its children.
Examples
Max-Heap:
The largest Element is always in the root node.
[Link]
Each node must have a key that is greater or equal to the key of each of itschildren.
Examples
HEAP OPERATIONS:
There are 2 operations of heap
Insertion
Deletion
Insert:
Adding a new key to the heap
Rules for the insertion:
To insert an element X, into the heap, do the following:
Example Problem :
2. Insert the keys 4, 6, 10, 20, and 8 in this order in an originally empty max-heap
EXAMPLE PROBLEMS :
1. DELETE MIN
[Link]
2. Delete Min -- 13
elementtype *element;
};
Insert Routine
For(i=++H-->size;H-->elements[i/2]>X;i=i/2)
H-->elements[i]=H-->elements[i/2];
H-->elements[i]=X;
}
Delete Routine
Elementtype deletemin(priorityqueue H)
{
int i,child;
elementtype minelement,lastelement;
if(isempty(H))
{
Error(“Priority queue is empty”);
Return H-->element[0];
}
Minelement=H-->element[1];
Lastelement=H-->element[H-->size--];
For(i=1;i*2<=H-->size;i=child)
{
/ *Find smaller child */
Child=i*2;
If(child!=H-->size && H-->elements[child++]<H-->elements[child])
{
Child++;
}
/ * Percolate one level * /
If(lastelement>H-->elements[child])
H-->element[i]=H-->elements[child];Else
Break;
}
H-->element[i]=lastelement;
[Link]
Return minelement;
}
1. Decrease Key.
2. Increase Key.
3. Delete.
4. Build Heap.
1. Decrease Key :
10 10 8
15 12 8 12 10 12
20 30 20 30 20 30
The Decrease key(P,∆,H) operation decreases the value of the key at position P, by a
positive amount ∆. This may violate the heap order property,which can be fixed by percolate up
Ex : decreasekey(2,7,H)
2. Increase Key :
The Increase Key(P,∆,H) operation increases the value of the key atposition P, by a
positive amount ∆. This may violate heap order property, which can be fixed by percolate
down.
Ex : increase key(2,7,H)
10 10 10
12 22 12
15 20 12
20 30 20 30
22 30
[Link]
3. Delete :
The delete(P,H) operation removes the node at the position P, from the heap
H. This can be done by,
10 10
12 12
20 10 12
22 30 22 30
22 30
Step 2 : Deletemin(H)
10 10
12 12
10 22 12
20 20
30
APPLICATIONS
The heap data structure has many applications
Heap sort
Selection algorithms
Graph algorithms
Heap sort :
One of the best sorting methods being in-place and with no quadraticworst-case
scenarios.
Selection algorithms:
Finding the min, max, both the min and max, median, or even the k-thlargest
element can be done in linear time using heaps.
[Link]
Graph algorithms:
By using heaps as internal traversal data structures, run time will bereduced by
an order of polynomial. Examples of such problems are Prim's minimal spanning tree
algorithm and Dijkstra's shortest path problem.
ADVANTAGE
The biggest advantage of heaps over trees in some applications is thatconstruction of heaps
can be done in linear time.
It is used in
o Heap sort
o Selection algorithms
o Graph algorithms
DISADVANTAGE
➢ Heap is expensive
➢ Performance :
Allocating heap memory usually involves a long negotiation with the OS.
➢ Maintenance:
➢ Dynamic allocation may fail; extra code to handle such exception is
required.
➢ Safety :
Object may be deleted more than once or not deleted at all .
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
3.8 BINARY HEAPS :
In this section we will learn "What is heap?" and "How to construct heap?"
Definition: Heap is a complete binary tree or a almost complete binary tree in
which every parent node be either greater or lesser than its child nodes.
Heap can be min heap or max heap.
Types of heap
i)Min heap ii) Max heap
A Max heap is a tree in which value of each node is greater than or equal to
the value of its children nodes.
[Link]
t
For example
A Min heap is a tree in which value of each node is less than or equal to value
of its children nodes.
For example:
Parent being greater or lesser in heap is called parental property. Thus heap has two
important properties.
Heap:
i)It should be a complete binary tree
ii)It should satisfy parental property
Before understanding the construction of heap let us learn/revise few basics
that are required while constructing heap.
• Level of binary tree: The root of the tree is always at level 0. Any node is
always at a level one more than its parent nodes level.
For example:
• Height of the tree: The maximum level is the height of the tree. The height of
the tree is also called depth of the tree.
For example:
[Link]
• Complete binary tree: The complete binary tree is a binary tree in which all
the levels of the tree are filled completely except the lowest level nodes which are
filled from left if required.
For example:
•. Almost complete binary tree: The almost complete binary tree is a tree in which
i) Each node has a left child whenever it has a right child. That means there is always
a left child, but for a left child there may not be a right child.
ii) The leaf in a tree must be present at height h or h-1. That means all the leaves are
on two adjacent levels.
Sol. i) For delete operation we will choose node 40 because this node is of highest priority.
[Link]
ii) Insert 38: Consider the given heap. After inserting 38, we need to heapify the tree. That means we
have to maintain the heap property i.e. parent node must be greater than the child nodes.
Ex. 4.15.4: Show the result of inserting 10, 12, 1, 14, 6, 5, 8, 15, 3, 9, 7, 4, 11, 13 and 2 at a time,
into initially empty binary heap. After creating such heap delete the element 8 from heap, how do
you repair the heap? Then insert the element in the heap and show the final result (insertion
should be at other than lead node).
Sol. : We will create max heap for the set
10 12 1 14 6 5 8 15 3 9 7 4 11 13 2
[Link]
[Link]
[Link]
[Link]
Review Questions
1. Explain the binary heap operations with examples.
2. Explain the insert and delete operations of heap with examples.
[Link]
UNIT IV
MULTIWAY SEARCH TREES AND GRAPHS
B-Tree – B+ Tree – Graph Definition – Representation of Graphs – Types of Graph - Breadth-first traversal –
Depth-first traversal –– Bi-connectivity – Euler circuits – Topological Sort – Dijkstra's algorithm – Minimum
Spanning Tree – Prim's algorithm – Kruskal's algorithm
++++++++++++++++++++++++++++++++++++++++++++++++++++++
4.1 B-Tree :
B-tree is a specialized multiway tree used to store the records in a disk. There are number of subtrees
to each node. So that the height of the tree is relatively small. So that only small number of nodes must
be read from disk to retrieve an item. The goal of B-trees is to get fast access of the data.
Multiway search tree
B-tree is a multiway search tree of order m is an ordered tree where each node has at the most m
children. If there are n number of children in a node then (n - 1) is the number of keys in the node.
For example:
Following is a tree of order 4.
Step 4 :
Now insert 13. But if we insert 13 then the leaf node will have 5 keys which is not allowed. Hence 8,
11,13, 14, 17 is split and medium node 13 is moved up.
[Link]
Step 5:
Now insert 6, 23, 12, 20 without any split.
Step 6: The 26 is inserted to the rightmost leaf node. Hence 14, 17, 20, 23, 26 the node is split and 20
will be moved up.
Step 7: Insertion of node 4 causes left most node to split. The 1, 3, 4, 5, 6 causes key 4 to move up.
Then insert 16, 18, 24, 25.
Step 8: Finally insert 19. Then 4, 7, 13, 19, 20 needs to be split. The median 13 will be moved up to
form a root node.
The tree then will be -
[Link]
Now we will delete 20, the 20 is not in a leaf node so we will find its successor which is 23. Hence 23
will be moved up to replace 20.
Next we will delete 18. Deletion of 18 from the corresponding node causes the node with only one
key, which is not desired (as per rule 4) in B-tree of order 5. The sibling node to immediate right has
an extra key. In such a case we can borrow a key from parent and move spare key of sibling to up. See
the following figure.
[Link]
Now delete 5. But deletion of 5 is not easy. The first thing is 5 is from leaf node. Secondly this leaf
node has no extra keys nor siblings to immediate left or right. In such a situation we can combine this
node with one of the siblings. That means remove 5 and combine 6 with the node 1, 3. To make the
tree balanced we have to move parent's key down. Hence we will move 4 down as 4 is between 1, 3
and 6. The tree will be
But again internal node of 7 contains only one key which not allowed in B-tree (as per rule 3). We
then will try to borrow a key from sibling. But sibling 17, 24 has no spare key. Hence what we can do
is that, combine 7 with 13 and 17, 24. Hence the B-tree will be
Ex. 5.1.2: Construct a B-tree with order m = 3 for the key values 2,3,7,9,5,6,4,8,1 and delete the
values 4 and 6. Show the tree in performing all operations.
Sol. The order m = 3 means at the most 2 keys are allowed. The insertion operation is as follows:
[Link]
Step 6: Insert 4. This will make a sequence 4,5, 6. The 5 will go up. Then the sequence will become
3,5, 7. Again 5 will go up.
[Link]
Step 8; Delete 4 and 6. As these are the leaf nodes without any adjacent key. Their deletion is very
simple. Just make these node as NULL. The remaining adjustment will occur. The resultant B-tree
will be,
Searching
The search operation on B-tree is similar to a search on binary search tree. Instead of choosing
between a left and right child as in binary tree, B-tree makes an m-way choice. Consider a B-tree as
given below
[Link]
B+ Tree
•In B-trees the traversing of the nodes is done in inorder manner which is time consuming. We want
such a data structure of B-tree which will allow us to access data sequentially, instead of inorder
traversing.
• Definition : In B+ trees from leaf nodes reference to any other node can be possible. The leaves in
B+tree form a linked list which is useful in scanning the nodes sequentially.
• The insertion and deletion operations are similar to B-trees.
•Example : Consider following B+tree.
• From leaf node only any key can be accessed of entire tree. There is no need to traverse the tree in
inorder fashion.
•Thus B+tree gives faster access to any key.
Ex. 5.2.1 Construct a B+tree for F, S, Q, K, C, L, H, T, V, W, M, R.
Sol ; The method for constructing B+tree is similar to the building of B tree but the only difference
here is that, the parent nodes also appear in the leaf nodes. We will build B+tree for order 5.
The order 3 means at the most 2 keys are allowed.
[Link]
++++++++++++++++++++++++++++++++++++++++++++
4.3 Graph Definition :
[Link]
Graph Definition
A graph is a collection of two sets V and E. Where V is a finite non empty set of vertices and
E is a finite non empty set of edges.
Vertices are nothing but the nodes in the graph.
Two adjacent vertices are joined by edges.
• Any graph is denoted as G = {V, E}
For example
Applications of Graphs
The graph theory is used in the computer science very widely. There are many interesting applications of
graph. We will list out few applications -
1. In computer networking such as Local Area Network (LAN), Wide Area Network (WAN),
internetworking.
[Link] telephone cabling graph theory is effectively used.
3. In job scheduling algorithms.
[Link]
+++++++++++++++++++++++++++++++++++++++++
Basic Terminologies
Complete graph: If an undirected graph of n vertices consists of n(n-1)/2 number of edges then it is
called as complete graph.
Connected graph: An undirected graph is said to be connected if for every pair of distinct vertices V;
and V; in V(G) there is a graph from V; to V; in G.
Weighted graph: A weighted graph is a graph which consists of weights along wih its edges.
[Link]
Path: A path is denoted using sequence of vertices and there exists an edge from one vertex to next
vertex.
Cycle : A closed walk through the graph with repeated vertices, mostly having the same starting and
endig vertex is called a cycle.
Self Loop
Self loop is an edge that connects the same vertex to itself.
++++++++++++++++++++++++++++++++++++++++++++++++
4.4 Representation of graph:
Representation of Graph
There various representations of graphs. The most commonly used representations are
[Link]
1. Adjacency Matrix Representation
2. Adjacency List Representation
Adjacency Matrix
Consider a graph G of n vertices and the matrix M. If there is an edge present between vertices Vi and
Vj then M[i][j] = 1 else M[i][j] = 0. Note that for an undirected graph if M[i][j] =1 then for M[j][i] is
also 1. Here are some graphs sh own by adjacency matrix.
Fig. 5.5.4 representing the adjacency list for the above graph. First let us see the 'C' structure.
typedef struct node1
{
char vertex;
[Link]
struct node1 *next
}node
node * head [5]
Explanation: This is the graph which can be represented with the array and linked list data structures.
Array is used to store the head nodes. The node strucutre will be the same through out.
Ex. 5.5.1 What do you mean adjacency matrix and adjacency list ? Give the adjacency matrix
and adjacency list of the following graph:
Ex. 5.5.2 Represent following graph using adjacency matrix and adjacency list.
[Link]
Ex. 5.5.3 Draw the adjacency list of the graph given in Fig. 5.5.6.
Ex. 5.5.4 Write an algorithm to find indegree and outdegree of a vertex in a given graph.
Sol
1. Initialize indegree and outdegree count for each node to zero.
2. Visit each node of a graph.
[Link]
3. Count the number of incoming edges of each node and then increment indegree count by one for
each incoming each.
4. Count the number of outgoing edges of each node and then increment outdegree count by one on
each outgoing edge.
Ex. 5.5.5 Write a function to print indegree, outdegree and total degree of given vertex. Sol. :
void print_degree()
{
int v;
for(v=0;v<n;v++)
{
in=indegree(v,n);
out outdegree(v,int n);
total=in+out;
printf("\n The indegree for vertex %d is %d",v,in); game jar
printf("\n The outdegree for vertex %d is %d",v,out);
printf("\n The total degree for vertex %d is %d",v,total);
}
int indegree(int v,n)
{
int v1,count=0;
for(v1=0;v1<n;v1++)
if(a[v1][v] ==1)
count++;
return count;
}
int outdegree(int v,int n)
{
int v2,count=0;
for(v2=0;v2<n;v2++)
if(a[v][v2]==1).
.count++;
return count;
}
Ex. 5.5.6 Explain with example inverse adjacency list representation of graph. Consider
following graph.
[Link]
+++++++++++++++++++++++++++++++++++++++++++++
4.5 Types of Graph :
Basically graphs are of two types -
1. Directed graphs
2. Undirected graphs.
In the directed graph the directions are shown on the edges. As shown in the Fig. 5.6.1, the edges
between the vertices are ordered. In this type of graph, the edge E1 is in between the vertices V1 and
V2. The V1 is called head and the V2 is called the tail. Similarly for V1 head the tail is V3 and so on.
We can say E, is the set of (V1, V2) and not of (V2, V1).
Similarly in an undirected graph, the edges are not ordered. Refer Fig. 5.6.2 for clear understanding of
undirected graph. In this type of graph the edge E1 is set of (V1, V2) or (V2, V1).
[Link]
+++++++++++++++++++++++++++++++++++++++++++++++
Sol. :
DFS Refer section 5.8.2.
BFS Refer section 5.8.1.
BFS for given graph
[Link]
Now delete each vertex from Queue and print it as BFS sequence:
V1 V2 V3 V4 V5 V6 V7 V8
[Link]
DFS for given graph
iii) DFS :
1 2 5 4 63
iv) BFS :
1 2 3 5 64
Ex. 5.8.3 From the Fig. 5.8.3, in what order are the vertices visited using DFS and BFS starting
from vertex A? Where a choice exists, use alphabetical order.
[Link]
Review Questions
1. Distinguish between breadth first search and depth first search with example.
2. Explain depth first and breadth first traversal.
++++++++++++++++++++++++++++++++++++++++++++
4.7 Bi-connectivity :
Bi-Connectivity
Biconnected graphs are the graphs which can not be broken into two disconnected pieces (graphs) by
connecting single edge. For example :
In the given Fig. 5.10.1 (a) even if we remove any single edge the graph does not become
disconnected.
For example even if we remove an edge E1 the graph does not become disconnected. We do not get
two disconnected components of graph. Same is the case with any other edge in the given graph.
[Link]
But the following graph does not possess the property of Biconnectivity.
In above graph if we remove an edge E-F then we will get two distinct graphs.
Properties of Biconnected Graph:
1. There are two disjoint paths between any two vertices.
[Link] exists simple cycle between two vertices.
3. There should not be any cut vertex (Cut vertex is a vertex which if we remove then the graph
becomes disconnected.)
Strongly Connected Components
Strongly Connectivity Definition :
A directed graph is strongly connected if there is directed path from any vertex to every other vertex.
The connected components are called strongly connected components for example:
In above graph the strongly connected components are represented by dotted marking.
Review Questions
1. Write short notes on biconnectivity.
2. Explain in detail about strongly connected components and illustrate with an example.
Cut Vertex
• Definition: A vertex V is an undirected graph G is called a cut vertex iff removing it
disconnects the graph.
• The cut vertex is also called as articulation point.
• The following example represents the concept of cut vertex.
[Link]
Euler Circuits
In graph theory there is a famous problem known as konigsberg bridge problem. In this problem the
main theme was to cross the seven bridges exactly once to visit various cities. The Swiss
mathematician Leonhard Euler solved this problem in 1736. From this problem, the concept of Euler
circuit is developed. Let us define the terminologies Euler path and Euler circuit.
Euler path A path in a graph G is called Euler path if it includes every edge exactly once and every
vertex gets visited. Euler circuit on graph G is an Euler path that visits each vertex of graph G and
uses every edge of G.
For example
The Euler circuit is A - B – E – A – D – B – C – E – D - C - A.
[Link]
Ex. 5.12.1 Find an Euler path or an Euler circuit using DFS for the following graph.
Euler circuit is a circuit that uses every edge of graph exactly once. It starts and ends at the same
vertex.
Review Questions
1. Give a short note on euler circuits.
2. Explain euler circuit with an example.
++++++++++++++++++++++++++++++++++++++++++++++
[Link]
4.10 Topological Sort :
Definition
Topologic sorting for directed acyclic graph (DAG) is a linear ordering of vertices such that every
directed edge uv, vertex u comes before v in ordering.
Algorithm:
Following are the steps to be followed in this algorithm -
1. From a given graph find a vertex with no incoming edges. Delete it along with all the edges
outgoing from it. If there are more than one such vertices then break the tie randomly.
2. Note the vertices that are deleted.
3. All these recorded vertices give topologically sorted list.
Let us understand this algorithm with some examples -
Ex. 5.9.1 Sort the digraph for topological sort.
Review Questions
1. Explain the topological sorting algorithm.
2. State and explain topological sort with suitable example.
Sol. Now we will consider each vertex as a source and will find the shortest distance from this vertex
to every other remaining vertex. Let us start with vertex A.
[Link]
Ex. 5.14.4 Apply an appropriate algorithm to find the shortest path from 'A' to every other node
of A. For the given graph.
Review Question
1. Explain the various applications of graphs.
+++++++++++++++++++++++++++++++++++++++++++++++++++++
Now, we will consider all the vertices first. Then we will select an edge with minimum weight. The
algorithm proceeds by selecting adjacent edges with minimum weight. Care should be taken for not
forming circuit.
[Link]
C program m
[Link]
/****************************************************************
This program is to implement Prim's Algorithm using Greedy Method
****************************************************************/
#include<stdio.h>
#include<conio.h>
#define SIZE 20
#define INFINITY 32767
/*This function finds the minimal spanning tree by Prim's Algorithm */
void Prim(int G[][SIZE], int nodes)
{
int tree[SIZE], i, j, k;
int min_dist, v1, v2,total=0;
// Initialize the selected vertices list
for (i=0; i<nodes; i++)
tree[i] = 0;
printf("\n\n The Minimal Spanning Tree Is :\n");
tree[0] = 1;
for (k=1; k<=nodes-1; k++)
{
min_dist = INFINITY;
//initially assign minimum dist as infinity
for (i=0; i<=nodes-1; i++)
{
for (j=0; j<=nodes-1; j++)
{
if (G[i][j] && ((tree[i] && !tree[j]) || (!tree[i] && tree[j])))
{
if (G[i][j] <min_dist)
{
min_dist=G[i][j];
v1 = i;
v2 = j;
}
}
}
}
printf("\n Edge (%d %d ) and weight = %d",v1,v2,min_dist);
tree[v1] = tree [v2] = 1;
total = total+min_dist;
}
printf("\n\n\t Total Path Length Is = %d",total);
}
void main()
[Link]
{
int G[SIZE][SIZE], nodes;
int v1, v2, length, i, j, n;
clrscr();
printf("\n\t Prim'S Algorithm\n");
printf("\n Enter Number of Nodes in The Graph ");
scanf("%d",&nodes);
printf("\n Enter Number of Edges in The Graph ");
scanf("%d",&n);
for (i=0; i<nodes; i++) // Initialize the graph
for (j=0; j<nodes; j++)
G[i][j] = 0;
//entering weighted graph
printf("\n Enter edges and weights \n");
for (i=0; i<n; i++)
{
printf("\n Enter Edge by V1 and V2 :");
printf("[Read the graph from starting node 0]");
scanf("%d %d", &v1,&v2);
printf("\n Enter corresponding weight :");
scanf("%d", &length);
G[v1][v2] = G[v2][v1] = length;
}
getch();
printf("\n\t");
clrscr();
Prim(G,nodes);
getch();
}
Output
Prim'S Algorithm
Enter Number of Nodes in The Graph 5
Enter Number of Edges in The Graph 7
[Link]
Enter edges and weights
Enter Edge by V1 and V2 : [Read the graph from starting node 0] 0 1
Enter corresponding weight :10
Enter Edge by V1 and V2 : [Read the graph from starting node 0] 1 2
Enter corresponding weight :1
Enter Edge by V1 and V2 : [Read the graph from starting node 0] 2 3
Enter corresponding weight :2
Enter Edge by V1 and V2 : [Read the graph from starting node 0] 3 4
Enter corresponding weight :3
Enter Edge by V1 and V2 [Read the graph from starting node 0] 4 0
Enter corresponding weight :5
Enter Edge by V1 and V2 : [Read the graph from starting node 0] 1 3
Enter corresponding weight :6
Enter Edge by V1 and V2 : [Read the graph from starting node 0] 4 2
Enter corresponding weight :7
The Minimal Spanning Tree Is :
Edge(0 4) and weight = 5
Edge(3 4) and weight = 3
Edge(2 3) and weight = 2
Edge(1 2) and weight = 1
Ex. 5.15.3 Give the Pseudo code for Prim's algorithm and apply the same to find the minimum
spanning tree of the graph shown below:
[Link]
4.14Kruskal's Algorithm :
Kruskal's algorithm is another algorithm of obtaining minimum spanning tree. This algorithm is
discovered by a second year graduate student Joseph Kruskal. In this algorithm always the minimum
cost edge has to be selected. But it is not necessary that selected optimum edge is adjacent.
Let us understand this algorithm with the help of some example.
Example :
Consider the graph given below:
[Link]
First we will select all the vertices. Then an edge with optimum weight is selected from heap, even
though it is not adjacent to previously selected edge. Care should be taken for not forming circuit.
[Link]
Ex. 5.15.4 Apply Kruskal's algorithm to find a minimum spanning tree of the following graph.
[Link]
C program
/*****************************************************************
Implementation of Kruskal's Algorithm
*****************************************************************/
#include<stdio.h>
#define INFINITY 999
typedef struct Graph
[Link]
{
int v1;
int v2;
int cost;
}GR;
GR G[20];
int tot_edges,tot_nodes;
void create();
void spanning_tree();
int Minimum(int);
void main()
{
printf("\n\t Graph Creation by adjacency matrix ");
create();
spanning_tree();
}
void create()
{
int k;
printf("\n Enter Total number of nodes: ");
scanf("%d", &tot_nodes);
printf("\n Enter Total number of edges: ");
scanf("%d", &tot_edges);
for(k=0;k<tot_edges;k++)
{
printf("\n Enter Edge in (V1 V2)form ");
scanf("%d%d",&G[k].v1,&G[k].v2);
printf("\n Enter Corresponding Cost ");
scanf("%d", &G[k].cost);
}
}
void spanning_tree()
{
int count,k,v1, v2,i,j,tree [10][10],pos,parent[10];
int sum;
int Find(int v2,int parent[]);
void Union(int i,int j,int parent[]);
count=0;
k=0;
sum=0;
for(i=0;i<tot_nodes;i++)
parent[i]=i;
while(count!=tot_nodes-1)
[Link]
{
pos=Minimum(tot_edges);//finding the minimum cost edge
if(pos==-1)//Perhaps no node in the graph
break;
v1=G[pos].v1;
v2=G[pos].v2;
i=Find(v1,parent);
j=Find(v2,parent);
if(i!=j)
{
tree[k][0]=v1;//storing the minimum edge in array tree[]
tree[k][1]=v2;
k++;
count++;
sum+=G[pos].cost;//accumulating the total cost of MST
Union(i,j,parent);
}
G[pos].cost=INFINITY;
}
if(count==tot_nodes-1)
{
printf("\n Spanning tree is...");
printf("\n-----------------\n");
for(i=0;i<tot_nodes-1;i++)
{
printf("[%d", tree[i][0]);
printf("-");
printf("%d",tree[i][1]);
printf("1");
}
printf("\n----------------");
printf("\nCost of Spanning Tree is=%d",sum);
}
else
{
printf("There is no Spanning Tree");
}
}
int Minimum(int n)
{
int i,small,pos;
small=INFINITY;
pos=-1;
[Link]
for(i=0;i<n;i++)
{
if(G[i].cost<small)
{
small=G[i].cost;
pos=i;
}
}
return pos;
}
int Find(int v2,int parent[])
{
while(parent[v2]!=v2)
{
v2=parent[v2];
}
return v2;
}
void Union(int i,int j,int parent[])
{
if(i<j)
parent[j]=i;
else
parent[i]=j;
}
Output
Graph Creation by adjacency matrix
Enter Total number of nodes: 4
Enter Total number of edges: 5
Enter Edge in (V1 V2)form 1 2
Enter Corresponding Cost 2
Enter Edge in (V1 V2)form 1 4
Enter Corresponding Cost 1
Enter Edge in (V1 V2)form 1 3
Enter Corresponding Cost 3
Enter Edge in (V1 V2)form 2 3
Enter Corresponding Cost 3
Enter Edge in (V1 V2)form 43
Enter Corresponding Cost 5
Spanning tree is...
-----------------------
[1-4][1-2][1-3]
-----------------------
[Link]
Cost of Spanning Tree is = 6
Example of a graph
Step 1: Firstly, we select an arbitrary vertex that acts as the starting vertex of the Minimum Spanning
Tree. Here we have selected vertex 0 as the starting vertex.
[Link]
0 is selected as starting vertex
Step 2: All the edges connecting the incomplete MST and other vertices are the edges {0, 1} and {0, 7}.
Between these two the edge with minimum weight is {0, 1}. So include the edge and vertex 1 in the MST.
Structure of the alternate MST if we had selected edge {1, 2} in the MST
How to implement Prim’s Algorithm?
Follow the given steps to utilize the Prim’s Algorithm mentioned above for finding MST of a graph:
• Create a set mstSet that keeps track of vertices already included in MST.
• Assign a key value to all vertices in the input graph. Initialize all key values as INFINITE. Assign the
key value as 0 for the first vertex so that it is picked first.
[Link]
• While mstSet doesn’t include all vertices
o Pick a vertex u that is not there in mstSet and has a minimum key value.
o Include u in the mstSet.
o Update the key value of all adjacent vertices of u. To update the key values, iterate through all
adjacent vertices.
o For every adjacent vertex v, if the weight of edge u-v is less than the previous key value
of v, update the key value as the weight of u-v.
The idea of using key values is to pick the minimum weight edge from the cut. The key values are used
only for vertices that are not yet included in MST, the key value for these vertices indicates the minimum weight
edges connecting them to the set of vertices included in MST.
Below is the implementation of the approach:
C++CJavaPythonC#JavaScript
// A C++ program for Prim's Minimum
// Spanning Tree (MST) algorithm. The program is
// for adjacency matrix representation of the graph
#include <bits/stdc++.h>
using namespace std;
return min_index;
}
// Driver's code
int main()
{
int graph[V][V] = { { 0, 2, 0, 6, 0 },
{ 2, 0, 3, 8, 5 },
{ 0, 3, 0, 0, 7 },
{ 6, 8, 0, 0, 9 },
{ 0, 5, 7, 9, 0 } };
return 0;
}
Output
Edge Weight
0-1 2
The graph contains 9 vertices and 14 edges. So, the minimum spanning tree formed will be having (9 –
1) = 8 edges.
After sorting:
1 7 6
2 8 2
2 6 5
[Link]
4 0 1
4 2 5
6 8 6
7 2 3
7 7 8
8 0 7
8 1 2
9 3 4
10 5 4
11 1 7
14 3 5
Now pick all edges one by one from the sorted list of edges
Step 1: Pick edge 7-6. No cycle is formed, include it.
[Link]
#include <bits/stdc++.h>
using namespace std;
public:
DSU(int n)
{
parent = new int[n];
rank = new int[n];
// Find function
int find(int i)
{
if (parent[i] == -1)
return i;
// Union function
void unite(int x, int y)
{
int s1 = find(x);
int s2 = find(y);
if (s1 != s2) {
if (rank[s1] < rank[s2]) {
parent[s1] = s2;
}
else if (rank[s1] > rank[s2]) {
parent[s2] = s1;
}
else {
parent[s2] = s1;
rank[s1] += 1;
}
}
}
};
class Graph {
vector<vector<int> > edgelist;
int V;
public:
Graph(int V) { this->V = V; }
void kruskals_mst()
{
// Sort all edges
sort([Link](), [Link]());
int ans = 0;
cout << "Following are the edges in the "
"constructed MST"
<< endl;
for (auto edge : edgelist) {
int w = edge[0];
int x = edge[1];
int y = edge[2];
// Driver code
int main()
{
Graph g(4);
[Link](0, 1, 10);
[Link](1, 3, 15);
[Link](2, 3, 4);
[Link](2, 0, 6);
[Link](0, 3, 5);
// Function call
g.kruskals_mst();
return 0;
}
Output
Following are the edges in the constructed MST
2 -- 3 == 4
0 -- 3 == 5
0 -- 1 == 10
[Link]
Minimum Cost Spanning Tree: 19
Time Complexity: O(E * logE) or O(E * logV)
• Sorting of edges takes O(E * logE) time.
• After sorting, we iterate through all edges and apply the find-union algorithm. The find and union
operations can take at most O(logV) time.
• So overall complexity is O(E * logE + E * logV) time.
• The value of E can be at most O(V2), so O(logV) and O(logE) are the same. Therefore, the overall time
complexity is O(E * logE) or O(E*logV)
Auxiliary Space: O(V + E), where V is the number of vertices and E is the number of edges in the
graph.
++++++++++++++++++++++++++++++++++++++++++++++++++
[Link]
UNIT V SEARCHING, SORTING AND HASHING TECHNIQUES 9
Searching – Linear Search – Binary Search. Sorting – Bubble sort – Selection sort – Insertion sort – Shell sort
–. Merge Sort – Hashing – Hash Functions – Separate Chaining – Open Addressing –Rehashing – Extendible
Hashing.
5.1 SEARCHING :
Searching is an algorithm, to check whether a particular element is present in the list.
Types of searching:-
Linear search
Binary Search
5.2 Linear Search :
Linear search is used to search a data item in the given set in the sequential manner, starting from
the first element. It is also called as sequential search
count ++ ;
}
if ( count = = 0 )
printf( " \n Element %d is not present in the array " , search ) ;
else
printf ( " \n Element %d is present %d times in the array \n " , search , count ) ;
}
OUTPUT:
Enter the number of elements 5
Enter the numbers
20 10 5 25 100
Array Elements
20 10 5 25 100
Enter the Element to be searched: 25
Element 25 is present 1 times in the array
Working principle:
Algorithm is quite simple. It can be done either recursively or iteratively:
1. Get the middle element;
2. If the middle element equals to the searched value, the algorithm stops;
3. Otherwise, two cases are possible:
o Search value is less than the middle element. In this case, go to the step 1 for the
part of the array, before middle element.
o Searched value is greater, than the middle element. In this case, go to the step 1
for the part of the array, after middle element.
Example 1.
OUTPUT:
Enter the number of elements 5
Enter the numbers
20 25 50 75 100
Array Elements
20 25 50 75 100
Enter the Element to be searched: 75
Element is present in the listPress any key to continue . . .
Advantages of Binary search:
In Linear search, the search element is compared with all the elements in the array. Whereas
in Binary search, the search element is compared based on the middle element present in the
array.
A technique for searching an ordered list in which we first check the middle item and - based
on that comparison - "discard" half the data. The same procedure is then applied to the
remaining half until a match is found or there are no more items left.
Disadvantages of Binary search:
Binary search algorithm employs recursive approach and this approach requires more
stack space.
It requires the data in the array to be stored in sorted order.
It involves additional complexity in computing the middle element of the array.
[Link]
Analysis of Searching algorithms:
Analysis Analysis
1 Linear search O(1) O(N) O(N)
2 Binary search O(1) O(log N) O(log N)
5.4 SORTING:
Definition:
Sorting is a technique for arranging data in a particular order.
Order of sorting:
Order means the arrangement of data. The sorting order can be ascending or descending. The
ascending order means arranging the data in increasing order and descending order means
arranging the data in decreasing order.
Types of Sorting
Internal Sorting
External Sorting
Internal Sorting
Internal Sorting is a type of sorting technique in which data resides on main memory of
computer. It is applicable when the number of elements in the list is small.
E.g. Bubble Sort, Insertion Sort, Shell Sort, Quick Sort., Selection sort, Radix sort
External Sorting
External Sorting is a type of sorting technique in which there is a huge amount of data and it resides
on secondary devise(for eg hard disk,Magnetic tape and so no) while sorting.
E.g. Merge Sort, Multiway Merge Sort,Polyphase merge sort
Sorting can be classified based on
[Link] complexity
[Link] utilization
3. Stability
4. Number of comparisons.
ANALYSIS OF ALGORITHMS:
Efficiency of an algorithm can be measured in terms of:
Sorting algorithms:
Insertion sort
Selection sort
Shell sort
Bubble sort
Quick sort
Merge sort
}
a[j]=temp;
}}
Program for Insertion sort
#include<stdio.h>
void main( ){
int n, a[ 25 ], i, j, temp;
printf( "Enter number of elements \n" );
scanf( "%d", &n );
printf( "Enter %d integers \n", n );
for ( i = 0; i < n; i++ )
scanf( "%d", &a[i] );
for ( i = 0 ; i < n; i++ ){
temp=a[i];
for (j=i;j > 0 && a[ j -1]>temp;j--)
{
a[ j ] = a[ j - 1 ];
}
a[j]=temp;}
printf( "Sorted list in ascending order: \n ");
for ( i = 0 ; i < n ; i++)
printf ( "%d \n ", a[ i ] );}
OUTPUT:
Enter number of elements
6
Enter 6 integers
20 10 60 40 30 15
Sorted list in ascending order:
10
15
20
30
40
[Link]
60
Advantage of Insertion sort
• Simple implementation.
Selection sort selects the smallest element in the list and place it in the first position then selects
the second smallest element and place it in the second position and it proceeds in the similar way
until the entire list is sorted. For “n” elements, (n-1) passes are required. At the end of the ith
iteration, the ith smallest element will be placed in its correct position.
OUTPUT:
After Sorting : 2 3 4 5 6
Advantages of Shell sort
• Efficient for medium-size lists.
Disadvantages of Shell sort
• Complex algorithm, not nearly as efficient as the merge, heap and quick sorts
[Link]
5.4.1 Bubble Sort:
• Bubble sort is one of the simplest internal sorting algorithms.
• Bubble sort works by comparing two consecutive elements and the largest
element among these two bubbles towards right at the end of the first pass the
largest element getssorted and placed at the end of the sorted list.
• This process is repeated for all pairs of elements until it moves the largest
element to theend of the list in that iteration.
• Bubble sort consists of (n-1) passes, where n is the number of elements to be sorted.
• In 1st pass the largest element will be placed in the nth position.
• In 2nd pass the second largest element will be placed in the (n-1)th
[Link] (n-1)th pass only the first two elements are compared.
1.
Assume A[0]=pivot which is the left. i.e pivot=left.
2.
Set i=left+1; i.e A[1];
3.
Set j=right. ie. A[6] if there are 7 elements in the array
4.
If A[pivot]>A[i],increment i and if A[j]>A[pivot],then decrement j, Otherwise swap A[i]
and A[j] element.
5. If i=j,then swap A[pivot] and A[j].
#include<stdio.h>
void quicksort (int [10], int, int ) ;
void main( )
{
int a[20], n, i ;
printf("Enter size of the array:" );
scanf("%d",&n);
printf( " Enter the numbers :");
for ( i = 0 ; i < n ; i ++ )
scanf ("%d",&a[i]);
quicksort ( a , 0 , n - 1 );
printf ( " Sorted elements: " );
for ( i = 0 ; i < n ; i ++ )
printf ("%d\t",a[ i]);
}
void quicksort ( int a[10], int left, int right ) [Link]
{
int p, j, temp, i ;
if ( left < right )
{
p = left ;
i = left ;
j = right ;
while ( i < j )
{
while(a[i]<= a[p] && i<right )
i++ ;
while ( a [ j ] > a [ p ] )
j--;
if ( i < j )
{
temp = a [ i ] ;
a[i]=a[j];
a[ j ] = temp ;
}
}
temp = a [ p ] ;
a[p]=a[j];
a [ j ] =temp ;
quicksort ( a , left , j - 1 ) ;
quicksort ( a , j + 1 , right ) ;
}
}
OUTPUT:
Enter size of the array:8
++++++++++++++++++++++++++++++++++++++++++++++[Link]
5.4.5 Merge Sort :
Merge sort is a sorting algorithm that uses the divide, conquer, and combine algorithmic
paradigm.
Divide means partitioning the n-element array to be sorted into two sub-arrays of n/2 elements.
If there are more elements in the array, divide A into two sub-arrays, A1 and A2, each containing
about half of the elements of A.
Conquer means sorting the two sub-arrays recursively using merge sort.
Combine means merging the two sorted sub-arrays of size n/2 to produce the sorted array of n
elements.
The basic steps of a merge sort algorithm are as follows:
If the array is of length 0 or 1, then it is already sorted.
Otherwise, divide the unsorted array into two sub-arrays of about half the size.
Use merge sort algorithm recursively to sort each sub-array.
Merge the two sub-arrays to form a single sorted list.
Merge Sort routine:
void Merge_sort (int a [ ] , int temp [ ] , int n )
{
msort ( a , temp , 0 , n - 1 ) ;
}
void msort ( int a[ ] , int temp [ ] , int left , int right ){
int center ; [Link]
if( left < right ){
center = ( left + right ) / 2 ;
msort ( a , left , center ) ;
msort ( a , temp , center + 1 , right ) ;
merge ( a , temp , n , left , center , right ) ;
}}
void merge ( int a [ ] , int temp [ ] , int n , int left , int center , int right )
{
int i = 0 , j , left_end = center , center = center + 1 ;
while( ( left < = left_end ) && ( center < = right ) )
{
if( a [ left ] < = a [ center ] )
{
temp [ i ] = a [ left ] ;
i++;
left + + ;
}
else
{
temp [ i ] = a [ center ] ;
i++;
center + + ;
}
}
[Link]
while( left <= left_end )
{
temp [ I ] = a [ left ] ;
left + + ;
i++;
}
while( center < = right )
{
temp [ i ] = a [ center ] ;
center + + ;
i++;
}
for ( i = 0 ; i < n ; i + + )
print temp [ i ] ;
}
int main()
{
int a[30],n,i;
printf("Enter no of elements:");
scanf("%d",&n);
printf("Enter array elements:");
for(i=0;i<n;i++)
scanf("%d",&a[i]);
mergesort(a,0,n-1);
printf("\nSorted array is :");
for(i=0;i<n;i++)
printf("%d ",a[i]);
return 0;
}
void mergesort(int a[],int i,int j)
{
int mid;
if(i<j)
[Link]
{
mid=(i+j)/2;
mergesort(a,i,mid); //left recursion mergesort(a,mid+1,j);
//right recursion merge(a,i,mid,mid+1,j); //merging of two
sorted sub-arrays
}}
void merge(int a[],int i1,int j1,int i2,int j2)
{
int temp[50]; //array used for merging
int i,j,k;
i=i1; //beginning of the first list
j=i2; //beginning of the second list
k=0;
while(i<=j1 && j<=j2) //while elements in both lists
{ if(a[i]<a[j])
temp[k++]=a[i++];
else
temp[k++]=a[j++];
}
while(i<=j1) //copy remaining elements of the first list
temp[k++]=a[i++];
while(j<=j2) //copy remaining elements of the second list
temp[k++]=a[j++];
//Transfer elements from temp[] back to a[]
for(i=i1,j=0;i<=j2;i++,j++)
a[i]=temp[j];
}
OUTPUT:
Enter no of elements:8
Enter array elements:24 13 26 1 2 27 38 15
Sorted array is :1 2 13 15 24 26 27 38
Advantages of Merge sort
• Mergesort is well-suited for sorting really huge amounts of data that does not fit into
memory.
• It is fast and stable algorithm
5.6 HASHING :
Hashing is a technique that is used to store, retrieve and find data in the data structure
called Hash Table. It is used to overcome the drawback of Linear Search (Comparison) &
Binary Search (Sorted order list). It involves two important concepts-
➢ Hash Table
➢ Hash Function
Hash table
A hash table is a data structure that is used to store and retrieve data (keys) very
quickly.
It is an array of some fixed size, containing the keys.
Hash table run from 0 to Tablesize – 1.
Each key is mapped into some number in the range 0 to Tablesize – 1.
This mapping is called Hash function.
Insertion of the data in the hash table is based on the key value obtained from the
hash function.
Using same hash key value, the data can be retrieved from the hash table by few
or more Hash key comparison.
The load factor of a hash table is calculated using the formula:
(Number of data elements in the hash table) / (Size of the hash table)
Factors affecting Hash Table Design
Hash function
Table size.
[Link]
Collision handling scheme
0
1
2
3
.
. Simple Hash table with table size = 10
8
9
E.g. consider the following data or record or key (36, 18, 72, 43, 6) table size = 8
93 44
107
3306
4999
The folding method for constructing hash functions begins by dividing the item into
equal-size pieces (the last piece may not be of equal size). These pieces are then added together
to give the resulting hash key value. For example, if our item was the phone number 436-555-
4601, we would take the digits and divide them into groups of 2 (43, 65, 55, 46, 01). After the
addition, 43+65+55+46+01, we get 210. If we assume our hash table has 11 slots, then we need
to perform the extra step of dividing by 11 and keeping the remainder. In this case 210 % 11 is 1,
so the phone number 436-555-4601 hashes to slot 1.
6-555-4601
[Link]
Collision:
If two more keys hashes to the same index, the corresponding records cannot be stored in the
same location. This condition is known as collision.
Characteristics of Good Hashing Function:
Linear Probing
Quadratic Probing
Double Hashing
5.8 Separate chaining (Open Hashing):
Open hashing technique.
Implemented using singly linked list concept.
Pointer (ptr) field is added to each record.
When collision occurs, a separate chaining is maintained for colliding data.
Element inserted in front of the list.
H (key) =key % table size
Two operations are there:-
▪ Insert
▪ Find
[Link]
P = find ( key, H );
if(P = = NULL)
{
newnode = malloc(sizeof(Struct node));
L = H TheLists[Hash(key,Tablesize)];
newnode nex t= L next;
newnode data = key;
L next = newnode;
}}
Position find( int key, Hashtable H){
Position P, List L;
L = H TheLists[Hash(key,Tablesize)];
P = L next;
while(P != NULL && P data != key)
P = P next;
return P;}
If two keys map to same value, the elements are chained together.
Initial configuration of the hash table with separate chaining. Here we use SLL(Singly Linked List)
concept to chain the elements.
NULL
0
NULL
1
NULL
2 NULL
3 NULL
4 NULL
5 NULL
6 NULL
7 NULL
8 NULL
9
[Link]
Insert the following four keys 22 84 35 62 into hash table of size 10 using separate chaining.
The hash function is
H(key) = key % 10
1. H(22) = 22 % 10 =2 2. 84 % 10 = 4
3.H(35)=35%10=5 4. H(62)=62%10=2
[Link]
Advantages
1. More number of elements can be inserted using array of Link List
Disadvantages
1. It requires more pointers, which occupies more memory space.
[Link] takes time. Since it takes time to evaluate Hash Function and also to traverse the
List
EMPTY 89 18 49 58 69
0 49 49 49
1 58 58
2 69
3
4
5
6
7
8 18 18 18
9 89 89 89 89
Linear probing
Quadratic Probing
To resolve the primary clustering problem, quadratic probing can be used. With quadratic
probing, rather than always moving one spot, move i2 spots from the point of collision, where
i is the number of attempts to resolve the collision.
Another collision resolution method which distributes items more evenly.
[Link]
From the original index H, if the slot is filled, try cells H+12, H+22, H+32,.., H + i2 with
wrap-around.
Hi(X)=(Hash(X)+F(i))mod Tablesize,F(i)=i2
Hi(X)=(Hash(X)+ i2)mod Tablesize
Limitation: at most half of the table can be used as alternative locations to resolve collisions.
This means that once the table is more than half full, it's difficult to find an empty spot. This
new problem is known as secondary clustering because elements that hash to the same hash
key will always probe the same alternative cells.
Double Hashing
Double hashing uses the idea of applying a second hash function to the key when a
collision occurs. The result of the second hash function will be the number of positions forms
the point of collision to insert.
There are a couple of requirements for the second function:
It must never evaluate to 0 must make sure that all cells can be probed.
Hi(X)=(Hash(X)+i*Hash2(X))mod Tablesize
A popular second hash function is:
Hash2 (key) = R - (key % R) where R is a prime number that is smaller than the size of the
table.
[Link]
5.10 Rehashing :
Once the hash table gets too full, the running time for operations will start to take too
long and may fail. To solve this problem, a table at least twice the size of the original will be
built and the elements will be transferred to the new table.
Advantage:
A programmer doesn‟t worry about table system.
Simple to implement
Can be used in other data structure as well
The new size of the hash table:
should also be prime
will be used to calculate the new insertion spot (hence the name rehashing)
This is a very expensive operation! O(N) since there are N elements to rehash and the
table size is roughly 2N. This is ok though since it doesn't happen that often.
[Link]
• Since the directory is much smaller than the file, doubling it is much cheaper. Only onepage of
keys and pointers is split.
000 100
0 1
010 100
100 000
111 000 000 100 100 000
001 000 010 100 111 000
011 000 001 000 101 000
111 001
001 010
101 100 00 01 10 11
101 110
[Link]
111 000
111 001
Extendible Hashing is a dynamic hashing method wherein directories, and buckets are
used to hash data. It is an aggressively flexible method in which the hash function also
experiences dynamic changes.
Main features of Extendible Hashing: The main features in this hashing technique are:
• Directories: These containers store pointers to buckets. Each directory is given a unique
id which may change each time when expansion takes place. The hash function returns this
directory id which is used to navigate to the appropriate bucket. Number of Directories =
2^Global Depth.
• Buckets: They store the hashed keys. Directories point to buckets. A bucket may contain
more than one pointers to it if its local depth is less than the global depth.
• Global Depth: It is associated with the Directories. They denote the number of bits which
are used by the hash function to categorize the keys. Global Depth = Number of bits in
directory id.
• Local Depth: It is the same as that of Global Depth except for the fact that Local Depth is
associated with the buckets and not the directories. Local depth in accordance with the
global depth is used to decide the action that to be performed in case an overflow occurs.
Local Depth is always less than or equal to the Global Depth.
• Bucket Splitting: When the number of elements in a bucket exceeds a particular size, then
the bucket is split into two parts.
[Link]
• Step 1 – Analyze Data Elements: Data elements may exist in various forms eg. Integer,
String, Float, etc.. Currently, let us consider data elements of type integer. eg: 49.
• Step 2 – Convert into binary format: Convert the data element in Binary form. For string
elements, consider the ASCII equivalent integer of the starting character and then convert
the integer into binary form. Since we have 49 as our data element, its binary form is
110001.
• Step 3 – Check Global Depth of the directory. Suppose the global depth of the Hash-
directory is 3.
• Step 4 – Identify the Directory: Consider the ‘Global-Depth’ number of LSBs in the
binary number and match it to the directory id.
[Link]
Eg. The binary obtained is: 110001 and the global-depth is 3. So, the hash function will
return 3 LSBs of 110001 viz. 001.
• Step 5 – Navigation: Now, navigate to the bucket pointed by the directory with directory-
id 001.
• Step 6 – Insertion and Overflow Check: Insert the element and check if the bucket
overflows. If an overflow is encountered, go to step 7 followed by Step 8, otherwise, go
to step 9.
• Step 7 – Tackling Over Flow Condition during Data Insertion: Many times, while
inserting data in the buckets, it might happen that the Bucket overflows. In such cases, we
need to follow an appropriate procedure to avoid mishandling of data.
First, Check if the local depth is less than or equal to the global depth. Then choose one of
the cases below.
o Case1: If the local depth of the overflowing Bucket is equal to the global depth,
then Directory Expansion, as well as Bucket Split, needs to be performed. Then
increment the global depth and the local depth value by 1. And, assign appropriate
pointers.
Directory expansion will double the number of directories present in the hash
structure.
o Case2: In case the local depth is less than the global depth, then only Bucket Split
takes place. Then increment only the local depth value by 1. And, assign
appropriate pointers.
[Link]
• Step 8 – Rehashing of Split Bucket Elements: The Elements present in the overflowing
bucket that is split are rehashed w.r.t the new global depth of the directory.
• Step 9 – The element is successfully hashed.
Example based on Extendible Hashing: Now, let us consider a prominent example of
hashing the following elements: 16,4,6,22,24,10,31,7,9,20,26.
Bucket Size: 3 (Assume)
Hash Function: Suppose the global depth is X. Then the Hash Function returns X LSBs.
• Solution: First, calculate the binary forms of each of the given numbers.
16- 10000
4- 00100
6- 00110
22- 10110
24- 11000
10- 01010
31- 11111
[Link]
7- 00111
9- 01001
20- 10100
26- 11010
• Initially, the global-depth and local-depth is always 1. Thus, the hashing frame looks like
this:
• Inserting 16:
The binary format of 16 is 10000 and global-depth is 1. The hash function returns 1 LSB
of 10000 which is 0. Hence, 16 is mapped to the directory with id=0.
[Link]
• Inserting 4 and 6:
Both 4(100) and 6(110)have 0 in their LSB. Hence, they are hashed as follows:
• Inserting 22: The binary form of 22 is 10110. Its LSB is 0. The bucket pointed by directory
0 is already full. Hence, Over Flow occurs.
[Link]
• As directed by Step 7-Case 1, Since Local Depth = Global Depth, the bucket splits and
directory expansion takes place. Also, rehashing of numbers present in the overflowing
bucket takes place after the split. And, since the global depth is incremented by 1, now,the
global depth is 2. Hence, 16,4,6,22 are now rehashed w.r.t 2 LSBs.[
16(10000),4(100),6(110),22(10110) ]
[Link]
•
*Notice that the bucket which was underflow has remained untouched. But, since the
number of directories has doubled, we now have 2 directories 01 and 11 pointing to the
same bucket. This is because the local-depth of the bucket has remained 1. And, any bucket
having a local depth less than the global depth is pointed-to by more than one directories.
• Inserting 24 and 10: 24(11000) and 10 (1010) can be hashed based on directories with id
00 and 10. Here, we encounter no overflow condition.
[Link]
• Inserting 31,7,9: All of these elements[ 31(11111), 7(111), 9(1001) ] have either 01 or 11
in their LSBs. Hence, they are mapped on the bucket pointed out by 01 and 11. We do not
encounter any overflow condition here.
[Link]
• Inserting 20: Insertion of data element 20 (10100) will again cause the overflow problem.
[Link]
• 20 is inserted in bucket pointed out by 00. As directed by Step 7-Case 1, since the local
depth of the bucket = global-depth, directory expansion (doubling) takes place along
with bucket splitting. Elements present in overflowing bucket are rehashed with the new
global depth. Now, the new Hash table looks like this:
[Link]
• Inserting 26: Global depth is 3. Hence, 3 LSBs of 26(11010) are considered. Therefore 26
best fits in the bucket pointed out by directory 010.
[Link]
• The bucket overflows, and, as directed by Step 7-Case 2, since the local depth of bucket
< Global depth (2<3), directories are not doubled but, only the bucket is split and elements
are rehashed.
Finally, the output of hashing the given list of numbers is obtained.
[Link]
1. A Bucket will have more than one pointers pointing to it if its local depth is less than the
global depth.
2. When overflow condition occurs in a bucket, all the entries in the bucket are rehashed with
a new local depth.
3. If Local Depth of the overflowing bucket
4. The size of a bucket cannot be changed after the data insertion process begins.
Advantages:
1. The directory size may increase significantly if several records are hashed on the same
directory while keeping the record distribution non-uniform.
2. Size of every bucket is fixed.
3. Memory is wasted in pointers when the global depth and local depth difference becomes
drastic.
4. This method is complicated to code.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++