0% found this document useful (0 votes)
75 views248 pages

Cs3301 UNIT III - V Data Structure Notes

This document provides an overview of tree data structures, focusing on the Tree Abstract Data Type (ADT), binary trees, and their traversals. It covers key concepts such as tree creation, insertion, deletion, and traversal methods (inorder, preorder, postorder), along with definitions of important terminologies like root, parent, child, leaves, and types of binary trees (full, complete, skewed). Additionally, it discusses the representation of binary trees using sequential and linked methods, highlighting their advantages and disadvantages.

Uploaded by

Mahalakshmi P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views248 pages

Cs3301 UNIT III - V Data Structure Notes

This document provides an overview of tree data structures, focusing on the Tree Abstract Data Type (ADT), binary trees, and their traversals. It covers key concepts such as tree creation, insertion, deletion, and traversal methods (inorder, preorder, postorder), along with definitions of important terminologies like root, parent, child, leaves, and types of binary trees (full, complete, skewed). Additionally, it discusses the representation of binary trees using sequential and linked methods, highlighting their advantages and disadvantages.

Uploaded by

Mahalakshmi P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

UNIT III

TREES
Tree ADT – Tree Traversals - Binary Tree ADT – Expression trees – Binary Search Tree ADT – AVL Trees
– Priority Queue (Heaps) – Binary Heap.

3.1 Trees ADT :


A tree is a finite set of one or more nodes such that -
i) There is a specially designated node called root.
ii) The remaining nodes are partitioned into n >= 0 disjoint sets T1, T2, T3...Tn
T1,T2T3, ...T are called the sub-trees of the root.

The concept of tree is represented by following Fig. 4.1.1.

Various operations that can be performed on the tree data structure are -
1. Creation of a tree.
2. Insertion of a node in the tree as a child of desired node.
3. Deletion of any node(except root node) from the tree.
4. Modification of the node value of the tree.
5. Searching particular node from the tree.

3.1.1 Basic Terminologies


Let us get introduced with some of the definitions or terms which are normally used.
1. Root
Root is a unique node in the tree to which further subtrees are attached. For above given tree, node 10 is a root
node.
2. Parent node
The node having further sub-branches is called parent node. In Fig. 4.2.2 the 20 is parent node of 40, 50 and
60.

3. Child nodes
The child nodes in above given tree are marked as shown below

4. Leaves
These are the terminal nodes of the tree.
For example -
5. Degree of the node
The total number of subtrees attached to that node is called the degree of a node.
For example.

6. Degree of tree
The maximum degree in the tree is degree of tree.

7. Level of the tree


The root node is always considered at level zero.
The adjacent nodes to root are supposed to be at level 1 and so on.
8. Height of the tree
The maximum level is the height of the tree. In Fig. 4.2.7 the height of tree is 3. Sometimes height of the tree
is also called depth of tree.
9. Predecessor
While displaying the tree, if some particular node occurs previous to some other node then that node is called
predecessor of the other node.
For example: While displaying the tree in Fig. 4.2.7 if we read node 20 first and then if we read node 40, then
20 is a predecessor of 40.
10. Successor
Successor is a node which occurs next to some node.
For example: While displaying tree in Fig. 4.2.7 if we read node 60 after-reading node 20 then 60 is called
successor of 20.
11. Internal and external nodes
Leaf node means a node having no child node. As leaf nodes are not having further links, we call leaf nodes
External nodes and non leaf nodes are called internal nodes.

12. Sibling
The nodes with common parent are called siblings or brothers.
For example
In this chapter we will deal with special type of trees called binary trees. Let us understand it.

3.2 Tree Traversal


Definition: Tree traversal means visiting each node exactly once.
•Basically there are six ways to traverse a tree. For these traversals we will use following notations :
L for moving to left child
R for moving to right child
D for parent node
• Thus with L, R, D we will have six combinations such as LDR, LRD, DLR, DRL, RLD, RDL.
• From computation point of view, we will consider only three combinations as LDR, DLR and LRD i.e.
inorder, preorder and postorder respectively.
1) Inorder Traversal :
In this type of traversal, the left node is visited, then parent node and then right node is visited.
For example

Algorithm:
1. If tree is not empty then
a. traverse the left subtree in inorder
b. visit the root node
c. traverse the right subtree in inorder
The recursive routine for inorder traversal is as given below to
void inorder(node *temp)
{
if(temp!= NULL)
{
inorder(temp->left);
printf("%d",temp->data);
inorder(temp->right);
}
}
2) Preorder Traversal :
In this type of traversal, the parent node or root node is visited first; then left node and finally right node will
be visited.
For example

Algorithm:
1. If tree is not empty then
a. visit the root node
b. traverse the left subtree in preorder
c. traverse the right subtree in preorder
The recursive routine for preorder traversal is as given below.
void preorder(node *temp)
{
if(temp!= NULL)
{
printf("%d",temp->data);
preorder(temp->left);
preorder(temp->right);
}
}
3) Postorder Traversal :
In this type of traversal, the left node is visited first, then right node and finally parent node is visited.
For example

Algorithm:
1. If tree is not empty then
a. traverse the left subtree in postorder
b. traverse the right subtree in postorder
c. visit the root node
The recursive routine for postorder traversal is as given below.
void postorder(node *temp)
{
if(temp!= NULL)
{
postorder(temp->left);
postorder(temp->right);
printf("%d",temp->data);
}
}
Ex. 4.6.1: Write inorder, preorder and postorder traversal for the following tree:

Sol. :
Inorder 8 10 11 30 20 25 40 42 45 60 50 55
Preorder 40 30 10 8 11 25 20 50 45 42 60 55
Postorder 8 11 10 20 25 30 42 60 45 55 50 40.
Ex. 4.6.2. Implementation of binary tree
***************************************************************
Program for creation of a binary tree and display the tree using recursive inorder, preorder and post order
traversals
****************************************************************/
#include <stdio.h>
#include <alloc.h>
#include<conio.h>
typedef struct bin
{
int data;
struct bin *left;
struct bin *right;
}node;/*Binary tree structure*/
void insert(node *, node *);
void inorder(node *);
void preorder(node *);
void postorder(node *);
node *get_node();
void main()
{
int choice;
char ans='n';
node *New,*root;
root=NULL;
clrscr();
do
{
printf("\n Program For Implementing Simple Binary Tree");
printf("\n [Link]");
printf("\n [Link]");
printf("\n [Link] ");
printf("\n [Link]");
printf("\n [Link]");
printf("\n\t Enter Your Choice: ");
scanf("%d", &choice);
switch(choice)
{
case 1:root = NULL;
do
{
New get_node();
printf("\n Enter The Element: ");
scanf("%d", &New->data);
if(root == NULL)
root=New;
else
insert(root,New);
printf("\n Do You want To Enter More elements?(y/n):");
ans=getche();
} while (ans=='y' || ans == 'Y');
clrscr();
break;
case 2:if(root == NULL)
printf("Tree Is not Created!");
else
inorder(root);
break;
case 3:if(root == NULL)
printf("Tree Is Not Created!");
else
preorder(root);
break;
case 4:if(root == NULL)
printf("Tree Is Not Created!");
else
postorder(root);
break;
}
}while(choice!=5);
}
node *get_node()
{
node *temp;
temp=(node *)malloc(sizeof(node));
temp->left = NULL;
temp->right= NULL;
return temp;
}
void insert(node *root,node *New)
{
char ch;
printf("\n Where to insert left/right of %d: ", root->data);
ch=getche();
if ((ch=='r') || (ch=='R'))
{
if(root->right== NULL)
{
root->right=New;
}
else
insert(root->right,New);
}
else
{
if (root->left== NULL)
{
root->left=New;
}
else
insert(root->left, New);
}
}
void inorder(node *temp)
{
if(temp!= NULL)
{
inorder(temp->left);
printf("%d",temp->data);
inorder(temp->right);
}
}
void preorder(node *temp)
{
if(temp!= NULL)
{
printf("%d",temp->data);
preorder(temp->left);
preorder(temp->right);
}
}
void postorder(node *temp)
{
if(temp!= NULL)
{
postorder(temp->left);
postorder(temp->right);
printf("%d",temp->data);
}
}
Output
Program For Implementing Simple Binary Tree
[Link]
[Link]
[Link]
[Link]
[Link]
Enter Your Choice: 1
Enter The Element: 10
Do You Want To Enter More Elements?(y/n): y
Enter The Element: 12
Where to insert left/right of 10: 1
Do You Want To Enter More Elements?(y/n): y
Enter The Element: 17
Where to insert left/right of 10: r
Do You Want To Enter More Elements?(y/n): y
Enter The Element: 8
Where to insert left/right of 10:1
Where to insert left/right of 12: r
Do You Want To Enter More Elements?(y/n):
Program For Implementing Simple Binary Tree
[Link]
[Link]
[Link]
[Link]
[Link]
Enter Your Choice: 2
12 8 10 17
Program For Implementing Simple Binary Tree
[Link]
[Link]
[Link]
[Link]
[Link]
Enter Your Choice: 3
10 12 8 17
Program For Implementing Simple Binary Tree
[Link]
[Link]
[Link]
[Link]
[Link]
Enter Your Choice: 4
8 12 17 10
Program For Implementing Simple Binary Tree
1. Create
2. Inorder
3. Preorder
4. Postorder
5. Exit
Enter Your Choice: 5

3.3. Binary Tree ADT


Definition of a binary tree: A binary tree is a finite set of nodes which is either empty or consists of a root
and two disjoint binary trees called the left subtree and right subtree.
• The binary tree can be as shown below -

Abstract DataType BinT (node * root)


{
Instances
: Binary tree is a nonlinear data structure which contains every node except the leaf nodes at most two child
nodes.
Operations:
1. Insertion :
This operation is used to insert the nodes in the binary tree. By inserting desired number of nodes, the binary
tree gets created.
2. Deletion :
This operation is used to remove any node from the tree. Note that if root node is removed the tree becomes
empty.
}
++++++++++++++++++++++++++++++++++++++++++++

3.3.1 Types of Binary Tree :


Full Binary Tree
•. A full binary tree is a tree in which every node has zero or two children.

• In other words, the full binary tree is a binary tree in which all the nodes have two children except the leaf
node.
Complete Binary Tree
• The complete binary tree is a binary in which all the levels are completely filled except the last level which
is filled from left.
• There are two points to be remembered
1) The leftmost side of the leaf node must always be filled first.
2) It is not necessary for the last leaf node to have right sibling.

+++++++++++++++++++++++++++++++++++++++++++
• Note that in above representation, Tree 1 is a complete binary tree in which all the levels are completely
filled, whereas Tree 2 is also a complete binary tree in which the last level is filled from left to right but it is
incompletely filled.
Difference between complete binary tree and full binary tree

Left and Right Skewed Trees


The tree in which each node is attached as a left child of parent node then it is left skewed tree. The tree in
which each node is attached as a right child of parent node then it is called right skewed tree.
++++++++++++++++++++++++++++++++++++++++++++++
3.3.2 Representation of Tree
There are two ways of representing the binary tree.
1. Sequential representation
2. Linked representation.
Let us see these representations one by one.

1. Sequential representation of binary trees or array representation :


Each node is sequentially arranged from top to bottom and from left to right. Let us understand this matter by
numbering each node. The numbering will start from root node and then remaining nodes will give ever
increasing numbers in level wise direction. The nodes on the same level will be numbered from left to right.
The numbering will be as shown below.

Now, observe Fig. 4.5.1 carefully. You will get a point that a binary tree of depth n having 2n-1 number of
nodes. In Fig. 4.5.1 the tree is having the depth 4 and total number of nodes are 15. Thus remember that in a
binary tree of depth n there will be maximum 2n-1 nodes. And so if we know the maximum depth of the tree
then we can represent binary tree using arrays data structure. Because we can then predict the maximum size
of an array that can accommodate the tree.
Thus array size can be >=n. The root will be at index 0. Its left child will be at index 1, its right child will be
at index 2 and so on. Another way of placing the elements in the array is by applying the formula as shown
below-
• When n = 0 the root node will placed at 0th location
•Parent(n) = floor(n-1)/2
•Left(n) = (2n+1)
• Right(n) = (2n+2).

Advantages of sequential representation


The only advantage with this type of representation is that the direct access to any node can be possible and
finding the parent, left or right children of any particular node is fast because of the random access.
Disadvantages of sequential representation
[Link] major disadvantage with this type of representation is wastage of memory.
For example: In the skewed tree half of the array is unutilized. You can easily understand this point simply by
seeing Fig. 4.5.3.
2. In this type of representation the maximum depth of the tree has to be fixed because we have already decided
the array size. If we choose the array size quite larger than the depth of the tree, then it will be wastage of the
memory. And if we choose array size lesser than the depth of the tree then we will be unable to represent some
part of the tree. The insertions and deletion of any node in the tree will be costlier as other nodes have to be
adjusted at appropriate positions so that the meaning of binary tree can be preserved.

As these drawbacks are there with this sequential type of representation, we will search for more flexible
representation. So instead of array we will make use of linked list to represent the tree.

2. Linked representation or node representation of binary trees


In binary tree each node will have left child, right child and data field.
The left child is nothing but the left link which points to some address of left subtree whereas right child is
also a right link which points to some address of right subtree. And the data field gives the information about
the node. Let us see the 'C' structure of the node in a binary tree.
typedef struct node
{
int data;
struct node *left;
struct node *right;
}bin;
The tree with Linked representation is as shown below -

Advantages of linked representation


1. This representation is superior to our array representation as there is no wastage of memory. And so there
is no need to have prior knowledge of depth of the tree. Using dynamic memory concept one can create as
many memory (nodes) as required. By chance if some nodes are unutilized one can delete the nodes by making
the address free.
2. Insertions and deletions which are the most common operations can be done without moving the other
nodes.
Disadvantages of linked representation
[Link] representation does not provide direct access to a node and special algorithms are required.
[Link] representation needs additional space in each node for storing the left and right sub-trees.
Ex. 4.5.1: For the given data draw a binary tree and show the array representation of the same: 100 80
45 55 110 20 70 65
Sol. The binary tree will be
Formula used for placing the node values in array are
1. Root will be at 0th location
2. Parent(n) - floor (n - 1)/2
3. Left (n) = (2n+1)
4. Right (n) = (2n + 2)
where n > 0.
Ex. 4.5.2: What is binary tree? Show the array representation and linked representation for the
following binary tree.
Sol. Binary Tree: Binary Tree is a finite set of nodes which is either empty or consists of a root and two
disjoint binary trees called left subtree and right subtree.
Array Representation

Root = A = index 0
Left(0) = 2*0 + 1 = 1
B = Index 1
Left (1) = 2*1+1=3
C = Index 3
Left (3) = 2*3+1 = 7
D = Index 7
Left (7) = 2*7+1=15
E = Index 15
Right (7) = 2*7 +2 = 16
F = Index 16
Linked Representation
3.4 Expression Trees :
Definition: An expression tree is a binary tree in which the operands are attached as leaf nodes and operators
become the internal nodes.
For example -

From expression tree :


Inorder traversal: A+B*C(Left-Data-Right)
Preorder traversal: +A*BC(Data-Left-Right)
Postorder traversal: ABC*+ (Left-Right-Data)
If we traverse the above tree in inorder, preorder or postorder then we get infix, prefix or postfix expressions
respectively.
Key Point; Inorder traversal of expression tree fives infix expression. The preorder traversal of expression
tree gives prefix expression and post order traversal of expression tree gives postfix expression.

Creation of an Expression Trees


Consider a postfix expression, stored in array expose sopol 5 Pms

Now we will read each symbol from left to right one character at a time. If we read an operand then we will
make a node of it and push it onto the stack. If we read operator then pop two nodes from stack, the first
popped node will be attached as right child to operator node and second popped node will be attached as a left
child to operator node. For the above given expression let us build a tree.
As now we read '\0', pop the content of stack. The node which we will get is the root node of an expression
tree.

Let us implement it
Ex. 4.7.1 Program for creating an expression tree and printing it using an inorder traversal.
Sol. :
#include <stdio.h>
#include <conio.h>
#include <alloc.h>
#include <ctype.h>
#define size 20
typedef struct node
{
char data;
struct node *left;
struct node *right;
}btree;
/*stack stores the operand nodes of the tree*/
btree *stack[size];
int top;
void main()
{
btree *root;
char exp[80]; /* exp stores postfix expression */
btree *create(char exp[80]);
void display(btree *root);
clrscr();
printf("Enter the postfix expression\n");
scanf("%s",exp);
top = -1; /* Initialise the stack */
root = create(exp);
printf("\n The Tree is created...\n");
printf("\n The inorder traversal of it is \n");
display(root);
getch();
}
btree* create(char exp[])
{
btree *temp;
int pos;
char ch;
void push(btree *);
btree *pop();
pos = 0;
ch = exp[pos];
while (ch!= '\0')
{
/* Create a new node */
temp = (btree *)malloc(sizeof(btree));
temp -> left = temp -> right = NULL;
temp -> data = ch;
if (isalpha(ch) ) /* is it a operand */
push (temp); /* push operand */
else if (ch=='+' || ch=='-'||ch=='*' || ch=='/')
{
/* it is operator, so pop two nodes from stack
set first node as right child and
set second as left child and push the
operator node on to the stack
*/
temp->right = pop();
temp ->left = pop();
push(temp);
}
else
printf("Invalid character in expression\n");
pos ++;
ch= exp[pos]; /* Read next character */
}
temp = pop();
return(temp);
}
void push(btree *Node)
{
if (top+1 >= size)
printf("Error: Stack is Full\n");
top++;
stack[top] = Node;
}
btree* pop()
{
btree *Node;
if (top == -1)
printf("Error: Stack is Empty\n");
Node = stack[top];
top--;
return(Node);
}
void display(btree *root)
{
btree *temp;
temp = root;
if (temp!= NULL)
{
display(temp->left);
printf("%c", temp->data);
display(temp->right);
}
}
Output
Enter the postfix expression
ab+cd-*
The Tree is created...
The inorder traversal of it is
a + b * c-d
Ex. 4.7.2 Show the binary tree with arithmetic expression A/B * C * D + E. Give the algorithm for
inorder, preorder, postorder traversals and show the result of these traversals.
Sol. :
Algorithm for inorder, preorder and postorder Traversal - Refer section 4.6.
Inorder Traversal A/B*C*D+E
Preorder Traversal + ** / ABCDE
Postorder Traversal AB/C*D*E+

➢ Applications of Trees
Various applications of trees are -
1. Binary search tree
2. Expression tree
3. Threaded binary tree.

Review Question
1. What are expression trees. Write the procedure for constructing an expression tree.
________________________________________________________

3.5 Binary Search Tree ADT:


Binary search tree is a binary tree in which the nodes are arranged in specific order. That means the
values at left subtree are less than the root node value. Similarly the values at the right subtree are greater than
the root node. Fig. 4.9.1 represents the binary search tree.
The binary search tree is based on binary search algorithm.

Operations on Binary Search Tree


1. Insertion of a node in a binary tree

Algorithm:
1. Read the value for the node which is to be created and store it in a node called New.
2. Initially if (root!=NULL) then root-New.
3. Again read the next value of node created in New.
4. If (New->value < root->value) then attach New node as a left child of root otherwise attach New node
as a right child of root.
5. Repeat steps 3 aand 4 for constructing required binary search tree completely.

void insert(node *root,node *New)


{
if(New->data <root->data)
{
if(root->left== NULL)
root->left=New;
else
insert(root->left, New);
}
if(New->data>root->data)
{
if(root->right== NULL)
root->right=New;
else
insert(root->right,New);} }
While inserting any node in binary search tree, first of all we have to look for its appropriate position
in the binary search tree. We start comparing this new node with each node of the tree. If the value of
the node which is to be inserted is greater than the value of current node we move onto the right
subbranch otherwise we move onto the left subbranch. As soon as the appropriate position is found we
attach this new node as left or right child appropriately.

In the Fig. 4.9.2 if we want to insert 23. Then we will start comparing 23 with value of root node i.e. 6.
As 23 is greater than 10, we will move on right onto ONA 2.25 015 subtree. Now we will compare 23
with 20 and move right, compare 23 with 22 and move right. Now compare 23 with 24 but it is less than
24. We will move on left branch of 24. But as there is NULL node as left child of 24, we can attach 23 as
left child of 24.

2. Deletion of an element from the binary tree


For deletion of any node from binary search tree there are three cases which are possible.
i. Deletion of leaf node.
ii. Deletion of a node having one child.
iii. Deletion of a node having two children.
Let us discuss the above cases one by one.
i. Deletion of leaf node
This is the simplest deletion, in which we set the left or right pointer of parent node as NULL.

From the above tree, we want to delete the node having value 6 then we will set left pointer of its parent
node as NULL. That is left pointer of node having value 8 is set to NULL.
ii. Deletion of a node having one child
To explain this kind of deletion, consider a tree as shown in the Fig. 4.9.6.

If we want to delete the node 15, then we will simply copy node 18 at place of 15 and then set the node
free. The inorder successor is always copied at the position of a node being deleted.

iii. The node having two children


Again, let us take some example for discussing this kind of deletion.
Let us consider that we want to delete node having value 7. We will then find out the inorder successor
of node 7. The inorder successor will be simply copied at location of node 7. Thats it !
That means copy 8 at the position where value of node is 7. Set left pointer of 9 as NULL. This completes
the deletion procedure.

void del(node *root,int key)


{
node *temp, *parent, *temp_succ;
temp=search(root,key,&parent);
/*deleting a node with two children*/
if(temp->left!= NULL&&temp->right!= NULL)
{
parent=temp;
temp_succ-temp->right;
while(temp_succ->left!= NULL)
{
parent=temp_succ;
temp_succ=temp_succ->left;
}
temp->data-temp_succ->data;
if(temp_succ == parent->left)
parent->left = NULL;
else
parent->right =NULL;
printf(" Now Deleted it!");
return;
}
/*deleting a node having only one child*/
/*The node to be deleted has left child*/
if(temp->left!= NULL &&temp->right== NULL)
{
if(parent->left==temp)
parent->left-temp->left;
else
parent->right-temp->left;
temp=NULL;
free(temp);
printf(" Now Deleted it!");
return;
}
/*The node to be deleted has right child*/
if(temp->left==NULL &&temp->right!= NULL)
{
if(parent->left == temp)
parent->left-temp->right;
else
parent->right-temp->right;
temp=NULL;
free(temp);
printf(" Now Deleted it!");
return;
}
/*deleting a node which is having no child*/
if(temp->left==NULL &&temp->right== NULL)
{
if(parent->left==temp)
parent->left=NULL;
else
parent->right=NULL;
printf(" Now Deleted it!");
return;
}
}
3. Searching a node from binary search tree
In searching, the node which we want to search is called a key node. The key node will be compared
with each node starting from root node if value of key node is greater than current node then we search
for it on right subbranch otherwise on left subbranch. If we reach to leaf node and still we do not get
the value of key node then we declare "node is not present in the tree".

In the above tree, if we want to search for value 9. Then we will compare 9 with root node 10. As 9 is
less than 10 we will search on left subbranch. Now compare 9 with 5, but 9 is greater than 5. So we will
move on right subbranch. Now compare 9 with 8 but as 9 is greater than 8 we will move on right
subbranch. Now we read the node value as 9. Thus the desired node can be searched. Let us see the 'C'
implementation of it. The routine is as given below -
Non-recursive search routine
node *search(node *root,int key,node **parent)
{
node *temp;
temp=root;
while(temp!= NULL)
{
if(temp->data==key)
{
printf("\n The %d Element is Present",temp->data);
return temp;
}
*parent=temp; ← Marking the parent node
if(temp->data>key) ← if current node is greater than key
temp-temp->left; ← Search for the left subtree.
else
temp=temp->right;
}
return NULL;
}We can display a tree in inorder fashion. Hence the complete implementation is given below along with
appropriate output.
/*****************************************************************
Program for Implementation of Binary Search Tree and perform insertion deletion, searching, display
of tree.
*****************************************************************
#include <stdio.h>
#include <conio.h>
#include <stdlib.h>
typedef struct bst
{
int data;
struct bst *left, *right;
}node;
void insert(node *,node *);
void inorder(node *);
node *search(node *,int,node **);
void del(node *,int);
void main()
{
int choice;
char ans='N';
int key;
node *New; *root, *tmp, *parent;
node *get_node();
root=NULL;
clrscr();
printf("\n\t Program For Binary Search Tree ");
do
{
printf("\[Link]\[Link]\[Link]\[Link]");
printf("\n\n Enter your choice :");
scanf("%d", &choice);
switch(choice)
{
case 1:do
{
New=get_node();
printf("\n Enter The Element ");
scanf("%d", &New->data);
if(root == NULL) /* Tree is not Created */
root=New;
else
insert(root,New);
printf("\n Do u Want To enter More Elements?(y/n)");
ans=getch();;
} while (ans=='y');
break;
case 2:printf("\n Enter The Element Which You Want To Search");
scanf("%d", &key);
tmp=search(root,key,&parent);
printf("\n Parent of node %d is %d",
tmp->data,parent->data);
break;
case 3:printf("\n Enter The Element U wish to Delete");
scanf("%d", &key);
del(root,key);
break;
case 4:if(root == NULL)
printf("Tree Is Not Created");
else
{
printf("\n The Tree is: ");
inorder(root);
}
break;
}
} while (choice!=5);
}
node *get_node()
{
node *temp;
temp=(node *)malloc(sizeof(node));
temp->left=NULL;
temp->right=NULL;
return temp;
}
/*This function is for creating a binary search tree */
void insert(node *root,node *New)
{
if(New->data<root->data)
{
if(root->left== NULL)
root->left=New;
else
insert(root->left,New);
}
if(New->data>root->data)
{
if(root->right == NULL)
root->right=New;
else
insert(root->right,New);
}
}
/*
This function is for searching the node from binary Search Tree
*/
node *search(node *root,int key,node **parent)
{
node *temp;
temp=root;
while(temp!= NULL)
{
if(temp->data==key)
{
}
}
printf("\n The %d Element is Present",temp->data);
return temp;
}
*parent=temp;
if(temp->data>key)
temp=temp->left;
else
temp=temp->right;
}
return NULL;
}
/*
This function is for deleting a node from binary search tree. There exists three possible cases for deletion
of a node
*/
void del(node *root,int key)
{
node *temp, *parent,*temp_succ;
temp=search(root,key,&parent);
/*deleting a node with two children*/
if(temp->left!= NULL&&temp->right!= NULL)
{
parent=temp;
temp_succ=temp->right;
while(temp_succ->left!= NULL)
{
parent=temp_succ;
temp_succ=temp_succ->left;
}
temp->data=temp_succ->data;
if(temp_succ == parent->left)
parent->left =NULL;
else
parent->right NULL;
printf(" Now Deleted it!");
return;
}
/*deleting a node having only one child*/
/*The node to be deleted has left child*/
if(temp->left!= NULL &&temp->right== NULL)
{
if(parent->left==temp)
parent->left=temp->left;
else
parent->right=temp->left;
temp=NULL;
free(temp);
printf(" Now Deleted it!");
return;
}
/*The node to be deleted has right child*/
if(temp->left==NULL &&temp->right!= NULL)
{
if(parent->left==temp)
parent->left=temp->right;
else
parent->right=temp->right;
temp=NULL;
free(temp);
printf(" Now Deleted it!");
return;
}
/*deleting a node which is having no child*/
if(temp->left==NULL &&temp->right== NULL)
{
if(parent->left==temp)
parent->left = NULL;
else
parent->right=NULL;
printf(" Now Deleted it!");
return;
}
/*
This function displays the tree in inorder fashion
*/
void inorder(node *temp)
{
if(temp!= NULL)
{
inorder(temp->left);
printf("%d",temp->data);
inorder(temp->right);
}
}
Output
Program For Binary Search Tree
1. Create
2. Search
3. Delete
4. Display
Enter your choice :1
Enter The Element 10
Do u Want To enter More Elements?(y/n)
Enter The Element 8
Do u Want To enter More Elements?(y/n)
Enter The Element 9
Do u Want To enter More Elements? (y/n)
Enter The Element 7
Do u Want To enter More Elements?(y/n)
Enter The Element 15
Do u Want To enter More Elements? (y/n)
Enter The Element 13
Do u Want To enter More Elements?(y/n)
Enter The Element 14
Do u Want To enter More Elements?(y/n)
Enter The Element 12
Do u Want To enter More Elements?(y/n)
Enter The Element 16
Do u Want To enter More Elements? (y/n)
1. Create
2. Search
3. Delete
4. Display
Enter your choice :4
The Tree is :
1. Create
2. Search
3. Delete
7 8 9 10 12 13 14 15 16
4. Display
Enter your choice :2
Enter The Element Which You Want To Search16
The 16 Element is Present
Parent of node 16 is 15
1. Create
2. Search
3. Delete
4. Display

Ex. 4.9.1 Define binary search tree. Draw the binary search tree for the following input. 14, 15, 4, 9, 7,
18, 3, 5, 16, 4, 20, 17, 9, 14, 5
Sol. Binary Search Tree (Refer section 4.9)
Ex. 4.9.2 Define a binary search tree and construct a binary search tree. With elements (22, 28, 20, 25,
22, 15, 18, 10, 14). Give recursive search algorithm to search an element in that tree.
Sol. Binary Search Tree (Refer section 4.9)
Example
Recursive Algorithm for search- Refer section 7.7.
Ex. 4.9.3 What is binary search tree? Draw the binary search tree for the following. input. 14, 5, 6, 2,
18, 20, 16, 18, -1, 21.
Sol. Binary Search Tree: Refer section 4.9.
Example
Ex. 4.9.4: What is binary search tree? Write a recursive search routine for binary search tree.
Sol. Binary Search Tree: Refer section 4.9.
Recursive Search Routine
node *search(node temp, int key)
{
if (temp == NULL || key == [Link])
return temp;
else
if (key<[Link])
return search(temp->left, key);
else
return search(temp->right,key);
}
Ex. 4.9.5 Write the following routines to implement the basic binary search tree operations
(i) Perform search operation in binary search tree
(ii) Find_min and Find_max
Sol.: (i) Search operation - Refer section 4.9.1.
(ii) Find_min and Find_max
Consider following binary search tree

For finding the minimum value from the binary search tree, we need to traverse to the left most node.
Hence the left most node in above Fig. 4.9.11 is with value 7 which is the minimum value. Note that for
the leftmost node the left pointer is NULL.
The routine for finding the minimum value from the binary search tree is,
Find_min(node *root)
{
struct node* current=root;
while(current->left != NULL)
current=current->left;
printf("%d", current->data);
}
For finding the maximum value from the binary search tree, we need to traverse to the right most node.
Hence the right most node in above Fig. 4.9.11 is with value 13 which is the maximum value. Note that
for the rightmost node the right pointer is NULL.
The routine for finding the maximum value from the binary search tree is
Find_max(node *root)
{
struct node* current=root;
while(current->right != NULL)
current=current->right;
printf("%d", current->data);
}

Review Questions
1. Write a iterative search routine for a binary search tree.
2. Describe the binary search tree with an example. Write a iterative function to search for the key value
in binary search tree.
3. How to insert and delete an element into binary search tree and write down the code for the insertion
routine with an example.

3.6 AVL Trees :


Adelsion Velski and Lendis in 1962 introduced binary tree structure that is balanced with respect to
height of subtrees. The tree can be made balanced and because of this retrieval of any node can be done in
O(logn) times, where n is total number of nodes. From the name of these scientists the tree is called AVL tree.
• Definition
An empty tree is height balanced if T is a non empty binary tree with TL and TR as its left and right subtrees.
The T is height balanced if and only if,
i) TL and TR are height balanced.
ii) HL - hR <=1 where hL - hR are heights of TL and TR
The idea of balancing a tree is obtained by calculating the balance factor of a tree.
• Definition of Balance Factor
The balance factor BF(T) of a node in binary tree is defined to be hL - hR where he hL and hR are heights of
left and right subtrees of T.
For any node in AVL tree the balance factor i.e. BF(T) is -1, 0 or +1

Difference between AVL tree and binary search tree


AVL TREE

The AVL tree is named after its two inventors, G.M. Adelson-Velsky and E.M.
Landis, who published it in their 1962 paper "An algorithm for the organization of
information."

Avl tree is a self-balancing binary search tree. In an AVL tree, the heights ofthe two
child subtrees of any node differ by at most one; therefore, it is alsosaid to be height-
balanced.

The balance factor of a node is the height of its right subtree minus theheight of its
left subtree and a node with balance factor 1, 0, or -1 is considered balanced. A node with
any other balance factor is considered unbalanced and requires rebalancing the tree. This
can be done by avl tree rotations

Need for AVL tree

The disadvantage of a binary search tree is that its height can be as largeas N-1
This means that the time needed to perform insertion and deletion andmany other
operations can be O(N) in the worst case
We want a tree with small height
A binary tree with N node has height at least Q(log N)
Thus, our goal is to keep the height of a binary search tree O(log N) Such trees are
called balanced binary search trees. Examples are AVLtree, red-black tree.
Thus we go for AVL tree.

HEIGHTS OF AVL TREE

An AVL tree is a special type of binary tree that is always "partially" [Link] criteria that is
used to determine the "level" of "balanced-ness" which is the difference between the heights of
subtrees of a root in the tree. The "height" of tree is the "number of levels" in the tree. The height
of a tree is defined as follows:

1. The height of a tree with no elements is 0


2. The height of a tree with 1 element is 1
3. The height of a tree with > 1 element is equal to 1 + the height of itstallest subtree.
4. The height of a leaf is 1. The height of a null pointer is zero.

The height of an internal node is the maximum height of its children plus 1.

FINDING THE HEIGHT OF AVL TREE

AVL trees are identical to standard binary search trees except that for every node in an AVL tree,
the height of the left and right subtrees can differ by at most 1 . AVL trees are HB-k trees (height
balanced trees of order k) of order HB-1. The following is the height differential formula:
|Height (Tl)-Height(Tr)|<=k

When storing an AVL tree, a field must be added to each node with one of three values: 1, 0, or
-1. A value of 1 in this field means that the left subtree has a height one more than the right
subtree. A value of -1 denotes the opposite. A value of 0 indicates that the heights of both
subtrees are the [Link] FOR HEIGHT OF AVL TREE
An AVL tree is a binary search tree with a balanced condition.
Balance Factor(BF) = Hl --- Hr. Hl
=> Height of the left subtree. Hr
=> Height of the right subtree.

If BF={ --1,0,1} is satisfied, only then the tree is [Link] tree is a


Height Balanced Tree.
If the calculated value of BF goes out of the range, then balancing has to bedone.

Rotation :

Modification to the tree. i.e. , If the AVL tree is Imbalanced, proper rotationshas to be done.
A rotation is a process of switching children and parents among two or threeadjacent nodes to
restore balance to a tree.

• There are two kinds of single rotation:


Right Rotation Left Rotation

An insertion or deletion may cause an imbalance in an AVL tree.


The deepest node, which is an ancestor of a deleted or an inserted node, and whose balance factor
has changed to -2 or +2 requires rotation to rebalance the tree.
Balanced Factor:

This Tree is an AVL Tree and a height balanced tree.

An AVL tree causes imbalance when any of following condition occurs:


i. An insertion into Right child’s right subtree.
ii. An insertion into Left child’s left subtree.
iii. An insertion into Right child’s left subtree.
[Link] insertion into Left child’s right subtree.
These imbalances can be overcome by,

1. Single Rotation – ( If insertion occurs on the outside,i.e.,LL or RR)

-> LL (Left -- Left rotation) --- Do single Right.


-> RR (Right -- Right rotation) – Do single Left.

2. Double Rotation - ( If insertion occurs on the inside,i.e.,LR or RL)

-> RL ( Right -- Left rotation) --- Do single Right, then single Left.
-> LR ( Left -- Right rotation) --- Do single Left, then single Right.

General Representation of Single Rotation

1. LL Rotation :

• The right child y of a node x becomes x's parent.


• x becomes the left child of y.
• The left child T2 of y, if any, becomes the right child of x.
[Link]

2. RR Rotation :

• The left child x of a node y becomes y's parent.


• y becomes the right child of x.
• The right child T2 of x, if any, becomes the left child of y.
[Link]
[Link]

General Representation of Double Rotation

1. LR( Left -- Right rotation):

2. RL( Right -- Left rotation) :


[Link]

EXAMPLE:
LET US CONSIDER INSERTING OF NODES 20,10,40,50,90,30,60,70 in an AVLTREE
[Link]
[Link]
[Link]
[Link]

AVL TREE ROUTINES

Creation of AVL Tree and Insertion

Struct avlnode
Typedef struct avlnode *position;
Typedef structavlnode *avltree; Typedef
int elementtype;
Struct avlnode
{
Elementtype element;
Avltree left;
Avltree right;Int
height;
};
Static int height(position P)
{ If(P==NULL)
return -1;
else
return P-->height;
}
Avltree insert(elementtype X, avltree T)
{ If(T==NULL)
{ / * Create and return a one node tree*/T=
malloc(sizeof(structavlnode)); If(T==NULL)
Fatalerror(“Out of Space”);Else
{
T-->element=X;
[Link]

T-->height=0;
T-->left=T-->right=NULL;
}
}
Else if(X<T-->element)
{
T-->left=Insert(X,T-->left);
If(height(T-->left) - height(T-->right)==2)If(X<T--
>left-->element) T=singlerotatewithleft(T);
Else T=doublerotatewithleft(T);
}
Else if(X>T-->element)
{
T-->right=insert(X,T-->right);
If(height(T-->left) - height(T-->right)==2)If(X>T--
>right-->element)
T= singlerotatewithright(T);Else
T= doublerotatewithright(T);
}
T-->height=max(height(T-->left),height(T-->right)) + 1;Return T;
}

Routine to perform Single Left :

. This function can be called only if k2 has a left child.


. Perform a rotate between a node k2 and its left child.
. Update height, then return the new root.

Static position singlerotatewithleft(position k2)


[Link]

{
Position k1;
k1=k2-->left;
k2-->left=k1-->right;k1--
>right=k2;
k2-->height= max(height(k2-->left),height(k2-->right)) + 1; k1-->height=
max(height(k1-->left),height(k1-->right)) + 1; return k1; / * New Root * /
}

Routine to perform Single Right :

Static position singlerotationwithright(position k1)


{
position k2;
k2=k1-->right;
k1-->right=k2-->left;k2--
>left=k1;
k2-->height=max(height(k2-->left),height(k2-->right)) + 1;
k1-->height=max(height(k1-->left),height(k1-->right)) + 1;return k1; / * New
Root * /
}

Double rotation with Left :

Static position doublerotationwithleft(position k3)


{
/ * Rotate between k1 & k2 * /
k3-->left=singlerotatewithright(k3-->left);
/ * Rotate between k3 & k2 * / returnsinglerotatewithleft(k3);
}

Double rotation with Right :

Static position doublerotatewithright(position k1)


{/ * Rotation between k2& k3 * /
k1-->right=singlerotatewithleft(k1-->right);
/ * Rotation between k1 &k2 * /
returnsinglerotatewithright(k1);
}
[Link]
APPLICATIONS

AVL trees play an important role in most computer related applications. Theneed and use of avl
trees are increasing day by day. their efficiency and less complexity add value to their reputation.
Some of the applications are

Contour extraction algorithm Parallel


dictionaries Compression of
computer files
Translation from source language to target languageSpell checker

ADVANTAGES OF AVL TREE

AVL trees guarantee that the difference in height of any two subtrees rooted at the same
node will be at most one. This guarantees an asymptotic running time of O(log(n)) as
opposed to O(n) in the case of astandard bst.
Height of an AVL tree with n nodes is always very close to thetheoretical
minimum.

Since the avl tree is height balabced the operation like insertion anddeletion have low time
complexity.
Since tree is always height [Link] implementation ispossible.
The height of left and the right sub-trees should differ by [Link] are possible.

DISADVANTAGES OF AVL TREE

one limitation is that the tree might be spread across memory


as you need to travel down the tree, you take a performance hit at everylevel down
one solution: store more information on the path
Difficult to program & debug ; more space for balance [Link] faster but
rebalancing costs time.
most larger searches are done in database systems on disk and useother structures

Ex. 4.13.1: Show the result of inserting 2, 1, 4, 5, 9, 3, 6, 7 into an empty AVL tree.
Sol. :
[Link]

Ex. 4.13.2 Draw the result of inserting 20, 10 and 24 one by one into the AVL tree given below. Draw
the tree after each insertion.
[Link]
Sol. :
Step 1: Insertion of 20
[Link]
Ex. 4.13.3: Show the results of inserting 43, 11, 69, 72 and 30 into an initally empty AVL tree. Show the
results of deleting the nodes 11 and 72 one after the other of constructed tree.
Sol. Insertion operation

Ex. 4.13.4: Construct an AVL tree with the values 3,1,4,5,9,2,8,7,0 into an initially empty tree. Write
code for inserting into AVL tree.
Sol.
[Link]

Code for Insertion into AVL Tree


//Tree node
typedef struct Node
{
int data;
int BF;
struct Node *left;
struct Node *right;
}node;
// Definiton of insert function
node *insert(int data, int *current)
{
root = create(root, data, current);
return root;
}
//definition of create function
node *create(struct Node *root, int data, int *current)
{
node *temp1, *temp2;
if (root == NULL)//initial node
{
root = new node;
root->data = = data;
root->left = NULL;
root->right = NULL;
root->BF = 0;
*current = TRUE;
return(root);
[Link]
}
if (data<root->data)
{
root->left = create(root->left, data, current);
// adjusting left subtree
if (*current)
{
switch (root->BF)
{
case 1:temp1 root->left;
if (temp1->BF ==1)
{
printf("\n single rotation: LL rotation");
root->left = temp1->right;
temp1->right = root;
root->BF = 0;
root = temp1;
}
else
{
printf("\n Double rotation:LR rotation");
temp2=temp1->right;
temp1->right = temp2->left;
temp2->left=temp1;
root->left = temp2->right;
temp2->right = root;
if (temp2->BF == -1)
root->BF = -1;
else
root->BF = 0;
if (temp2->BF == -1)
temp1->BF = 1;
else
temp1->BF = 0;
root = temp2;
}
root->BF = :0;
* current = FALSE;
break;
case 0:
root->BF = 1;
break;
case 1:
[Link]
root->BF = 0;
*current = FALSE;
}//end of switch
}//inner if
}//outer if
if (data > root -> data)
{
root->right=create(root->right, data, current);
//adjusting the right subtree
if (*current != NULL)
{
switch (root->BF)
{
case 1:
root->BF = 0;
*current = FALSE;
break;
case 0:
root->BF = -1;
break;
case-1:
temp1 = root->right;
if (temp1->BF == -1)
{
printf("\n single rotation:RR rotation");
root->right = temp1->left;
temp1->left = root;
root->BF = 0;
root = temp1;
}
else
{
printf("\n Double rotation:RL rotation");
temp2 = temp1->left;
temp1->left = temp2->right;
temp2->right = temp1;
root->right temp2->left;
temp2->left = root;
if (temp2->BF == -1)
root->BF = 1;
else
root->BF = 0;
if (temp2->BF == 1)
[Link]
temp1->BF = -1;
else
temp1->BF = 0;
root =temp2;
}
root->BF = 0;
* current = FALSE;
}
}
}
return(root);
}
void main()
{
node *root = NULL;
int current;
//calling insert function by passing values
root = insert(40, &current);
root = insert(50, &current);
root = insert(70, &current);
….
}
[Link]

Ex. 4.13.5 Define AVL Tree and starting with an empty AVL search following elements in the given
order 35, 45, 65, 55,75,15, 25.
Sol. AVL Tree Refer section 4.13.
[Link]

Ex. 4.13.6: Write a routine for AVL tree insertion. Insert the following elements in the empty tree and
how do you balance the tree after each element insertion Elements: 2, 5, 4, 6, 7, 9, 8, 3, 1, 10
Sol. :
[Link]

Ex. 4.13.7: Implementation of AVL tree.


Sol. :
[Link]
/**************************************************************
This program performs the insertion and deletion operations on an AVL tree
**************************************************************/
#include<stdio.h>
#include<stdlib.h>
#include<conio.h>
#define FALSE 0
#define TRUE 1
//Tree node
typedef struct Node
{
int data;
int BF;
struct Node *left;
struct Node *right;
}node;
node *insert(int data, int *current)
{
node *create(node *root,int data, int *current);
node *root;1
root=create(root,data,current);
return root;
}
node *remove(node *root,int data, int *current);
node *find_succ(node *temp,node *root, int *current);
node *right_rotation(node *root, int *current);
node *left_rotation(node *root, int *current);
void display(node *root);
node *create(struct Node *root,int data, int *current)
{
node *temp1,*temp2;
if(root == NULL)
{
root = new node;
root->data=data;
root->left=NULL;
root->right=NULL;
root->BF=0;
*current=TRUE;
return(root);
}
if(data<root->data)
{
[Link]
root->left=create(root->left,data,current);
// adjusting left subtree
if(*current)
{
switch(root->BF)
{
case 1:temp1= root->left;
if(temp1->BF==1)
{
printf("\n single rotation: LL rotation");
root->left=temp1->right;
temp1->right=root;
root->BF=0;
root=temp1;
}
else
{
printf("\n Double roation:LR rotation");
temp2=temp1->right;
temp1->right=temp2->left;
temp2->left=temp1;
root->left=temp2->right;
temp2->right=root;
if(temp2->BF==1)
root->BF=-1;
else
root->BF=0;
if(temp2->BF==-1)
temp1->BF=1;
else
temp1->BF=0;
root=temp2;
}
root->BF=0;
*current=FALSE;
break;
case 0:
root->BF=1;
break;
case -1:
root->BF=0;
*current=FALSE;
}
[Link]
}
}
if(data> root->data)
{
root->right=create(root->right,data,current);
//adjusting the right subtree
if(*current!= NULL)
{
case 1:
root->BF=0;
*current=FALSE;
break;
case 0:
root->BF=-1;
break;
case -1:
temp1=root->right;
if(temp1->BF==-1)
{
printf("\n single rotation:RR rotation")
root->right=temp1->left;
temp1->left=root;
root->BF=0;
root=temp1;
}
else
{
printf("\n Double rotation:RL rotation");
temp2=temp1->left;
temp1->left=temp2->right;
temp2->right=temp1;
root->right=temp2->>left;
temp2->left=root;
if(temp2->BF==-1)
root->BF=1;
else
root->BF=0;
if(temp2->BF==1)
temp1->BF=-1;
else
temp1->BF=0;
root=temp2;
}
[Link]
root->BF=0; *
current=FALSE;
}
}
}
return(root);
}
/*
Display of Tree in inorder fashion
*/
void display(node *root)
{
if(root!= NULL)
{
display(root->left);
printf("%d",root->data);
display(root->right);
}
}
/*
Deletion of desired node the tree
*/
node *remove(node *root,int data, int *current)
{
node *temp;
if(root->data==13)
printf("%d",root->data);
if(root == NULL)
{
printf("\n Empty Tree!!!");
return (root);
}
else
{
if(data<root->data)
{
root->left=remove(root->left,data,current);
if(*current)
root=right_rotation(root,current);
}
else
{
if(data>root->data)
[Link]
root->right-remove(root->right,data,current);
if(*current)
root-left_rotation(root, current);
}
else
{
temp=root;
if(temp->right== NULL)
{
root=temp->left;
*current=TRUE;
delete(temp);
}
else
{
if(temp->left== NULL)
{
root=temp->right;
* current=TRUE;
delete(temp);
}
else
{
temp->right=find_succ(temp->right,temp, current);
if(*current)
root=left_rotation(root,current);
}
}
}
}
}
return (root);
}
node *find_succ(node *succ,node *temp,int *current)
{
node *temp1=succ;
if(succ->left!= NULL)
{
succ->left=find_succ(succ->left,temp,current);
if(*current)
succ=right_rotation(succ,current);
}
else
[Link]
{
temp1=succ;
temp->data=succ->data;
succ succ->right;
delete temp1;
*current=TRUE;
}
return (succ);
}
node *right_rotation(node *root, int *current)
{
node *temp1,*temp2;
switch(root->BF)
{
case 1:
root->BF=0;
break;
case 0:
root->BF -1;
*current=FALSE;
break;
case -1:
temp1=root->right;
if(temp1->BF<=0)
{
printf("\n single rotation: RR rotation");
root->right=temp1->left;
temp1->left=root;
if(temp1->BF==0)
{
root->BF=-1;
temp1->BF=1;
*current=FALSE;
}
else
{
root->BF=temp1->BF=0
}
root=temp1;
}
else
{
printf("\n Double Rotation:RL rotation");
[Link]
temp2=temp1->left;
temp1->left=temp2->right;
temp2->right=temp1;
root->right=temp2->left;
temp2->left=root;
if(temp2->BF==-1)
root->BF=1;
else
root->BF=0;
if(temp2->BF==1)
temp1->BF=-1;
else
temp1->BF=0;
root=temp2;
temp2->BF=0;
}
}
return (root);
}
node *left_rotation(node *root,int *current)
{
node *temp1,*temp2;
switch(root->BF)
{
case 1:
root->BF=0;
break;
Case 0:
root->BF=1;
*current=FALSE;
break;
Case 1:
temp1=root->left;
if(temp1->BF>=0)
{
printf("\nsingle rotation LL rotation");
root->left=temp1->right;
temp1->right=root;
if(temp1->BF==0)
{
root->BF=1;
temp1->BF=-1;
*current=FALSE;
[Link]
}
else
{
root->BF=temp1->BF=0;
}
root=temp1;
}
else
{
printf("\nDouble rotation:LR rotation");
temp2=temp1->right;
temp1->right-temp2->left;
temp2->left=temp1;
root->left=temp2->right;
temp2->right=root;
if(temp2->BF==1)
root->BF=-1;
else
root->BF=0;
if(temp2->BF==-1)
temp1->BF=1;
else
temp1->BF=0;
root=temp2;
temp2->BF=0;
}
}
return root;
}
void main()
{
node *root=NULL;
int current;
clrscr();
root=insert(40,&current);
root=insert(50,&current);
root=insert(70,&current);
printf("\n");
display(root);
printf("\n");
root=insert(30,&current);
printf("\n");
display(root);
[Link]
root=insert(20,&current);
printf("\n");
display(root);
root=insert(45,&current);
printf("\n");
display(root);
root=insert(25,&current);
printf("\n");
display(root);
root=insert(10,&current);
printf("\n");
display(root);
root=insert(5,&current);
printf("\n");
display(root);
root=insert(22,&current);
printf("\n");
display(root);
root=insert(1, &current);
printf("\n");
display(root);
root=insert(35,&current);
printf("\n\nFinal AVL tree is: \n");
display(root);
printf("\n Removing node 20");
root remove(root,20,&current);
printf("\n Removing node 45");
root remove(root,45,&current);
printf("\n\n AVL tree after deletion of a node: \n");
display(root);
printf("\n");
}
Output
single rotation:RR rotation
40 50 70
30 40 50 70
single rotation: LL rotation
20 30 40 50 70
Double roation:LR rotation
20 30 40 45 50 70
Double roation:LR rotation
20 25 30 40 45 50 70
10 20 25 30 40 45 50 70
[Link]
single rotation: LL rotation
5 10 20 25 30 40 45 50 70
Double roation:LR rotation
5 10 20 22 25 30 40 45 50 70
single rotation: LL rotation
1 5 10 20 22 25 30 40 45 50 70
Double roation:LR rotation
Final AVL tree is:
1 5 10 20 22 25 30 35 40 45 50 70
Removing node 20
single rotation LL rotation.
Removing node 45
AVL tree after deletion of a node:
1 5 10 22 25 30 35 40 50 70
Review Questions
1. Explain with illustration the various types of rotations needed to retain the properties of an AVL tree
during insertion of a node. State the time complexity of insertion of a node into an AVL
2. Explain the following. routines in AVL tree with example. i) Insertion, ii) Delection, iii) Single
rotation, iv) Double rotation. AU: May-14, Marks 16
3. Explain the AVL rotations with a suitable example.
3.7 Priority Queue (Heaps):

PRIORITY QUEUE

It is a data structure which determines the priority of jobs.


The Minimum the value of Priority, Higher is the priority of the [Link] best way to
implement Priority Queue is Binary Heap.
A Priority Queue is a special kind of queue datastructure. It has zero or morecollection of
elements, each element has a priority value.
• Priority queues are often used in resource management, simulations,and in the
implementation of some algorithms (e.g., some graph algorithms, some backtracking
algorithms).
• Several data structures can be used to implement priority [Link] is a
comparison of some:

Basic Model of a Priority Queue

Deletion(h) I
PRIORITY QUEUE
Insertion(h)
[Link]

Implementation of Priority Queue


1. Linked List.
2. Binary Search Tree.
3. Binary Heap.

Linked List :
A simple linked list implementation of priority queue requires o(1) timeto perform the
insertion at the front and o(n) to delete at minimum element.

Binary Search tree :


This gives an average running time of o(log n) for both insertion anddeletion.(deletemin).

The efficient way of implementing priority queue is Binary Heap (or)Heap.

Heap has two properties :


1. Structure Property.
2. Heap Order Preoperty.

1. Structure Property :
The Heap should be a complete binary tree, which is a completely filled tree, which is a
completely filled binary tree with the possible exception of thebottom level, which is filled from
left to right.
A Complete Binary tree of height H, has between 2h and (2h+1 - 1) nodes.

Sentinel Value :
The zeroth element is called the sentinel value. It is not a node of the [Link] value is
required because while addition of new node, certain operations are performed in a loop and to
terminate the loop, sentinel value is used.
Index 0 is the sentinel value. It stores irrelated value, inorder to terminate theprogram in case of complex
codings.
Structure Property : Always index 1 should be starting position.

2. Heap Order Property :

The property that allows operations to be performed quickly is a heap orderproperty.


[Link]

Mintree:
Parent should have lesser value than children.
Maxtree:
Parent should have greater value than children.
These two properties are known as heap propertiesMax-heap
Min-heap

Min-heap:
The smallest element is always in the root [Link] node must have akey that is less or
equal to the key of each of its children.
Examples

Max-Heap:
The largest Element is always in the root node.
[Link]

Each node must have a key that is greater or equal to the key of each of itschildren.

Examples

HEAP OPERATIONS:
There are 2 operations of heap
Insertion
Deletion

Insert:
Adding a new key to the heap
Rules for the insertion:
To insert an element X, into the heap, do the following:

[Link]- 18 in a Min Heap


[Link]
Step1: Create a hole in the next available location , since otherwise the treewill not be complete.
Step2: If X can be placed in the hole, without violating heap order, then do insertion,
otherwise slide the element that is in the hole’s parent node, intothe hole, thus, bubbling the
hole up towards the root.
Step3: Continue this process until X can be placed in the hole.

[Link]- 18 in a Min Heap


[Link]

Example Problem :

[Link]- 18 in a Min Heap


[Link]

2. Insert the keys 4, 6, 10, 20, and 8 in this order in an originally empty max-heap

2.12.2 Delete-max or Delete-min:


[Link]

Removing the root node of a max- or min-heap, respectively

Procedure for Deletemin :


* Deletemin operation is deleting the minimum element from the loop.
* In Binary heap | min heap the minimum element is found in the root.
* When this minimum element is removed, a hole is created at the root.
* Since the heap becomes one smaller, make the last element X in the heap tomove somewhere
in the heap.
* If X can be placed in the hole, without violating heap order property, place it ,otherwise slide
the smaller of the hole’s children into the hole, thus , pushing the hole down one level.
* Repeat this process until X can be placed in the [Link]
general strategy is known as Percolate Down.
[Link]

EXAMPLE PROBLEMS :

1. DELETE MIN
[Link]

2. Delete Min -- 13

BINARY HEAP ROUTINES [Priority Queue]


Typedef struct heapstruct *priorityqueue; Typedef int
elementype;
Struct heapstruct
{
int capacity;int
size;
[Link]

elementtype *element;
};

Declaration of Priority Queue


Priorityqueue initialize(int maxelement)
{
Priorityqueue H;
If(minpsize<maxelements) Error(“Priority queue
size is too small”);H=malloc(sizeof(struct
heapstruct)); If(H=NULL)
Fatalerror(“Out of space”);
/ * Allocate the array plus one extra for sentinel * /
H-->elements=malloc((maxelements+1)*sizeof(elementtype));
If(H-->elements==NULL)
Fatalerror(“out of space”);
H-->capacity=maxelements;
H-->size=0;
H-->elements[0]=mindata;
Return H;
}
/ * H-->elements[0]=sentinelvalue * /

Insert Routine

Void insert(elementtype X, priorityqueue H)


{
int i;
if(isfull(H))
{
Error(“Priority queue is full”);
Return;
}
[Link]

For(i=++H-->size;H-->elements[i/2]>X;i=i/2)
H-->elements[i]=H-->elements[i/2];
H-->elements[i]=X;
}

Delete Routine

Elementtype deletemin(priorityqueue H)
{
int i,child;
elementtype minelement,lastelement;
if(isempty(H))
{
Error(“Priority queue is empty”);
Return H-->element[0];
}
Minelement=H-->element[1];
Lastelement=H-->element[H-->size--];
For(i=1;i*2<=H-->size;i=child)
{
/ *Find smaller child */
Child=i*2;
If(child!=H-->size && H-->elements[child++]<H-->elements[child])
{
Child++;
}
/ * Percolate one level * /
If(lastelement>H-->elements[child])
H-->element[i]=H-->elements[child];Else
Break;
}
H-->element[i]=lastelement;
[Link]

Return minelement;
}

Other Heap Operations

1. Decrease Key.
2. Increase Key.
3. Delete.
4. Build Heap.

1. Decrease Key :

10 10 8

15 12 8 12 10 12

20 30 20 30 20 30

The Decrease key(P,∆,H) operation decreases the value of the key at position P, by a
positive amount ∆. This may violate the heap order property,which can be fixed by percolate up
Ex : decreasekey(2,7,H)

2. Increase Key :

The Increase Key(P,∆,H) operation increases the value of the key atposition P, by a
positive amount ∆. This may violate heap order property, which can be fixed by percolate
down.
Ex : increase key(2,7,H)

10 10 10

12 22 12
15 20 12

20 30 20 30
22 30
[Link]
3. Delete :

The delete(P,H) operation removes the node at the position P, from the heap
H. This can be done by,

Step 1: Perform the decrease key operation, decrease key(P,∞,H).Step 2:


Perform deletemin(H) operation.

Step 1: Decreasekey(2, ∞,H)

10 10

12 12
20 10 12

22 30 22 30
22 30

Step 2 : Deletemin(H)

10 10

12 12
10 22 12

20 20
30

APPLICATIONS
The heap data structure has many applications
Heap sort
Selection algorithms
Graph algorithms

Heap sort :
One of the best sorting methods being in-place and with no quadraticworst-case
scenarios.

Selection algorithms:
Finding the min, max, both the min and max, median, or even the k-thlargest
element can be done in linear time using heaps.
[Link]
Graph algorithms:
By using heaps as internal traversal data structures, run time will bereduced by
an order of polynomial. Examples of such problems are Prim's minimal spanning tree
algorithm and Dijkstra's shortest path problem.

ADVANTAGE

The biggest advantage of heaps over trees in some applications is thatconstruction of heaps
can be done in linear time.

It is used in
o Heap sort
o Selection algorithms
o Graph algorithms

DISADVANTAGE

➢ Heap is expensive

➢ Performance :
Allocating heap memory usually involves a long negotiation with the OS.
➢ Maintenance:
➢ Dynamic allocation may fail; extra code to handle such exception is
required.

➢ Safety :
Object may be deleted more than once or not deleted at all .
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
3.8 BINARY HEAPS :

In this section we will learn "What is heap?" and "How to construct heap?"
Definition: Heap is a complete binary tree or a almost complete binary tree in
which every parent node be either greater or lesser than its child nodes.
Heap can be min heap or max heap.

Types of heap
i)Min heap ii) Max heap
A Max heap is a tree in which value of each node is greater than or equal to
the value of its children nodes.
[Link]

t
For example
A Min heap is a tree in which value of each node is less than or equal to value
of its children nodes.
For example:

Parent being greater or lesser in heap is called parental property. Thus heap has two
important properties.
Heap:
i)It should be a complete binary tree
ii)It should satisfy parental property
Before understanding the construction of heap let us learn/revise few basics
that are required while constructing heap.
• Level of binary tree: The root of the tree is always at level 0. Any node is
always at a level one more than its parent nodes level.
For example:

• Height of the tree: The maximum level is the height of the tree. The height of
the tree is also called depth of the tree.
For example:
[Link]

• Complete binary tree: The complete binary tree is a binary tree in which all
the levels of the tree are filled completely except the lowest level nodes which are
filled from left if required.
For example:

•. Almost complete binary tree: The almost complete binary tree is a tree in which
i) Each node has a left child whenever it has a right child. That means there is always
a left child, but for a left child there may not be a right child.
ii) The leaf in a tree must be present at height h or h-1. That means all the leaves are
on two adjacent levels.

Ex. 4.15.3 Consider the heap in the following Fig. 4.15.3.


i) Apply deleet operation to the [Link] heap after deletion.
ii) Insert 38 into the [Link] heap after insertion.

Sol. i) For delete operation we will choose node 40 because this node is of highest priority.
[Link]

ii) Insert 38: Consider the given heap. After inserting 38, we need to heapify the tree. That means we
have to maintain the heap property i.e. parent node must be greater than the child nodes.

Ex. 4.15.4: Show the result of inserting 10, 12, 1, 14, 6, 5, 8, 15, 3, 9, 7, 4, 11, 13 and 2 at a time,
into initially empty binary heap. After creating such heap delete the element 8 from heap, how do
you repair the heap? Then insert the element in the heap and show the final result (insertion
should be at other than lead node).
Sol. : We will create max heap for the set
10 12 1 14 6 5 8 15 3 9 7 4 11 13 2
[Link]
[Link]
[Link]
[Link]

Review Questions
1. Explain the binary heap operations with examples.
2. Explain the insert and delete operations of heap with examples.
[Link]

UNIT IV
MULTIWAY SEARCH TREES AND GRAPHS
B-Tree – B+ Tree – Graph Definition – Representation of Graphs – Types of Graph - Breadth-first traversal –
Depth-first traversal –– Bi-connectivity – Euler circuits – Topological Sort – Dijkstra's algorithm – Minimum
Spanning Tree – Prim's algorithm – Kruskal's algorithm

++++++++++++++++++++++++++++++++++++++++++++++++++++++

4.1 B-Tree :

B-tree is a specialized multiway tree used to store the records in a disk. There are number of subtrees
to each node. So that the height of the tree is relatively small. So that only small number of nodes must
be read from disk to retrieve an item. The goal of B-trees is to get fast access of the data.
Multiway search tree
B-tree is a multiway search tree of order m is an ordered tree where each node has at the most m
children. If there are n number of children in a node then (n - 1) is the number of keys in the node.
For example:
Following is a tree of order 4.

From above tree following observations can be made


1. The node which has n children posses (n - 1) keys.
2. The keys in each node are in ascending order.
3. For every node [Link] [0] has only keys which are less then [Link] [0] similarly, [Link]
[1] has only keys which are greater than [Link] [0]
In other words, the node at level 1 has F, K and O as keys. At level 2 the node containing. keys C and
D are arranged as child of key F (the cell before F). Similarly the node containing key G will be child
of F but should be attached after F and before K, as G is between F and K. Here the alphabets F, K, O
are called keys and branches are called children. The B-tree of order m should be constructed using
following properties.
Rule 1:
All the leaf nodes are on the botom level.
[Link]
Rule 2:
The root node should have at least two children.
Rule3;
All the internal nodes except root node have at least ceil (m/2) nonempty children. The ceil is a
function such that ceil (3.4) = 4, ceil (9.3) = 10, ceil (2.98) = 3 ceil (7) = 7.
Rule 4;
Each leaf node must contain at least ceil (m/2) - 1 keys.
Insertion
Thus the B-tree is fairly balanced tree which can be illustrated by following example.
Example of B-tree
We will consruct a B-tree of order 5 following numbers.
3, 14, 7, 1, 8, 5, 11, 17, 13, 6, 23, 12, 20, 26, 4, 16, 18, 24, 25, 19.
The order 5 means at the most 4 keys are allowed. The internal node should have at least 3 nonempty
children and each leaf node must contain at least 2 keys..
Step 1: Insert 3, 14, 7, 1 as follows.

Step 2: If we insert 8 then we need to split the node 1, 3, 7, 8, 14 at medium.

Step 3: Insert 5, 11, 7 which can be easily inserted in a B-tree.

Step 4 :
Now insert 13. But if we insert 13 then the leaf node will have 5 keys which is not allowed. Hence 8,
11,13, 14, 17 is split and medium node 13 is moved up.
[Link]

Step 5:
Now insert 6, 23, 12, 20 without any split.

Step 6: The 26 is inserted to the rightmost leaf node. Hence 14, 17, 20, 23, 26 the node is split and 20
will be moved up.

Step 7: Insertion of node 4 causes left most node to split. The 1, 3, 4, 5, 6 causes key 4 to move up.
Then insert 16, 18, 24, 25.

Step 8: Finally insert 19. Then 4, 7, 13, 19, 20 needs to be split. The median 13 will be moved up to
form a root node.
The tree then will be -
[Link]

Thus the B-tree is constructed.


Ex. 5.1.1 Distinguish between B Tree and B+ Tree. Create a B three of order 5 by inserting the
following elements: 3,14,7,1,8,5,11,17,13,6,23,12,20,26,4,16,18,24,25 and 19.
Sol. :

Example: Creation of B tree - Refer section 5.1.1.


Algorithm for insertion in B-tree
Algorithm insert (root, key)
{
// Problem Description: This algorithm is for
// inserting key node in B-tree.
temp ←root
if (n[temp]=2t 1) then
s←get_node () // memory is allocated.
{
roots←s
leaf [s] ←FALSE
n [s] ←0
child1[s] ←root
split_child (s, 1, root)
Insert_In (s, key)
}
[Link]
Else
Insert_In (s, key)
}
Deletion
Consider a B-tree

If we want to delete 8 then it is very simple

Now we will delete 20, the 20 is not in a leaf node so we will find its successor which is 23. Hence 23
will be moved up to replace 20.

Next we will delete 18. Deletion of 18 from the corresponding node causes the node with only one
key, which is not desired (as per rule 4) in B-tree of order 5. The sibling node to immediate right has
an extra key. In such a case we can borrow a key from parent and move spare key of sibling to up. See
the following figure.
[Link]
Now delete 5. But deletion of 5 is not easy. The first thing is 5 is from leaf node. Secondly this leaf
node has no extra keys nor siblings to immediate left or right. In such a situation we can combine this
node with one of the siblings. That means remove 5 and combine 6 with the node 1, 3. To make the
tree balanced we have to move parent's key down. Hence we will move 4 down as 4 is between 1, 3
and 6. The tree will be

But again internal node of 7 contains only one key which not allowed in B-tree (as per rule 3). We
then will try to borrow a key from sibling. But sibling 17, 24 has no spare key. Hence what we can do
is that, combine 7 with 13 and 17, 24. Hence the B-tree will be

Ex. 5.1.2: Construct a B-tree with order m = 3 for the key values 2,3,7,9,5,6,4,8,1 and delete the
values 4 and 6. Show the tree in performing all operations.
Sol. The order m = 3 means at the most 2 keys are allowed. The insertion operation is as follows:
[Link]

Step 6: Insert 4. This will make a sequence 4,5, 6. The 5 will go up. Then the sequence will become
3,5, 7. Again 5 will go up.
[Link]

Step 8; Delete 4 and 6. As these are the leaf nodes without any adjacent key. Their deletion is very
simple. Just make these node as NULL. The remaining adjustment will occur. The resultant B-tree
will be,

Searching
The search operation on B-tree is similar to a search on binary search tree. Instead of choosing
between a left and right child as in binary tree, B-tree makes an m-way choice. Consider a B-tree as
given below
[Link]

If we want to search node 11 then


i) 11 13 Hence search left node
ii) 11 > 7: Hence rightmost node
iii) 11 > 8, move in second block
iv) node 11 is found.
The running time of search operation depends upon the height of the tree. It is O(log n).
Algorithm
The algorithm for searching a target node is as follows
Algorithm Search (temp, target)
{
// Problem Description: This algorithm is for
// searching the 'target' node
i ←1
while i←i + 1 ((i≤n[temp]) AND (target > key; [temp]))
if ((i≤n[temp]) AND (target ==key.[temp]))
return (temp, i)
if (leaf [temp]==TRUE) then
return NULL
else READ (Ci [temp])
return search ((Ci [temp], target)
}
The search operation on B-tree is similar to a search operation on a binary tree. The desired child is
chosen by performing linear search of values in the node. The running time of this algorithm is O(log
n).
Height of B-Tree
The maximum height of B-tree gives an upper bound on number of disk access. The minimum number
of keys in a B-tree of order 2m and depth h is
1 + 2m + 2m(m + 1) + 2m(m+1)2 + ... +2m(m+1)h-1

The maximum height of B-tree with n keys


logm+1 n/ 2m = O(logn)
[Link]
Review Questions
1. Explain the properties of B-trees.
2. Explain the insertion and deletion of nodes in the B-trees with suitable example.
3. What is a B-tree? Mention the properties that a B-tree holds.
++++++++++++++++++++++++++++++++++++++++++++++++++++
4.2 B+ Tree :

B+ Tree
•In B-trees the traversing of the nodes is done in inorder manner which is time consuming. We want
such a data structure of B-tree which will allow us to access data sequentially, instead of inorder
traversing.
• Definition : In B+ trees from leaf nodes reference to any other node can be possible. The leaves in
B+tree form a linked list which is useful in scanning the nodes sequentially.
• The insertion and deletion operations are similar to B-trees.
•Example : Consider following B+tree.

• From leaf node only any key can be accessed of entire tree. There is no need to traverse the tree in
inorder fashion.
•Thus B+tree gives faster access to any key.
Ex. 5.2.1 Construct a B+tree for F, S, Q, K, C, L, H, T, V, W, M, R.
Sol ; The method for constructing B+tree is similar to the building of B tree but the only difference
here is that, the parent nodes also appear in the leaf nodes. We will build B+tree for order 5.
The order 3 means at the most 2 keys are allowed.
[Link]

++++++++++++++++++++++++++++++++++++++++++++
4.3 Graph Definition :
[Link]
Graph Definition
A graph is a collection of two sets V and E. Where V is a finite non empty set of vertices and
E is a finite non empty set of edges.
Vertices are nothing but the nodes in the graph.
Two adjacent vertices are joined by edges.
• Any graph is denoted as G = {V, E}
For example

Comparison between Graph and Tree

Applications of Graphs
The graph theory is used in the computer science very widely. There are many interesting applications of
graph. We will list out few applications -
1. In computer networking such as Local Area Network (LAN), Wide Area Network (WAN),
internetworking.
[Link] telephone cabling graph theory is effectively used.
3. In job scheduling algorithms.
[Link]

+++++++++++++++++++++++++++++++++++++++++

Basic Terminologies
Complete graph: If an undirected graph of n vertices consists of n(n-1)/2 number of edges then it is
called as complete graph.

The graph shown in the Fig. 5.4.1 is a complete graph.


Subgraph: A subgraph G' of graph G is a graph such that the set of vertices. and set of edges of G' are
proper subset of the set of edges of G.

Connected graph: An undirected graph is said to be connected if for every pair of distinct vertices V;
and V; in V(G) there is a graph from V; to V; in G.

Weighted graph: A weighted graph is a graph which consists of weights along wih its edges.
[Link]

Path: A path is denoted using sequence of vertices and there exists an edge from one vertex to next
vertex.

Cycle : A closed walk through the graph with repeated vertices, mostly having the same starting and
endig vertex is called a cycle.

Component: The maximal connected subgraph of a graph is called component of a graph.


For example: Following are 3 components of a graph.
[Link]

Indegree and Outdegree


The degree of vertex is the number of edges associated with the vertex.
Indegree of a vertex is the number of edges that incident to that vertex. Outdegree of the vertex is total
number of edges that are going away from the vertex.

Self Loop
Self loop is an edge that connects the same vertex to itself.

++++++++++++++++++++++++++++++++++++++++++++++++
4.4 Representation of graph:
Representation of Graph
There various representations of graphs. The most commonly used representations are
[Link]
1. Adjacency Matrix Representation
2. Adjacency List Representation
Adjacency Matrix
Consider a graph G of n vertices and the matrix M. If there is an edge present between vertices Vi and
Vj then M[i][j] = 1 else M[i][j] = 0. Note that for an undirected graph if M[i][j] =1 then for M[j][i] is
also 1. Here are some graphs sh own by adjacency matrix.

Creating a Graph using Adjacency Matrix


Creation of graph using adjacency matrix is quite simple task. The adjacency matrix is nothing but a
two dimensional array. The algorithm for creation of graph using adjacency matix will be as follows:
1) Declare an array of M[size] [size] which will store the graph.
2) Enter how many nodes you want in a graph.
3) Enter the edges of the graph by two vertices each, say Vi, Vj indicates some edge.
4) If the graph is directed set M[i][j]=1. If graph is undirected set M[i][j] = 1 and M[j][i]= 1 as well.
5) When all the edges for the desired graph is entered print the graph M[i][j].
Adjacency List
The type of representation of a graph in which the linked list is used, is called Adjacency
List representation.
There are two methods of representing the adjacency list.
Method 1:
For graph G in Fig. 5.5.2, the nodes as a, b, c, d, e. So we will maintain the linked list of these head
nodes as well as the adjacent nodes. The 'C' structure will be
[Link]

typedef struct head


}
char data;
struct head *down;
struct head *next;
}
typedef struct node
{
int ver;
struct node *link;
}
Explanation: This is purely the adjacency list graph. The down pointer helps us to go to each node in
the graph whereas the next node is for going to adjacent node of each of the head node.
Method 2: In this method of representing the adjacency list, we take mixed data structure. That
means instead of taking the head list as a linked list we will take an array of head nodes. So only one
'C' structure will be there representing the adjacent nodes. See the

Fig. 5.5.4 representing the adjacency list for the above graph. First let us see the 'C' structure.
typedef struct node1
{
char vertex;
[Link]
struct node1 *next
}node
node * head [5]
Explanation: This is the graph which can be represented with the array and linked list data structures.
Array is used to store the head nodes. The node strucutre will be the same through out.
Ex. 5.5.1 What do you mean adjacency matrix and adjacency list ? Give the adjacency matrix
and adjacency list of the following graph:

Sol. Adjacency matrix and Adjacency list


Adjacency Matrix for given graph

Adjacency List for given graph

Ex. 5.5.2 Represent following graph using adjacency matrix and adjacency list.
[Link]

Ex. 5.5.3 Draw the adjacency list of the graph given in Fig. 5.5.6.

Ex. 5.5.4 Write an algorithm to find indegree and outdegree of a vertex in a given graph.
Sol
1. Initialize indegree and outdegree count for each node to zero.
2. Visit each node of a graph.
[Link]
3. Count the number of incoming edges of each node and then increment indegree count by one for
each incoming each.
4. Count the number of outgoing edges of each node and then increment outdegree count by one on
each outgoing edge.
Ex. 5.5.5 Write a function to print indegree, outdegree and total degree of given vertex. Sol. :
void print_degree()
{
int v;
for(v=0;v<n;v++)
{
in=indegree(v,n);
out outdegree(v,int n);
total=in+out;
printf("\n The indegree for vertex %d is %d",v,in); game jar
printf("\n The outdegree for vertex %d is %d",v,out);
printf("\n The total degree for vertex %d is %d",v,total);
}
int indegree(int v,n)
{
int v1,count=0;
for(v1=0;v1<n;v1++)
if(a[v1][v] ==1)
count++;
return count;
}
int outdegree(int v,int n)
{
int v2,count=0;
for(v2=0;v2<n;v2++)
if(a[v][v2]==1).
.count++;
return count;
}
Ex. 5.5.6 Explain with example inverse adjacency list representation of graph. Consider
following graph.
[Link]

Ex. 5.5.7 Give the adjacency matrix and adjacency


list representation for the graph shown in following Fig. 5.5.7.
[Link]

+++++++++++++++++++++++++++++++++++++++++++++
4.5 Types of Graph :
Basically graphs are of two types -
1. Directed graphs
2. Undirected graphs.
In the directed graph the directions are shown on the edges. As shown in the Fig. 5.6.1, the edges
between the vertices are ordered. In this type of graph, the edge E1 is in between the vertices V1 and
V2. The V1 is called head and the V2 is called the tail. Similarly for V1 head the tail is V3 and so on.
We can say E, is the set of (V1, V2) and not of (V2, V1).
Similarly in an undirected graph, the edges are not ordered. Refer Fig. 5.6.2 for clear understanding of
undirected graph. In this type of graph the edge E1 is set of (V1, V2) or (V2, V1).
[Link]

+++++++++++++++++++++++++++++++++++++++++++++++

4.6 Breadth-first traversal and Depth first traversal

Graph Traversal Methods


The graph can be traversed using Breadth first search and depth first search method. Let us
discuss these methods with the help of examples -

Breadth First Traversal


For the graph to traverse it by BFS, a vertex V1 in the graph will be visited first, then all the
vertices adjacent to V1 will be traversed suppose adjacent to V1 are (V2 V3, V4...Vn). So V2,
V3….Vn will be printed first. Then again from V2 the adjacent vertices will be printed. This process
will be continued for all the vertices to get encounterd. To keep track of all the vertices and their
adjacent vertices we will make use of queue data structure. Also we will make use of an array for
visited nodes. The nodes which are get visited are set to 1. Thus we can keep track of visited nodes.
In short in BFS traversal we follow the path in breadthwise fashion. Let us see the algorithm for
breadth first search.
Algorithm :
1. Create a graph. Depending on the type of graph i.e. directed or undirected set the value of the flag
as either 0 or 1 respectively.
2. Read the vertex from which you want to traverse the graph say Vi.
3. Initialize the visited array to 1 at the index of Vi.
4. Insert the visited vertex Vi in the queue.
5. Visit the vertex which is at the front of the queue. Delete it from the queue and place its adjacent
nodes in the queue.
6. Repeat the step 5, till the queue is not empty.
7. Stop.
[Link]
1. BFS (non-recursive) For Adjacency Matrix
Ex. 5.1 'C' Program
/******************************************************************
Program to create a Graph. The graph is represented using Adjacency Matrix.
*****************************************************************/
#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
#define size 20
#define TRUE 1
#define FALSE O
/*
int g[size][size];
int visit[size];
int Q[size];
int front, rear;
int n;
void main()
{
int v1, v2;
char ans ='y';
void create(),bfs();;
clrscr();
create();
clrscr();
printf("The Adjacency Matrix for the graph is \n");
for (v1 = 0; v1 < n; v1++)
{
for (v2 = 0; v2 < n; v2++)
printf("%d ", g[v1][v2]);
printf("\n");
}
getch();
do
{
for (v1 = 0; v1 < n; v1++)
visit [v1] = FALSE;
clrscr();
printf("Enter the Vertex from which you want to traverse ");
scanf("%d", &v1);
if (v1 >= n )
printf("Invalid Vertex\n");
else
[Link]
{
printf("The Breadth First Search of the Graph is \n");
bfs(v1);
getch();
}
printf("\nDo you want to traverse from any other node?");
ans=getche();
} while(ans=='y');
exit(0);
}
void create()
{
Int v1, v2;
char ans='y';
printf("\n\t\t This is a Program To Create a Graph");
printf("\n\t\t The Display Is In Breadth First Manner");
printf("\nEnter no. of nodes");
scanf("%d",&n);
for (v1= 0; v1 < n; v1++)
for (v2 = 0; v2 < n; v2++)
g[v1][v2] = FALSE;
printf("\nEnter the vertices no. starting from 0");
do
{
printf("\nEnter the vertices v1 & v2");
scanf("%d %d", &v1, &v2);
if (v1 >= n || v2 >= n)
printf("Invalid Vertex Value\n");
else
{
g[v1][v2]=TRUE;
g[v2][v1] = TRUE;
}
printf("\n\nAdd more edges??(y/n)");
ans=getche();
}while(ans =='y');
}
void bfs(int v1)
{
int v2;
visit [v1] = TRUE;
front rear = -1;
Q[++rear] = v1;
[Link]
while (front != rear)
{
v1 = Q[++front];
printf("%d\n", v1);
for ( v2 = 0; v2 < n; v2++)
{
if (g[v1][v2]==TRUE && visit [v2]==FALSE)
{
Q[++rear] = v2;
visit [v2]: TRUE;
}
}
}
}
Output
This is a Program To Create a Graph
The Display Is In Breadth First Manner
Enter no. of nodes 4
Enter the vertices no. starting from 0
Enter the vertices v1 & v2
01
Add more edges??(y/n)y
Enter the vertices v1 & v2
02
Add more edges??(y/n)y
Enter the vertices v1 & v2
13
Add more edges??(y/n)y
Enter the vertices v1 & v2
23
Add more edges??(y/n)
The Adjacency Matrix for the graph is
0110
1001
1001
0110
Enter the Vertex from which you want to traverse 0
The Breadth First Search of the Graph is
0
1
2
3
Do you want to traverse from any other node?
[Link]
Enter the Vertex from which you want to traverse 1
The Breadth First Search of the Graph is
1
0
3
2
Do you want to traverse from any other node?
Explanation of logic of BFS
In BFS the queue is maintained for storing the adjacent nodes and an array 'visited' is maintained for
keeping the track of visited nodes i.e. once a particular node is visited it should not be revisited again.
Let us see how our program works.
[Link]

So output will be - BFS for above graph as


[Link]
1234
C Function for BFS using Adjacency List
/* Global declarations */
typedef struct node
{
int vertex;
struct node *next;
}node;
/* Declare an adjacency matrix for storing the graph */
node *head[10];
int visited[MAX];
int Queue[MAX];
int front, rear;
int n;
void bfs(int V1)
{
int i;
node *first;
front=-1;
rear=-1;
Queue[++rear] = V1;
while (front != rear)
{
i = Queue[++front];
if (visited[i] == FALSE)
{
printf("%d\n", i);
visited[i]= TRUE;
}
first=head[i];
while (first != NULL)
{
if (visited[first->vertex] == FALSE)
Queue[++rear] = first->vertex;
first=first->next;
}
}
}
Depth First Traversal
• In depth first search traversal, we start from one vertex, and traverse the path as deeply as we can go.
When there is no vertex further, we traverse back and search for unvisited vertex.
• An array is maintained for storing the visited vertex.
For example:
[Link]

• The DFS will be (if the source vertex is 0) 0-1-2-3-4.


• The DFS will be (if we start from vertex 3) 3-4-0-1-2.
DFS by Adjacency Matrix (Recursive Program)
Ex. 5.2 'C' Program
/****************************************************************
Program to create a Graph. The graph is represented using Adjacency Matrix.
****************************************************************/
#include <stdio.h>
#include<conio.h>
#include<stdlib.h>
/* List of defined constants */
#define MAX 20
#define TRUE 1
#define FALSE O
/* Declare an adjacency matrix for storing the graph */
int g[MAX][MAX];
int v[MAX];
int n;
void main()
{
/* Local declarations */
int v1, v2;
char ans;
void create();
void Dfs(int);
clrscr();
create();
clrscr();
printf("The Adjacency Matrix for the graph is \n");
for (v1 = 0; v1 < n; v1++)
{
for (v2 = 0; v2 < n; v2++)
printf("%d ", g[v1][v2]);
[Link]
printf("\n");
}
getch();
do
{
for (v1 = 0; v1 < n; v1++)
v[v1] = FALSE;
clrscr();
printf("Enter the Vertex from which you want to traverse :");
scanf("%d", &v1);
if (v1 >= MAX)
printf("Invalid Vertex\n");
else
{
printf("The Depth First Search of the Graph is \n");
Dfs(v1);
}
printf("\n Do U want To Taverse By any Other Node?");
ans=getch();
} while(ans=='y');
}
void create()
{
int ch, v1, v2, flag;
char ans='y';
printf("\n\t\t This is a Progrm To Create a Graph");
printf("\n\t\t The Display Is In Depth First Manner");
getch();
clrscr();
flushall();
for (v1 = 0; v1 < n; v1++)
for (v2 = 0; v2 < n; v2++)
g[v1][v2]=FALSE;
printf("\nEnter no. of nodes");
scanf("%d", &n);
printf("\nEnter the vertices no. starting from 0");
do
{
printf("\nEnter the vertices v1 & v2");
scanf("%d %d", &v1, &v2);
if (v1 >= n || v2 >= n)
printf("Invalid Vertex Value\n" );
else
[Link]
{
g[v1][v2] = TRUE;
g[v2][v1] = TRUE;
}
printf("\n\nAdd more edges??(y/n)");
ans=getche();
} while(ans=='y');
}
void Dfs( int v1)
{
int v2;
printf("%d\n", v1);.
for ( v2 = 0; v2 < MAX; v2++)
if (g[v1][v2] == TRUE && v[v2] == FALSE)
Dfs(v2);
}
Output
This is a Program To Create a Graph
The Display Is In Depth First Manner
Enter no. of nodes 4
Enter the vertices no. starting from 0
Enter the vertices v1 & v2 0 1
Add more edges??(y/n)y
Enter the vertices v1 & v2 0 2
Add more edges??(y/n)y
Enter the vertices v1 & v2 1 3
Add more edges??(y/n)y
Enter the vertices v1 & v2 2 3
Add more edges??(y/n)
The Adjacency Matrix for the graph is
0110
1001
1001
0110
Enter the Vertex from which you want to traverse: 1
The Depth First Search of the Graph is
1
0
2
3
Do U want To Traverse By any Other Node?
Explanation of Logic for Depth First Traversal
[Link]
In DFS the basic data structure for storing the adjacent nodes is stack. In our program we have used a
recursive call to DFS function. When a recursive call is invoked actually push operation gets
performed. When we exit from the loop pop operation will be performed. Let us see how our program
works.

Since all the nodes are covered stop the procedure.


So output of DFS is
1 2 4 3
Ex. 5.3 C Function (Non-recursive)
[Link]
/****************************************************************
Program to create a Graph. The graph is represented using
Adjacency Matrix (Non recursive program)
***************************************************************/
#include<stdio.h>
void Dfs(int v1)
{
int v2;
void push(int item);
int pop();
push(v1);
while(top!= -1)
{
v1=pop();
if(v[v1]==FALSE)
{
printf("\n%d",v1);
v[v1]=TRUE;
}
for (v2 = 0; v2 < n; v2++)
if (g[v1][v2] == TRUE && v[v2]==FALSE)
push(v2);
}
}
void push(int item)
{
st[++top]=item;
}
int pop()
{
int item;
item=st[top];
top--;
return item;
}
Ex. 5.4 C Function - Adjacency List
typedef struct node
{
int vertex;
struct node *next;
}node;
/* Declare an adjacency matrix for storing the graph */
node *head[MAX]; /*Array Of head nodes*/
[Link]
int visited[MAX]; /*visited array for checking whether the array is visited or not*/ void Dfs(int V1)
{
int V2;
node *first;
printf("%d\n", V1);
visited[V1]=TRUE;
first = head[V1];
while (first != NULL)
if (visited [first->vertex] == FALSE)
Dfs (first->vertex);
else
first first ->next;
}
Ex. 5.8.1 Define DFS and BFS graph. Show DFS and BFS for the graph given below.

Sol. :
DFS Refer section 5.8.2.
BFS Refer section 5.8.1.
BFS for given graph
[Link]

Now delete each vertex from Queue and print it as BFS sequence:
V1 V2 V3 V4 V5 V6 V7 V8
[Link]
DFS for given graph

As all nodes are visited so.


[Link]
DFS of graph = V1, V2
Ex. 5.8.2 For the following construct:
i) Adjacency matrix
ii) Adjacency list
iii) DFS search
iv) BFS search

iii) DFS :
1 2 5 4 63
iv) BFS :
1 2 3 5 64
Ex. 5.8.3 From the Fig. 5.8.3, in what order are the vertices visited using DFS and BFS starting
from vertex A? Where a choice exists, use alphabetical order.
[Link]

Difference between DFS and BFS


[Link]

Review Questions
1. Distinguish between breadth first search and depth first search with example.
2. Explain depth first and breadth first traversal.
++++++++++++++++++++++++++++++++++++++++++++
4.7 Bi-connectivity :
Bi-Connectivity
Biconnected graphs are the graphs which can not be broken into two disconnected pieces (graphs) by
connecting single edge. For example :

In the given Fig. 5.10.1 (a) even if we remove any single edge the graph does not become
disconnected.

For example even if we remove an edge E1 the graph does not become disconnected. We do not get
two disconnected components of graph. Same is the case with any other edge in the given graph.
[Link]
But the following graph does not possess the property of Biconnectivity.

In above graph if we remove an edge E-F then we will get two distinct graphs.
Properties of Biconnected Graph:
1. There are two disjoint paths between any two vertices.
[Link] exists simple cycle between two vertices.
3. There should not be any cut vertex (Cut vertex is a vertex which if we remove then the graph
becomes disconnected.)
Strongly Connected Components
Strongly Connectivity Definition :
A directed graph is strongly connected if there is directed path from any vertex to every other vertex.
The connected components are called strongly connected components for example:

In above graph the strongly connected components are represented by dotted marking.
Review Questions
1. Write short notes on biconnectivity.
2. Explain in detail about strongly connected components and illustrate with an example.
Cut Vertex
• Definition: A vertex V is an undirected graph G is called a cut vertex iff removing it
disconnects the graph.
• The cut vertex is also called as articulation point.
• The following example represents the concept of cut vertex.
[Link]

• On removing C we get the disconnected components as.

• The concept of vertex are useful for designing reliable networks.


+++++++++++++++++++++++++++++++++++++++++++++++++++
4.8 Euler circuits:

Euler Circuits
In graph theory there is a famous problem known as konigsberg bridge problem. In this problem the
main theme was to cross the seven bridges exactly once to visit various cities. The Swiss
mathematician Leonhard Euler solved this problem in 1736. From this problem, the concept of Euler
circuit is developed. Let us define the terminologies Euler path and Euler circuit.
Euler path A path in a graph G is called Euler path if it includes every edge exactly once and every
vertex gets visited. Euler circuit on graph G is an Euler path that visits each vertex of graph G and
uses every edge of G.
For example
The Euler circuit is A - B – E – A – D – B – C – E – D - C - A.
[Link]

Ex. 5.12.1 Find an Euler path or an Euler circuit using DFS for the following graph.

Euler circuit is a circuit that uses every edge of graph exactly once. It starts and ends at the same
vertex.
Review Questions
1. Give a short note on euler circuits.
2. Explain euler circuit with an example.
++++++++++++++++++++++++++++++++++++++++++++++
[Link]
4.10 Topological Sort :
Definition
Topologic sorting for directed acyclic graph (DAG) is a linear ordering of vertices such that every
directed edge uv, vertex u comes before v in ordering.
Algorithm:
Following are the steps to be followed in this algorithm -
1. From a given graph find a vertex with no incoming edges. Delete it along with all the edges
outgoing from it. If there are more than one such vertices then break the tie randomly.
2. Note the vertices that are deleted.
3. All these recorded vertices give topologically sorted list.
Let us understand this algorithm with some examples -
Ex. 5.9.1 Sort the digraph for topological sort.

Hence the list after topological sorting will be B, A, D, C, E.


Ex. 5.9.2 Sort the given digraph using topological sort.
[Link]

Thus the topologically sorted list is B, A, C, D, E.


Ex. 5.9.3 Consider a directed acyclic graph 'D' given in following figure. Sort the nodes of 'D' by
applying topological sort on 'D'.
[Link]

Thus the topological sorting will be A, B, C, G, D, E,F.


Ex. 5.9.4 Implementation of Topological Sort(Application of Graph)
[Link]
Sol. :
#include<stdio.h>
#define SIZE 10
#define MAX 10
using namespace std;
int G[SIZE][SIZE], i, j, k;
int front, rear;
int n, edges;
int b[SIZE], Q[SIZE], indegree[SIZE];
int create()
{
front = -1; rear = -1;
for (i=0; i<MAX; i++) //initialising the graph
{
for (j = 0; j<MAX; j++)
{
G[i][j] = 0;
}
}
for (i=0; i<MAX; i++)
{
indegree[i] = -99;
}
n = 5;
edges=7;
G[0][2]=1;
G[0][3]=1;
G[1][0]=1;
G[1][3]=1;
G[2][4]=1
G[3][2]=1
G[3][4]=1
return n;
}
void Display(int n)
{
int V1, V2;
for (V1= 0; V1<n; V1++)
{
for (V2= 0; V2<n; V2++)
printf("%d", G[V1][V2]);
printf("\n");
}
[Link]
}
void Insert_Q(int vertex, int n)
{
if (rear == n)
printf("Queue Overflow\n");
else
{
if (front == -1)/*Empty Queue condition*/
front =0;
rear rear + 1;
Q[rear]=vertex;/* Inserting node into the Q*/
}
}
int Delete_Q()
{
int item;
if (front==-1| | front > rear)
{
printf("Queue Underflow\n");
return -1;
}
else
{
item=Q[front];
front = front + 1;
return item;
}
}
int Compute_Indeg(int node, int n)
{
int v1, indeg_count=0;
for (v1 = 0; v1<n; v1++)
if (G[v1][node] == 1)//checking for incoming edge
indeg_count++;
return indeg_count++;
}
void Topo_ordering(int n)
{
j = 0;
for (i=0; i<n; i++)
{
indegree[i] = Compute_Indeg(i, n);
if (indegree[i]==0)
[Link]
Insert_Q(i, n);
}
while (front <= rear)
{
k = Delete_Q();
b[j++] = k;
for (i=0; i<n; i++)
{
if (G[k][i]==1)
{
G[k][i] = 0;
indegree[i]=indegree[i] - 1;
if (indegree[i]==0)
Insert_Q(i, n);
}
}
}
printf("\nThe result of after topological sorting is ...");
for (i=0; i<n; i++)
printf("%d",b[i]);
printf("\n");
}
int main()
{
n = create();
printf("The adjacency matrix is : \n");
Display(n);
Topo_ordering(n);
return 0;
}
Output
The adjacency matrix is :
00110
10010
00001
00101
00000
The result of after topological sorting is... 1 0 3 2 4
[Link]

Review Questions
1. Explain the topological sorting algorithm.
2. State and explain topological sort with suitable example.

4.11 Dijkstra's algorithm:


Finding Shortest Path
•Dijkstra's Algorithm is a popular algorithm for finding shortest path.
• This algorithm is called single source shortest path algorithm because in this algorithm, for a given
vertex called source the shortest path to all other vertices is obtained.
• This algorithm applicable to graphs with non-negative weights only.
Ex. 5.14.1 Obtain the shortest path for the given graph.

Sol. Now we will consider each vertex as a source and will find the shortest distance from this vertex
to every other remaining vertex. Let us start with vertex A.
[Link]

Thus the shortest distance from A to E is obtained.


that is A-B-C-E with path length = 4 +1+3 = 8 obtained by choosing appropriate source and
destination.
Ex. 5.14.2 Using Dijkstra's algorithm find the shortest path from the source node A

Sol. The source node S={A}, the target nodes


P =(B, E, H, C, D}
Step 1: S = {A}
P = {B, E, H, C, D}
d(A, B)=2
d (A, H) = 4
d. (A, D) =∞
d(A, E)=5
d (A, C) = ∞
The minimum distance is 2. Hence choose vertex B.
Step 2:
S {A, B}
d(A, E) = 5
d(A, H)=4
[Link]
P = {E, H, C, D}
d(A, C) = 2 + 5 = 7
d (A, D)=∞
The minimum distance is 4. It is obtained from vertex H. Choose vertex H.
Step 3:
S={A, B, H}
d (A, E)=5.
P = {E, C, D}
d (A, C) = 4+2 = 6
d (A, D) = ∞
Choose vertex E
Step 4:
S ={A, B, H, E}
P = {C, D}
d(A, C)=6
choose vertex C
d(A, D) = 5+3=8
Step 5:
S={A, B, H, E, C}
P = {D}
d (A, D) = 8
choose vertex D.
S={A, B, H, E, C, D}
Thus the shortest paths from single source S is as given below

Ex. 5.14.3 Implementation of Shortest Path Algorithm (Application of Graph)


Sol. :
#include<stdio.h>
#include <conio.h>
#define infinity 999
int path[10];
void main()
{
int tot_nodes,i,j,cost[10][10],dist[10],s[10];
[Link]
void create(int tot_nodes,int cost[][10]);
void Dijkstra(int tot_nodes,int cost[][10], int i,int dist[10]);
void display(int i,int j,int dist[10]);
clrscr();
printf("\n\t\t Creation of graph ");
printf("\n Enter total number of nodes ");
scanf("%d", &tot_nodes);
create(tot_nodes,cost);
for(i=0;i<tot_nodes;i++)
{
printf("\n\t\t\t Press any key to continue...");
printf("\n\t\t When Source = %d\n",i);
for(j=0;j<tot_nodes;j++)
{
Dijkstra (tot_nodes, cost,i,dist);
if(dist[j]==infinity)
printf("\n There is no path to %d\n",j);
else
{
display(i,j,dist);
}
}
}
}
void create(int tot_nodes,int cost[][10])
{
int i,j,val,tot_edges,count=0;
for(i=0;i<tot_nodes;i++)
{
for(j=0;j<tot_nodes;j++)
{
if(i==j)
cost[i][j]=0;//diagonal elements are 0
else
cost[i][j]=infinity;
}
}
printf("\n Total number of edges");
scanf("%d", &tot_edges);
while(count<tot_edges)
{
printf("\n Enter Vi and Vj");
scanf("%d %d", &i,&j);
[Link]
printf("\n Enter the cost along this edge");
scanf("%d",&val);
cost [j][i]=val;
cost[i][j]=val;
count++;
}
}
void Dijkstra (int tot_nodes,int cost [10][10], int source,int dist[])
{
int i,j,v1,v2,min_dist;
int s[10];
for(i=0;i<tot_nodes;i++)
{
dist[i]=cost [source][i];//initially put the
s[i]=0; //distance from source vertex to i
//i is varied for each vertex
path[i]=source;//all the sources are put in path
}
s[source]=1;
for(i=1;i<tot_nodes;i++)
{
min_dist=infinity;
v1=-1;//reset previous value of v1
for(j=0;j<tot_nodes;j++)
{
if(s[j]==0)
{
if(dist[j]<min_dist)
{
min_dist=dist[j];//finding minimum distance
v1=j;
}
}
}
s[v1]=1;
for(v2=0;v2<tot_nodes;v2++)
{
if(s[v2]==0)
{
if(dist[v1]+cost [v1][v2] <dist[v2])
{
dist[v2]=dist[v1]+cost[v1][v2];
path[v2]=v1;
[Link]
}}}
}
}
void display(int source,int destination,int dist[])
{
int i;
getch();
printf("\n Step by Step shortest path is...\n");
for(i=destination;i!=source;i=path[i])
{
printf("%d <-",i);
}
printf("%d",i);
printf(" The length=%d",dist[destination]);
}
Output
Creation of graph
Enter total number of nodes 5
Total number of edges 7
Enter Vi and Vj 0 1
Enter the cost along this edge 4
Enter Vi and Vj 0 2
Enter the cost along this edge 8
Enter Vi and Vj 1 2
Enter the cost along this edge 1
Enter Vi and Vj 1 3
Enter the cost along this edge 3
Enter Vi and Vj 2 3
Enter the cost along this edge 7
Enter Vi and Vj 2 4
Enter the cost along this edge 3
Enter Vi and Vj 3 4
Enter the cost along this edge 8
Press any key to continue...
When Source =0
Step by Step shortest path is....
0 The length=0
Step by Step shortest path is...
1<-0 The length=4
Step by Step shortest path is...
2<-1-0 The length=5
Step by Step shortest path is...
3<-1-0 The length=7
[Link]
Step by Step shortest path is...
4<-2<-1-0 The length=8
Press any key to continue...
When Source =1
Step by Step shortest path is...
0<-1 The length=4
Step by Step shortest path is...
1 The length=0
Step by Step shortest path is...
2<-1 The length=1
Step by Step shortest path is...
3<-1 The length=3
Step by Step shortest path is...
4<-2<-1 The length=4
Press any key to continue...
When Source =2
Step by Step shortest path is...
0<-1<-2 The length=5
Step by Step shortest path is...
1<-2 The length=1
Step by Step shortest path is...
2 The length=0
Step by Step shortest path is...
3<-1<-2 The length=4
Step by Step shortest path is...
4<-2 The length=3
Press any key to continue...
When Source =3
Step by Step shortest path is...
0<-1<-3 The length=7
Step by Step shortest path is...
1<-3 The length=3
Step by Step shortest path is...
2<-1<-3 The length=4
Step by Step shortest path is...
3 The length=0
Step by Step shortest path is...
4<-2<-1<-3 The length=7
Press any key to continue...
When Source =4
Step by Step shortest path is...
0<-1<-2<-4 The length=8
Step by Step shortest path is...
[Link]
1<-2<-4 The length=4
Step by Step shortest path is....
2<-4 The length=3
Step by Step shortest path is...
3<-1<-2<-4 The length=7
Step by Step shortest path is...
4 The length=0

Ex. 5.14.4 Apply an appropriate algorithm to find the shortest path from 'A' to every other node
of A. For the given graph.

Sol. We will follow following steps to obtain topologically sorted list.


Step 1: S = {A}, T= {B,C,D,E}
d(A,B) 3 ← Min distance. select B
d(A,C) = ∞
d(A,D) = ∞
d(A,E) = ∞
Step 2: S = {A, B}, T = {C, D, E}
d(A,C)=d(A,B)+d(B,C)=3+6=9
d(A,D) = d(A,B) + d(B,D) = 3+1 = 4← select D
d(A,E) = d(A,B) + d(B,E) = 3+5=8
Step 3: S ={A, B, D}, T = {C,E}
d(A,C) = 9
d(A,E) = 8 ← select this
Step 4: S ={A, B, D,E}, T = {C}
d(A,C) = 9
[Link]
Step 5 Thus the shortest path from 'A' to every other node

Review Question
1. Explain the various applications of graphs.
+++++++++++++++++++++++++++++++++++++++++++++++++++++

4.12 Minimum Spanning Tree :


Minimum Spanning Tree
Spanning tree
A spanning tree of a graph G is a subgraph which is basically a tree and it contains all the vertices of
G containing no circuit.
Minimum spanning tree
A minimum spanning tree of a weighted connected graph G is a spanning tree with minimum or
smallest weight.
Weight of the tree
A weight of the tree is defined as the sum of weights of all its edges.
For example:
Consider a graph G as given below. This graph is called weighted connected graph because some
weights are given along every edge and the graph is a connected graph.

Applications of spanning trees :


1. Spanning trees are very important in designing efficient routing algorithms.
2. Spanning trees have wide applications in many areas such as network design.
4.13 Prim's Algorithm :
Let us understand the prim's algorithm with the help of example.
Example:
Consider the graph given below :
[Link]

Now, we will consider all the vertices first. Then we will select an edge with minimum weight. The
algorithm proceeds by selecting adjacent edges with minimum weight. Care should be taken for not
forming circuit.
[Link]

C program m
[Link]
/****************************************************************
This program is to implement Prim's Algorithm using Greedy Method
****************************************************************/
#include<stdio.h>
#include<conio.h>
#define SIZE 20
#define INFINITY 32767
/*This function finds the minimal spanning tree by Prim's Algorithm */
void Prim(int G[][SIZE], int nodes)
{
int tree[SIZE], i, j, k;
int min_dist, v1, v2,total=0;
// Initialize the selected vertices list
for (i=0; i<nodes; i++)
tree[i] = 0;
printf("\n\n The Minimal Spanning Tree Is :\n");
tree[0] = 1;
for (k=1; k<=nodes-1; k++)
{
min_dist = INFINITY;
//initially assign minimum dist as infinity
for (i=0; i<=nodes-1; i++)
{
for (j=0; j<=nodes-1; j++)
{
if (G[i][j] && ((tree[i] && !tree[j]) || (!tree[i] && tree[j])))
{
if (G[i][j] <min_dist)
{
min_dist=G[i][j];
v1 = i;
v2 = j;
}
}
}
}
printf("\n Edge (%d %d ) and weight = %d",v1,v2,min_dist);
tree[v1] = tree [v2] = 1;
total = total+min_dist;
}
printf("\n\n\t Total Path Length Is = %d",total);
}
void main()
[Link]
{
int G[SIZE][SIZE], nodes;
int v1, v2, length, i, j, n;
clrscr();
printf("\n\t Prim'S Algorithm\n");
printf("\n Enter Number of Nodes in The Graph ");
scanf("%d",&nodes);
printf("\n Enter Number of Edges in The Graph ");
scanf("%d",&n);
for (i=0; i<nodes; i++) // Initialize the graph
for (j=0; j<nodes; j++)
G[i][j] = 0;
//entering weighted graph
printf("\n Enter edges and weights \n");
for (i=0; i<n; i++)
{
printf("\n Enter Edge by V1 and V2 :");
printf("[Read the graph from starting node 0]");
scanf("%d %d", &v1,&v2);
printf("\n Enter corresponding weight :");
scanf("%d", &length);
G[v1][v2] = G[v2][v1] = length;
}
getch();
printf("\n\t");
clrscr();
Prim(G,nodes);
getch();
}
Output

Prim'S Algorithm
Enter Number of Nodes in The Graph 5
Enter Number of Edges in The Graph 7
[Link]
Enter edges and weights
Enter Edge by V1 and V2 : [Read the graph from starting node 0] 0 1
Enter corresponding weight :10
Enter Edge by V1 and V2 : [Read the graph from starting node 0] 1 2
Enter corresponding weight :1
Enter Edge by V1 and V2 : [Read the graph from starting node 0] 2 3
Enter corresponding weight :2
Enter Edge by V1 and V2 : [Read the graph from starting node 0] 3 4
Enter corresponding weight :3
Enter Edge by V1 and V2 [Read the graph from starting node 0] 4 0
Enter corresponding weight :5
Enter Edge by V1 and V2 : [Read the graph from starting node 0] 1 3
Enter corresponding weight :6
Enter Edge by V1 and V2 : [Read the graph from starting node 0] 4 2
Enter corresponding weight :7
The Minimal Spanning Tree Is :
Edge(0 4) and weight = 5
Edge(3 4) and weight = 3
Edge(2 3) and weight = 2
Edge(1 2) and weight = 1

Total Path Length Is = 11


Ex. 5.15.1 Consider the following weighted graph.
Give the list of edges in the MST in the order that Prim's algorithm inserts them. Start Prim's
algorithm from vertex A.
[Link]
[Link]
Ex. 5.15.2 Discuss about the algorithm and Pseudo-code to find the Minimum Spanning Tree
using Prim's Algorithm. Find the Minimum Spanning tree for the graph shown below.

Ex. 5.15.3 Give the Pseudo code for Prim's algorithm and apply the same to find the minimum
spanning tree of the graph shown below:
[Link]

4.14Kruskal's Algorithm :
Kruskal's algorithm is another algorithm of obtaining minimum spanning tree. This algorithm is
discovered by a second year graduate student Joseph Kruskal. In this algorithm always the minimum
cost edge has to be selected. But it is not necessary that selected optimum edge is adjacent.
Let us understand this algorithm with the help of some example.
Example :
Consider the graph given below:
[Link]

First we will select all the vertices. Then an edge with optimum weight is selected from heap, even
though it is not adjacent to previously selected edge. Care should be taken for not forming circuit.
[Link]

Ex. 5.15.4 Apply Kruskal's algorithm to find a minimum spanning tree of the following graph.
[Link]

C program
/*****************************************************************
Implementation of Kruskal's Algorithm
*****************************************************************/
#include<stdio.h>
#define INFINITY 999
typedef struct Graph
[Link]
{
int v1;
int v2;
int cost;
}GR;
GR G[20];
int tot_edges,tot_nodes;
void create();
void spanning_tree();
int Minimum(int);
void main()
{
printf("\n\t Graph Creation by adjacency matrix ");
create();
spanning_tree();
}
void create()
{
int k;
printf("\n Enter Total number of nodes: ");
scanf("%d", &tot_nodes);
printf("\n Enter Total number of edges: ");
scanf("%d", &tot_edges);
for(k=0;k<tot_edges;k++)
{
printf("\n Enter Edge in (V1 V2)form ");
scanf("%d%d",&G[k].v1,&G[k].v2);
printf("\n Enter Corresponding Cost ");
scanf("%d", &G[k].cost);
}
}
void spanning_tree()
{
int count,k,v1, v2,i,j,tree [10][10],pos,parent[10];
int sum;
int Find(int v2,int parent[]);
void Union(int i,int j,int parent[]);
count=0;
k=0;
sum=0;
for(i=0;i<tot_nodes;i++)
parent[i]=i;
while(count!=tot_nodes-1)
[Link]
{
pos=Minimum(tot_edges);//finding the minimum cost edge
if(pos==-1)//Perhaps no node in the graph
break;
v1=G[pos].v1;
v2=G[pos].v2;
i=Find(v1,parent);
j=Find(v2,parent);
if(i!=j)
{
tree[k][0]=v1;//storing the minimum edge in array tree[]
tree[k][1]=v2;
k++;
count++;
sum+=G[pos].cost;//accumulating the total cost of MST
Union(i,j,parent);
}
G[pos].cost=INFINITY;
}
if(count==tot_nodes-1)
{
printf("\n Spanning tree is...");
printf("\n-----------------\n");
for(i=0;i<tot_nodes-1;i++)
{
printf("[%d", tree[i][0]);
printf("-");
printf("%d",tree[i][1]);
printf("1");
}
printf("\n----------------");
printf("\nCost of Spanning Tree is=%d",sum);
}
else
{
printf("There is no Spanning Tree");
}
}
int Minimum(int n)
{
int i,small,pos;
small=INFINITY;
pos=-1;
[Link]
for(i=0;i<n;i++)
{
if(G[i].cost<small)
{
small=G[i].cost;
pos=i;
}
}
return pos;
}
int Find(int v2,int parent[])
{
while(parent[v2]!=v2)
{
v2=parent[v2];
}
return v2;
}
void Union(int i,int j,int parent[])
{
if(i<j)
parent[j]=i;
else
parent[i]=j;
}
Output
Graph Creation by adjacency matrix
Enter Total number of nodes: 4
Enter Total number of edges: 5
Enter Edge in (V1 V2)form 1 2
Enter Corresponding Cost 2
Enter Edge in (V1 V2)form 1 4
Enter Corresponding Cost 1
Enter Edge in (V1 V2)form 1 3
Enter Corresponding Cost 3
Enter Edge in (V1 V2)form 2 3
Enter Corresponding Cost 3
Enter Edge in (V1 V2)form 43
Enter Corresponding Cost 5
Spanning tree is...
-----------------------
[1-4][1-2][1-3]
-----------------------
[Link]
Cost of Spanning Tree is = 6

Difference between Prim's And Kruskal's Algorithm

4.13 Prim's algorithm


How does Prim’s Algorithm Work?
The working of Prim’s algorithm can be described by using the following steps:
Step 1: Determine an arbitrary vertex as the starting vertex of the MST.
Step 2: Follow steps 3 to 5 till there are vertices that are not included in the MST (known as fringe vertex).
Step 3: Find edges connecting any tree vertex with the fringe vertices.
Step 4: Find the minimum among these edges.
Step 5: Add the chosen edge to the MST if it does not form any cycle.
[Link]
Step 6: Return the MST and exit
Note: For determining a cycle, we can divide the vertices into two sets [one set contains the vertices included in
MST and the other contains the fringe vertices.]
Illustration of Prim’s Algorithm:
Consider the following graph as an example for which we need to find the Minimum Spanning Tree
(MST).

Example of a graph
Step 1: Firstly, we select an arbitrary vertex that acts as the starting vertex of the Minimum Spanning
Tree. Here we have selected vertex 0 as the starting vertex.
[Link]
0 is selected as starting vertex
Step 2: All the edges connecting the incomplete MST and other vertices are the edges {0, 1} and {0, 7}.
Between these two the edge with minimum weight is {0, 1}. So include the edge and vertex 1 in the MST.

1 is added to the MST


Step 3: The edges connecting the incomplete MST to other vertices are {0, 7}, {1, 7} and {1, 2}. Among
these edges the minimum weight is 8 which is of the edges {0, 7} and {1, 2}. Let us here include the edge {0, 7}
and the vertex 7 in the MST. [We could have also included edge {1, 2} and vertex 2 in the MST].

7 is added in the MST


[Link]
Step 4: The edges that connect the incomplete MST with the fringe vertices are {1, 2}, {7, 6} and {7, 8}.
Add the edge {7, 6} and the vertex 6 in the MST as it has the least weight (i.e., 1).

6 is added in the MST


Step 5: The connecting edges now are {7, 8}, {1, 2}, {6, 8} and {6, 5}. Include edge {6, 5} and vertex 5
in the MST as the edge has the minimum weight (i.e., 2) among them.

Include vertex 5 in the MST


Step 6: Among the current connecting edges, the edge {5, 2} has the minimum weight. So include that
edge and the vertex 2 in the MST.
[Link]

Include vertex 2 in the MST


Step 7: The connecting edges between the incomplete MST and the other edges are {2, 8}, {2, 3}, {5, 3}
and {5, 4}. The edge with minimum weight is edge {2, 8} which has weight 2. So include this edge and the vertex
8 in the MST.

Add vertex 8 in the MST


Step 8: See here that the edges {7, 8} and {2, 3} both have same weight which are minimum. But 7 is
already part of MST. So we will consider the edge {2, 3} and include that edge and vertex 3 in the MST.
[Link]

Include vertex 3 in MST


Step 9: Only the vertex 4 remains to be included. The minimum weighted edge from the incomplete
MST to 4 is {3, 4}.

Include vertex 4 in the MST


The final structure of the MST is as follows and the weight of the edges of the MST is (4 + 8 + 1 + 2 +
4 + 2 + 7 + 9) = 37.
[Link]

The structure of the MST formed using the above method


Note: If we had selected the edge {1, 2} in the third step then the MST would look like the following.

Structure of the alternate MST if we had selected edge {1, 2} in the MST
How to implement Prim’s Algorithm?
Follow the given steps to utilize the Prim’s Algorithm mentioned above for finding MST of a graph:
• Create a set mstSet that keeps track of vertices already included in MST.
• Assign a key value to all vertices in the input graph. Initialize all key values as INFINITE. Assign the
key value as 0 for the first vertex so that it is picked first.
[Link]
• While mstSet doesn’t include all vertices
o Pick a vertex u that is not there in mstSet and has a minimum key value.
o Include u in the mstSet.
o Update the key value of all adjacent vertices of u. To update the key values, iterate through all
adjacent vertices.
o For every adjacent vertex v, if the weight of edge u-v is less than the previous key value
of v, update the key value as the weight of u-v.
The idea of using key values is to pick the minimum weight edge from the cut. The key values are used
only for vertices that are not yet included in MST, the key value for these vertices indicates the minimum weight
edges connecting them to the set of vertices included in MST.
Below is the implementation of the approach:
C++CJavaPythonC#JavaScript
// A C++ program for Prim's Minimum
// Spanning Tree (MST) algorithm. The program is
// for adjacency matrix representation of the graph

#include <bits/stdc++.h>
using namespace std;

// Number of vertices in the graph


#define V 5

// A utility function to find the vertex with


// minimum key value, from the set of vertices
// not yet included in MST
int minKey(int key[], bool mstSet[])
{
// Initialize min value
int min = INT_MAX, min_index;

for (int v = 0; v < V; v++)


if (mstSet[v] == false && key[v] < min)
min = key[v], min_index = v;

return min_index;
}

// A utility function to print the


// constructed MST stored in parent[]
void printMST(int parent[], int graph[V][V])
{
cout << "Edge \tWeight\n";
for (int i = 1; i < V; i++)
[Link]
cout << parent[i] << " - " << i << " \t"
<< graph[i][parent[i]] << " \n";
}

// Function to construct and print MST for


// a graph represented using adjacency
// matrix representation
void primMST(int graph[V][V])
{
// Array to store constructed MST
int parent[V];

// Key values used to pick minimum weight edge in cut


int key[V];

// To represent set of vertices included in MST


bool mstSet[V];

// Initialize all keys as INFINITE


for (int i = 0; i < V; i++)
key[i] = INT_MAX, mstSet[i] = false;

// Always include first 1st vertex in MST.


// Make key 0 so that this vertex is picked as first
// vertex.
key[0] = 0;

// First node is always root of MST


parent[0] = -1;

// The MST will have V vertices


for (int count = 0; count < V - 1; count++) {

// Pick the minimum key vertex from the


// set of vertices not yet included in MST
int u = minKey(key, mstSet);

// Add the picked vertex to the MST Set


mstSet[u] = true;

// Update key value and parent index of


// the adjacent vertices of the picked vertex.
// Consider only those vertices which are not
[Link]
// yet included in MST
for (int v = 0; v < V; v++)

// graph[u][v] is non zero only for adjacent


// vertices of m mstSet[v] is false for vertices
// not yet included in MST Update the key only
// if graph[u][v] is smaller than key[v]
if (graph[u][v] && mstSet[v] == false
&& graph[u][v] < key[v])
parent[v] = u, key[v] = graph[u][v];
}

// Print the constructed MST


printMST(parent, graph);
}

// Driver's code
int main()
{
int graph[V][V] = { { 0, 2, 0, 6, 0 },
{ 2, 0, 3, 8, 5 },
{ 0, 3, 0, 0, 7 },
{ 6, 8, 0, 0, 9 },
{ 0, 5, 7, 9, 0 } };

// Print the solution


primMST(graph);

return 0;
}

// This code is contributed by rathbhupendra

Output
Edge Weight
0-1 2

4.15 Kruskal's algorithm

ntroduction to Kruskal’s Algorithm:


Here we will discuss Kruskal’s algorithm to find the MST of a given weighted graph.
[Link]
In Kruskal’s algorithm, sort all edges of the given graph in increasing order. Then it keeps on adding
new edges and nodes in the MST if the newly added edge does not form a cycle. It picks the minimum
weighted edge at first and the maximum weighted edge at last. Thus we can say that it makes a locally
optimal choice in each step in order to find the optimal solution. Hence this is a Greedy Algorithm.
How to find MST using Kruskal’s algorithm?
Below are the steps for finding MST using Kruskal’s algorithm:
1. Sort all the edges in non-decreasing order of their weight.
2. Pick the smallest edge. Check if it forms a cycle with the spanning tree formed so far. If the cycle is not
formed, include this edge. Else, discard it.
3. Repeat step#2 until there are (V-1) edges in the spanning tree.
Step 2 uses the Union-Find algorithm to detect cycles.
So we recommend reading the following post as a prerequisite.
• Union-Find Algorithm | Set 1 (Detect Cycle in a Graph)
• Union-Find Algorithm | Set 2 (Union By Rank and Path Compression)
Kruskal’s algorithm to find the minimum cost spanning tree uses the greedy approach. The Greedy
Choice is to pick the smallest weight edge that does not cause a cycle in the MST constructed so far.
Let us understand it with an example:
Illustration:
Below is the illustration of the above approach:
Input Graph:

The graph contains 9 vertices and 14 edges. So, the minimum spanning tree formed will be having (9 –
1) = 8 edges.
After sorting:

Weight Source Destination

1 7 6

2 8 2

2 6 5
[Link]

Weight Source Destination

4 0 1

4 2 5

6 8 6

7 2 3

7 7 8

8 0 7

8 1 2

9 3 4

10 5 4

11 1 7

14 3 5

Now pick all edges one by one from the sorted list of edges
Step 1: Pick edge 7-6. No cycle is formed, include it.
[Link]

Add edge 7-6 in the MST


Step 2: Pick edge 8-2. No cycle is formed, include it.

Add edge 8-2 in the MST


Step 3: Pick edge 6-5. No cycle is formed, include it.
[Link]

Add edge 6-5 in the MST


Step 4: Pick edge 0-1. No cycle is formed, include it.

Add edge 0-1 in the MST


Step 5: Pick edge 2-5. No cycle is formed, include it.
[Link]

Add edge 2-5 in the MST


Step 6: Pick edge 8-6. Since including this edge results in the cycle, discard it. Pick edge 2-3: No cycle
is formed, include it.

Add edge 2-3 in the MST


[Link]
Step 7: Pick edge 7-8. Since including this edge results in the cycle, discard it. Pick edge 0-7. No cycle
is formed, include it.

Add edge 0-7 in MST


Step 8: Pick edge 1-2. Since including this edge results in the cycle, discard it. Pick edge 3-4. No cycle
is formed, include it.
[Link]
Add edge 3-4 in the MST
Note: Since the number of edges included in the MST equals to (V – 1), so the algorithm stops here
Below is the implementation of the above approach:
• C++
• C
• Java
• Python3
• C#
• Javascript

// C++ program for the above approach

#include <bits/stdc++.h>
using namespace std;

// DSU data structure


// path compression + rank by union
class DSU {
int* parent;
int* rank;

public:
DSU(int n)
{
parent = new int[n];
rank = new int[n];

for (int i = 0; i < n; i++) {


parent[i] = -1;
rank[i] = 1;
}
}

// Find function
int find(int i)
{
if (parent[i] == -1)
return i;

return parent[i] = find(parent[i]);


}
[Link]

// Union function
void unite(int x, int y)
{
int s1 = find(x);
int s2 = find(y);

if (s1 != s2) {
if (rank[s1] < rank[s2]) {
parent[s1] = s2;
}
else if (rank[s1] > rank[s2]) {
parent[s2] = s1;
}
else {
parent[s2] = s1;
rank[s1] += 1;
}
}
}
};

class Graph {
vector<vector<int> > edgelist;
int V;

public:
Graph(int V) { this->V = V; }

// Function to add edge in a graph


void addEdge(int x, int y, int w)
{
edgelist.push_back({ w, x, y });
}

void kruskals_mst()
{
// Sort all edges
sort([Link](), [Link]());

// Initialize the DSU


DSU s(V);
[Link]

int ans = 0;
cout << "Following are the edges in the "
"constructed MST"
<< endl;
for (auto edge : edgelist) {
int w = edge[0];
int x = edge[1];
int y = edge[2];

// Take this edge in MST if it does


// not forms a cycle
if ([Link](x) != [Link](y)) {
[Link](x, y);
ans += w;
cout << x << " -- " << y << " == " << w
<< endl;
}
}
cout << "Minimum Cost Spanning Tree: " << ans;
}
};

// Driver code
int main()
{
Graph g(4);
[Link](0, 1, 10);
[Link](1, 3, 15);
[Link](2, 3, 4);
[Link](2, 0, 6);
[Link](0, 3, 5);

// Function call
g.kruskals_mst();

return 0;
}

Output
Following are the edges in the constructed MST
2 -- 3 == 4
0 -- 3 == 5
0 -- 1 == 10
[Link]
Minimum Cost Spanning Tree: 19
Time Complexity: O(E * logE) or O(E * logV)
• Sorting of edges takes O(E * logE) time.
• After sorting, we iterate through all edges and apply the find-union algorithm. The find and union
operations can take at most O(logV) time.
• So overall complexity is O(E * logE + E * logV) time.
• The value of E can be at most O(V2), so O(logV) and O(logE) are the same. Therefore, the overall time
complexity is O(E * logE) or O(E*logV)
Auxiliary Space: O(V + E), where V is the number of vertices and E is the number of edges in the
graph.
++++++++++++++++++++++++++++++++++++++++++++++++++
[Link]
UNIT V SEARCHING, SORTING AND HASHING TECHNIQUES 9
Searching – Linear Search – Binary Search. Sorting – Bubble sort – Selection sort – Insertion sort – Shell sort
–. Merge Sort – Hashing – Hash Functions – Separate Chaining – Open Addressing –Rehashing – Extendible
Hashing.

5.1 SEARCHING :
Searching is an algorithm, to check whether a particular element is present in the list.
Types of searching:-
Linear search
Binary Search
5.2 Linear Search :
Linear search is used to search a data item in the given set in the sequential manner, starting from
the first element. It is also called as sequential search

Linear Search routine:

void Linear_search ( int a[ ] , int n )


{
int search , count = 0 ;
for ( i = 0 ; i < n ; I ++ )
{
if ( a [ i ] = = search )
{
count ++ ;
}
}
if ( count = = 0 )
print “Element not Present” ;
else
print “Element is Present in list" ;
}
[Link]

Program for Linear search


#include < stdio.h >
void main( )
{
int a [ 10 ] , n , i , search, count = 0 ;
printf ( " Enter the number of elements \ t " ) ;
scanf ( " %d " , & n ) ;
printf ( " \n Enter %d numbers \n " , n ) ;
for ( i = 0 ; i < n ; i ++ )
scanf ( " %d " , & a [ i ] ) ;
printf ( " \n Array Elements \n " ) ;
for ( i = 0 ; i < n ; i ++ )
printf ( " %d \t " , a [ i ] ) ;
printf ( " \ n \ n Enter the Element to be searched: \ t " ) ;
scanf ( " % d " , & search ) ;
for ( i =0 ; i < n; i ++ )
{
if ( search = = a [ i ] )

count ++ ;
}
if ( count = = 0 )
printf( " \n Element %d is not present in the array " , search ) ;
else
printf ( " \n Element %d is present %d times in the array \n " , search , count ) ;
}
OUTPUT:
Enter the number of elements 5
Enter the numbers
20 10 5 25 100
Array Elements
20 10 5 25 100
Enter the Element to be searched: 25
Element 25 is present 1 times in the array

Advantages of Linear search:


• The linear search is simple - It is very easy to understand and implement;
• It does not require the data in the array to be stored in any particular order.
[Link]
Disadvantages of Linear search:
• Slower than many other search algorithms.
• It has a very poor efficiency.

5.3 Binary Search :


Binary search is used to search an item in a sorted list. In this method , initialize the lower
limit and upper limit.
The middle position is computed as (first+last)/2 and check the element in the middle
position with the data item to be searched.
If the data item is greater than the middle value then the lower limit is adjusted to one
greater than the middle [Link] the upper limit is adjusted to one less than the
middle value.

Working principle:
Algorithm is quite simple. It can be done either recursively or iteratively:
1. Get the middle element;
2. If the middle element equals to the searched value, the algorithm stops;
3. Otherwise, two cases are possible:
o Search value is less than the middle element. In this case, go to the step 1 for the
part of the array, before middle element.
o Searched value is greater, than the middle element. In this case, go to the step 1
for the part of the array, after middle element.

Example 1.

Find 6 in {-1, 5, 6, 18, 19, 25, 46, 78, 102, 114}.

Step 1 (middle element is 19 > 6): -1 5 6 18 19 25 46 78 102 114

Step 2 (middle element is 5 < 6): -1 5 6 18 19 25 46 78 102 114

Step 3 (middle element is 6 == 6): -1 5 6 18 19 25 46 78 102 114

Binary Search routine:


void Binary_search ( int a[ ] , int n , int search )
{
int first, last, mid ;
first = 0 ;
last = n-1 ;
mid = ( first + last ) / 2 ;
while ( first < = last )
[Link]
{
if ( Search > a [ mid ] )
first = mid + 1 ;
else if ( Search = = a [ mid ] )
{
print “Element is present in the list" ;
break ;
}
else {
last = mid – 1 ;
mid = ( first + last ) / 2 ;
}
if( first > last )
print “Element Not Found” ;
}
Program for Binary Search:
#include<stdio.h>
void main( )
{
int a [ 10 ] , n , i , search, count = 0 ;
void Binary_search ( int a[ ] , int n , int search );
printf ("Enter the number of elements \t") ;
scanf ("%d",&n);
printf("\nEnter the numbers\n") ;
for (i = 0; i<n;i++)
scanf("%d",&a[i]);
printf("\nArray Elements\n") ;
for (i = 0 ; i < n ; i ++ )
printf("%d\t",a[i]) ;
printf ("\n\nEnter the Element to be searched:\t");
scanf("%d",&search );
Binary_search(a,n,search);
}
void Binary_search ( int a[ ] , int n , int search )
{
int first, last, mid ;
first = 0 ;
last = n-1 ;
mid = (first + last ) / 2 ;
while (first<=last )
{
if(search>a[mid])
[Link]
first = mid + 1 ;
else if (search==a[mid])
{
printf("Element is present in the list");
break ;
}
else
last = mid - 1 ;
mid = ( first + last ) / 2 ;
}
if( first > last )
printf("Element Not Found");
}

OUTPUT:
Enter the number of elements 5
Enter the numbers
20 25 50 75 100
Array Elements
20 25 50 75 100
Enter the Element to be searched: 75
Element is present in the listPress any key to continue . . .
Advantages of Binary search:
In Linear search, the search element is compared with all the elements in the array. Whereas
in Binary search, the search element is compared based on the middle element present in the
array.
A technique for searching an ordered list in which we first check the middle item and - based
on that comparison - "discard" half the data. The same procedure is then applied to the
remaining half until a match is found or there are no more items left.
Disadvantages of Binary search:
Binary search algorithm employs recursive approach and this approach requires more
stack space.
It requires the data in the array to be stored in sorted order.
It involves additional complexity in computing the middle element of the array.
[Link]
Analysis of Searching algorithms:

[Link] Algorithm Best Case Analysis Average Case Worst Case

Analysis Analysis
1 Linear search O(1) O(N) O(N)
2 Binary search O(1) O(log N) O(log N)

5.4 SORTING:
Definition:
Sorting is a technique for arranging data in a particular order.
Order of sorting:
Order means the arrangement of data. The sorting order can be ascending or descending. The
ascending order means arranging the data in increasing order and descending order means
arranging the data in decreasing order.
Types of Sorting

Internal Sorting
External Sorting
Internal Sorting
Internal Sorting is a type of sorting technique in which data resides on main memory of
computer. It is applicable when the number of elements in the list is small.
E.g. Bubble Sort, Insertion Sort, Shell Sort, Quick Sort., Selection sort, Radix sort
External Sorting
External Sorting is a type of sorting technique in which there is a huge amount of data and it resides
on secondary devise(for eg hard disk,Magnetic tape and so no) while sorting.
E.g. Merge Sort, Multiway Merge Sort,Polyphase merge sort
Sorting can be classified based on
[Link] complexity
[Link] utilization
3. Stability

4. Number of comparisons.

ANALYSIS OF ALGORITHMS:
Efficiency of an algorithm can be measured in terms of:

Space Complexity: Refers to the space required to execute the algorithm


[Link]

Time Complexity: Refers to the time required to run the program.

Sorting algorithms:
Insertion sort
Selection sort
Shell sort
Bubble sort
Quick sort
Merge sort

5.4.3 INSERTION SORTING:


The insertion sort works by taking elements from the list one by one and inserting them
in the correct position into a new sorted list.
Insertion sort consists of N-1 passes, where N is the number of elements to be sorted.
The ith pass will insert the ith element A[i] into its rightful place among
A[1],A[2],…,A[i-1].
After doing this insertion the elements occupying A[1],…A[i] are in sorted order.

How Insertion sort algorithm works?

Insertion Sort routine:


void Insertion_sort(int a[ ], int n)
{
[Link]
int i, j, temp;
for ( i = 0 ; i < n -1 ; i ++ )
{
temp = a [ j ] ;
for ( j = i ; j > 0 && a [ j -1 ] > temp ; j -- )
{
a[ j ] = a [ j – 1 ] ;

}
a[j]=temp;
}}
Program for Insertion sort
#include<stdio.h>
void main( ){
int n, a[ 25 ], i, j, temp;
printf( "Enter number of elements \n" );
scanf( "%d", &n );
printf( "Enter %d integers \n", n );
for ( i = 0; i < n; i++ )
scanf( "%d", &a[i] );
for ( i = 0 ; i < n; i++ ){
temp=a[i];
for (j=i;j > 0 && a[ j -1]>temp;j--)
{
a[ j ] = a[ j - 1 ];
}
a[j]=temp;}
printf( "Sorted list in ascending order: \n ");
for ( i = 0 ; i < n ; i++)
printf ( "%d \n ", a[ i ] );}
OUTPUT:
Enter number of elements
6
Enter 6 integers
20 10 60 40 30 15
Sorted list in ascending order:
10
15
20
30
40
[Link]
60
Advantage of Insertion sort
• Simple implementation.

• Efficient for (quite) small data sets.

• Efficient for data sets that are already substantially sorted.

Disadvantages of Insertion sort


• It is less efficient on list containing more number of elements.
• As the number of elements increases the performance of the program would
be [Link] sort needs a large number of element shifts.
5.4.2 Selection Sort :

Selection sort selects the smallest element in the list and place it in the first position then selects
the second smallest element and place it in the second position and it proceeds in the similar way
until the entire list is sorted. For “n” elements, (n-1) passes are required. At the end of the ith
iteration, the ith smallest element will be placed in its correct position.

Selection Sort routine:


void Selection_sort( int a[ ], int n )
{
int i , j , temp , position ;
for ( i = 0 ; i < n – 1 ; i ++ )
{
position = i ;
for ( j = i + 1 ; j < n ; j ++ )
{
if ( a[ position ] > a[ j ] )
position = j;}
temp = a[ i ];
a[ i ] = a[ position
]; a[ position ] =
temp;
}}

How Selection sort algorithm works?


[Link]

Program for Selection sort


#include <stdio.h>
void main( )
{
int a [ 100 ] , n , i , j , position , temp ;
printf ( "Enter number of elements \n" ) ;
scanf ( "%d", &n ) ;
printf ( " Enter %d integers \n ", n ) ;
for ( i = 0 ; i < n ; i ++ )
scanf ( "%d", & a[ i ] ) ;
for ( i = 0 ; i < ( n - 1 ) ; i ++ )
{
position = i ;
for ( j = i + 1 ; j < n ; j ++ )
{
if ( a [ position ] > a [ j ] )
position = j ;
}
if ( position != i )
{
temp = a [ i ] ;
a [ i ] = a [ position ] ;
a [ position ] = temp ;
}
}
[Link]
printf ( "Sorted list in ascending order: \n ") ;
for ( i = 0 ; i < n ; i ++ )
printf ( " %d \n ", a[ i ] ) ;
}
OUTPUT:
Enter number of elements
5
Enter 5 integers
83951
Sorted list in ascending order:
1
3
5
8
9
Advantages of selection sort
• Memory required is small.
• Selection sort is useful when you have limited memory available.
• Relatively efficient for small arrays.

Disadvantage of selection sort


• Poor efficiency when dealing with a huge list of items.
• The selection sort requires n-squared number of steps for sorting n elements.
• The selection sort is only suitable for a list of few elements that are in random order.
5.4.4 Shell Sort:
• Invented by Donald shell.
• It improves upon bubble sort and insertion sort by moving out of order elements more
than one position at a time.
• In shell sort the whole array is first fragmented into K segments, where K is preferably a
prime number.
• After the first pass the whole array is partially sorted.
• In the next pass, the value of K is reduced which increases the size of each segment and
reduces the number of segments.
• The next value of K is chosen so that it is relatively prime to its previous value.
• The process is repeated until K=1 at which the array is sorted.
• The insertion sort is applied to each segment so each successive segment is partially
sorted.
• The shell sort is also called the Diminishing Increment sort, because the value of k
decreases continuously
[Link]

A Shell Sort with Increments of Three

A Shell Sort after Sorting Each Sublist

Shell Sort: A Final Insertion Sort with Increment of 1


[Link]
Shell Sort routine:

void Shell_sort ( int a[ ], int n )


{
int i, j, k, temp;
for ( k = n / 2 ; k > 0 ; k = k / 2 )
for ( i = k ; i < n ; i + + )
{
temp = a [ i ] ;
for ( j = i ; j > = k && a [ j – k ] > temp ; j = j – k )
{
a[j]=a[j–k];
}
a [ j ] = temp ;
}
}

Program for Shell sort


#include<stdio.h>
void main( )
{
int n, a[ 25 ], i, j,k,temp;
printf( "Enter number of elements \n" );
scanf( "%d", &n );
printf( "Enter %d integers \n", n );
for ( i = 0; i < n; i++ )
scanf( "%d", &a[i] );
for (k = n / 2 ; k>0 ; k=k/ 2){
for ( i = k ; i < n ; i ++ )
{
temp = a [ i ] ;
for (j = i ; j>= k && a [ j - k ]>temp ; j=j - k )
{
a[j]=a[j-k];
}
a [ j ] = temp ;
}
}
printf( "Sorted list in ascending order using shell sort: \n ");
for ( i = 0 ; i < n ; i++)
printf ( "%d\t ", a[ i ] );
}
[Link]
OUTPUT:
Enter number of elements
10
Enter 10 integers
81 94 11 96 12 35 17 95 28 58
Sorted list in ascending order using shell sort:
11 12 17 28 35 58 81 94 95 96
//PROGRAM FOR SHELL USING FUNCTION
#include < stdio.h >
void main( )
{
int a [ 5 ] = { 4, 5, 2, 3, 6 } , i = 0 ;
ShellSort ( a, 5 ) ;
printf(“Example using function”);
printf( " After Sorting :" ) ;
for ( i = 0 ; i < 5 ; i ++ )
printf ( " %d ", a[ i ] ) ;
}
void ShellSort (int a [ 5 ] , int n )
{
int i , j , k , temp ;
for ( k = n / 2 ; k > 0 ; k / = 2)
{
for ( i = k ; i < n ; i ++ )
{
temp = a [ i ] ;
for ( j = i ; j > = k && a [ j – k ] > temp ; j = j – k ){
a[j]=a[j–k];
}
a [ j ] = temp ; } }}

OUTPUT:
After Sorting : 2 3 4 5 6
Advantages of Shell sort
• Efficient for medium-size lists.
Disadvantages of Shell sort
• Complex algorithm, not nearly as efficient as the merge, heap and quick sorts
[Link]
5.4.1 Bubble Sort:
• Bubble sort is one of the simplest internal sorting algorithms.

• Bubble sort works by comparing two consecutive elements and the largest
element among these two bubbles towards right at the end of the first pass the
largest element getssorted and placed at the end of the sorted list.
• This process is repeated for all pairs of elements until it moves the largest
element to theend of the list in that iteration.
• Bubble sort consists of (n-1) passes, where n is the number of elements to be sorted.

• In 1st pass the largest element will be placed in the nth position.

• In 2nd pass the second largest element will be placed in the (n-1)th
[Link] (n-1)th pass only the first two elements are compared.

Bubble sort routine:


void Bubble_sort (int a [ ] , int n )
{
int i, j, temp;
for( i = 0; i < n - 1; i++ ){

for( j = 0; j < n – i - 1; j++ )


{
if( a[ j ] > a [ j + 1 ] )
{
} }}
temp = a [ j ];
a[ j ] = a[ j + 1 ];
} a[ j + 1 ] = temp;
[Link]

Program for Bubble sort


#include<stdio.h >
#include<conio.h >
void main( )
{
int a [ 20 ], i, j, temp, n ;
printf ("Enter the number of elements");
scanf ("%d",&n);
printf("Enter the numbers");
for(i=0;i < n ;i++)
scanf("%d",&a[i]);
for(i=0;i<n-1;i++)
{
for(j=0;j<n-i-1;j++)
{
if(a[j]>a[j + 1]){
temp = a[ j ] ;
a[ j ] = a[ j + 1 ] ;
a[ j+ 1 ] = temp;
}
}
}
printf("\nSorted array\t");
for(i=0;i<n;i++)
printf("%d\t",a[i]);
}
OUTPUT:
Enter the number of elements5
Enter the numbers8 3 9 5 1
Sorted array 1 3 5 8 9
Advantage of Bubble sort
• It is simple to write
• Easy to understand
• It only takes a few lines of code.
[Link]

Disadvantage of Bubble sort


• The major drawback is the amount of time it takes to sort.
• The average time increases almost exponentially as the number of table elements
increase.
• ++++++++++++++++++++++++++++++++++++++++++++++++++
Quick Sort
Quicksort is a divide and conquer algorithm.
The basic idea is to find a “pivot” item in the array and compare all other items with pivot
element.
Shift items such that all of the items before the pivot are less than the pivot value and all
the items after the pivot are greater than the pivot value.
After that, recursively perform the same operation on the items before and after the pivot.
Find a “pivot” item in the array. This item is the basis for comparison for a single round.
Start a pointer (the left pointer) at the first item in the array.
Start a pointer (the right pointer) at the last item in the array.

1.
Assume A[0]=pivot which is the left. i.e pivot=left.
2.
Set i=left+1; i.e A[1];
3.
Set j=right. ie. A[6] if there are 7 elements in the array
4.
If A[pivot]>A[i],increment i and if A[j]>A[pivot],then decrement j, Otherwise swap A[i]
and A[j] element.
5. If i=j,then swap A[pivot] and A[j].

Quick Sort routine:


void Quicksort ( int a [ ], int left, int right )
{
int i, j, p, temp;
if ( left < right )
{
p = left;
i = left + 1; j
= right;
while ( i < j )
{
while ( a [ i ] < = a [ p ] )
i = i + 1;
while ( a [ j ] > a [ p ] )
j = j - 1;
if ( i < j )
{
[Link]
temp = a [ i ];
a [ i ] = a [ j ];
a [ j ] = temp;
}
}
temp = a [ p ];
a [ p ] = a [ j ];
a [ j ] = temp;
quicksort ( a, left, j - 1 );
quicksort ( a, j + 1, right );
}}
Program for Quick sort

#include<stdio.h>
void quicksort (int [10], int, int ) ;
void main( )
{
int a[20], n, i ;
printf("Enter size of the array:" );
scanf("%d",&n);
printf( " Enter the numbers :");
for ( i = 0 ; i < n ; i ++ )
scanf ("%d",&a[i]);
quicksort ( a , 0 , n - 1 );
printf ( " Sorted elements: " );
for ( i = 0 ; i < n ; i ++ )
printf ("%d\t",a[ i]);
}
void quicksort ( int a[10], int left, int right ) [Link]
{
int p, j, temp, i ;
if ( left < right )
{
p = left ;
i = left ;
j = right ;
while ( i < j )
{
while(a[i]<= a[p] && i<right )
i++ ;
while ( a [ j ] > a [ p ] )
j--;
if ( i < j )
{
temp = a [ i ] ;
a[i]=a[j];
a[ j ] = temp ;
}
}
temp = a [ p ] ;
a[p]=a[j];
a [ j ] =temp ;
quicksort ( a , left , j - 1 ) ;
quicksort ( a , j + 1 , right ) ;
}
}

OUTPUT:
Enter size of the array:8

Enter the numbers :40 20 70 14 60 61 97 30


Sorted elements: 14 20 30 40 60 61 70 97
Advantages of Quick sort
• Fast and efficient as it deals well with a huge list of items.
• No additional storage is required.
Disadvantages of Quick sort
• The difficulty of implementing the partitioning algorithm.

++++++++++++++++++++++++++++++++++++++++++++++[Link]
5.4.5 Merge Sort :
Merge sort is a sorting algorithm that uses the divide, conquer, and combine algorithmic
paradigm.
Divide means partitioning the n-element array to be sorted into two sub-arrays of n/2 elements.
If there are more elements in the array, divide A into two sub-arrays, A1 and A2, each containing
about half of the elements of A.
Conquer means sorting the two sub-arrays recursively using merge sort.
Combine means merging the two sorted sub-arrays of size n/2 to produce the sorted array of n
elements.
The basic steps of a merge sort algorithm are as follows:
If the array is of length 0 or 1, then it is already sorted.

Otherwise, divide the unsorted array into two sub-arrays of about half the size.
Use merge sort algorithm recursively to sort each sub-array.
Merge the two sub-arrays to form a single sorted list.
Merge Sort routine:
void Merge_sort (int a [ ] , int temp [ ] , int n )
{
msort ( a , temp , 0 , n - 1 ) ;
}
void msort ( int a[ ] , int temp [ ] , int left , int right ){
int center ; [Link]
if( left < right ){
center = ( left + right ) / 2 ;
msort ( a , left , center ) ;
msort ( a , temp , center + 1 , right ) ;
merge ( a , temp , n , left , center , right ) ;

}}
void merge ( int a [ ] , int temp [ ] , int n , int left , int center , int right )
{
int i = 0 , j , left_end = center , center = center + 1 ;
while( ( left < = left_end ) && ( center < = right ) )
{
if( a [ left ] < = a [ center ] )
{
temp [ i ] = a [ left ] ;
i++;
left + + ;
}
else
{
temp [ i ] = a [ center ] ;
i++;
center + + ;
}
}
[Link]
while( left <= left_end )
{
temp [ I ] = a [ left ] ;
left + + ;
i++;
}
while( center < = right )
{
temp [ i ] = a [ center ] ;
center + + ;
i++;
}
for ( i = 0 ; i < n ; i + + )
print temp [ i ] ;
}

Program for merge sort


#include<stdio.h>
void mergesort(int a[],int i,int j);
void merge(int a[],int i1,int j1,int i2,int j2);

int main()
{
int a[30],n,i;
printf("Enter no of elements:");
scanf("%d",&n);
printf("Enter array elements:");
for(i=0;i<n;i++)
scanf("%d",&a[i]);
mergesort(a,0,n-1);
printf("\nSorted array is :");
for(i=0;i<n;i++)
printf("%d ",a[i]);
return 0;
}
void mergesort(int a[],int i,int j)
{
int mid;
if(i<j)
[Link]
{
mid=(i+j)/2;
mergesort(a,i,mid); //left recursion mergesort(a,mid+1,j);
//right recursion merge(a,i,mid,mid+1,j); //merging of two
sorted sub-arrays
}}
void merge(int a[],int i1,int j1,int i2,int j2)
{
int temp[50]; //array used for merging
int i,j,k;
i=i1; //beginning of the first list
j=i2; //beginning of the second list
k=0;
while(i<=j1 && j<=j2) //while elements in both lists
{ if(a[i]<a[j])
temp[k++]=a[i++];
else
temp[k++]=a[j++];
}
while(i<=j1) //copy remaining elements of the first list
temp[k++]=a[i++];
while(j<=j2) //copy remaining elements of the second list
temp[k++]=a[j++];
//Transfer elements from temp[] back to a[]
for(i=i1,j=0;i<=j2;i++,j++)
a[i]=temp[j];
}

OUTPUT:

Enter no of elements:8
Enter array elements:24 13 26 1 2 27 38 15
Sorted array is :1 2 13 15 24 26 27 38
Advantages of Merge sort
• Mergesort is well-suited for sorting really huge amounts of data that does not fit into
memory.
• It is fast and stable algorithm

Disadvantages of Merge sort


• Merge sort uses a lot of memory.
• It uses extra space proportional to number of element n.
• This can slow it down when attempting to sort very large data.
[Link]

Analysis of Sorting algorithms:

[Link] Algorithm Best Case Analysis Average Case Worst Case


Analysis Analysis
1 Insertion sort O(N) O(N2) O(N2)
2 Selection sort O(N2) O(N2) O(N2)
3 Shell sort O(N log N) O(N1.5) O(N2)
4 Bubble sort O(N2) O(N2) O(N2)
5 Quick sort O(N log N) O(N log N) O(N 2)
6 Merge sort O(N log N) O(N log N) O(N log N)
7 Radix or bucket or O(N log N) O(N log N) O(N log N)
binsort sort or card
sort

5.6 HASHING :
Hashing is a technique that is used to store, retrieve and find data in the data structure
called Hash Table. It is used to overcome the drawback of Linear Search (Comparison) &
Binary Search (Sorted order list). It involves two important concepts-
➢ Hash Table
➢ Hash Function
Hash table
A hash table is a data structure that is used to store and retrieve data (keys) very
quickly.
It is an array of some fixed size, containing the keys.
Hash table run from 0 to Tablesize – 1.
Each key is mapped into some number in the range 0 to Tablesize – 1.
This mapping is called Hash function.
Insertion of the data in the hash table is based on the key value obtained from the
hash function.
Using same hash key value, the data can be retrieved from the hash table by few
or more Hash key comparison.
The load factor of a hash table is calculated using the formula:
(Number of data elements in the hash table) / (Size of the hash table)
Factors affecting Hash Table Design

Hash function
Table size.
[Link]
Collision handling scheme

0
1
2
3
.
. Simple Hash table with table size = 10
8
9

5.7 Hash function:


It is a function, which distributes the keys evenly among the cells in the Hash
Table.
Using the same hash function we can retrieve data from the hash table.
Hash function is used to implement hash table.
The integer value returned by the hash function is called hash key.
If the input keys are integer, the commonly used hash function is

H ( key ) = key % Tablesize

typedef unsigned int index;


index Hash ( const char *key , int Tablesize )
{
unsigned int Hashval = 0 ;
while ( * key ! = „ \0 „ )
Hashval + = * key ++ ;
return ( Hashval % Tablesize ) ;
}

A simple hash function


[Link]
Types of Hash Functions
1. Division Method
2. Mid Square Method
3. Multiplicative Hash Function
4. Digit Folding
1. Division Method:
It depends on remainder of division.
Divisor is Table Size.
Formula is ( H ( key ) = key % table size )
[Link]

E.g. consider the following data or record or key (36, 18, 72, 43, 6) table size = 8

2. Mid Square Method:


We first square the item, and then extract some portion of the resulting digits. For
example, if the item were 44, we would first compute 442=1,936. Extract the middle two digit
93 from the answer. Store the key 44 in the index 93.

93 44

3. Multiplicative Hash Function:


Key is multiplied by some constant value.
Hash function is given by,
H(key)=Floor (P * ( key * A ))
P = Integer constant [e.g. P=50]
A = Constant real number [A=0.61803398987],suggested by Donald Knuth to use this
constant
[Link]

E.g. Key 107


H(107)=Floor(50*(107*0.61803398987))
=Floor(3306.481845)
H(107)=3306
Consider table size is 5000

107
3306

4999

4. Digit Folding Method:

The folding method for constructing hash functions begins by dividing the item into
equal-size pieces (the last piece may not be of equal size). These pieces are then added together
to give the resulting hash key value. For example, if our item was the phone number 436-555-
4601, we would take the digits and divide them into groups of 2 (43, 65, 55, 46, 01). After the
addition, 43+65+55+46+01, we get 210. If we assume our hash table has 11 slots, then we need
to perform the extra step of dividing by 11 and keeping the remainder. In this case 210 % 11 is 1,
so the phone number 436-555-4601 hashes to slot 1.

6-555-4601
[Link]

Collision:
If two more keys hashes to the same index, the corresponding records cannot be stored in the
same location. This condition is known as collision.
Characteristics of Good Hashing Function:

▪ It should be Simple to compute.


▪ Number of Collision should be less while placing record in Hash Table.
▪ Hash function with no collision Perfect hash function.
▪ Hash Function should produce keys which are distributed uniformly in hash table.
▪ The hash function should depend upon every bit of the key. Thus the hash
function that simply extracts the portion of a key is not suitable.
Collision Resolution Strategies / Techniques (CRT):
If collision occurs, it should be handled or overcome by applying some technique. Such
technique is called CRT.
There are a number of collision resolution techniques, but the most popular are:
▪ Separate chaining (Open Hashing)
▪ Open addressing. (Closed Hashing)

Linear Probing
Quadratic Probing
Double Hashing
5.8 Separate chaining (Open Hashing):
Open hashing technique.
Implemented using singly linked list concept.
Pointer (ptr) field is added to each record.
When collision occurs, a separate chaining is maintained for colliding data.
Element inserted in front of the list.
H (key) =key % table size
Two operations are there:-
▪ Insert
▪ Find
[Link]

Structure Definition for Node


typedef Struct node *Position;
Struct node
{
int data; defines the nodes
Position next;
};

Structure Definition for Hash Table


typedef Position List;
struct Hashtbl
{ Defines the hash table which contains
int Tablesize; array of linked list
List * theLists;
};

Initialization for Hash Table for Separate Chaining


Hashtable initialize(int Tablesize)
{
HashTable H;
int i;
H = malloc (sizeof(struct HashTbl)); Allocates table
H Tablesize = NextPrime(Tablesize);
H the Lists=malloc(sizeof(List) * H Tablesize); Allocates array of list
for( i = 0; i < H Tablesize; i++ )
{
H TheLists[i] = malloc(Sizeof(Struct node)); Allocates list headers
H TheLists[i] next = NULL;
}
return H;
}
Insert Routine for Separate Chaining
void insert (int Key, Hashtable H)
{
Position P, newnode; *[Inserts element in the Front of the list always]*
List L;
[Link]

P = find ( key, H );
if(P = = NULL)
{
newnode = malloc(sizeof(Struct node));
L = H TheLists[Hash(key,Tablesize)];
newnode nex t= L next;
newnode data = key;
L next = newnode;
}}
Position find( int key, Hashtable H){
Position P, List L;
L = H TheLists[Hash(key,Tablesize)];
P = L next;
while(P != NULL && P data != key)
P = P next;
return P;}
If two keys map to same value, the elements are chained together.
Initial configuration of the hash table with separate chaining. Here we use SLL(Singly Linked List)
concept to chain the elements.

NULL
0
NULL
1
NULL
2 NULL
3 NULL
4 NULL
5 NULL
6 NULL
7 NULL

8 NULL

9
[Link]

Insert the following four keys 22 84 35 62 into hash table of size 10 using separate chaining.
The hash function is
H(key) = key % 10
1. H(22) = 22 % 10 =2 2. 84 % 10 = 4

3.H(35)=35%10=5 4. H(62)=62%10=2
[Link]

Advantages
1. More number of elements can be inserted using array of Link List
Disadvantages
1. It requires more pointers, which occupies more memory space.
[Link] takes time. Since it takes time to evaluate Hash Function and also to traverse the
List

5.9 Open Addressing :


Closed Hashing
Collision resolution technique
Uses Hi(X)=(Hash(X)+F(i))mod Tablesize
When collision occurs, alternative cells are tried until empty cells are found.
Types:-
▪ Linear Probing
▪ Quadratic Probing
▪ Double Hashing
Hash function
▪ H(key) = key % table size.
Insert Operation
▪ To insert a key; Use the hash function to identify the list to which the
element should be inserted.
▪ Then traverse the list to check whether the element is already present.
▪ If exists, increment the count.
▪ Else the new element is placed at the front of the list.
Linear Probing:
Easiest method to handle collision.
Apply the hash function H (key) = key % table size
Hi(X)=(Hash(X)+F(i))mod Tablesize,where F(i)=i.
How to Probing:
first probe – given a key k, hash to H(key)
second probe – if H(key)+f(1) is occupied, try H(key)+f(2)
And so forth.
Probing Properties:
We force f(0)=0
The ith probe is to (H (key) +f (i)) %table size.
If i reach size-1, the probe has failed.
Depending on f (i), the probe may fail sooner.
Long sequences of probe are costly.
Probe Sequence is:
[Link]
H (key) % table size
H (key)+1 % Table size
H (Key)+2 % Table size

1. H(Key)=Key mod Tablesize


This is the common formula that you should apply for any hashing
If collocation occurs use Formula 2
2. H(Key)=(H(key)+i) Tablesize
Where i=1, 2, 3, …… etc
Example: - 89 18 49 58 69; Tablesize=10
1. H(89) =89%10
=9
2. H(18) =18%10
=8
3. H(49) =49%10
=9 ((coloids with [Link] try for next free cell using formula 2))
i=1 h1(49) = (H(49)+1)%10
= (9+1)%10
=10%10
=0
4. H(58) =58%10
=8 ((colloids with 18))
i=1 h1(58) = (H(58) +1)%10
= (8+1) %10
=9%10
=9 =>Again collision
i=2 h2(58) =(H(58)+2)%10
=(8+2)%10
=10%10
=0 =>Again collision
[Link]

EMPTY 89 18 49 58 69
0 49 49 49
1 58 58
2 69
3
4
5
6
7
8 18 18 18
9 89 89 89 89

Linear probing

Quadratic Probing
To resolve the primary clustering problem, quadratic probing can be used. With quadratic
probing, rather than always moving one spot, move i2 spots from the point of collision, where
i is the number of attempts to resolve the collision.
Another collision resolution method which distributes items more evenly.
[Link]

From the original index H, if the slot is filled, try cells H+12, H+22, H+32,.., H + i2 with
wrap-around.
Hi(X)=(Hash(X)+F(i))mod Tablesize,F(i)=i2
Hi(X)=(Hash(X)+ i2)mod Tablesize

Limitation: at most half of the table can be used as alternative locations to resolve collisions.
This means that once the table is more than half full, it's difficult to find an empty spot. This
new problem is known as secondary clustering because elements that hash to the same hash
key will always probe the same alternative cells.
Double Hashing
Double hashing uses the idea of applying a second hash function to the key when a
collision occurs. The result of the second hash function will be the number of positions forms
the point of collision to insert.
There are a couple of requirements for the second function:
It must never evaluate to 0 must make sure that all cells can be probed.
Hi(X)=(Hash(X)+i*Hash2(X))mod Tablesize
A popular second hash function is:
Hash2 (key) = R - (key % R) where R is a prime number that is smaller than the size of the
table.
[Link]

5.10 Rehashing :
Once the hash table gets too full, the running time for operations will start to take too
long and may fail. To solve this problem, a table at least twice the size of the original will be
built and the elements will be transferred to the new table.
Advantage:
A programmer doesn‟t worry about table system.
Simple to implement
Can be used in other data structure as well
The new size of the hash table:
should also be prime
will be used to calculate the new insertion spot (hence the name rehashing)
This is a very expensive operation! O(N) since there are N elements to rehash and the
table size is roughly 2N. This is ok though since it doesn't happen that often.
[Link]

The question becomes when should the rehashing be applied?


Some possible answers:
once the table becomes half full
once an insertion fails
once a specific load factor has been reached, where load factor is the ratio of the
number of elements in the hash table to the table size

5.11 Extendible Hashing :


• Extendible Hashing is a mechanism for altering the size of the hash table to accommodatenew
entries when buckets overflow.
• Common strategy in internal hashing is to double the hash table and rehash each [Link],
this technique is slow, because writing all pages to disk is too expensive.
• Therefore, instead of doubling the whole hash table, we use a directory of pointers to
buckets, and double the number of buckets by doubling the directory, splitting just the
bucket that overflows.

• Since the directory is much smaller than the file, doubling it is much cheaper. Only onepage of
keys and pointers is split.
000 100
0 1
010 100
100 000
111 000 000 100 100 000
001 000 010 100 111 000
011 000 001 000 101 000

101 000 011 000 111 001

111 001
001 010
101 100 00 01 10 11

101 110
[Link]

000 100 100 000


010 100
001 000 101 000
011 000
001 010 101 100
001 011 101 110

111 000
111 001

Extendible Hashing is a dynamic hashing method wherein directories, and buckets are
used to hash data. It is an aggressively flexible method in which the hash function also
experiences dynamic changes.
Main features of Extendible Hashing: The main features in this hashing technique are:

• Directories: The directories store addresses of the buckets in pointers. An id is assigned to


each directory which may change each time when Directory Expansion takes place.
• Buckets: The buckets are used to hash the actual data.
Basic Structure of Extendible Hashing:
[Link]

Frequently used terms in Extendible Hashing:

• Directories: These containers store pointers to buckets. Each directory is given a unique
id which may change each time when expansion takes place. The hash function returns this
directory id which is used to navigate to the appropriate bucket. Number of Directories =
2^Global Depth.
• Buckets: They store the hashed keys. Directories point to buckets. A bucket may contain
more than one pointers to it if its local depth is less than the global depth.
• Global Depth: It is associated with the Directories. They denote the number of bits which
are used by the hash function to categorize the keys. Global Depth = Number of bits in
directory id.
• Local Depth: It is the same as that of Global Depth except for the fact that Local Depth is
associated with the buckets and not the directories. Local depth in accordance with the
global depth is used to decide the action that to be performed in case an overflow occurs.
Local Depth is always less than or equal to the Global Depth.
• Bucket Splitting: When the number of elements in a bucket exceeds a particular size, then
the bucket is split into two parts.
[Link]

• Directory Expansion: Directory Expansion Takes place when a bucket overflows.


Directory Expansion is performed when the local depth of the overflowing bucket is equal
to the global depth.
Basic Working of Extendible Hashing:

• Step 1 – Analyze Data Elements: Data elements may exist in various forms eg. Integer,
String, Float, etc.. Currently, let us consider data elements of type integer. eg: 49.
• Step 2 – Convert into binary format: Convert the data element in Binary form. For string
elements, consider the ASCII equivalent integer of the starting character and then convert
the integer into binary form. Since we have 49 as our data element, its binary form is
110001.
• Step 3 – Check Global Depth of the directory. Suppose the global depth of the Hash-
directory is 3.
• Step 4 – Identify the Directory: Consider the ‘Global-Depth’ number of LSBs in the
binary number and match it to the directory id.
[Link]

Eg. The binary obtained is: 110001 and the global-depth is 3. So, the hash function will
return 3 LSBs of 110001 viz. 001.
• Step 5 – Navigation: Now, navigate to the bucket pointed by the directory with directory-
id 001.
• Step 6 – Insertion and Overflow Check: Insert the element and check if the bucket
overflows. If an overflow is encountered, go to step 7 followed by Step 8, otherwise, go
to step 9.
• Step 7 – Tackling Over Flow Condition during Data Insertion: Many times, while
inserting data in the buckets, it might happen that the Bucket overflows. In such cases, we
need to follow an appropriate procedure to avoid mishandling of data.
First, Check if the local depth is less than or equal to the global depth. Then choose one of
the cases below.
o Case1: If the local depth of the overflowing Bucket is equal to the global depth,
then Directory Expansion, as well as Bucket Split, needs to be performed. Then
increment the global depth and the local depth value by 1. And, assign appropriate
pointers.
Directory expansion will double the number of directories present in the hash
structure.
o Case2: In case the local depth is less than the global depth, then only Bucket Split
takes place. Then increment only the local depth value by 1. And, assign
appropriate pointers.
[Link]

• Step 8 – Rehashing of Split Bucket Elements: The Elements present in the overflowing
bucket that is split are rehashed w.r.t the new global depth of the directory.
• Step 9 – The element is successfully hashed.
Example based on Extendible Hashing: Now, let us consider a prominent example of
hashing the following elements: 16,4,6,22,24,10,31,7,9,20,26.
Bucket Size: 3 (Assume)
Hash Function: Suppose the global depth is X. Then the Hash Function returns X LSBs.

• Solution: First, calculate the binary forms of each of the given numbers.
16- 10000
4- 00100
6- 00110
22- 10110
24- 11000
10- 01010
31- 11111
[Link]

7- 00111
9- 01001
20- 10100
26- 11010
• Initially, the global-depth and local-depth is always 1. Thus, the hashing frame looks like
this:

• Inserting 16:
The binary format of 16 is 10000 and global-depth is 1. The hash function returns 1 LSB
of 10000 which is 0. Hence, 16 is mapped to the directory with id=0.
[Link]

• Inserting 4 and 6:
Both 4(100) and 6(110)have 0 in their LSB. Hence, they are hashed as follows:

• Inserting 22: The binary form of 22 is 10110. Its LSB is 0. The bucket pointed by directory
0 is already full. Hence, Over Flow occurs.
[Link]

• As directed by Step 7-Case 1, Since Local Depth = Global Depth, the bucket splits and
directory expansion takes place. Also, rehashing of numbers present in the overflowing
bucket takes place after the split. And, since the global depth is incremented by 1, now,the
global depth is 2. Hence, 16,4,6,22 are now rehashed w.r.t 2 LSBs.[
16(10000),4(100),6(110),22(10110) ]
[Link]


*Notice that the bucket which was underflow has remained untouched. But, since the
number of directories has doubled, we now have 2 directories 01 and 11 pointing to the
same bucket. This is because the local-depth of the bucket has remained 1. And, any bucket
having a local depth less than the global depth is pointed-to by more than one directories.
• Inserting 24 and 10: 24(11000) and 10 (1010) can be hashed based on directories with id
00 and 10. Here, we encounter no overflow condition.
[Link]

• Inserting 31,7,9: All of these elements[ 31(11111), 7(111), 9(1001) ] have either 01 or 11
in their LSBs. Hence, they are mapped on the bucket pointed out by 01 and 11. We do not
encounter any overflow condition here.
[Link]

• Inserting 20: Insertion of data element 20 (10100) will again cause the overflow problem.
[Link]

• 20 is inserted in bucket pointed out by 00. As directed by Step 7-Case 1, since the local
depth of the bucket = global-depth, directory expansion (doubling) takes place along
with bucket splitting. Elements present in overflowing bucket are rehashed with the new
global depth. Now, the new Hash table looks like this:
[Link]

• Inserting 26: Global depth is 3. Hence, 3 LSBs of 26(11010) are considered. Therefore 26
best fits in the bucket pointed out by directory 010.
[Link]

• The bucket overflows, and, as directed by Step 7-Case 2, since the local depth of bucket
< Global depth (2<3), directories are not doubled but, only the bucket is split and elements
are rehashed.
Finally, the output of hashing the given list of numbers is obtained.
[Link]

• Hashing of 11 Numbers is Thus Completed.


Key Observations:

1. A Bucket will have more than one pointers pointing to it if its local depth is less than the
global depth.
2. When overflow condition occurs in a bucket, all the entries in the bucket are rehashed with
a new local depth.
3. If Local Depth of the overflowing bucket
4. The size of a bucket cannot be changed after the data insertion process begins.
Advantages:

1. Data retrieval is less expensive (in terms of computing).


2. No problem of Data-loss since the storage capacity increases dynamically.
3. With dynamic changes in hashing function, associated old values are rehashed w.r.t the
new hash function.
[Link]

Limitations Of Extendible Hashing:

1. The directory size may increase significantly if several records are hashed on the same
directory while keeping the record distribution non-uniform.
2. Size of every bucket is fixed.
3. Memory is wasted in pointers when the global depth and local depth difference becomes
drastic.
4. This method is complicated to code.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

You might also like