0% found this document useful (0 votes)
4 views

BSCSS 32 Data Structures

The document outlines the syllabus for the B.Sc. Computer Science course on Data Structures at Tamil Nadu Open University, detailing course objectives, outcomes, and content blocks covering linear and non-linear data structures, including arrays, linked lists, stacks, queues, trees, and graphs. It emphasizes the importance of understanding data structures for efficient algorithm design and provides a scheme of lessons along with reference books for further study. The course is designed to equip students with practical skills in implementing and analyzing various data structures and algorithms.

Uploaded by

m15791830
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

BSCSS 32 Data Structures

The document outlines the syllabus for the B.Sc. Computer Science course on Data Structures at Tamil Nadu Open University, detailing course objectives, outcomes, and content blocks covering linear and non-linear data structures, including arrays, linked lists, stacks, queues, trees, and graphs. It emphasizes the importance of understanding data structures for efficient algorithm design and provides a scheme of lessons along with reference books for further study. The course is designed to equip students with practical skills in implementing and analyzing various data structures and algorithms.

Uploaded by

m15791830
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 281

BACHELOR OF SCIENCE IN

COMPUTER SCIENCE

BSCSS-32: Data Structures


Semester-III

Department of Computer Science


School of Computer Science
Tamil Nadu Open University
577, Anna Salai, Saidapet, Chennai - 600 015.

www.tnou.ac.in
November 2022
Course Writer:

Dr. N.Sivashanmugam,
Assistant Professor,
Department of Computer Science,
School of Computer Sciences,
Tamil Nadu Open University,
Chennai - 600 015.

©Department of Computer Science, School of C o m p u t e r Sciences,


Tamil Nadu Open University

All rights reserved. No part of this work may be reproduced in any form,
mimeograph or any other means, without permission in writing from the
Tamil Nadu Open University. Further information of the Tamil Nadu Open
University Programmes may be obtained from the University office at:

577, Anna Salai, Saidapet, Chennai - 600 015.

November 2022.

www.tnou.ac.in.
Syllabus
B.Sc Computer Science - Syllabus – III Semester (Distance Mode)
COURSE TITLE : Data structures

COURSE CODE : BSCSS-32


COURSE CREDIT : 03
COURSE OBJECTIVES
While studying the Data structures, the student shall be able to:
 Understand how to perform Matrix Operations using Two-Dimensional Arrays.
 Gain Knowledge on Linked Lists and Linear Data Structures namely Stacks and Queues.
 Gain Knowledge on Non Linear Data Structures namely Trees and Graphs.
 Understand how to perform Search Operations using Linear Search, Binary Search and
Hashing.
 To familiarize Sorting techniques namely Selection Sort, Insertion Sort, Bubble Sort,
Quick Sort, Merge Sort and Bucket Sort.
COURSE OUTCOMES
After completion of the Data structures, the student will be able to:
 Implement abstract data types for linear data structures.
 Apply the different linear and non-linear data structures to problem solutions.
 Critically analyze the various sorting algorithms.

Block – 1
Introduction to Data Structures–Linear and Non Linear Data Structures–Arrays–Types of
Arrays–Representation of One-Dimensional Array in Memory–Array Traversal–Insertion and
Deletion–Realizing Matrices using Two-Dimensional Arrays– Matrix Operations–Addition–
Subtraction–Multiplication–Transpose–Linked Lists–Representation of Linked Lists–
Advantages and Disadvantages of Linked List–Linked List Node Declaration–Linked List
Operations–Linked List Implementation–Circular Linked List Operations–Circular Linked List
Implementation–Doubly Linked List Node Declaration–Doubly Linked List Operations–Doubly
Linked List Implementation.

Block – 2
Stacks–Stack Representation in Memory–Arrays vs. Stacks–Stack Operations–Array
Implementation of Stacks–Linked Implementation of Stacks–Queues–Logical Representation of
Queues–Queue Operations–Array Implementation of Queues–Linked Implementation of
Queues–Circular Queues–Priority Queues–Double-Ended Queues.

Block – 3
Trees–Tree Terminology–Binary Tree–Array representation of Binary Tree–Linked
Representation of Binary Tree–Binary Tree Traversal–Binary Search Tree–Insert, Delete, and
Search Operations on a Binary Tree and Binary Search Tree–Expression Trees.
Block – 4
Graphs –Graph Terminology–Implementing Graphs Using Adjacency Matrix, Path Matrix and
Adjacency List–Shortest Path Algorithm–Breadth First Search and Depth First Search Traversal
of a Graph –Searching –Linear Search –Binary Search –Hashing.

Block – 5
Sorting–Selection Sort–Insertion Sort–Bubble Sort–Quick Sort–Merge Sort–Bucket Sort.

Reference books:
1. E Balagurusamy, “Data Structures”, McGraw Hill Education (India) Private Limited,
2019.
2. Mark Allen Weiss, "Data Structures and Algorithm Analysis in C++”, Third Edition,
Pearson Education, 2007.
3. Reema Thareja, ―Data Structures Using C‖, Second Edition , Oxford University Press,
2011
4. Ellis Horowitz, Sartaj Sahni, Susan Anderson-Freed, ―Fundamentals of Data Structures
in C‖, Second Edition, University Press, 2008
SCHEME OF LESSONS

Contents Page No.

Block1: Introduction to Data Structures 1

Unit 1: Introduction 2

Unit 2: Arrays 8

Unit 3: Array Traversal 23

Unit 4: Linked Lists 42

Block2: stacks and queues 100

Unit 5: Stacks 101

Unit 6: Queues 118

Block3: Trees 132

Unit 7: Basics of trees 133

Unit 8: Binary Tree 140

Unit 9: Binary Search tree 154

Block4: Graphs 172

Unit 10: Basics of Graphs 173

Unit 11: Shortest Path Algorithm 184

Unit 12: Graph Traversal 200

Unit 13: Searching techniques 215

Block5: Sorting 229

Unit 14: Sorting Techniques 230

Unit 15: Bubble sort and Quick sort 244

Unit 16: Merge sort and Bucket sort 259


Block-1: INTRODUCTION TO DATA STRUCTURES

Unit-1: Introduction
Unit-2: Arrays
Unit-3: Array Traversal
Unit-4: Linked Lists

1
Unit -1
Introduction

Structure

Overview
Learning objectives
1.0 Introduction to Data Structures

1.1 Types of Data Structures


1.1.1 Linear Data Structures
1.1.2 Non Linear Data Structures
Let us sum up
Check your progress
Glossary

Suggested readings
Answers to check your progress

Overview
In computer science, a data structure is defined as group of data elements
used for organizing and storing data. In order to be effective, data has to be
organized in a manner that adds to the efficiency of algorithm. This chapter
deals about the different types of data structures

Learning objectives
At the end of this unit, you will be able to
 Understand the basic definitions of Algorithms and Data
Structures
 Understand the different types of Data Structures
 Understand the manipulation of all linear and Non linear data
structures

2
1.0 Introduction to Data Structures
A data type is a well-defined collection of data with a well-defined set
of operations on it. A data structure is an actual implementation of a
particular abstract data type. In computer science, a data structure is a way
of storing data in a computer so that it can be used efficiently. Often a
carefully chosen data structure will allow a more efficient algorithm to be
used. The choice of the data structure often begins from the choice of an
abstract data structure. A well-designed data structure allows a variety of
critical operations to be performed, using as few resources, both execution
time and memory space, as possible. Data structures are implemented
using the data types, references and operations on them provided by a
programming language.
Different kinds of data structures are suited to different kinds of
applications, and some are highly specialized to certain tasks. For example,
B-trees are particularly well-suited for implementation of databases, while
routing tables rely on networks of machines to function. In the design of
many types of programs, the choice of data structures is a primary design
consideration, as experience in building large systems has shown that the
difficulty of implementation and the quality and performance of the final
result depends heavily on choosing the best data structure. After the data
structures are chosen, the algorithms to be used often become relatively
obvious. Sometimes things work in the opposite direction - data structures
are chosen because certain key tasks have algorithms that work best with
particular data structures. In either case, the choice of the appropriate data
structures is crucial.
The fundamental building blocks of most of data structures are
arrays, records, discriminated unions, and references. For example, the
nullable reference, a reference which can be null, is a combination of
references and discriminated unions, and the simplest linked data structure,
the linked list, is built from the records and nullable references.

There is some debate about whether the data structures represent


implementations or interfaces. How they are seen may be a matter of
perspective. A data structure can be viewed as an interface between two
functions or as an implementation of methods to access storage that is
organized according to the associated data type.

3
Definition of Data Structures
Data structures can be defined as a collection of data elements
whose origination is characterized by accessing operations that are used to
store and retrieve the individual elements. A Data structure can thus be
represented as follows:
Data Structures = data organization + operations
A data type describes representation, interpretation and structure of values
manipulated by algorithms or objects whereas data structures are
implemented using the data types, references and operations on them
provided by a programming language.
Some of the operations that can be performed on the Data Structures
includes the following:
Searching - whether a particular element is found or not.
Traversal - Iterating through all the elements in the data structure.
Insertion - a new element is added into the structure
Deletion - a given element is removed from the structure
Sorting - all the elements in the structure are arranged in an increasing
order.

Copy - Copying the contents of one structure into another structure


Merge - Merging the contents of two structures into one structure.
The choice of an appropriate data structure depends on the
requirement. There are several algorithms to solve a problem and the
choice of an appropriate one depends on the time and space complexity of
the algorithm. The time units required to solve a given problem is known as
time complexity whereas the space complexity denotes the memory
requirement.
1.1 Types of Data Structures
Data Structures can be classified into linear and non linear
structures, based on the scheme of organizing the related information. A
Data structure is said to be linear, if its elements form a linear fashion. Here
the elements are stored in a sequence of one after another. Examples:
Arrays, Records, Linked Lists, Stack and Queue. A Data structure is said to

4
be non linear, if its elements form a non linear fashion where the elements
are not stored in a sequence. Examples: Trees and Graphs.
1.1.1 Linear Data Structures

This data structure is called linear or contiguous data structure since


the elements in this structure are adjacent to each other. It has exactly two
neighbor’s elements to which it is connected as its previous and next
member. Examples of linear data structures include:
 Array
 Stack

 Queue
 Linked List
Arrays are most frequently used in programming. Mathematical
problems like matrix, algebra etc, can be easily handled by arrays. An array
is a collection of homogeneous data elements described by a single name.
A linked list is an ordered set consisting of a varying number of elements to
which insertion and deletion can be made. A list represented by displaying
the relationship between the adjacent elements is said to be a linear list. A
Stack is one of the most important and useful non-primitive linear data
structure in computer science. It is an ordered collection of items into which
the new data items may be added/inserted and from which items may be
deleted at only one end, called the top of the stack. As all the addition and
deletion in a stack is done from the top of the stack, the last added element
will be first removed from the stack. A Queue is logically a first in first out
(FIFO or first come first serve) linear data structure. It is a homogeneous
collection of elements in which the new elements are added at one end
called the rear, and the existing elements are deleted from other end called
the front.
1.1.2 Non- Linear Data Structures
Non-Linear data structure also called as non-contiguous structure is
that in which the data values stored in this structure are not arranged in
order. In other words, if one element can be connected to more than
two adjacent elements then it is known as non-linear data structure.
Example of Non Linear Data Structures includes:
 Tree

5
 Graph
Tree is one of the important non-liner data structures in computer
science. Many real life problems can be represented and solved using
trees. Trees are very flexible, versatile and powerful non-liner data structure
that can be used to represent data items possessing hierarchical
relationship between the grand father and his children and grand children as
so on. Graph is another non linear data structure where the data values in
this structure are not arranged in order. Here the elements of the data
structure are placed in such a way that they are adjacent to more than one
element.
Selection of Data
There are many considerations to be taken into account when
choosing the best data structure for a specific program. They are:
 Size of the data.
 Speed and manner of data in use.
 Data dynamics, as change and edit.
 Size of the required storage.
 Fetch time of any information from data structure.

Let us sum up
This Unit introduced you to the essentials of data structures with
their different types as linear and non-linear structures. Also you were
introduced to the various kinds of structures in the linear and non-linear data
structures and for further readings refer the suggested books.

Check your progress


1. The term Data means ____________ and structure means
__________
2. A step-by-step procedure is called as ___________
3. What are the five important features of an algorithm?

4. Name some of the operations performed in Data structures.


5. What are types of Data Structures?

6
Glossary

linear data structures


 Array

 Stack
 Queue
 Linked List
Non Linear Data Structures includes:
 Tree
 Graph

Suggested readings
1. E Balagurusamy, “Data Structures”, McGraw Hill Education
(India) Private Limited, 2019.
2. Mark Allen Weiss, "Data Structures and Algorithm Analysis in
C++”, Third Edition, Pearson Education, 2007.
3. Reema Thareja, ―Data Structures Using C‖, Second Edition ,
Oxford University Press, 2011.
4. Ellis Horowitz, Sartaj Sahni, Susan Anderson-Freed,
―Fundamentals of Data Structures in C‖, Second Edition,
University Press, 2008.

Answers to check your progress


1. Values, the way it is organized and arranged into mathematical
and logical way

2. Algorithm
3. Finiteness, Definiteness, Input, Output, and Effectiveness
4. Searching, Traversal, Insertion, Deletion, Sorting, Copy, Merge
5. Linear Data Structures and Non-linear Data Structures

7
Unit -2
Arrays

Structure

Overview
Learning objectives
2.0 Introduction to Arrays

2.1 Types of Arrays


2.2 Representation of One-Dimensional Array in Memory
Let us sum up
Check your progress
Glossary
Suggested readings
Answers to check your progress

Overview
In this unit we discusses about the concepts and their types of
arrays. This chapter also provides the Representation of One-Dimensional
array in memory and their implementation with examples.

Learning objectives
At the end of this unit you will able to

 Get the knowledge of Arrays.


 Get clear idea about the types of arrays.
 Understand the Representation of One-Dimensional array in memory

2.0 Arrays
Arrays are the simplest data structure possible, and are just
aggregates of homogeneous items. They are very similar to variables,
except that all array elements share the same variable name, but have

8
unique indices. Each array position acts just like a variable. It may be
assigned to, passed as an argument (by value), or accessed like a variable.
Arrays hold a series of data elements, usually of the same size and data
type. Individual elements are accessed by their position in the array. The
position is given by an index and the value is stored at the index. The index
usually uses a consecutive range of integers, but the index can have any
ordinal set of values.

2.1 Types of Arrays


Arrays could be classified as either one-dimensional or two –
dimensional or multidimensional depending on the characteristics that are
explained by the elements. In computer language, we can declare an one-
dimensional array as follows:
int a [4] = { 3,5,9,6 };
Where a is an integer array , which consists of 4 values and all the
values are stored in consecutive memory locations. All values are
homogeneous i.e all values belong to the same type integer. Since the
integer occupies has been assumed to hold 2 bytes, 8 bytes of continuous
memory is allocated for the whole array. To access an element i of an
array, you need to perform the function that translates an array index to the
address of the indexed element which is defined below.

Address(i) = Base address+(i-lb)* w


Where, Address(i) –Address to be computed for the indexed element
n = number of elements in the array
Base address= Starting address of the array
i=index of the array element
Lb=Lower bound, initial or starting index of the array

w= data type length


For this example,
int a [4] = { 3,5,9,6 };
n= 4, Base address = 1000, Lb = 0(initial index of the array), w= datatype
length =2(for integer)

9
if index = 2, then the address of the indexed element would be 1004
Address(2) = 1000+(2-0)*2 = 1004
This type of one Dimensional arrays could be used in places where
you want to store a list of students in a class, list of employees in a
company, list of products sold, list of customers, list of temperatures
recorded on a particular month. The operations which could be performed
on arrays are: Traversal, Insertion, Deletion, Search, Merging and Sorting.
Traversing the array element means accessing each and every
element of the array for a specific purpose. Inserting an element into the
array means adding a new data element in an already existing array
whereas deleting an element from the array means removing a data
element from an already existing array. Searching enables to find whether a
particular value is present in the array or not. If the value is present in the
array then searching is said to be successful and the searching process
gives the location of that value in the array, otherwise it reports as
unsuccessful. Sorting arranges all the elements in the array in an increasing
order whereas Merging operation merges the contents of two arrays.
Sometimes, we need to store the data in the form of matrices or
tables. Here the concept of one-dimension array is extended to incorporate
two-dimensional data structures. Here, Rows and Columns are used to
explain the two dimensions of two-dimensional array. Row index and
column indexes are used to access an individual element stored in two-
dimensional array as shown in figure.

Figure: Two Dimensional Array Representation

10
Row size of the above two-dimensional array is 3 and column size is
also 3. So it is called a 3*3 matrix or two-dimensional array. A two-
dimensional array is declared as:

Data type array_name [row_size] [column_size];


For example, in C language, if we want to store the marks obtained by 3
students in 4 different subjects, then we can declare a two-dimensional
array as:
int marks [3] [4]; - 3 rows and 4 columns
Here, the sequential storage may be either ROW major or COLUMN
major order depending on the compiler. In ROW major order, all the ROW
elements are stored sequentially, which means the elements of the first row
are stored before the elements of the second and third row, i.e. the
elements of the array are stored row by row, where n elements of the first
row will occupy the first nth location. In Column major order, the elements of
the first column are stored before the elements of the second and third
column.
A multi-dimensional array in simple terms is an array of arrays.
This can also be called as n-dimensional array which consists of n-indices.
In a multi-dimensional array, a particular element is specified by using n
subscripts as A[I1] [I2] [I3]..[In] where n denotes the number of dimensions.
Therefore, we have seen that arrays could be used to implement the list of
items successfully. In this, the implementation of lists could follow either the
static allocation or the dynamic allocation of memory.

2.2 REPRESENTATION OF MULTI-DIMENSIONAL ARRAY IN MEMORY

Let us recall that a multi-dimensional array is an array of arrays.


Unlike one-dimensional arrays which have only one subscript, a
multidimensional array has multiple subscripts. For example, a two-
dimensional array, one of the most widely used instances of multi-
dimensional arrays, has two subscripts. It is used to programmatically
realize a matrix with its first subscript representing the row and the second
subscript representing the column of a matrix.

The representation of a two-dimensional array in memory is not like


the gird-like structure of a matrix. Instead, it is same as the one-dimensional

11
array representation in memory. It either stores the array elements row by
row (row major order) or column by column (column major order). Figure
illustrates these representations:

Figure: Representation of two-dimensional array in memory

As shown in the above Figure., the elements of a two-dimensional


array are stored at consecutive memory locations. The only difference is in
the order in which these elements are stored in memory. In column-major
order, the elements are stored column-by-column while in row-major order
the elements are stored row-by-row. Both these memory representations
are intrinsic to a programming language and the programmer does not have
a choice of selecting a particular representation format for storing array
elements.

The formula for computing the address location of a multi-


dimensional array element in row major implementation is given below:

Address of A[i,j] = B + W (n (i – LBR) + (j – LBC))

Here,
1. A[ ][ ] is the multidimensional array.

12
2. B is the base address.
3. W is the word size or the size of an array element.
4. n is the number of columns.

5. i, j are the index identifiers.


6. LBR is the lower bound of row index.
7. LBC is the lower bound of column index.

Similarly, the formula for computing the address location of a multi-


dimensional array element in column major implementation is given below:

Address of A[i,j] = B + W (m (j – LBC) + (i – LBR))


Here, m represents the number of rows
Example: A 10 ¥ 12 matrix is implemented using array A[10][12]. If the
base address of the array is 200 and the word size is 2 then compute the
address of the element
A[4,7] in:

a) Row major order

b) Column major order


Assume that the lower bound of both row and column indices is 1.

Solution:
(a) Row major order
Address of A[i,j] = B + W (n (i – LBR) + (j – LBC))

Address of A[4,7] = 200 + 2 (12 (4 – 1) + (7 – 1))


= 200 + 2 (42)
= 284
(b) Column major order
Address of A[i,j] = B + W (m (j – LBC) + (i – LBR))

13
Address of A[4,7] = 200 + 2 (10 (7 – 1) + (4 – 1))

= 200 + 2 (63)

= 326

a) Implementation of Arrays (lists) based on Static memory allocation

When we declare the size of the array as static, static allocation of


memory has been done for that array. Each static allocation tries to allocate
a fixed size of memory specified by the size attribute of that array. The
number of bytes reserved for the array cannot change during the execution
of the program. The use of a static allocation array implies that the list will
have a fixed maximum size determined by the variable declaration. This
means that the maximum size must be determined in advance and the
program is written to allow for this even though it may frequently use only a
small portion of the array.
For example the following array, in 'C' language, could store only a
list of marks of 21 students and this cannot be changed during the
execution of the program based on the requirement.

struct person
{
int mark[21];
};
typedef struct person PERSON;
PERSON student[100];

On the other hand, the order of items stored in a list using this kind
of array is determined by the sequential positioning of the items in memory,
ie., consecutive items in the list are stored in consecutive memory locations.
Problems would arise when an ordered list has been implemented using
these arrays. For instance, when one attempts to add an item to the list by
preserving the order of the items, problem may rise.

An ordered list is a list in which the order of the items is significant.


However, the items in an ordered lists are not necessarily sorted.
Consequently, it is possible to change the order of items and still have a

14
valid ordered list. For example, to add the name Mani to the following list, it
is necessary to copy each array element from student [3] onwards down
one position in order to insert the new record as shown in figure. A similar
problem exists when deleting records from the list also.

Gopi 90 Student [0] Gopi 90

Karthick 75 Student [1] Karthick 75

Mani 70 Student [2] Laxmanan 65


New item
Sriram 85 Student [3] Mani 70

O O O
o o o
o o o
o o o

Vinoth 82 Student [98] Tini 60

Visnu 80 Student [99] Vinoth 82

Student [100] Vishnu 80

Figure: Inserting an element in an array


For example, if a new element has to be inserted in the ith location of an
array whose size is n, then the steps for inserting the element at ith place is
as follows:
1. move the nth elment into n+1 th place,n-1th element to n th

place and so on upto i th element to i+1 th place

2. Create an empty place for the element to be inserted.


3. insert the element into i th place.
Similarly, if the element in the ith location has to be deleted from an array of
size n, then the steps for deleting the element at ith place is as follows:
1. remove the ith element and store the element in value where
value is the deleted element.

15
2. move the i+1th elment into i th place,i+2th element to i+1th
place and so on upto n th element to n-1 th place
A C program which demonstrates the implementation of arrays (lists) with
various operations is given below:
#include <stdio.h>
#include <conio.h>

#define MAX 5
void insert ( int *, int pos, int num ) ;
void del ( int *, int pos ) ;
void reverse ( int * ) ;
void display ( int * ) ;
void search ( int *, int num ) ;
/* inserts an element num at given position pos */
void insert ( int *arr, int pos, int num )
{

/* shift elements to right */


int i ;
for ( i = MAX - 1 ; i >= pos ; i-- )

arr[i] = arr[i - 1] ;
arr[i] = num ;
}

/* deletes an element from the given position pos */


void del ( int *arr, int pos )
{

/* skip to the desired position */


int i ;
for ( i = pos ; i < MAX ; i++ )

arr[i - 1] = arr[i] ;
arr[i - 1] = 0 ;

16
}
/* reverses the entire array */
void reverse ( int *arr )

{
int i ;
for ( i = 0 ; i < MAX / 2 ; i++ )

{
int temp = arr[i] ;
arr[i] = arr[MAX - 1 - i] ;
arr[MAX - 1 - i] = temp ;
}
}

/* searches array for a given element num */


void search ( int *arr, int num )

{
/* Traverse the array */
int i ;

for ( i = 0 ; i < MAX ; i++ )


{
if ( arr[i] == num )

{
printf ( "\n\nThe element %d is present at %dth position.",
num, i + 1 ) ;

return ;
}
}

if ( i == MAX )

17
printf ( "\n\nThe element %d is not present in the array.", num ) ;
}

void display ( int *arr )


{
/* traverse the entire array */

int i ;
printf ( "\n" ) ;
for ( i = 0 ; i < MAX ; i++ )
printf ( "%d\t", arr[i] ) ;
}
void main( )
{
int arr[5] ;
insert ( arr, 1, 11 ) ;

insert ( arr, 2, 12 ) ;
insert ( arr, 3, 13 ) ;
insert ( arr, 4, 14 ) ;

insert ( arr, 5, 15 ) ;

printf ( "\nElements of Array: " ) ;

display ( arr ) ;

del ( arr, 5 ) ;

del ( arr, 2 ) ;
printf ( "\n\nAfter deletion: " );
display ( arr ) ;

18
del ( arr, 5 ) ;
del ( arr, 2 ) ;
printf ( "\n\nAfter deletion: ") ;

display ( arr ) ;
insert ( arr, 2, 222 ) ;
insert ( arr, 5, 555 ) ;

printf ( "\n\nAfter insertion:" );


display ( arr ) ;
reverse ( arr ) ;
printf ( "\n\nAfter reversing:");
display ( arr ) ;
search ( arr, 222 ) ;
search ( arr, 666 ) ;
}
The disadvantage of the implementation of arrays of static allocation
is that we had declared a static array of fixed size, say,
int arr[100];
Whenever this statement is executed, consecutive space for 100
integers is allocated. Also it is common that we may be using only 10% or
20% of the allocated space, thereby wasting the rest of the space. To
overcome this problem and to utilize the memory efficiently, C language
provides a mechanism of dynamically allocating memory so that only the
amount of memory that is actually required is reserved. We reserve space
only at the run time for the arrays. Dynamic memory allocation gives the
best performance in situations in which we do not know the memory
requirements in advance.
b) Implementation of Arrays (lists) based on Dynamic Memory
Allocation

Dynamic memory allocation is a way to defer the decision of how


much memory is necessary until the program is actually running, or give
back memory that the program no longer needs. The area from where the
application gets dynamic memory is called Heap. In C language, facilities

19
are provided to allocate and de-allocate dynamic heap memory using
malloc() and calloc() functions. Memory allocated through these functions
could be released using free() function.

Alternate to static allocation, a dynamic array can be calculated and


allocated at runtime using memory allocation functions before storing the
items, if the number of records is known. An implementation of an array
based on Dynamic memory allocation in C language has been represented
below:
.#include <stdio.h>

#include <stdlib.h>
int main()
{
int N,*a,i,s=0;
printf("\n enter the number of elements of the array:");
scanf("%d",&N);
a=(int *)malloc(N*sizeof(int));
if(a==NULL)
{

printf("\n memory allocation unsuccessful...");


exit(0);
}

printf("\n enter the array elements one by one");


for(i=0; i<N;++i)
{

scanf("%d",&a[i]);
s+=a[i];
}
printf("\n sum is %d ",s);
return 0;
}

20
Therefore, the advantages of arrays has been summarized as :
 Simple and easy to understand
 Follow Contiguous allocation

 Fast retrieval because of its indexed nature


 No need for the user to be worried about the allocation and de-
allocation of arrays (but dynamic memory allocation need to be
worried)
The disadvantages of the arrays can be summarized as:
1. Static arrays have a problem of fixed size/allocation.
2. Dynamic arrays should know the number of records in advance.
3. Problems would arise in the insertion and deletion of items in the
ordered list at the specified position in a fixed size.
Therefore, the major drawback of arrays such as the consecutive
memory allocation, fixed size to store the elements and the constraint of
knowing the exact number of elements in advance gives rise to an
alternative concept or approach called Linked list.

Let us sum up
At end of this unit you will be able to understand the concept of
Arrays and their types. Each and every concept is explained with the
example, so that you will be able to understand the concept easily. Further
readings refer the suggested books listed.

Check your progress


1. An array is called as ____________________.
2. Arrays allow ………….access
3. 4. ………… character arrays are used to store null-
terminated strings
Glossary
Arrays hold a fixed number of equally sized data elements of the same data
type.

21
Heap: The area from where the application gets dynamic memory is called
Heap.
Suggested readings
1. E Balagurusamy, “Data Structures”, McGraw Hill Education (India)
Private Limited, 2019.
2. Mark Allen Weiss, "Data Structures and Algorithm Analysis in C++”,
Third Edition, Pearson Education, 2007.
3. Reema Thareja, ―Data Structures Using C‖, Second Edition ,
Oxford University Press, 2011.
4. Ellis Horowitz, Sartaj Sahni, Susan Anderson-Freed,
―Fundamentals of Data Structures in C‖, Second Edition, University
Press, 2008.

Answers to check your progress


1. Hold a series of data elements, usually of the same size and data
type.
2. Random
3. One-dimensional

22
Unit -3
Array Traversal

Structure

Overview
Learning objectives
3.0 Array Traversal

3.1 Insertion and Deletion


3.2 Realizing Matrices using Two-Dimensional Arrays
3.3 Matrix Operations
Let us sum up
Check your progress
Glossary
Suggested readings
Answers to check your progress

Overview
In this unit we discuss about the concepts of Array Traversal and
their Insertion and Deletion. This chapter also discusses Realizing Matrices
using Two-Dimensional Arrays and Matrix Operations. Basic terminology
and concepts will be defined and relevant examples provided.

Learning objectives
At the end of this unit, you will be able to
 Understand the concept of Array Traversal
 Get the knowledge of Insertion and Deletion in arrays
 Know the Realizing Matrices using Two-Dimensional Arrays
 Clear idea about the Matrix Operations

23
3.0 ARRAY TRAVERSAL

While working with arrays, it is often required to access the array


elements; that is, reading values from the array. This is achieved with the
help of array traversal. It involves visiting the array elements and storing or
retrieving values from it. Some of the typical situations where array traversal
may be required are:

 Printing array elements,


 Searching an element in the array,
 Sorting an array, and so on

Algorithm of sequentially traverse an array:


traverse(arr[], size)
Step 1: Start
Step 2: Set i = 0
Step 3: Repeat Steps 4-5 while i < size
Step 4: Access arr[i]
Step 5: Set i = i + 1
Step 6: Stop

Example: C program to traverse each element of an array and print its value

#include <stdio.h>
#include <conio.h>

void traverse(int*, int); /*Function prototype for array traversal*/


void main()
{

int arr[5] = {2, 6, 7, 3, 8};


int N=5;
clrscr();

24
printf(“Press any key to perform array traversal and display its elements:
\n\n”);
getch();

traverse(arr,N); /*Calling traverse function*/


getch();
}
void traverse(int *array, int size)
{
int i;

for(i=0;i<size;i++)
printf(“arr[%d] = %d\n”,i,array[i]); /*Accessing array element and printing it*/
}

Output:

Press any key to perform array traversal and display its elements:

arr[0] = 2
arr[1] = 6
arr[2] = 7
arr[3] = 3

arr[4] = 8

3.1 INSERTION AND DELETION


Insertion is the task of adding an element into an existing array while
deletion is the task of removing an element from the array. The point of
insertion or deletion that is the position where an element is to be inserted
or deleted holds the vital importance, as we shall see in the subsequent
sections.

25
a) INSERTION

If an element is to be inserted at the end of an array, then it can be


simply achieved by storing the new element one position to the right of the
last element. However, the array must have vacant positions at the end for
this to be feasible. Alternatively, if an element is required to be inserted at
the middle, then this will require all the subsequent elements to be moved
one place to the right. Figure (a) depicts the insertion of an element into an
array.

Figure: (a) Array insertion

Algorithm:

Example: Write an algorithm to perform array insertion.


The following algorithm inserts an element P at index location k in the array
A[N], where k<=N.

insert(A[N],k, P)

Step 1: Start
Step 2: Set i = N
Step 3: Repeat Steps 4-5 while i >=k

26
Step 4: Set A[i+1] = A[i]
Step 5: Set i = i - 1
Step 6: Set A[k] = P

Step 7: Set N = N + 1
Step 8: Stop

b) DELETION

The deletion of elements follows a similar procedure as insertion.


The deletion of element from the end is quite simple and can be achieved
by more updation of index identifier. However, to remove an element from
the middle, one must move all the elements present to the right of the point
of deletion, one position to the left. Figure (b) depicts the deletion of an
element from an array.

Figure (b): Array deletion

Algorithm to perform array deletion operation.

The following algorithm deletes the element at index location k in the array
A[N], where k<=N.

delete(A[N],k)

27
Step 1: Start
Step 2: Set D = A[k]
Step 3: Set i = k

Step 4: Repeat Steps 5-6 while i <=N-1


Step 5: Set A[i] = A[i+1]
Step 6: Set i = i + 1
Step 7: Set N = N - 1
Step 8: Stop

3.2 REALIZING MATRICES USING TWO-DIMENSIONAL ARRAYS


Two-dimensional arrays are most commonly used for realizing
matrices. The first subscript signifies the rows of a matrix while the second
subscript signifies the columns. Operation on these array-represented
matrices can be performed through simple programming.

Figure c): Matrix represented by two-dimensional array

3.3 MATRIX OPERATIONS


The various operations associated with matrices are:

1. Addition
2. Subtraction
3. Multiplication
4. Transpose

28
1. Matrix Addition

The sum of two matrices


If two matrices have the same dimensions, they may be added
together. The result is a new matrix with the same dimensions in which
each element is the sum of the corresponding elements of the previous
matrices. For example, consider the following tables:

portA:
Bond 1
Stock 2

portB:
Bond 5
Stock 2
To find the total amounts held in the two portfolios, simply add the
corresponding matrices:
portAll = portA + portB
In table form:
portAll:

Bond 6
Stock 4
The sum of a matrix and a scalar

It is also possible to add a constant to every element in a matrix. For


example:
portPlus = portAll + 5

Gives:
portPlus:
Bond 11

Stock 9

29
2. Matrix Subtraction

Matrix subtraction is like addition. Each element of one matrix is


subtracted from the corresponding element of the other. If a scalar is
subtracted from a matrix, the former is subtracted from every element of the
latter. For example:
portA:

Bond 1
Stock 2

portB:
Bond 5
Stock 2

portB - portA:
Bond 4
Stock 0

portB - 1:

Bond 4
Stock 1

Other Element-by-element Operations

Addition and subtraction of matrices operate on an element-by-


element basis. In some cases it is desirable to perform multiplication,
division or exponentiation in the same manner. We follow the MATLAB
conventions, preceding the relevant operator with a dot (period) to indicate
that such an element-by-element operation is desired.

Element-by-element operations with matrices


Here are examples involving vectors:
portA:

30
Bond 1
Stock 2

portB:
Bond 5
Stock 2

portA .* portB:
Bond 5
Stock 4

portA ./ portB:
Bond 0.2
Stock 1.0

portB .^ portA:
Bond 5
Stock 4

Element-by-element operations with a matrix and a scalar


Element-by-element operations can also be performed with a matrix
and a scalar. For example:
portA .* 5:
Bond 5

Stock 10

portA ./ 5:

Bond 0.2

31
Stock 0.4

portA .^ 3:

Bond 1
Stock 8

3. Matrix Multiplication

A key matrix operation is that of multiplication.


The product of two vectors
Consider the task of portfolio valuation. This requires the multiplication of
the number of shares of each security by the corresponding price per share,
then the summation of the results. A simple matrix operation can
accomplish this easily. Suppose that:
price {1*assets} =
54 21

quantity {assets*1} =
1

2
Let value be the product of price and quantity:
value = price*quantity

In this case:

value = 96

To compute the value, one multiplies matrix (here, vector) price by matrix
(here, vector) quantity.
To understand this process, it is useful to represent each number by a
symbol:
price =

32
p1 p2

quantity =

n1
n2

value = p1*n1 + p2*n2


The first number in price is multiplied by the first number in quantity,
then the second number in price is multiplied by the second number in
quantity. The process continues until the end is reached, at which time all
the products are summed.
Rather clearly, this cannot be done unless the number of columns in
the first matrix equals the number of rows in the second. Put somewhat
differently, the inner dimensions of the two matrices must be the same. This
is always required in matrix multiplication and should be checked in
advance. Here:
price {1*assets} *quantity {assets*1} ===> value {1*1}
Note that the information in the curly brackets verifies that the
multiplication can take place, since the inner dimensions are the same
(assets). Such information also indicates the dimensions of the answer,
which is given by the outer dimensions (here: 1 by 1).
In general, the product obtained by multiplying two matrices will
have the same number of rows as the first matrix, and the same number of
columns as the second. For example:

{2*3} times {3*5} ==> {2*5}


{3*2} times {2*4} ==> {3*4}
{1*2} times {2*1} ==> {1*1}
The last case is the one in the example. More generally, multiplying
a row vector times a column vector always produces a scalar.
To repeat, it is a good practice (and often necessary) to think about
the dimension of an answer before performing any matrix multiplication.

33
When doing so, one can also check to make certain that the inner
dimensions are the same, as is required. The general scheme is:
{a*b} times {b*c} will produce {a*c}

The product of a matrix and a vector


When one or more of the matrices to be multiplied is a table, the
process is simply one of repeated vector multiplications. Consider, for
example, the determination of the value of a portfolio on three different days
(Monday, Tuesday, Wednesday):
Here, there are three sets of prices. The Price Table is:

Bond Stock
Mon 54 21
Tue 55 18
Wed 56 27
while the Price Matrix is:
54 21
55 18
56 27
The dimensions of Price are {days*assets} -- in this case, {3*2}.

Now, consider multiplication of Price times quantity, to obtain value:


Price {days*assets} *quantity {assets*1} ===> value {days*1}
Given the quantity vector q:

Bond 1
Stock 2
The result is the column vector:

96
91
110
The first number in the result value is obtained by multiplying the
vector in the top row of matrix Price by the column vector quantity, giving
the same result as before. The second number in the result is obtained by

34
multiplying the vector in the second row of matrix Price by the column
vector quantity, and so on. Using symbols:
Price =

p11 p12
p21 p22
p31 p32

quantity =
n1

n2

value =
p11*n1 + p12*n2
p21*n1 + p22*n2
p31*n1 + p32*n2

Recall that value is {days*1}. Hence, the associated table is:


Mon 96
Tue 91
Wed 110
The value of the portfolio was $96 on Monday, $91 on Tuesday, and $110
on Wednesday.
The product of two matrices

When two tables are multiplied, the process is simply expanded,


with each column of the result obtained by using the corresponding column
of the second matrix. For example, consider the task of finding the values of
two portfolios on each of three days. In this case, Quantity is itself a matrix.
In table form:
Port A Port B
Bond 1 5
Stock 2 2

35
In matrix form:
1 5
2 2

The product is a matrix showing the value of each portfolio on each of the
three days:
Price {days*assets} *quantity {assets*portfolios}

===> Value {days*portfolios}


In table form:
PortA PortB

Mon 96 312
Tue 91 311
Wed 110 334

Matrix Inversion
Thus far, we have not discussed matrix division; only array division.
There is a matrix construct similar to that of division, and it is central to
much of the work of the Analyst. The key ingredient is the use of
the inverse of a matrix, to which we now turn.

First, a few preliminaries.


A square matrix has the same number of rows and columns.
An identity matrix is a square matrix with ones on the diagonal from upper
left to lower right and zeros elsewhere. For example:
I=
100

010
001
Such a matrix is often denoted I.
The product of an identity matrix (of the right size) and a column
vector is the column vector, as can be seen by applying the rules for matrix
multiplication. Thus, if:

36
v=
3
4

I*v ==> v
(read: I times v gives v).
More generally, the product of any matrix M and an identity matrix
with the same number of columns as M will be the original matrix:
I*M ==> M
as can be seen by working through the operations involved in matrix
multiplication.
The inverse of a square matrix is a matrix of the same size that,
when multiplied by the matrix, gives an identity matrix of the same size. The
inverse of a matrix is sometimes written with a "-1" superscript. We use
instead the more computer-friendly MATLAB form:
inv(M)
where M is a square matrix.

By definition:
inv(M)*M = I
Note that only square matrices can have inverses (although not all do).
To see why matrix inversion is similar to division, consider a {1*1}
matrix -- i.e. a scalar -- with a value of 5. The identity matrix of the same
size will also be a scalar, and in this case the single value 1. From this it
follows that the inverse of the original matrix (scalar) will be the reciprocal of
its value. Thus:
(1/5)*5 = 1

Multiplication by the inverse of a matrix is like dividing by the matrix,


except, this is strictly true only if the matrix is {1*1}.
Solving Simultaneous Linear Equations

37
Matrix inversion is often used to solve a set of simultaneous linear
equations. Consider a situation in which there are two states of the world
("weather is good", "weather is bad") and two securities (Bond, Stock).
Matrix Payoff {states*assets} shows the payments made by each security in
each state of the world. Vector quantity {assets*1} shows the composition of
a portfolio. Vector result {states*1} shows the payments that will be received
from the portfolio in each possible state of the world. Below, we show all
three in table form:
Payoff:

Bond Stock
good 60 40
bad 60 10

quantity:
Bond 1
Stock 2

result = Payoff*quantity:

good 140
bad 80
Thus the portfolio will provide $140 if the weather is good. If the
weather is bad it will only provide $80.
Now, assume that an investor would like to receive $240 if the
weather is good and $150 if the weather is bad. The problem is to
determine the portfolio (quantity) that will produce the desired payment
vector.
Consider the equation for the computation:
Payoff*quantity = result
Note that Payoff is square, so it is possible to compute its inverse,
barring complications to be discussed later. We multiply both sides of the
equation by this inverse (a "legal" matrix operation):
inv(Payoff)*Payoff*quantity = inv(Payoff)*result

38
But the product of the inverse and the original matrix is the identity matrix,
so:
I*quantity = inv(Payoff)*result

But the product of an identity matrix and a vector is the vector. Thus:
quantity = inv(Payoff)*result
This is precisely what we want :- an equation for a portfolio (quantity) that
will provide the desired set of cash flows (result)!
The three components are shown below, with the resulting values shown in
bold:

result:
good 240
bad 150
inv(Payoff):
-0.0056 0.0222
0.0333 -0.0333
quantity:
Bond 2
Stock 3

Thus the desired result can be achieved with a portfolio of 2 bonds


and 3 stocks.
Any set of simultaneous linear equations for which there is a solution
can be solved in this manner. It may seem that the requirement that the
matrix of coefficients be square is overly restrictive. However, to solve a set
of such equations, it requires precisely as many equations as there are
unknowns, so the matrix of "left-hand sides" (here, Payoff) must have as
many rows (equations) as it does columns (variables).
Unfortunately, sometimes this won't work. It is impossible to take the
inverse of some matrices, even though they are square. In such cases the
matrix in question is said to be singular. In typical investment applications
this will occur when a strategy is not truly independent and can be provided
with some combination of other included strategies. When this occurs, the
programming system being used is likely to complain that it cannot take the

39
needed inverse because the matrix in question is singular (or very nearly
so). This is a signal that the economics of the original problem formulation
need to be re-examined.
4. The Transpose of a Matrix

It is not unusual to find that a matrix is the "wrong way around" for a
needed calculation. More precisely, its rows should be columns and its
columns should be rows. Happily, there is a standard operation that "turns
around" a matrix (or vector).
The transpose of a matrix is, in effect, the matrix rotated in this manner. For
example, if M is:
123
456
then M' (read: M-prime or M-transpose) is:
14
25
36
This is sometimes denoted by appending a "T" as a superscript
after M , but we will use the MATLAB version M'.

Multiple Operations
To facilitate an exposition, we have generally restricted our
examples to one matrix or array operation. Sometimes we have put the
result on the left; and sometimes on the right. Moreover, we have used an
arrow when it appeared useful and an equality sign at other times. When
writing commands to be executed by a programming system, of course,
rather strict rules of syntax must be followed. Generally, the result must be
written first, followed by an equality sign, followed by
an expression indicating the desired computations. Such expressions can
include multiple matrix and/or array operations, if desired. For example:
D = inv(A)*(b*c)
This would be perfectly legal if the dimensions of A, b and c were
appropriate. The sense of the equality sign is that of assignment. Thus the
statement really says: "D should be assigned the result obtained by
multiplying the inverse of A times the product of b and c."

40
Statements such as this, which are designed to be operated on by a
programming system, are generally written without bold fonts, since such
subtleties would be lost on the processor, even if they could be presented to
it.
Let us sum up
At end of this unit you will be able to understand the concept of the
tree traversal and their types. Each and every concept is explained with the
example, so that you will be able to understand the concept easily. For
further readings refer the suggested books listed below.
Check your progress
1. ________arrays are most commonly used for realizing matrices.
2. Addition and subtraction of matrices operate on an _______basis.
3. A ________ has the same number of rows and columns.
4. __________is often used to solve a set of simultaneous linear
equations.
Glossary
Insertion and deletion in array
Matrix operations in arrays
Suggested readings
1. E Balagurusamy, “Data Structures”, McGraw Hill Education
(India) Private Limited, 2019.
2. Mark Allen Weiss, "Data Structures and Algorithm Analysis in
C++”, Third Edition, Pearson Education, 2007.
3. Reema Thareja, ―Data Structures Using C‖, Second Edition
, Oxford University Press, 2011.
4. Ellis Horowitz, Sartaj Sahni, Susan Anderson-Freed,
―Fundamentals of Data Structures in C‖, Second Edition,
University Press, 2008.
Answers to check your progress
1. Two-dimensional
2. element-by-element
3. square matrix
4. Matrix inversion

41
Unit - 4
Linked lists

Structure

Overview
Learning objectives
4.0 Linked Lists

4.0.1Representation of Linked Lists


4.0.2 Advantages and Disadvantages of Linked List
4.1 Linked List Node Declaration
4.1.1 Linked List Operations
4.1.2 Linked List Implementation
4.2 Circular Linked List Operations
4.2.1 Circular Linked List Implementation
4.3 Doubly Linked List Node Declaration
4.3.1 Doubly Linked List Operations

4.3.2 Doubly Linked List Implementation


Let us sum up
Check your progress
Glossary
Suggested readings
Answer to check your progress

Overview
This unit explains the concepts of linked lists and their types. This
chapter also provides you the implementation and operations of various
types of linked lists. All the Basic terminology and concepts will be defined
and relevant examples provided.

42
Learning objectives
After, learning this unit you will be able to
 Understand the Linked list and their implementation and operations.
 Get knowledge about Single linked list and their implementation and
operations.
 Get clear idea about the Circular linked list and their implementation
and operations.
 Know the concepts of Doubly linked list and their implementation
and operations.

4.0 Linked list


In real time systems, most often, the number of elements will not be
be known in advance. A linked list allows data to be inserted into and
deleted from a list without physically moving any data as well as without
knowing the maximum size in advance. Additionally, if the linked list is
implemented using dynamic data structures (dynamic memory allocation),
the list size can vary as the program executes. Linked lists could be
visualized as a train or a sequence of nodes in which each node contains
one or more data fields and a pointer (link) to the next node. In other words,
a linked list is a linear collection of data elements called nodes. Each node
consists of two parts: the first part containing the information/data whereas
the second part consists of the address/link of the next node in the list.
Linked list is more advantageous than arrays because unlike an array, a
linked list does not store its nodes in consecutive memory locations.
They can be stored anywhere in the memory and could easily be
located by its address stored in the link part of the previous node. Linked list
can be accessed only in a sequential manner but insertions, deletions can
be done at any point in the list in a constant time. Another advantage of a
linked list over an array is that we can add any number of elements in the
list whereas it is not possible in the case of an array (fixed size).
Linked list is efficient because of the following reasons:
 Insert and delete operations are supported in O(1) time
 Searching operations are done in O(n) time

 Any number of elements could be added in the list (Variable


size)

43
 Does not require consecutive memory locations like an array.
Where O represents Big O notation or Big Oh notation, and also
Landau notation or asymptotic notation, a mathematical notation used to
describe the asymptotic behavior of functions. It is an asymptotic tight
bound which is useful in the analysis of the complexity of algorithms. But,
memory overhead may arise in a linked list , but it is allocated only to
entries that are present.
Different types of linked lists available are Singly linked lists, Doubly
linked lists, Circular linked lists and Circular doubly linked lists. In these
types, we are going to see about the singly, doubly and circular linked list in
detail in the following sections.

4.0.1 Representation of Linked Lists


In the figure below there are 4 nodes. The data of the nodes are A1
, A2 , A3 & A4.The next part of A1 points the address of A2 .The next part
of A2 points the address of A3 .The next part of A3 points the address of
A4 .The next part of A4 contains null value. That means it contains no
address.

In the above figure, the address specifies where the nodes are
stored in memory. The node A1 is in the address 550.The next part of A1
contains the 670 which is the address of next node A2 .The node A2 is in
the address 670.The next part of A2 contains the 780 which is the address
of next node A3 .The node A3 is in the address 780.The next part of A3
contains the 900 which is the address of next node A4 .The next part of A4
contains null. Because A4 is the last node of the linked list. Using this
technique the nodes are virtually connected.

4.0.2 Advantages & Disadvantages of Linked List


a) Advantages of Linked List

1. The linked list is a dynamic data structure.


2. You can also decrease and increase the linked list at run-time. That
is, you can allocate and de-allocate memory at run-time itself.

44
3. In this, you can easily do the insertion and deletion functions. That
is, you can easily insert and delete the node.
4. Memory is well utilized in the linked list. Because in it, we do not
have to allocate memory in advance.
5. Its access time is very fast, and it can be accessed at a certain time
without memory overhead.
6. You can easily implement the linear data structures using the linked
list like a stack, queue.
c) Disadvantages of Linked List
1. The linked list requires more memory to store the elements than an
array, because each node of the linked list points a pointer, due to
which it requires more memory.
2. It is very difficult to traverse the nodes in a linked list. In this, we
cannot access randomly to any one node. (As we do in the array by
index.) For example: - If we want to traverse a node in an n position,
then we have to traverse all the nodes that come before n, which will
spoil a lot of our time.
3. Reverse traversing in a linked list is very difficult, because it requires
more memory for the pointer.

1.1 Singly Linked Lists


Declarations:
This is the simplest form of linked list which consists of two parts to
represent a node. Every node consists of two parts: first part containing the
information/data and the second part containing the link/address of the next
node in the list. Each data element or 'node' contains an address pointer.
Memory is allocated dynamically in the heap for every node
whenever requirement arises, as well as it is also freed, when not required.
In this book, we assume that a header node is created for every list which is
a dummy node/starting point that contains the address of the initial node of
the list. When a new node has been created for the first time, memory
allocated for that has been accounted for as the initial node of the linked list.
Subsequent creation of nodes (node 2) doesn’t necessitate the consecutive
memory allocation to the previous node (node1). Rather, memory allocation
could be done anywhere else on the memory and address of the node2
gets properly updated on the link/address part of the previous allocated

45
node (node1). This mechanism links the new node to the previous nodes
(Linked lists) in a linear fashion. Each node contains the address of the
node that follows it in the list. Thus the nodes can be stored in the memory
in random order which may improve the effective utilization of memory. The
last node in the list contains a special address that indicates that there are
no further nodes and it is represented as ‘NULL’. Since in a linked list, every
node contains a pointer to another node which is of the same type, it is also
called as a Self-referential data type. A diagrammatic representation of the
linked list has been shown below in figure 3.

Figure 3: Simple Representation of a singly linked list


In C, we will implement a linked list using the following code:
struct node
{
int data;
struct node *next;

};
typedef node *L;
In order to maintain a singly linked list, we need a structure called
node which has two fields named Data and Next. Data will store the
information part whereas the next would contain the address of the next
node in the sequence. It has been observed from figure 3 that the header
node contains the address of the first node in the list. We can traverse the
entire list using the pointer variable starting from the header node until it
reached null.

4.1.1 Linked List Operations


Now let us see the operations that could be performed on a singly
linked list to manipulate them.
a) Insertion into a Singly linked list

46
A new node can be added/inserted into a linked list under the following
three cases/situations:
Case 1: The new node is inserted at the beginning of the list

Case 2: The new node is inserted at the end of the list


Case 3: The new node is inserted at the middle of the list
Case 1: Insertion at the beginning of the list

If we add a new node in an empty list, memory has been allocated for
the new node to be created and data has been stored in the data part and
next part of the node has been updated with NULL value. Now the
header(start) node points to the new node created stating that this is the
first/initial node of the list.
If there is an existing list, and if we wish to add a new node at the
beginning of the list, memory has to be allocated for the new node structure
by assigning the value to the data part and making the next part to point to
the address of the first node of the list. Meanwhile, header node value has
to be adjusted to point the new node created, thus making the new node as
the initial node of the list. This process has been depicted in figure 4.

Figure 4: Insertion into the beginning of the list

In the above figure 4, we wish to add a new node at the beginning of


the list with the value of 9. Then the following changes take place in the list
as shown in figure 4. Now the header node will point to the newly added
node with value 9. Correspondingly, C routine to insert a new node at the
beginning of the list has been shown below:
void insertbeg (int num, list l)

{
Struct node *temp;
temp= (struct node *) malloc (size of(struct node));
temp-> data=num;

47
if(L->next==null)
{
L->next = temp;

temp->next = null;
}
else
{
temp->next = L->next;
L->next= temp;
}
}
In the above code, num represents the number to be inserted and L
represents the address of the header node. Memory has been allocated for
the new node using malloc statement. If memory allocation is successful,
number has been assigned to the data part of the newly created node. If (L-
>next == null) means that the list is an empty list. If its an empty list, header
node points to the new node. Otherwise, the next part of the header node
has been updated to the address of the new node showing that the new
node has been added to the beginning of the list.
Case 2: Insertion at the end of the list

In an existing list, if we wish to add a new node at the end of the list,
memory has been allocated for the new node structure by assigning the
value to the data part and making the next part to point to null value.
Meanwhile, next part of the last node in the existing linked list has been
adjusted to point the new node created, thus making the new node as the
ending node of the list. This process has been depicted in figure 5.

Figure 5. Insertion at the end of the list

48
In the above figure 5, we wish to add a new node at the end of the
list with the value of 7. The process of adding the new node at the end of
the list is shown in figure 5. Now the final node of the existing list will point
to the newly added node with value 7. Correspondingly, C routine to insert a
new node at the end of the list has been shown below:
void insertend (int num, list l)

{
Struct node *temp;
temp= (struct node *) malloc (size of(struct node));

temp-> data=num;
if(L->next==null)
{
L->next = temp;
temp->next = null;
}
else
{
iter=L->next;

while(iter!=NULL)
{
iter = iter->next;

}
iter=temp;
temp->next=NULL;

}
}
In the above code, num represents the number to be inserted and L
represents the address of the header node. Memory has been allocated for
the new node using malloc statement. If memory allocation is successful,
number has been assigned to the data part of the newly created node. If (l-

49
>next = null) means that the list is an empty list. If it is an empty list, header
node points to the new node. Otherwise, starting from the header node, the
list has been traversed or iterated through, until it reaches the end of the list.
Once the end of the list has been obtained, the new node has been added
there by updating its ‘next’ field to NULL.
Case 3: Insertion at the middle of the list

Sometimes in an existing list, we may wish to add a new node at the


middle of the list, by specifying the location as an argument along with the
value to be inserted. This says that the new value has to be inserted/added
after the element of the specified location. Meanwhile, we traverse through
the list to reach the node specified by the given location in the existing list.
Once we reach the node of the given location, we exchange the pointers in
such a way that the new node has been added next to it. This process has
been depicted in figure 6.

Figure 6. Insertion at the middle of the list

In the above figure 6, we wish to add a new node at the middle of


the list with the value of 5 after the node containing the value of 10. The
process of adding the new node with value 5 after 10 has been explained in
figure 6. In this figure, the pointers of the node with value 10 and its
succeeding node has been adjusted in such a way that the value 5 has
been added between 10 and 12. Correspondingly, C routine to insert a new
node at the middle of the list has been shown below:
void insertmiddle (int num, list l, int pos)
{

Struct node *temp, *iter; int count =0;


temp= (struct node *) malloc (size of(struct node));
temp-> data=num;
iter = L->next;

50
if(iter!=null &&count==pos)
{
iter =iter->next;

count++;
}
temp->next = iter->next;
Iter->next=temp;
}
In the above code, “num” represents the number to be inserted, “pos”
represents the position/location of the node after which the new node needs
to be inserted and L represents the address of the header node. Memory
has been allocated for the new node using malloc statement. If memory
allocation is successful, number has been assigned to the data part of the
newly created node. Starting from the header node, the list has been
traversed or iterated through loop, until it reaches location specified (iter-
>next). Once the location is reached, address of the succeeding node of
‘iter’ has been updated to the address part of the new node ‘temp’ and the
address of the new node has been updated to the link part of the ‘iter’ node.
Thus, the new node has been added in the middle of ‘iter’ and its
succeeding node.
b) Deletion from a singly linked list

A new node can be removed/deleted from a linked list under the


following three cases/situations:
Case 1: The new node is deleted from the beginning of the list

Case 2: The new node is deleted from the end of the list
Case 3: The new node is deleted from the middle of the list
Note that when we delete a node from the linked list, memory has
been deallocated for that particular node and the memory is returned to the
free pool so that it can be used to store other useful programs and data.
When the list has only one node, after deletion the list becomes empty and
the header node would be made to point NULL. In certain occasions, an
important problem called UNDERFLOW may arise. Underflow is a condition
that occurs when we try to delete a node from the linked list that is empty.

51
This would happen when the header node is already pointing to NULL and
when there are no more nodes to delete.

Case 1: Deletion from the beginning of the list

If there is an existing list, and if we wish to remove a node from the


beginning of the list, then its corresponding memory gets deallocated by
making the header node to point to the next node in the sequence. This
process has been depicted in figure 7.

Figure 7. Deletion from the beginning of the list


In the above figure 7, we wish to delete a node from the beginning of
the list with the value of 12. Then the following changes takes place in the
list as shown in figure 7. Now the header node will point to the next node
with the value 10. Correspondingly, the C routine to delete a node from the
beginning of the list has been shown below:
void deletebeg ( list l)
{
struct node *temp;
temp= l->next;
l->next = temp->next;

free(temp);
}
In the above code, l represents the address of the header node.
Here, temp stores the address of the first node and Header node (l->next)
has been made to point to the address of the next (second) node in the
sequence. Hence, memory of the first node has been freed off.
Case 2: Deletion from the end of the list
If we wish to remove a node from the end of the list, the list will be
iterated till it reaches the end of the list. Memory of the last node gets

52
deallocated and last but one node has been made as the last node by
updating its next field to NULL and these changes are depicted in the figure
8.

Figure 8. Deletion from the end of the list


In the above figure 8, we wish to delete a node with value 5 from the
end of the list. Here, the node which is previous to the last node has been
updated as the last node. Correspondingly the C routine to delete a node
from the end of the list has been shown below:

void deleteend ( list l)


{
struct node *temp *r;
temp= l->next;
r=l->next;
while(r->next!=NULL)

{
temp=r;
r=r->next;

}
temp->next=null;
free(r);

}
In the above code, l represents the address of the header node.
Here, temp stores the address of the previous node and r stores the
address of the current node while iterating through the list. Once it reaches
the end of the list, temp node contains the address of the last but one node
and r denotes the last node. Here, the address part of the temp has been
updated to null and the memory of ‘r’ node has been freed.
Case 3: Deletion from the middle of the list

53
Suppose we may wish to delete a node from the middle of the list
after a particular given value as shown in figure 9.

Figure 9. Deletion from the middle of the list

In the above figure 9, we wish to delete the node containing value 10


that succeeds the value 4 from the middle of the list. Here, the pointer
variable has been moved to point to the node that contain the value 4, and
the address field of the node with value 4 has been updated to point to the
succeeding node of the node to be deleted with value 10. A corresponding
C routine which deletes the node from the middle of the list has been shown
below:
void deletemiddle (int num, list l)
{
struct node *temp, *curr, *p;
temp= l->next;
p = l;

while (temp!=NULL && temp->data!=num)


{
p=temp;
temp=temp->next;
}
curr = p->next;

p->next =curr->next;
free(curr);
}

In the above code, “num” represents the number to be deleted, and


L represents the address of the header node. While loop iterates through
the list and identifies the address of the previous node to be deleted (p) and
“curr” represents the address of the current node to be deleted. Here, the

54
next field of the previous node (p) has been updated to the address of the
succeeding node of the current node (curr). Later, memory of the current
node (curr) has been deallocated.
c) Traversing a Singly linked list

Traversing a singly linked list means accessing the nodes of the list
in order to perform some processing on them. Always the start of the list is
marked with the header node and the end of the list is represented by the
node whose address part contains null. C routine to traverse a singly linked
list is represented below:

void traverse (list l)


{
struct node *temp;
temp = l->next;
while (temp!=next)
{
temp=temp->next;
printf(“%d”, temp->data);
}

}
Here, list l contains the address of the header node and the “temp”
pointer iterates or traverses through the node until it reaches the end of the
list. In each iteration, it tries to print or display the content of each node.
d) Searching from a singly linked list

Searching a linked list means to find whether a particular element is


present in the linked list or not. Searching means finding whether a given
value is present in the information/data part of the node or not. If it is
present, the searching algorithm returns the address of the node containing
the value. Corresponding C routine for searching a node in a singly linked
list is listed below:
struct node * Find (int num, list l)

{
struct node *temp;

55
temp = l->next;
while (temp!=null)
{

if temp->data == num)
return temp;
else
temp=temp->next;
}
return null;
}
Here list l contains the address of the header node and it searches
for the element ‘num’. The list has been iterated through the while loop node
by the node through the pointer ‘temp’ and if the data of a particular node
gets matched, it returns the address of the corresponding node. If no match
occurs, the null value will be returned.

4.1.2 Linked List Implementation


Linked list implementation involves declaring its structure and
defining its operations. The following example shows how a linked list is
implemented using C language.

Example Program 3.1 implements a linked list in C.

#include<stdio.h>
#include<conio.h>

/*Linked list declaration*/


struct node
{

int INFO;
struct node *NEXT; };/*Declaring pointers to the first and last
node of the linked list*/

56
struct node *FIRST = NULL;
struct node *LAST = NULL;

/*Declaring function prototypes for linked list operations*/


void insert(int);
int delete(int);
void print(void);
struct node *search (int);

void main()
{
int num1, num2, choice;
struct node *location;

/*Displaying a menu of choices for performing linked list operations*/


while(1)
{
clrscr();
printf(“\n\nSelect an option\n”);
printf(“\n1 - Insert”);
printf(“\n2 - Delete”);
printf(“\n3 - Search”);
printf(“\n4 - Print”);
printf(“\n5 - Exit”);

printf(“\n\nEnter your choice: “);


scanf(“%d”, &choice);
switch(choice)

57
{
case 1:

{
printf(“\nEnter the element to be inserted into the linked list: “);
scanf(“%d”,&num1);

insert(num1); /*Calling the insert() function*/


printf(“\n%d successfully inserted into the linked list!”,num1);
getch();

break;
}
case 2:
{
printf(“\nEnter the element to be deleted from the linked list: “);
scanf(“%d”,&num1);

num2=delete(num1); /*Calling the delete() function */


if(num2==-9999)
printf(“\n\t%d is not present in the linked list\n\t”,num1); else

printf(“\n\tElement %d successfuly deleted from the linked list\n\t”,num2);


getch();
break;

}
case 3:
{
printf(“\nEnter the element to be searched: “);
scanf(“%d”,&num1);
location=search(num1); /*Calling the search() function*/
if(location==NULL)
printf(“\n\t%d is not present in the linked list\n\t”,num1);

58
else
{
if(location==LAST)

printf(“\n\tElement %d is the last element in the list”,num1); else


printf(“\n\tElement %d is present before element %d in the linked list\
n\t”,num1,(location->NEXT)->INFO);

}
getch();
break;

}
case 4:
{
print(); /*Printing the linked list elements*/
getch();
break;

}
case 5:
{
exit(1);
break;
}
default:
{
printf(“\nIncorrect choice. Please try again.”);
getch();
break;
}}}}

/*Insert function*/

59
void insert(int value)
{
/*Creating a new node*/

struct node *PTR = (struct node*)malloc(sizeof(struct node));


/*Storing the element to be inserted in the new node*/ PTR-
>INFO = value;

/*Linking the new node to the linked list*/


if(FIRST==NULL)
{

FIRST = LAST = PTR;


PTR->NEXT=NULL;
}
else
{
LAST->NEXT = PTR;

PTR->NEXT = NULL;
LAST = PTR;
}
}

/*Delete function*/
int delete(int value)
{
struct node *LOC,*TEMP;
int i;
i=value;
LOC=search(i); /*Calling the search() function*/

if(LOC==NULL) /*Element not found*/


return(-9999);

60
if(LOC==FIRST)
{

if(FIRST==LAST)
FIRST=LAST=NULL;
else
FIRST=FIRST->NEXT;
return(value);
}
for(TEMP=FIRST;TEMP->NEXT!=LOC;TEMP=TEMP->NEXT);
TEMP->NEXT=LOC->NEXT;
if(LOC==LAST)LAST=TEMP;
return(LOC->INFO);
}
/*Search function*/

struct node *search (int value)


{
struct node *PTR;
if(FIRST==NULL) /*Checking for empty
list*/ return(NULL);
/*Searching the linked list*/
for(PTR=FIRST;PTR!=LAST;PTR=PTR->NEXT) if(PTR->INFO==value)
return(PTR); /*Returning the location of the searched element*/
if(LAST->INFO==value)

return(LAST);
else
return(NULL); /*Returning NULL value indicating unsuccessful search*/

61
/*print function*/
void print()
{

struct node *PTR;


if(FIRST==NULL) /*Checking whether the list is empty*/
{

printf(“\n\tEmpty List!!”);
return;
}

printf(“\nLinked list elements:\n”);


if(FIRST==LAST) /*Checking if there is only one element in the list*/
{
printf(“\t%d”,FIRST->INFO);
return;
}

/*Printing the list elements*/ for(PTR=FIRST;PTR!=LAST;PTR=PTR-


>NEXT) printf(“\t%d”,PTR->INFO); printf(«\t%d»,LAST->INFO);

Output

Select an option
1 - Insert
2 - Delete

3 - Search
4 - Print
5 - Exit

Enter your choice: 4

62
Empty List!!
Select an option

1 - Insert
2 - Delete
3 - Search
4 - Print
5 - Exit

Enter your choice: 1


Enter the element to be inserted into the linked list: 1
1 successfully inserted into the linked list!
Select an option

1 - Insert

2 - Delete
3 - Search
4 - Print
5 - Exit

Enter your choice: 1

Enter the element to be inserted into the linked list: 2


2 successfully inserted into the linked list!
Select an option
1 - Insert
2 - Delete
3 - Search
4 - Print

63
5 - Exit
Enter your choice: 1

Enter the element to be inserted into the linked list: 3


3 successfully inserted into the linked list!
Select an option
1 - Insert
2 - Delete
3 - Search
4 - Print
5 - Exit
Enter your choice: 3
Enter the element to be searched: 5
5 is not present in the linked list
Select an option

1 - Insert
2 - Delete
3 - Search
4 - Print
5 - Exit

Enter your choice: 3


Enter the element to be searched: 2
Element 2 is present before element 3 in the linked list
Select an option
1 - Insert
2 - Delete
3 - Search

64
4 - Print
5 - Exit

Enter your choice: 2


Enter the element to be deleted from the linked list: 2
Element 2 successfully deleted from the linked listSelect an option
1 - Insert
2 - Delete
3 - Search
4 - Print
5 - Exit
Enter your choice: 4
Linked list elements:

1 3

Select an option

1 - Insert

2 - Delete

3 - Search

4 - Print

5 - Exit

Enter your choice: 5

4.2 Circular linked list


In a Circular linked list, the last node in the list points to the first node
in the list in a circular fashion. In other words, all the nodes in the list are
circularly connected so that when we traverse the list in the forward

65
direction, we will reach the place again in a round fashion, where we have
started. Thus, a circular linked list does not have a beginning or end. It is
also possible to declare a circularly doubly linked list which can move
around in a circular fashion both in the forward and backward direction. The
only disadvantage of circular linked list is that it requires more number of
iterations. In this book, we are going to deal about the Circular singly linked
list structure in detail. A diagrammatic representation of the circular linked
list has been shown below in figure 17.

Figure 17. Simple Representation of a circular singly linked list


In C, we will implement the structure of a circular linked list using the
following code:
struct node
{
int data;
struct node *next;
};
typedef node *list;
In order to maintain a circular linked list, we need a structure called
node which has two fields name by Data and Next. Data will store the
information part whereas the next would contain the address of the next
node in the sequence. It has been observed from figure 17 that the header
node contains the address of the first node in the list. We can traverse the
entire list using the pointer variable starting from the header node until it
reaches the last node. Here the ‘next’ field of the last node contains the
address of the initial node of the list.
4.2.1 Circular Linked List Operations

Now let us see the operations that could be performed on a circular


singly linked list to manipulate them.
a) Insertion into a Circular Singly linked list

66
A new node can be added/inserted into a circular singly linked list under the
following two cases/situations:
Case 1: The new node is inserted at the beginning of the list

Case 2: The new node is inserted at the end of the list

Case 1: Insertion at the beginning of the list

If a new node has been added in an empty list, memory has been
allocated for the new node to be created and data part has been assigned
with the given value. Here the next field of the new node has been updated
with its own address since this is the only node in the list. Now the
header(start) node points to the new node created stating that this is the
first/initial node of the list.
If there is an existing list, and if we are about to add a new node at
the beginning of the list, memory has been allocated for the new node
structure by assigning the value to the data part and making the next part to
point to the address of the first node of the list. Meanwhile, header node
value has to be adjusted to point the new node created. In addition, the
‘next’ field of the last node in the list has been updated with the address of
the newly created node since this follows a circular fashion. This process
has been depicted in figure 18.

Figure 18. Insertion into the beginning of the list

In the above figure 18, we wish to add a new node at the beginning
of the circular list with the value of 19. Then the changes flow in the list are
shown in figure 18. Correspondingly, the C routine to insert a new node at
the beginning of the circular singly linked list has been shown below:
void insertbeg (int num, list l)
{
struct node *temp, *first, *end;

67
temp= (struct node *) malloc (size of(struct node));
temp-> data=num;
first = l->next;

if (first == NULL)
{
temp->next = first;
end = first;
}
else
{
temp->next = l->next;
l->next =temp;
end->next = temp;
}
}

In the above code, num represents the number to be inserted and l


represents the address of the header node. Memory has been allocated for
the new node using malloc statement. If memory allocation is successful,
number has been assigned to the data part of the newly created node. If
(first == null) means that the list is an empty list. If its an empty list, the
header node points to the new node as well as the ending node in the list is
also the new node. Otherwise, the next field of the header node has been
updated to the address of the new node showing that the new node has
been added to the beginning of the list. Similarly, the next field of the last
node of the list has been updated to the address of the new node created to
indicate that the list is connected in a circular fashion.
Case 2: Insertion at the end of the list

Consider the linked list shown in figure 19, where we want to add a
new node of value 15 at the end of the list. The flow of changes has been
characterized by the figure 19.

68
Figure 19. Insertion at the end of the circular singly linked list

Correspondingly, the C routine to illustrate the insertion of a new node at


the end of the list has been shown below:
void insertend (int num, list l)
{

struct node *temp, *end;


temp= (struct node *) malloc (size of(struct node));
temp-> data=num;

if(l->next==null)
{
l->next = temp;
temp->next = l->next;
end = temp;
}
else
{
temp->next=end->next;

end->next=temp;
end=temp;
}

}
In the above code, num represents the number to be inserted and l
represents the address of the header node. Once the new node has been
allocated and assigned with data, it looks through whether the list is an

69
empty list through the condition (l->next = null). If it is an empty list, header
node points to the new node as well as the new node becomes the ending
node. Otherwise, the ‘next’ address of the ending node has been assigned
to the ‘next’ part of the new node (temp) making the new node to be the last
one in the list.
b) Deletion from a Circular singly linked list

A new node can be removed/deleted from a circular singly linked list under
the following two cases/situations:
Case 1: The new node is deleted from the beginning of the list

Case 2: The new node is deleted from the end of the list
Case 1: Deletion from the beginning of the list
If there is an existing list, and if we wish to remove a node from the
beginning of the circular singly linked list, then its corresponding memory
gets deallocated by making the header node to point to the next node in the
sequence. Subsequently, the ‘next’ field of the last node in the list will point
to the second node in the sequence if the first node gets deleted. This
process has been depicted in figure 20.

Figure 20. Deletion from the beginning of the circular singly linked list

A C routine to delete a node from the beginning of the circular singly linked
list has been shown below:
void deletebeg ( list l, list end)
{
struct node *temp;
temp= l->next;

l->next = temp->next;
end->next = l->next;
free(temp);

70
}
In the above code, l represents the address of the header node and
‘end’ represents the address of the last node in the list. Here, temp stores
the address of the first node and Header node (l->next) has been made to
point to the address of the next (second) node in the sequence and memory
of the first node has been freed off. Now the second node (temp->next)
becomes the first node of the list and its address has been assigned to the
‘next’ field of the ending node of the list.

Case 2: Deletion from the end of the list


If we wish to remove a node from the end of the list, the list will be
iterated till it reaches the end of the list. Memory of the last node gets
deallocated and the preceding node of the last one, now points to the initial
node. The flow of these changes is depicted in the figure 21.

Figure 21. Deletion from the end of the list

Subsequently, the C routine to delete a node from the end of the circular
singly linked list has been shown below:
void deleteend ( list l)
{
struct node *temp, *r;
temp= l->next;

r=l->next;
while(r->next!=NULL)
{

temp=r;
r=r->next;
}
temp->next=r->next;

71
free(r);
}
In the above code, l represents the address of the header node.
Here, temp and r stores the address of the previous & current node while
iterating through the list. Once it reaches the end of the list, temp node
contains the address of the last but one node and r denotes the last node.
Here, the ‘next’ field of the temp has been updated to the address of the
initial node and the memory of ‘r’ node has been freed.
c) Traversing a Circular Singly linked list

Traversal procedure of a circular singly linked list is similar to the


procedure of singly linked list where it differs in accessing the nodes of the
list in a circular fashion. We can traverse around and come back to the
original position in the list.

4.2.2 Circular Linked List Implementation

The implementation of circular linked list involves declaring its structure and
defining its operations. The following example shows how a circular linked
list is implemented in C.

Example: C program to implement a circular linked list and perform its


common operations.

#include<stdio.h>
#include<conio.h>

/*Circular linked list declaration*/


struct cl_node
{

int INFO;
struct cl_node *NEXT;
};

72
/*Declaring pointers to first and last node of the list*/ struct
cl_node *FIRST = NULL;
struct cl_node *LAST = NULL;

/*Declaring function prototypes for list operations*/


void insert(int);

int delete(int);
void print(void);
struct cl_node *search (int);

void main()
{
int num1, num2, choice;
struct cl_node *location;
/*Displaying a menu of choices for performing list operations*/

while(1)
{
clrscr();
printf(“\n\nSelect an option\n”);
printf(“\n1 - Insert”);
printf(“\n2 - Delete”);
printf(“\n3 - Search”);

printf(“\n4 - Print”);
printf(“\n5 - Exit”);
printf(“\n\nEnter your choice: “);
scanf(“%d”, &choice);

73
switch(choice)
{
case 1:

{
printf(“\nEnter the element to be inserted into the circular linked list: “);
scanf(“%d”,&num1);

insert(num1); /*Calling the insert() function*/


printf(“\n%d successfully inserted into the linked list!”,num1);
getch();

break;
}
case 2:
{
printf(“\nEnter the element to be deleted from the circular linked list: “);
scanf(“%d”,&num1);

num2=delete(num1); /*Calling the delete() function */


if(num2==-9999)
printf(“\n\t%d is not present in the list\n\t”,num1); else

printf(“\n\tElement %d successfully deleted from the list\n\t”,num2);


getch();

break;
}
case 3:
{
printf(“\nEnter the element to be searched: “);
scanf(“%d”,&num1);

74
location=search(num1); /*Calling the search()function*/
if(location==NULL)
printf(“\n\t%d is not present in the list\n\t”,num1); else

printf(“\n\tElement %d is present before element %d in the circular linked


list\n\t”,num1,(location->NEXT)->INFO);
getch();

break;
}
case 4:
{
print(); /*Printing the list elements*/
getch();
break;
}
case 5:

{
exit(1);

break;
}
default:
{
printf(“\nIncorrect choice. Please try again.”);
getch();

break;
}}}}

/*Insert function*/
void insert(int value)

75
{
/*Creating a new node*/
struct cl_node *PTR = (struct cl_node*)malloc(sizeof(struct cl_node));

/*Storing the element to be inserted in the new node*/


PTR->INFO = value;
/*Linking the new node to the circular linked list*/

if(FIRST==NULL)
{
FIRST = LAST = PTR;

PTR->NEXT=FIRST;
}
/*Delete function*/
int delete(int value)
{
struct cl_node *LOC,*TEMP;

int i;
i=value;
LOC=search(i); /*Calling the search() function*/
if(LOC==NULL) /*Element not found*/
return(-9999);
if(LOC==FIRST)
{
if(FIRST==LAST)
FIRST=LAST=NULL;
else
{
FIRST=FIRST->NEXT;

LAST->NEXT=FIRST;

76
}
return(value);
}

for(TEMP=FIRST;TEMP->NEXT!=LOC;TEMP=TEMP->NEXT);

if(LOC==LAST)
{
LAST=TEMP;
TEMP->NEXT=FIRST;
}
else
TEMP->NEXT=LOC->NEXT;
return(LOC->INFO);
}

/*Search function*/
struct cl_node *search (int value)
{
struct cl_node *PTR;

if(FIRST==NULL) /*Checking for empty


list*/ return(NULL);
if(FIRST==LAST & FIRST->INFO==value) /*Checking if there is only one
element in the list*/
return(FIRST);
/*Searching the linked list*/
for(PTR=FIRST;PTR!=LAST;PTR=PTR->NEXT) if(PTR->INFO==value)

return(PTR); /*Returning the location of the searched element*/

77
if(LAST->INFO==value)
return(LAST);
else

return(NULL); /*Returning NULL value indicating unsuccessful search*/


}
/*print function*/
void print()
{
struct cl_node *PTR;
if(FIRST==NULL) /*Checking whether the list is empty*/
{
printf(“\n\tEmpty List!!”);
return;
}
printf(“\nCircular linked list elements:\n”);

if(FIRST==LAST) /*Checking if there is only one element in the list*/


{
printf(“\t%d”,FIRST->INFO);
return;
}
/*Printing the list elements*/

for(PTR=FIRST;PTR!=LAST;PTR=PTR->NEXT)
printf(“\t%d”,PTR->INFO); printf(«\t%d»,LAST->INFO);
}

Output

Select an option
1 - Insert

78
2 - Delete
3 - Search
4 - Print

5 - Exit

Enter your choice: 4


Empty List!!

Select an option
1 - Insert
2 - Delete
3 - Search
4 - Print
5 - Exit
Enter your choice: 1

Enter the element to be inserted into the circular linked list: 1


1 successfully inserted into the linked list!

Select an option
1 - Insert
2 - Delete

3 - Search
4 - Print
5 - Exit
Enter your choice: 1
Enter the element to be inserted into the circular linked list: 2
2 successfully inserted into the linked list!

79
Select an option
1 - Insert
2 - Delete

3 - Search
4 - Print
5 - Exit
Enter your choice: 1
Enter the element to be inserted into the circular linked list: 3
3 successfully inserted into the linked list!

Select an option
1 - Insert
2 - Delete
3 - Search
4 - Print

5 - Exit
Enter your choice: 3
Enter the element to be searched: 2
Element 2 is present before element 3 in the circular linked list
Select an option
1 - Insert

2 - Delete
3 - Search
4 - Print
5 - Exit
Enter your choice: 3
Enter the element to be searched: 3
Element 3 is present before element 1 in the circular linked list

80
1 3
Select an option

1 - Insert
2 - Delete
3 - Search
4 - Print
5 - Exit
Enter your choice: 5
Advantages of a Circular linked list
1. Any node can be a starting point. We can traverse the whole list by
starting from any point. We just need to stop when the first visited
node is visited again.
2. Useful for implementation of queue. Unlike this implementation, we
don’t need to maintain two pointers for the front and rear if we use
circular linked list. We can maintain a pointer to the last inserted
node and front can always be obtained as next of last.
3. Circular lists are useful in applications to repeatedly go around the
list. For example, when the multiple applications are running on a
PC, it is common for the operating system to put the running
applications on a list and then to cycle through them, giving each of
them a slice of time to execute, and then making them wait while the
CPU is given to another application. It is convenient for the operating
system to use a circular list so that when it reaches the end of the
list it can cycle around to the front of the list.
4. Circular Doubly Linked Lists are used for implementation of
advanced data structures like Fibonacci Heap.

1.3 Doubly linked list


Declarations

A doubly linked list or a two-way linked list is a more complex type of


linked list which contains the address of the next node as well as the

81
previous node in the sequence. Therefore, it consists of three parts namely:
data, a pointer to the previous node and a pointer to the next node.
In doubly linked list, we assume that a header node is created which
is a dummy node/starting point , whose ‘next’ field contains the address of
the initial node of the list and ‘prev’ field is null. First node of the list contains
the address of the succeeding node in the next part and null value for the
prev field. The final node in this list contains a special address NULL for its
next field. Each other node in the list contains the address of the previous
node as well as succeeding node in the list. Thus we see that a doubly
linked list provides the ease to manipulate the elements of the list as it
maintains pointers to nodes in both the directions (forward and backward).
The main advantage of using a doubly linked list is that it makes searches
twice as efficient. A diagrammatic representation of the doubly linked list
has been shown below in figure.

Figure: Simple Representation of a Doubly linked list

In C, we will implement the structure of the doubly linked list as follows:


struct node
{
struct node *prev;
int data;
struct node *next;

};
typedef struct node *L;
In the above structure, ‘prev’ indicates the address of the previous node and
‘next’ indicates the address of the next node. The prev field of the first node
and next field of the last node contains null value.

82
4.2.1 Doubly Linked List Operations

a) Insertion into a Doubly linked list

A new node can be added/inserted into a linked list under the following
three cases/situations:
Case 1: The new node is inserted at the beginning of the list

Case 2: The new node is inserted at the end of the list


Case 3: The new node is inserted at the middle of the list
Case 1: Insertion at the beginning of the doubly linked list

If we wish to add a new node at the beginning of the list, memory has to be
allocated for the new node structure by assigning the value to the data part
and making the ‘next’ to point to the address of the first node of the list and
‘prev’ to NULL. Meanwhile, the header node value has to be adjusted to
point the new node created, thus making the new node as the initial node of
the list. This process has been depicted in figure.

Figure: Insertion into the beginning of the list


In the above figure, we wish to add a new node at the beginning of the list
with the value of 6. Then the following changes take place in the list as
shown in figure. Now the header node will point to the newly added node
with value 6. Correspondingly, the C routine to insert a new node at the
beginning of the list has been shown below:
void insertbeg (int num, list l)

{
struct node *tmpcell;
tmpcell=(struct node*) malloc(sizeof(struct node));
if(l->next==null)
{

83
tmpcell->next=l->next;
tmpcell->prev=l;
l->next=tmpcell;

}
else
{
tmpcell->next=l->next;
l->next->prev=tmpcell;
l->next= tmpcell;
tmpcell->prev=l;
}
}
In the above code, num represents the number to be inserted and l
represents the address of the header node. Memory has been allocated for
the new node using malloc statement. If memory allocation is successful,
the number has been assigned to the data part of the newly created node. If
(L->next == null) means that the list is an empty list. If its an empty list,
header node points to the new node. Otherwise, the address of the new
node has been updated to the ‘next’ field of the header node (l->next =
tmpcell) showing that the new node has been added to the beginning of the
list. The ‘next field’ of the new node has been assigned to NULL and its
‘prev’ field points to header node.
Case 2: Insertion at the end of the doubly linked list

If we wish to add a new node at the end of the doubly linked list,
then the flowing changes will happen as shown in figure.

Figure: Insertion at the end of the doubly linked list

84
In the above figure, we wish to add a new node at the end of the list
with the value of 10. Now the final node of the existing list will point to the
newly added node with value 10. Correspondingly, the C routine to insert a
new node at the end of the list has been shown below:
void insertend (int num, list l)
{

struct node *temp, *ptr;


temp= (struct node *) malloc (size of(struct node));
temp-> data=num;

ptr = l;
while (ptr->next!=null)
{
ptr=ptr->next;
}
temp->next=ptr->next;
ptr->next=temp;
temp->prev=ptr;
}

In the above code, num represents the number to be inserted and l


represents the address of the header node. Memory has been allocated for
the new node using malloc statement. If memory allocation is successful,
the number has been assigned to the data part of the newly created node.
Starting from the header node, the list has been traversed or iterated
through until it reaches the end of the list using the pointer ptr. Once the end
of the list has been obtained(ptr->next =null), the new node has been added
there by updating its ‘next’ field to NULL and ‘prev’ field to the address of
the node pointed by ptr. Subsequently, the ‘next’ field of the pointer ptr has
been updated to the new node inserted.
Case 3: Insertion at the middle of the doubly linked list

Consider the doubly linked list shown in the figure and we wish to add a
new node at the middle of the list with the value 3 after the node containing
the value 5. This process is clearly depicted in the figure.

85
Figure: Insertion at the middle of the doubly linked list
In this figure, the pointers of the node with value 5 and its succeeding node
has been adjusted in such a way that the value 3 has been added between
5 and 7. The following C routine explains how to insert a new node at the
middle of the list after a given value/position:
void insertmiddle (int num, list l, int pos)

{
struct node *tmpcell, *temp;
int count=0;
tmpcell= (struct node *) malloc (size of(struct node));
tmpcell-> data=num;
temp=l;
while(temp ->next!= null && count < pos)
{
count++;
temp=temp->next;
}
tmpcell-> next= temp->next;

temp-> next->prev= tmpcell;


tmpcell->prev= temp;
temp->next=tmpcell;
}
In the above code, num represents the number to be inserted, pos
represents the position/location of the node after which the new node needs

86
to be inserted and l represents the address of the header node. Memory
has been allocated for the new node using malloc statement. If memory
allocation is successful, number has been assigned to the data part of the
newly created node. Starting from the header node, the list has been
traversed or iterated through while loop, until it reaches location specified
(temp ->next!= null && count < pos). Once the location is reached, new
node has been added to the succeeding ones of ‘temp’ node.
Correspondingly, the address of the ‘next’ field and ‘prev’ field of tmpcell
has been updated to the succeeding node of ‘temp’ and ‘temp’.
b) Deletion from a Doubly linked list
A new node can be removed/deleted from a doubly linked list under the
following three cases/situations:
Case 1: The new node is deleted from the beginning of the list
Case 2: The new node is deleted from the end of the list
Case 3: The new node is deleted from the middle of the list
Similar to singly linked list, when we delete a node, memory has been
deallocated for that particular node and the memory is returned to the free
pool so that it can be used to store other useful programs and data. When
the list has only one node, after deletion the list becomes empty and the
header node would be made to point NULL.
Case 1: Deletion from the beginning of the doubly linked list

Consider the doubly linked list shown in figure. When we want to delete a
node from the beginning of the list, then the following changes will be done
in the linked list.

Figure: Deletion from the beginning of the doubly linked list


In the above figure, we wish to delete a node from the beginning of the list
with the value of 10. Now the memory occupied by the first node of the list
has been freed and the header node will point to the next node with the

87
value 15. A C routine which illustrates the deletion of a node from the
beginning of the list has been shown below:
void deletebeg ( list l)

{
struct node *temp;
temp=l->next;

l->next= temp->next;
temp->next->prev=l;
free(temp);

}
In the above code, l represents the address of the header node. Here, temp
stores the address of the first node and the Header node (l->next) has been
made to point to the address of the next (second) node in the sequence.
Meanwhile, the ‘prev’ field of the second node in the sequence (temp->next-
>prev) points to the header node correspondingly. Finally, memory of the
first node has been freed off.
Case 2: Deletion from the end of the doubly linked list
If we wish to remove a node from the end of the doubly linked list, the list
will be iterated till it reaches the end of the list. The last node of the list gets
deleted as shown in figure 15.

Figure: Deletion from the end of the doubly linked list

In the above figure, we wish to delete a node with value 15 from the end of
the list. Here, the node which is preceding the last node becomes the last
node now. Correspondingly, the C routine to delete a node from the end of
the list has been shown below:
void deleteend ( list l)
{
struct node *temp *r;

88
temp= l->next;
r=l->next;
while(r->next!=NULL)

{
temp=r;
r=r->next;
}
temp->next=null;
free(r);
}
In the above code, l represents the address of the header node. Here, temp
stores the address of the previous node and r stores the address of the
current node while iterating through the list. Once it reaches the end of the
list, temp node contains the address of the last but one node (preceding
ones of last node) and r denotes the last node. Here, the ‘next’ field of the
temp has been updated to null and the memory of last node ‘r’ node has
been freed off.
Case 3: Deletion from the middle of the doubly linked list

Suppose we may wish to delete a node from the middle of the list after a
particular given value as shown in figure.

Figure: Deletion from the middle of the doubly linked list

In the above figure 16, we wish to delete the node containing the value 15
that succeeds the value 4 from the middle of the list. Here, the pointer
variable has been moved until it reaches the node containing the value 4,
and the address fields gets updated accordingly as shown in figure 16. The
C routine which illustrates the deletion of the node from the middle of the list
has been shown below:

void deletemiddle (int num, list l)

89
{
struct node *temp, *curr, *p;
temp= l->next;

p = l;
while (temp!=NULL && temp->data!=num)
{
p=temp;
temp=temp->next;
}
curr = p->next;
p->next =curr->next;
curr->next->prev=p;
free(curr);
}
In the above code, num represents the number to be deleted, and l
represents the address of the header node. While loop iterates through the
list and identifies the address of the previous node to be deleted (p) and
curr represents the address of the current node to be deleted. Here, the
next field of the previous node (p) has been updated to the address of the
succeeding node of the current node (curr). Also, the ‘prev’ field of ‘curr-
>next’ has been updated to the previous node ‘p’. Later, memory of the
current node (curr) has been deallocated.
c) Traversing a Doubly linked list

Traversing a doubly linked list is very similar to traversing a singly linked list.
Traversal gets started from the beginning of the list and it iterates through
the list and prints the content until it reaches the end of the list. But traversal
is possible in both the directions unlike a singly linked list.

90
4.2.2 Doubly Linked List Implementation

The implementation of doubly linked list involves declaring its structure and
defining its operations.

The following example shows how a doubly linked list is implemented in C.

Example Program for Implementation of a doubly linked list


#include<stdio.h>
#include<conio.h>

/*Doubly linked list declaration*/

struct dl_node
{
int INFO;
struct dl_node *NEXT;

struct dl_node *PREVIOUS;


};
/*Declaring pointers to first and last node of the doubly linked list*/
struct dl_node *FIRST = NULL;
struct dl_node *LAST = NULL;
/*Declaring function prototypes for list operations*/
void insert(int);
int delete(int);
void print(void);

struct dl_node *search (int);


void main()

91
{
int num1, num2, choice;
struct dl_node *location;

/*Displaying a menu of choices for performing list operations*/


while(1)

{
clrscr();
printf(“\n\nSelect an option\n”);

printf(“\n1 - Insert”);
printf(“\n2 - Delete”);
printf(“\n3 - Search”);
printf(“\n4 - Print”);
printf(“\n5 - Exit”);
printf(“\n\nEnter your choice: “);

scanf(“%d”, &choice);
switch(choice)
{
case 1:
{

printf(“\nEnter the element to be inserted into the doubly linked list: “);
scanf(“%d”,&num1);

insert(num1); /*Calling the insert() function*/


printf(“\n%d successfully inserted into the linked list!”,num1);
getch();

break;
}

92
case 2:
{
printf(“\nEnter the element to be deleted from the doubly linked list: “);
scanf(“%d”,&num1);
num2=delete(num1); /*Calling the delete() function */
if(num2==-9999)

printf(“\n\t%d is not present in the doubly linked list\n\t”,num1); else


printf(“\n\tElement %d successfully deleted from the doubly linked list\
n\t”,num2);

getch();
break;
}
case 3:
{
printf(“\nEnter the element to be searched: “);
scanf(“%d”,&num1);
location=search(num1); /*Calling the search()*/
if(location==NULL)

printf(“\n\t%d is not present in the list\n\t”,num1);


else
{

if(location==LAST)
printf(“\n\tElement %d is the last element in the list”,num1);

else
printf(“\n\tElement %d is present before element %d in the doubly linked
list\n\t”,num1,(location->NEXT)->INFO);
}
getch();
break;

93
}
case 4:
{

print(); /*Printing the list elements*/


getch();
break;
}
case 5:
{
exit(1);
break;
}
default:
{
printf(“\nIncorrect choice. Please try again.”);

getch();
break;
}}}}

/*Insert function*/
void insert(int value)

{
/*Creating a new node*/
struct dl_node *PTR = (struct dl_node*)malloc(sizeof(struct dl_node));
/*Storing the element to be inserted in the new node*/
PTR->INFO = value;
/*Linking the new node to the doubly linked list*/

94
if(FIRST==NULL)
{
FIRST = LAST = PTR;

PTR->NEXT=NULL;
PTR->PREVIOUS=NULL;
}
else
{
LAST->NEXT = PTR;
PTR->NEXT = NULL;
PTR->PREVIOUS = LAST;
LAST = PTR;
}
}
/*Delete function*/

int delete(int value)


{
struct dl_node *LOC,*TEMP;
int i;
i=value;
LOC=search(i); /*Calling the search() function*/

if(LOC==NULL) /*Element not found*/


return(-9999);
if(LOC==FIRST)

{
if(FIRST==LAST)
FIRST=LAST=NULL;

else

95
{
FIRST->NEXT->PREVIOUS=NULL;
FIRST=FIRST->NEXT;

}
return(value);
}
for(TEMP=FIRST;TEMP->NEXT!=LOC;TEMP=TEMP->NEXT);
if(LOC==LAST)
{
LAST=TEMP;
TEMP->NEXT=NULL;
}
else
{
TEMP->NEXT=LOC->NEXT;

LOC->NEXT->PREVIOUS=TEMP;
}
return(LOC->INFO);
}
/*Search function*/
struct dl_node *search (int value)

{
struct dl_node *PTR;

if(FIRST==NULL) /*Checking for empty


list*/ return(NULL);
if(FIRST==LAST && FIRST->INFO==value) /*Checking if there is only one
element in the list*/
return(FIRST);

96
/*Searching the linked list*/
for(PTR=FIRST;PTR!=LAST;PTR=PT
R->NEXT) if(PTR->INFO==value)

return(PTR); /*Returning the location of the searched element*/


if(LAST->INFO==value)

return(LAST);
else
return(NULL); /*Returning NULL value indicating unsuccessful search*/
}
/*print function*/
void print()
{
struct dl_node *PTR;
if(FIRST==NULL) /*Checking whether the list is empty*/
{
printf(“\n\tEmpty List!!”);
return;

}
printf(“\nDoubly linked list elements:\n”);
if(FIRST==LAST) /*Checking if there is only one element in the list*/

{
printf(“\t%d”,FIRST->INFO);
return;
}
/*Printing the list elements*/ for(PTR=FIRST;PTR!=LAST;PTR=PTR-
>NEXT) printf(“\t%d”,PTR->INFO); printf(«\t%d»,LAST->INFO);
}

97
Output :-

The output of this program is similar as singly linked list implementation.

Applications of Linked list


Linked lists are used in the computer programming areas such as
Database Management systems, Process Management, Operating systems,
Text editors etc. An important application of linked list is to represent
polynomials and their manipulations. As an application of linked lists,
addition and subtraction of polynomials could be performed.
Let us sum up
This unit we discussed about the linked list concepts and their types.
Moreover, we earned the knowledge of double and circular linked lists with
example. For further readings refer the suggested books.
Check your progress
1. A _____________ allows data to inserted into and deleted from a list
2. The first part containing information/data and the second part
containing the link/address of the next node in the list is represented
as _________.

3. Doubly linked list is represented by _________ , ________ ,


_________
4. The last node in the list points to the first node in the list in a circular
fashion is called as __________

Glossary
Singly linked list, Circular linked lists, Doubly Linked list

Suggested readings
1. E Balagurusamy, “Data Structures”, McGraw Hill Education
(India) Private Limited, 2019.
2. Mark Allen Weiss, "Data Structures and Algorithm Analysis in
C++”, Third Edition, Pearson Education, 2007.
3. Reema Thareja, ―Data Structures Using C‖, Second Edition

98
, Oxford University Press, 2011.
4. Ellis Horowitz, Sartaj Sahni, Susan Anderson-Freed,
―Fundamentals of Data Structures in C‖, Second Edition,
University Press, 2008.

Answers to check your progress


1. Linked list
2. Singly linked Lists
3. Data, a pointer to the previous node, and a pointer to the next
node
4. Circular linked list.

99
Block-2: STACKS AND QUEUES

Unit-5: Stacks

Unit-6: Queues

100
Unit -5
Stacks

Structure

Overview
Learning objectives
5.0 Stacks

5.1 Stack Representation in Memory


5.2 Arrays vs. Stacks
5.3 Stack Operations
5.4 Array Implementation of Stacks
5.5 Linked Implementation of Stacks
Let us sum up
Check your progress
Glossary
Suggested readings

Answers to check your progress

Overview
This Unit deals about the different types of data structures such as stack
which may provide different capabilities to organize the data. Each data
structure has its own unique properties and it is constructed to suit the
various kinds of applications.

Learning objectives
At the end of this unit you will be able to
 Understand the concepts of Stacks and the Representation in Memory.
 Get knowledge about the Arrays vs. Stacks and Stack Operations.
 Get clear idea about the Array and linked Implementation of Stacks.

101
5.0 Stack
A Stack is an important linear data structure which follows the Last
In First Out (LIFO) mechanism to store the elements in an ordered manner.
It is more restrictive than linked list. A stack is a linear list in which all
insertions and deletions are made at one end, called the ‘top’. The last
element to be inserted into the stack will be the first one to be removed.
Thus stacks are sometimes referred to as Last In First Out (LIFO) lists. A
stack is a last in first out structure (LIFO). A stack can be visualized as a
pile of plates. Plates are added to the top of pile and the plate at the top of
the pile which was added at the last will be the first to be removed.

5.1 Stack Representation in Memory

Stacks appear as a group of elements stored at contiguous locations


in memory. Each successive insert or delete operation adds or removes an
item from the group. The top location of the stack or the point of addition or
deletion is maintained by a pointer called top. The following Figure shows
the logical representation of stacks in memory.

5.2 Arrays vs. Stacks


While both arrays and stacks may look to be similar in their logical
representation, they are different in several aspects, as explained in the
following Table.

102
S.No, Arrays Stacks

1 Arrays provide the flexibility of adding or Stacks restrict the


removing data elements anywhere in insertion or deletion of
the list, i.e., at the beginning, end or elements to only one
anywhere in the middle. While this place in the list i.e. the
flexibility may seem to be a boon in top of the stack. Thus,
certain situations, the same may not be there are no
true in situations where frequent associated overheads
insertions or deletions are required. of shifting other
This is because; each insertion or elements to new
deletion in arrays requires the adjoining locations.
elements to be shifted to new locations,
which is an overhead.

2 By using arrays, a programmer can Stacks find their


realize common scenarios where usage as vital in
grouping of records is required, for solutions to advanced
example inventory management, systems problems
employee records management, etc. such as recursion
control, expression
evaluation, etc.

5.3 General Operations on Stack


In a Stack data structure, the entry/exit point to add/delete an element is
called as the ‘Top’ of the stack. The operations that could be performed on
a Stack data structure such as InitStack(creation), Push, Pop and Top are
listed below:
InitStack(Stack): This operation creates an empty stack. This creates
an empty stack which contains NULL as an initial value as shown in figure.

103
Null

Figure: Creating Empty stack


Push (Item): This operation pushes (inserts) an item/ new element at
the top of the stack as shown in figure below which adds/pushes the
element 15.

5 Push (15) 5

4 4 15

3 10 3 10

2 2 2 2

1 3 1 3

Figure: Push operation of a Stack

Pop(): This operations removes the first item from the stack Pop
operation removes the first element (i.e. recently added) from the top of the
stack. In figure, below the recently added element 15 has been removed
from the top of the stack.

5 5

4 15 X = pop() 4

3 10 3 10

2 2 2 2

1 3 1 3

X=15

104
Figure: Pop operation of a Stack
Top(Stack): This operation returns the first/top item from the stack
without removing it . This operation retrieves the value stored in the top of
the stack. In figure, below the element at the top of the stack is 10 which is
returned as a result.
Top=10

5 10

4 12

3 13

2 9

1 8

Figure: Returning Top of stack


Stack is a data structure which is frequently used in situations where the
order of processing is very important. Since stack is a list, we can use two
popular implementations to stack. One is an array based implementation of
stack since it represents a linear array and the other one is a linked
implementation.
5.4 Array Implementation of Stacks
Stacks are a subclass of Linear Lists; all access to a stack is
restricted to one end of the list, called the ‘top’ of the stack. Visually, it can
be a stack of books, coins, plates, etc. Addition of books, plates, etc. occurs
at the top of the stack; removal also occurs at the same end of the stack
called ‘top’. A Stack is an ordered (by position, not by value) collection of
data (usually homogeneous), with the following operations defined on it.
The stack itself is a structure containing two elements: Data: the array of
data, and TOP- an index that keeps track of the current top of stack; that is,
it tells us where data is added for a Push and removed for a Pop. An array-
based stack requires two values as a priori: the type of data contained in the
stack, and the size of the array called MAX. If TOP=NULL, it indicates that
the stack is empty and if TOP=MAX-1, then the stack is full.

105
a) Stack Operations using Arrays

Operations that could be performed in an Array implementation of a


Stack is listed below with suitable C routines:
Initialize: This operation would initialize the internal structure of the
stack and create an empty stack. C language statements which could better
represent the initialization of stack has been listed below:

# define MAX 10
int st[MAX] , top = -1;
The above statement indicates that a stack of integer data type with
a maximum limit of 10 elements has been created. Initially, top variable has
been assigned to -1 which shows that the stack is empty or no elements in
the stack.
Push: This operation adds new element to the top of the stack. However,
before adding the element it would check whether the stack is full or it has
empty space to add new elements. Element would be added to the stack
only if the array(stack) has enough room in it otherwise a OVERFLOW
message will be displayed to the user. C routine which depicts the push
operation of a Stack using arrays has been shown below:

void push(int st[], int item)


{
if (top == max-1)
printf(“%d”, “Stack overflow”);
else
{

top++;
st[top] = item;
}
}
In the above routine,before adding an element, the it has been analyzed
that whether top variable does not reache the maximum limit of the array
(top==max-1, since the array gets started from 0th position). If the stack
reaches the maximum limit, an overflow message will be printed to the user.

106
Otherwise, top variable gets incremented by one and the element gets
inserted in the corresponding position.
Pop: This operation removes the top element from the stack. However,
before deleting the topmost element of the stack, we must first check
whether the stack is empty. If it is empty, an underflow message has to be
displayed to the user. Otherwise, the top element will be removed/deleted
from the stack. C routine which depicts the pop operation of a Stack using
arrays has been described below:
int pop(int st[])

{
if (top== -1)
printf(“%s”,”Stack Underflow”);
else
{
int element = st[top];
top--;
return element;
}

}
In the above routine, before deleting an element, the value of the top
variable has been checked. If top==-1, it represents that the stack is empty.
An UNDERFLOW message will be displayed to the user if we attempt to
pop an element from the empty stack. Otherwise, the element in the top
position of the stack will be retrieved and the top variable gets decremented
by one to show that the top element of the stack gets deleted.
Top: This operation returns/retrieves the value at the top of the stack
(without deleting that element). C routine which depicts the top operation of
a Stack using arrays has been described below:
int top( int st[])
{
if (top== -1)
printf(“Stack Underflow”);

107
else
{
int element = st[top];

return element;
}
}
In the above code, it checks whether the stack is empty. If not so, the
element at the top position of the stack gets returned to the user.
5.5 Linked Implementation of a Stack
One disadvantage of using an array to implement a stack is the
wasted space—because arrays should be declared with a fixed size and
most of the times certain amount of the array space is unused. It is very
difficult to predict the maximum size of the array in advance. A more elegant
and economical implementation of a stack may be the usage of a linked list,
because we can make use of it if the array size cannot be determined in
advance.
A linked-list is somewhat of a dynamic array that grows and shrinks
as values are added to it. Rather than being stored in a continuous block of
memory, the values in the dynamic array are linked together with pointers.
In a linked stack, every node has two parts- one that stores the data and the
other one that contains the address of the next node. The initial node of the
linked stack is considered to be the ‘TOP’ node or ‘TOP’. Here, the header
node is always made to point to the node which is considered as ‘TOP’.
Figure below represents the simple representation of a stack using linked
implementation. Here, the header node points to the initial node of the stack
which is considered as ‘TOP’.

Header Data Data Data Data

Next Next Next Next

TOP
Figure: Simple Representation of a Linked stack

108
Linked Structure of the Stack with respect to C language has been
explained below:
struct stack

{
int data;
struct stack *next;

} *top = NULL;
typedef struct stack st;
Here st represents the self referential stack structure with a data part
and a next pointer field to point to the next node of the stack. Top has been
initialized to null to represent an empty stack.
a) Operations on Stack using linked lists

Operations such as push and pop could be performed on a stack


using linked list implementation. We perform a push by inserting at the front
of the list and at the same time, a pop could be done by deleting the
element from the front of the list. We merely create a header node and set
the next pointer of it to NULL if the stack is empty, otherwise the header
node would always point to the first node i.e. TOP node of the list and
insertion/deletion takes place in that position.
PUSH: This operation inserts the new element at the beginning of the stack
like inserting a new node at the beginning of the singly linked list. C routine
which depicts this PUSH operation has been shown below:
void push( Stack st, int num)
{

st *node, *firstcell;
firstcell = st->next;
node = (st*)malloc(sizeof(st));
node->data = num;
node->next = firstcell->next;
top = node;
}

109
In the above code, st represents the address of the header node of the
stack and num represents the number to be pushed inside the stack. During
the push operation, the next field of the header node has been made to
point to the newly inserted node and new node has been assigned as the
top node (top=node).
POP: This operation deletes the element from the beginning of the stack
like deleting a node from the beginning of the singly linked list. C routine
which depicts this POP operation has been shown below:
int pop( Stack st)

{
st *firstcell; int num;
firstcell = st->next;
num = firstcell->data;
st->next = firstcell->next;
free(firstcell);
top = st->next;
return num;
}

In the above code, st represents the address of the header node of the
stack. During the pop operation, the next field of the header node (st->next)
has been made to point to the second node in the sequence (firstcell->next),
which becomes the current top node of the list(top=st->next) . Subsequently
the node firstcell has been freed off by deallocating the memory.
Top () – This operation returns the value of the top element of the stack and
the corresponding C code has been given below:
int top (stack st)
{
return top->data;
}

110
b) Application of Stacks

Applications of the stack include Recursion, Evaluation of arithmetic


expressions and control transfers to subprograms. Recursion is the
process of a function calling itself till a condition is satisfied. This is an
important feature in many programming languages. There are many
algorithmic descriptions for recursive function. Evaluation of arithmetic
expressions could be performed using stacks. Also, stacks are suitable
data structures to backtrack the program flow or control transfers too.
Stacks are widely used in:

 Evaluation of Postfix expressions


 Conversion of Infix to Postfix expressions, postfix to infix
 Backtracking problems
 Recursive functions
Here, in this section, we explain how the stack has been used to
evaluate the postfix expressions in detail.
c) Evaluation of expressions

Computers solve arithmetic expressions by restructuring them so


that the order of each calculation is embedded in the expression. Once
converted, to the required notation (Either Prefix or Postfix), an expression
can be solved
Types of Expression

Any expression can be represented in 3 ways namely – Infix, Prefix


and Postfix. Consider the expression 4 + 5 * 5. This can be represented as
follows:
Infix - 4 + 5 * 5.
Prefix - + 4 * 5 5 (The Operators precede the operands.)
Postfix - 4 5 5 * + (The Operators follow the operands.)

The default way of representing a mathematical expression is in the


form of Infix notation. This method is called Polish Notation (discovered by
the Polish mathematician Jan Lukasiewicz). The Valuable aspect of RPN
(Reverse Polish Notation or Postfix) are:

111
 Parentheses are not necessary
 Easy for a compiler to evaluate an arithmetic expression.
Postfix (Reverse Polish Notation)

Postfix notation arises from the concept of post-order travel of an


expression tree. For now, consider the postfix notation as a way of
redistributing operators in an expression so that their operation is delayed
until the correct time.
Consider a quadratic formula:
X = (-b + (b^2-4*a*c)^0.5)/(2*a)

In postfix form the formula becomes


X b @ b 2 ^ 4 a * c * - 0.5 ^ 2a*/=
Where ‘@’ represents the unary ‘-‘ operator.
Notice that the order of the operands remain the same but the
operands are redistributed in a unconventional way. The reason for using
postfix notation is that a fairly simple algorithm exists to evaluate such
expressions based on using a stack:
Evaluation of Postfix expressions using Stack
Consider the following postfix expression and the algorithm to
evaluate the post fix expression using stack as a mechanism:
6523+8*+3+*

Algorithm

initialize stack to empty;


while (not end of postfix expression){

get next postfix item;


if(item is value)
push it onto the stack;

else if (item is binary operator){


Pop the stack to x;
Pop the stack to y;

112
Perform y operator x;
Push the results onto the stack;
}

else if (item is unary operator){


pop the stack to x;
Perform operator (x);
Push the results onto the stack
}
}
Unary operators are Unary minus, square root, sin, cos, exp, etc.,
So for the above expression : 6523+8*+3+*
the first item to be pushed into the stack is value (6)
the next item is value (5)
the next item is value (2)
the next item is value (3)

And the stack becomes

TOS=>
3

The remaining items are now: +8*+3+*, So next a ‘+’ is read (a binary
operator), so 3 and 2 are popped from the stack and their sum ‘5’ is pushed
onto the stack:

113
TOS=>

Next 8 is pushed and the next time is the operator*:

TOS=>8

(8, 5 popped, 40 pushed)

TOS=>40

40

Next the operator + followed by 3:

114
TOS=>

45

(40,5 popped, 45 pushed, 3 pushed)

TOS=>

45

Next is operator +, so 3 and 45 are popped and 45+3=48 is pushed

TOS=>
48

Next is operator *, so 48 and 6 are popped, and 6 * 48=288 is pushed

115
TOS=>

288

Now there are no more items and there is a single value on the stack,
representing the final answer 288. The answer was found with a single
traversal of the postfix expression, with the stack being used as a kind of
memory storing values that are waiting for their operands. Therefore, using
stack, the expression has been evaluated and the answer 288 has been
obtained as a result.

c) Recursion
A Recursive function is a function whose definition is based upon
itself i.e a function containing either a call statement to another function that
may eventually result in a call statement to the original function, then that
function is called a recursive function. For example, finding the factorial of a
given number is a recursion. Stack data structure is an important
mechanism to implement recursion in programming languages.
In programming languages, each call to a subroutine requires that
the subprogram should have a storage area where it can keep its local
variables, its calling parameters and its return address. For a recursive
function the storage areas for subprogram calls are kept in a stack.
Therefore any recursive function may be rewritten in a non-recursive form
using stack. Stack data structure is used by many programming languages
for implementing function calls and recursive functions.

116
Let us sum up
In this unit we discussed about the concepts of Stacks in data
structures. We also discussed the Stack Representation in Memory, Arrays
vs. Stacks, Stack Operations and earned knowledge about the Array and
linked Implementation of Stacks. For further reference refer the suggested
books listed.

Check your progress

1. LIFO stands for_______


2. ________ operations creates an empty stack.
3. ______ operation adds new element to the top of the stack.

Glossary
Initialize, Push, Pop, Top
Suggested readings
1. E Balagurusamy, “Data Structures”, McGraw Hill Education (India)
Private Limited, 2019.
2. Mark Allen Weiss, "Data Structures and Algorithm Analysis in C++”,
Third Edition, Pearson Education, 2007.
3. Reema Thareja, ―Data Structures Using C‖, Second Edition , Oxford
University Press, 2011.
4. Ellis Horowitz, Sartaj Sahni, Susan Anderson-Freed, ―Fundamentals of
Data Structures in C‖, Second Edition, University Press, 2008.

Answers to check your progress


1. Last In First Out
2. InitStack(Stack)
3. Push

117
Unit -6
Queues

Structure

Overview
Learning objectives
6.0 Queues

6.1 Logical Representation of Queues


6.2 Queue Operations
6.3 Array Implementation of Queues
6.4 Linked Implementation of Queues
6.5 Circular Queues
6.6 Priority Queues
6.7 Double-Ended Queues.
Let us sum up
Check your progress

Glossary
Suggested readings
Answer to check your progress

Overview
This chapter deals about the different types of data structures such as
queues which may provide the different capabilities to organize the data.
Each data structure has its own unique properties and it is constructed to
suit the various kinds of applications.

Learning objectives
At the end of this unit you will be able to

118
 Get knowledge about the Queues and the Logical Representation of
Queues
 Understand the concepts of Queue Operations and Array and linked
Implementation of Queues
 Get clear idea about the Circular Queues and Priority Queues
 Know the Double-Ended Queues.

6.0 Queue
In everyday life, we encounter queues everywhere - a line of people
waiting to buy a ticket or waiting to be served in a bank; all such lines of
people are queues. The first person in line is the next one served, and when
another person arrives, he/she joins the queue at the rear.
A queue is a chronologically ordered list; the only difference
between a stack and a queue is that, in a stack, elements are added and
removed at the same end (the top), whereas in a queue, values are added
at one end (the rear) and removed from the other end (the front). The order
of elements in a queue, therefore, perfectly reflects the order in which they
were added: the first element added will be the first one removed (so
queues are called FIFO - ``first in first out''), and the last one added will be
the last one removed.
A Queue is an ordered set of homogeneous elements in which the
items are added at one end (called the rear) and are removed from the
other end (called the front). A queue is a First In First Out (FIFO) data
structure where the first element, which is inserted into the queue will be the
first one to be removed. All elements are inserted into the queue through an
entry point called ‘Rear’(last position of the queue) and the elements are
deleted from the queue using the exit point ‘Front’ which is the first position
in the queue.

6.1 Logical Representation of Queues


The figure shows the logical representation of a queue.

119
6.2 General Operations of a Queue
The operations that can be performed on a queue are listed below:
Initialize(Queue): This operation creates a new empty queue. This
operation must be done in order to make the queue logically accessible.
When a queue is empty, both the front and rear pointers will be -1, which
indicates that the queue is empty as shown in figure below. Here F and R
represents Front and Rear pointers which are outside the empty queue.

F R
Fig. Empty Queue
Enqueue(Item): This operation inserts the new element at the rear end
of the queue when the queue is not full. When the queue is full, it indicates
the OVERFLOW message to the user. Figure indicates that the queue is
having the value 2 at rear end whereas the front pointer points to the value
5 present in the beginning of the queue. Subsequently, enqueueing/adding
the value 30, adds 30 at the rear end of the queue as shown in figure.

5 10 15 2

Front Rear

APPEND (30) 5 10 15 2 30

Front Rear
Figure: Enqueue operation in a Queue

Dequeue(Queue): This operation removes and returns an item from the


front end of the queue. The precondition for this operation is that the queue
must not be empty. When we try to dequeue an item from an empty queue,
an UNDERFLOW message will be displayed to the user. Figure indicates
that the front end points to the element 5 and rear end points to the value of
2. Dequeue operation removes the element 5 and increments the front
pointer to point to the second element in the sequence.

5 10 15 2

120
Front Rear

X=remove() 5 10 15 2

X=5 Front Rear


Figure: Dequeue operation in a queue

6.3 Array Implementation of a Queue


In an array implementation of queue, a queue may be represented
as an array list to store the queue elements and two integer variables
named front and rear to hold the present positions of the front and rear
element of the queue. This structure requires two values as a priori: the type
of data contained in the queue, and the maximum size of the array called
MAX_SIZE. If the variables front=rear=NULL, it indicates that the queue is
empty and if the rear = MAX_SIZE-1, then the queue is full.
a) Queue Operations using Arrays

Operations that could performed in a Array implementation of a Queue


is listed below with suitable C routines:
Initialize: This operation would initialize the internal structure of the
queue and create an empty queue. C language statements which could
better represent the initialization of queue has been listed below:
# define MAX_SIZE 10
int queue[MAX_SIZE] , rear, front;
rear=front=-1;
The above statement indicates that a queue of integer data type with a
maximum limit of 10 elements has been declared and created. Two different
variables such as rear and front of integer data type has been declared to
hold the positions of the front and rear elements accordingly. Assigning -1 to
rear and front elements shows that the queue is empty or it doesn’t have
any elements.
Push: This operation adds the new element to the rear end of the
queue. However, before adding the element it would check whether the
queue is full or it has enough space to add new elements. Element would
be added at the rear end only if the array(queue) is free, otherwise an

121
OVERFLOW message will be displayed to the user. The C routine which
depicts the push operation of a Stack using arrays has been shown below:
void enqueue(int queue[], int item)

{
if (rear == max_size-1)
printf(“Queue overflow”);

else
{
rear++;

queue[rear] = item;
}
}
In the above routine, before adding an element, it has been analyzed
that whether the rear variable does not reache the maximum limit of the
array (rear==max_size-1). If the queue reaches the maximum limit, an
overflow message will be printed to the user. Otherwise, value of ‘rear’ gets
incremented by one and the element gets inserted in the rear position of the
queue and the newly added element represents the rear end of the queue.
Dequeue: This operation removes the element from the front end of the
queue. However, before deleting the front element of the queue, it is
checked for underflow condition. If so, an underflow message will be
displayed to the user. Otherwise, the element in the front position will be
removed/deleted from the queue. The C routine which depicts the dequeue
operation of a queue using arrays has been described below:

int dequeue(int queue[])


{
if (front== -1)

printf(“%s”,”Queue Underflow”);
else
{
int element = queue[front];

122
front++;
return element;
}

}
In the above routine, before deleting an element, the value of the
front variable has been checked. If front==-1, it represents that the queue is
empty. Otherwise, the element in the front position of the queue will be
retrieved and the front variable gets incremented by one to show that the
‘front’ points to the next element in the sequence.
6.4 Linked Implementation of a Queue
The queue can be implemented as a linked list with one external
pointer to the front of the queue and a second external pointer to the rear
(back of the queue). Each element of the queue can be represented with
respect to the C language as follows:
Struct node
{
int data;
struct node * next;

};
Struct node *l;
Struct queue
{
Struct node * front;
Struct node *rear;

};
Struct queue * q;
Here, the node represents the self referential queue structure with a
data part and a next pointer field to point to the next node of the queue. q is
used to denote or maintain two pointers ‘front’ and ‘rear’ of the queue which
are initialized to null during the creation of the queue (q->front= q-
>rear=NULL).

123
a) Operations on Queue using linked lists

Operations such as push and pop could be performed on a queue


using linked list implementation. We perform enqueue by inserting at the rear
end of the list and at the same time, a dequeue could be done by deleting the
element from the front of the list. We merely create a header node and set the
next pointer of it to NULL if the queue is empty, otherwise the header node
would always point to the first node i.e. the ‘front’ node of the list.
ENQUEUE: This operation inserts the new element at the rear end of
the queue like inserting a new node at the end of the singly linked list. The
C routine which depicts this PUSH operation has been shown below:
void enqueue( struct queue q, int num)
{
struct node *ptr;
ptr = (struct node*) malloc (size of(struct node*));
ptr->data = num;
if (q->front == NULL)
{

q->front=ptr;
q->rear=ptr;
q->front->rear= q->rear->next=NULL;
}
else
{

q->rear->next = ptr;
q->rear= ptr;
q->rear->next=NULL;
}
}

124
In the above code, q represents the address of the header node of the
queue which contains appropriate pointers for the rear and the front and
num represents the number to be inserted inside the queue. During the
enqueue operation, the rear and next field gets changed appropriately as
shown above.
DEQUEUE: This operation deletes the element from the beginning of
the queue like deleting a node from the beginning of the singly linked list.
The C routine which depicts this DEQUEUE operation has been shown
below:

void dequeue (struct queue *q)


{
struct node *ptr;
ptr= q->front;
if (q->front == NULL)
Printf(“Queue Underflow”);
else
{
q->front = q->front->next;

printf(“value deleted is %d”, ptr->data);


free(ptr);
}
}
In the above code, q represents the address of the header node of the
queue with pointers for the front and the rear. During the dequeue operation,
the front node (q->front = q->front->next) has been reassigned as the
address of the next node of the front node which means the ‘front’ points to
the second node in the sequence, thus making the second node as the
current front node of the list(q->front->next) . Subsequently the node ptr has
been freed off by deallocating the memory.
a) Dequeue

A dequeue is a queue in which the insertions and deletions can happen


at both the ends of the queue. A dequeue or a double ended queue is a

125
data structure, which combines the properties of a queue and a stack. Like
the stack, items can be pushed into the dequeue, once inserted into the
dequeue, the last item pushed in may be extracted from the same side
(popped as similar to a stack). Dequeue behaves similar to a queue, in
which the first item pushed in may be pulled out first on the other side( as
similar to a queue). In an input restricted dequeue, the insertion of elements
is at one end only, but the deletion of elements can be done at both the
ends of a queue. In an output restricted dequeue, the deletion of elements
is done at one end only, but it allows insertion to be done at both the ends
of dequeue.
6.5 Circular Queue
Problems may arise with this array-based queue implementation.
The elements are always inserted (enqueued) into the queue at the end, so
they start at location 0 in the array and move forward towards MAX_SIZE-1,
the last position in the array. On the other hand, elements are removed from
the ‘front’ of the array. Let us consider a situation, where a queue has been
constructed with five elements and it reaches the maximum size
(rear=MAX_SIZE-1), in that situation, first four elements get deleted and the
front pointer has been moved to the fifth position (front=rear=fifth
position=MAX_SIZE-1). When the user wants to add a new element, the
rear end says that the queue is full, even though the first four slots in the
queue is free to occupy the elements.
To overcome this situation, we can implement a queue as a circular
queue. That is during the addition, if we reach the end of the queue and if
the slots at the beginning of the queue are empty, then the new elements
would get added at the beginning of the queue. This can be repeated
continuously until the front and rear pointers meet together.
a) Circular Queue Operations using Arrays

Operations that could be performed in an Array implementation of a


Circular Queue is listed below with suitable C routines:

The C language statements which could better represent the


initialization of queue has been listed below:
# define MAX_SIZE 10
int queue[MAX_SIZE] , rear, front;
rear=front=-1;

126
Push: This operation adds new element to the rear end of the queue.
Here, The Queue will be declared as full, only when there is no empty
space in the queue (i.e. rear=max_size-1 and front =0) or when the next
element of the rear is equivalent to the front (i.e. front = rear+1). Once the
rear is equivalent to the maximum size of the array, but still the room is
available on the other end of the queue means, it moves in a circular
fashion making rear =0 and adds the element in the new rear end.
Otherwise, the element would be added to the rear end by incrementing it
by one. The C routine which depicts the enqueue operation of a queue
using array has been shown below:
void enqueue(int queue[], int item)
{
if((front==0&&rear==max-1)||(front==rear+1)) // full condition
{
printf("Queue is overflow\n");
}
if(front==-1)
{
front=rear=0;
}
else
{
if(rear==max-1)
{
rear=0; // moving in a circular manner
}
else
{
rear++;
}
}
queue[rear]=item;
}
Dequeue: This operation removes the element from the front end of the
queue. C routine which depicts the dequeue operation of a circular queue
using arrays has been described below:

127
int dequeue(int queue[])
{
if(front==-1)
{
printf("queue is underflow\n");
}
int element=queue[front];
if(front==rear)
{
front=rear=-1;
}
else
{
if(front==max-1)
{
front=0;
}
else
{
front++;
}
}
}
In the above routine, before deleting an element, the queue is checked
for an underflow condition (front==-1) and it is reported to the user if the
queue is empty. Otherwise the element at the front position has been
deleted & retrieved. After deleting that element, the variable ‘front’ gets
adjusted in such a way that if the deleted element is the last one in the
queue (front==rear), then the queue has been set as empty by making the
front = rear = -1. During deletion, if the front reaches maximum size of the
array (front = max_size -1), i.e. element to be deleted resides in the last
position of the queue, then the front pointer has been adjusted to point to
the initial value of the queue in a circular fashion (i.e. front =0). Otherwise,
the element in the front position of the queue gets deleted by incrementing
the front variable by one, to show that the ‘front’ points to the next element
in the sequence.

128
6.6 PRIORITY QUEUES
Priority queue is a type of queue in which each element is assigned
certain priority such that the order of deletion of elements is decided by their
associated priorities. The order of processing or deletion of elements in a
priority queue is decided by the following rules:

1. An element with highest priority is deleted before all other elements of


lower priority.
2. If two elements have the same priority then they are deleted as per
the order in which they were added into the queue (i.e., First-In-First-
Out).
The implementation of priority queues may follow different
approaches. For instance, the elements may be added arbitrarily into the
queue and deleted as per their priority values or, the elements may be
sorted as per their priorities at the time of their insertion itself, and deleted in
a sequential fashion. We’ll be following the later approach for implementing
priority queues.
The structure of a priority queue needs to be defined in such a
manner that each queue node is able to store both its value as well as its
priority information. The following C structure defines the node of a priority
queue:

struct queue /*Node of a priority queue*/


{
int element;

int priority;
struct queue *next; /*Pointer to the next queue node*/
};

6.7 DOUBLE-ENDED QUEUES


A double-ended queue is a special type of queue that allows
insertion and deletion of elements at both the ends, i.e., the front and rear.
In simple words, a double-ended queue can be referred as a linear list of
elements in which the insertion and deletion of elements takes place at its

129
two ends but not in the middle. This is the reason why it is termed as
double-ended queue or deque.

Based on the type of restrictions imposed on insertion and deletion of


elements, a double-ended queue is categorized into two types:
1. Input-restricted deque: It allows deletion from both the ends but restricts the
insertion at only one end.
2. Output-restricted deque: It allows insertion at both the ends but restricts the
deletion at only one end.

Figure: Double-Ended Queues

The insertion and deletion of elements is possible at both the front and
rear ends of the queue. As a result, the following four operations are
possible for a double-ended queue:
1. i_front Insertion at the front end of the queue.
2. d_front Deletion from the front end of the queue.

3. i_rear Insertion at the rear end of the queue.


4. d_rear Deletion from the rear end of the queue.

Applications of Queue
 Round robin techniques for processor scheduling is implemented
using queue.
 Printer server routines (in drivers) are designed using queues.
 All types of customer service software (like Railway/Air ticket
reservation) are designed using queue to give proper service to the
customers.

130
Let us sum up
In this unit we discussed about the concepts of the Queues and the Logical
Representation of Queues and Queue Operations and Array and linked
Implementation of Queues. Moreover, we earned the knowledge of Circular
Queues and Priority Queues and Double-Ended Queues with suitable
examples. For further readings refer the suggested books listed.

Check your progress


1. FIFO stands for _______.
2. _________ operation removes the element from the front end of the
queue.
3. ________ is a special type of queue that allows insertion and deletion of
elements at both the ends.

Glossary
FIFO, Enqueue, Dequeue, circular and priority queues

Suggested Readings
1. E Balagurusamy, “Data Structures”, McGraw Hill Education (India)
Private Limited, 2019.
2. Mark Allen Weiss, "Data Structures and Algorithm Analysis in C++”,
Third Edition, Pearson Education, 2007.
3. Reema Thareja, ―Data Structures Using C‖, Second Edition , Oxford
University Press, 2011.
4. Ellis Horowitz, Sartaj Sahni, Susan Anderson-Freed, ―Fundamentals of
Data Structures in C‖, Second Edition, University Press, 2008.

Answer to check your progress


1. first in first out
2. Dequeue
3. double-ended queue.

131
Block–3: TREES

Unit -7: Basics of Trees

Unit -8: Binary Tree

Unit -9: Binary Search Tree

132
Unit -7
Basics of Trees

Structure

Overview
Learning objectives
7.0 Introduction

7.1 Trees
7.2 Tree Terminology
Let us sum up
Check your progress
Glossary
Suggested readings
Answers to check your progress

Overview
We have discussed about the linear data structures and nonlinear
data structures in the previous units. In this chapter, we will learn about the
different types of Trees. This chapter will give an overview of these trees
with their notations, terminologies, representations and operations
performed in it.
Learning objectives
At the end of this unit you will be able to
 Understand the basic concepts of Trees in Data Structures.
 Get clear idea about the Trees.
 Get knowledge about the Tree Terminologies.
7.0 Introduction
A Data structure is said to be a Non-Linear Data Structure if its
elements do not form a sequence or a linear series but form a hierarchical
format.

133
The frequently used non-linear data structures are:
(a) Trees : - It maintains a hierarchical relationship between the
elements present in it.
7.1 Trees
The data structures like stack, linked lists etc. are linear data
structures, whereas in real life, we will come across nonlinear structures like
hierarchical data structure. For Example Tree resembles like a Grand
Father – Son– Grand Son relationship in a family or CEO – Vice President –
General Manager – Manager relationship in a company. Consider a
hierarchical structure which needs to be followed typically in a college
scenario as shown in the following Fig. These kind of hierarchical structures
can best be represented with the help of a data structure called Tree.
Look at the picture, in case you want to know the details of student
number 10 of Section A of CSE branch, you need to approach the Principal
– Dean academics – Head CSE – In Charge Students –Section A. In this
process you have not bothered to visit Dean Admin, thus saving access
time by ignoring one complete branch of the tree.
As a result, this feature makes the tree data structure as most useful
and widely used structure in Computer Science in the areas of data storage,
parsing, evaluation of expressions, and compiler design.

Principal

Dean Dean Admin


Academics

Head CSE Head IT Transport Admission


s

Student IC Student IC
Section A Section B

Figure: A tree organization

134
Definition

A Tree T is a finite set of one or more nodes such that there is a


specially designated node called the ‘root’ and the remaining nodes are
partitioned into n>=0 disjoint sets T1,…Tn where each of these sets is a
tree T1,…Tn are called subtrees of the root. In other words, a tree T is a set
of nodes which stores elements in such a way that it maintains a parent-
child relationship between the elements and satisfies the following:
• If T is not empty, T has a special tree called the root that has no
parent

• Each node v of T different than the root has a unique parent node
w; each node with parent w is called as a child of w
Formally, a tree is defined recursively as follows. It consists of one or more
items called nodes. It consists of a distinguished node called the root, and a
set of zero or more non empty subset of nodes, denoted T1, T2,, Tk where
each itself is a tree. These are called the subtrees of the root.
7.2 Terminologies of a Tree
Now, let us look at the terminologies used for the tree data structure in
this section.
1. Root is at the top of the hierarchy. In the figure 2. Principal is the root
of the tree.
2. Parent Node: Each node except the root has a parent. In the figure
2., Head CSE node is parent node for in charge Admin and in charge
Students. Principal, being root, does not have a parent.
3. Ancestor or Descendant: An ancestor of a given node is either the
parent, the parent of the parent, the parent of that, parent, etc. The
counterpart of ancestor is descendant.
4. Child Nodes and Siblings: A child is a node connected directly
below the starting node. Nodes with the same parent are called
siblings. Observe that a node has 2 or 1 or nil child nodes directly
under it. Nodes with the same parent are called siblings. Dean
Computing has two siblings Head IT and Head CSE. Observe that
Section a, Section B and In Charge Admin have no child nodes.
5. Branch and Non-Leaves : A branch is a sequence of nodes such
that the first is the parent of the second, the second is the parent of
the third, etc. Nodes with children are called as non-leaves (or

135
sometimes internal nodes). Note that Principal, Deans, HODs, and in
charges are all called Nodes. Nodes with siblings are called the
internal nodes. Inter connecting lines are called Edges.
6. Leaf Nodes: Nodes with no Child nodes are called leaf nodes or
terminal nodes or external nodes. . For example section A and
Section B, in charge admin are all leaf nodes.
7. Trees: Collection of nodes and Edges. One of the node is a root, and
remaining nodes are partitioned as a collection of sub trees, each of
which is a tree by itself.
8. Edge or Branch: A line drawn from a root to child or from a node to
its child is called as an edge. Note that there are 12 Nodes A, B, C,
D,E, F, G, H, I, J, K, L in the below figure. But the number of edges
present are 11. Therefore, No of edges = no of nodes – 1
9. Path: This is a list of vertices from the root node to leaf node
connected by edges. For example A-C-F-I-J is a path shown in the
figure. Note that there is only one path between nodes.
10. Path Length: This is defined as the number of edges in a path. In
the figure, the path A-C-F-I-J has a path length of 4.
11. Depth of a Node: It is defined as the path length starting from the
root to the specified node. In the figure below, Node J is at depth 4
(starting from 0) and node D has a depth of 2
12. Height of a Node: It is defined as the number of edges on the
longest downward path (path length), that exist between the specified
node to the leaf. In the figure below, Node f is at height 2 whereas the
height for Node A is 3.
13. Height of a Tree: It is defined as the number of edges on the
longest downward path (path length) between the root and the leaf.
14. Degree of a Node: The number of edges incident on a node is
called as a degree of a node. Node C is of degree 4 whereas node D
has a degree of 3 in the figure below.
15. Subtree and Proper Subtree: A subtree is a portion of a tree data
structure that can be viewed as a complete tree in itself. Any node in a
tree T, together with all the nodes below it, comprises a subtree of T.
The subtree corresponding to the root node is the entire tree; the

136
subtree corresponding to any other node is called a proper subtree
(in analogy to the term proper subset).

B C

D E F

G
H I

J K L

Figure: A tree

Common operations that could be performed on trees are:


 Enumerating all the items.
 Searching for an item.
 Adding a new item at a certain position on the tree.
 Deleting an item.
 Finding the root for any node.

Advantages of Tree

 Tree reflects the structural relationships in the data.


 It is used to represent hierarchies.
 It provides an efficient insertion and searching operations.
 Trees are flexible. It allows to move subtrees around with minimum
effort.
Disadvantage of Tree

A small change in the data can cause a large change in the structure of the
decision tree causing instability.

137
Applications of Tree:
1. One reason to use trees might be because you want to store the
information that naturally forms a hierarchy.
2. If we organize keys in the form of a tree (with some ordering e.g., BST),
we can search for a given key in moderate time (quicker than Linked List
and slower than arrays). Self-balancing search trees like AVL and Red-
Black trees guarantee an upper bound of O(Logn) for search.
3. We can insert/delete keys in moderate time (quicker than Arrays and
slower than Unordered Linked Lists). Self-balancing search
trees like AVL and Red-Black trees guarantee an upper bound of
O(Logn) for insertion/deletion.
4. Like Linked Lists and unlike Arrays, Pointer implementation of trees
don’t have an upper limit on number of nodes as the nodes are linked
using pointers.
Other Applications:
1. Heap is a tree data structure which is implemented using arrays and
used to implement priority queues.
2. B-Tree and B+ Tree : They are used to implement indexing in
databases.
3. Syntax Tree: Used in Compilers.
4. K-D Tree: A space partitioning tree used to organize points in K
dimensional space.
5. Trie : Used to implement dictionaries with prefix lookup.
6. Suffix Tree : For quick pattern searching in a fixed text.

Let us sum up
This Unit introduced you to the basics of trees also with the
importance of trees. Also, you were exposed to the different types of Trees
with their representations, properties and manipulation operations. Finally,
operations of every search structure have been discussed with suitable
examples and implementation using the C language. For further readings
refer the suggested books listed below.

138
Check your progress
a) Fill in the blanks:

1. ___________Collection of nodes and Edges.


2. __________ is defined as the number of edges in a path.
3. The number of edges incident on a node is called as a_________.
4. A___________ is a portion of a tree data structure that can be
viewed as a complete tree in itself.

Glossary
 Root
 Child Nodes and Siblings
 Ancestor or Descendant
 Parent Node
Suggested Readings
1. E Balagurusamy, “Data Structures”, McGraw Hill Education (India)
Private Limited, 2019.
2. Mark Allen Weiss, "Data Structures and Algorithm Analysis in C++”,
Third Edition, Pearson Education, 2007.
3. Reema Thareja, ―Data Structures Using C‖, Second Edition , Oxford
University Press, 2011.
4. Ellis Horowitz, Sartaj Sahni, Susan Anderson-Freed, ―Fundamentals
of Data Structures in C‖, Second Edition, University Press, 2008.

Answer to check your progress

a) Fill in the blanks:

Trees
Path Length

degree of a node
subtree

139
Unit -8
Binary Tree

Structure

Overview
Learning objectives
8.0 Binary Tree

8.1 Array representation of Binary Tree


8.2 Linked Representation of Binary Tree
8.3 Binary Tree Traversal
Let us sum up
Check your progress
Glossary
Suggested readings
Answers to check your progress

Overview
In this chapter, we will learn about the concepts of Binary Trees.
This chapter will give an overview of these trees with their notations,
terminologies, representations and operations performed in it. And we also
discussed about the binary tree traversal.

Learning objectives
At the end of this unit you will be able to
 Get knowledge of the Binary Tree.
 Understand the Array representation of Binary Tree.
 Get clear idea about the Linked Representation of Binary Tree.
 To know about the Binary Tree Traversal.

140
8.0 Binary Trees
A binary tree is a rooted tree in which every node has at the most
two children. A binary tree is a finite set of nodes that is either empty or it
consists of a root and two disjoint binary trees called the left subtree and the
right subtree. A binary tree is a data structure which is defined as a
collection of elements called nodes. Every node contains a left pointer, a
right pointer and a data element. Every binary tree has a root element
pointed by a ‘root’ pointer. The root element is the topmost node in the tree.
If a root is NULL, then the tree is empty. The following Figure shows the
pictorial representation of a binary tree. Some more examples of binary tree
has been shown in the figure.

2
5
7

2 6 9

5 11 4

Figure: Binary Tree

a) Notations and Terminologies of Binary Tree


A full binary tree is a tree in which every node has zero or two
children. It is also known as a proper binary tree. A perfect binary tree is
a full binary tree in which all leaves (vertices with zero children) are at the
same depth (distance from the root). Sometimes the perfect binary tree is
also called as the complete binary tree. A complete binary tree is a binary
tree which satisfies two properties: First, in a complete binary tree, every
level, except possibly the last, is completely filled. Second, all the nodes
appear as far left as possible. In a complete binary tree Tn, there are
exactly n nodes and level r of T can have atmost 2r nodes. An almost
complete binary tree is a tree where for a right child, there is always a left
child, but for a left child there may not be a right child.

141
In the binary tree shown in figure, all the nodes have the same number of
children. All internal nodes have two children and external nodes have no
children. In the following figure, internal nodes are represented using circles
and external nodes are represented using squares.

- Internal nodes - External nodes


Figure: Internal and External node representation

A binary tree T is said to be an extended binary tree if each node in


the tree has either no child or exactly two children. In an extended
binary tree, the nodes having two children are called internal nodes
and nodes having no nodes are called as external nodes. In an
extended binary tree, tree with n internal nodes has a minimum of n+
1 external nodes. Minimum number of nodes in a binary tree whose
height h is 2h+1 and maximum number of nodes is about 2h+1 –1.
Figure shows an extended binary tree with 5 levels and 31 nodes.

Figure: A full binary tree with nodes = 31, levels = 5 (level 0 to level 4)

142
Some interesting properties of full Binary Trees.
Number of levels L = 5 (level 0 to level 4)
Number of nodes N = 2 ^ L – 1 = 32 -1 = 31 (numbered 0 to 30)

Number of nodes in a level m = 2 ^ m


Number of nodes at level 3 = 2 ^3 = 8
Number of internal nodes = Sum of nodes at (L0+L1+L2+L3) levels =
1+2+4+8=15
Number of external nodes (leaf nodes) = No of Internal nodes + 1 = 15
+1 = 16.
Height /Depth of Binary Tree with x internal node = log2 (x +1) = log2 (15+1)
=4

Figure: Examples of binary trees

b) Representations of Binary Trees


In the computer’s memory, a binary tree can be maintained either by
using a sequential representation (using arrays) or by using a linked
representation (using linked list).
A full binary tree of depth k is a binary tree of depth k having 2k-1
nodes. This is the maximum number of the nodes such a binary tree
can have. A very elegant sequential representation for such binary
trees results from sequentially numbering the nodes, starting with
nodes on level 1, then those on level 2 and so on. Nodes on any
level are numbered from left to right. This numbering scheme gives
us the definition of a complete binary tree. A binary tree with n nodes

143
and a depth k is complete iff its nodes correspond to the nodes
which are numbered one to n in the full binary tree of depth k. The
nodes may be represented either using an array or using a linked list.

8.1. Array Representation of Trees

Array representation of trees can be done using one dimensional


array. This method is easy to understand and implement. This is very
useful for certain kinds of tree applications, such as heaps, and fairly
useless for others. This is the simplest technique for memory representation
but this requires lot of memory space and inefficient for more number of
additions and deletions. The following Figure represents an array
representation of trees with all of its nodes represented properly in the array
locations.
Steps to implement binary trees using array representation are as follows:
 Take a complete binary tree called Tree and number its nodes
from top to bottom, left to right.
 Number the tree in such a way that the root is numbered as 0, the
left child 1, the right child 2, the left child of the left child 3, etc. for
all the nodes.
 The root of the tree will be stored in the first location, Tree[0] will
represent the root element.
 Put the i th data node of this tree in the i th position of the array.
 The children of a node k will be stored in locations 2*k and 2*k+1.
 The maximum size of the array TREE is given as 2d+1 -1, where d
is the depth of the tree.
 An empty tree is specified using null. If Tree[0] = NULL, then the
tree is an empty tree.

Three simple formulas allow you to go from the index of the parent to the
index of its children and vice versa:
 if index(parent) = N, index(left child) = 2*N+1
 if index(parent) = N, index(right child) = 2*N+2
 if index(child) = N, index(parent) = (N-1)/2 (integer division with
truncation)

144
Figure: 6 Array Representations of Trees

8.2. Linked Representation of Trees


To represent a tree in a linked representation, a node in the tree
should have the following as represented in the figure:
 a data field
 a left child field with a pointer to the left tree node
 a right child field with a pointer to the right tree node

Figure: Representation of a node

 Look at the Figure, wherein we have shown a node. Observe that


the Left and Right pointers are pointing to NULL. The Node holds a
value of 12 which can also be called as a data field. The schematic
diagram of the linked representation of the binary tree is shown in
the figure. Here the left position of the node is used to point to the
left child or to store the address of the left child of the node. The
middle position is used to store the data. Finally, the right position
is used to point to the right child of the node or to store the address
of the right child. Empty subtrees are represented using NULL.

145
leftchild
element
rightchild

Figure: Linked Representation of Trees

The C representation of the binary tree with a node type is given


below:
struct node{
struct node *left;
int data;

struct node *right;


};
Here, the data represents the data part and left & right pointers are used to
store the address of the left and right children of the trees respectively.

8.3 Binary Tree Traversals


Traversing a tree means visiting all the nodes of the tree exactly
once in a systematic way. Elements are traversed in a sequential order in
case of a linear data structure, whereas for a non-linear data structure, the
elements can be traversed in many different ways or algorithms. These
algorithms differ in the order in which the nodes are visited. There are three

146
modes in which a tree could be traversed. All algorithms are recursive in
nature. They are:
In Order Traversal

Traverse the Left sub Tree (follow in order recursively)


Visit the root
Traverse the Right sub tree (follow in order recursively)
Pre Order Traversal (Depth First Order – Stack data structure)

Visit the root


Traverse left sub Tree (follows pre order recursively)

Traverse the right sub tree (follows pre order recursively)


Post Order Traversal (Breadth First Traversal – queue data structure)
Traverse left sub Tree (follows post order .recursively)
Traverse the right sub tree (follows post order recursively)
Visit the root
a) Pre Order Traversal

To traverse a binary tree in the pre-order, the following operations are


performed recursively at each node. The algorithm starts with the root node
of the tree and iterates continuously by

1. Visiting the root node


2. Traversing the left sub tree and finally
3. Traversing the right sub tree.

We may recursively define it as :


pre-order: root, children
parent comes before children; overall root first
Consider the tree given in the figure,

147
Figure: Binary Tree Traversal

The nodes of the tree T would be visited in the order: D B A C F E G. An


implementation of the pre order traversal of the binary trees using the C
language has been shown below. Here, the root represents the root of the
binary tree to be traversed with left and right indicating the address the left
and right child of every node.
int preorder (struct node *root)//preorder function
{
if(root==NULL)
{

printf(“\n\tEMPTY TREE”);
return 0;
}//end if
printf(“%d”, root->data);
if(root->left!=NULL)
preorder(root->left);

if(root->right!=NULL)
preorder(root->right);
return 0;
}//end preorder

148
b) In Order Traversal

To traverse a binary tree in in-order, the following operations are


performed recursively at each node. The algorithm starts with the root node
of the tree and iterates continuously by
1. Traversing the left sub tree
2. Visiting the root node and finally,

3. Traversing the right sub tree.


We may recursively define it as :
In-order: left child, root, right child
parent comes between the left and right children
Consider the tree given in the figure, the nodes of the tree T would be
visited in the order: A B C D E F G. An implementation of the in-order
traversal of the binary trees using the C language has been shown below.
Here, the root represents the root of the binary tree to be traversed with left
and right indicating the address of the left and right child of every node.
int inorder(struct node *root) //inorder function
{
if(root==NULL)

{
printf(“\n\tEMPTY TREE”);
return 1;
}//end if
if(root->left!=NULL)
inorder(root->left);

printf(“%d”,root->data);
if(root->right!=NULL)
inorder(root->right);
return 0;
}//end inorder

149
c) Post Order Traversal

To traverse a binary tree in post-order, the following operations are


performed recursively at each node. The algorithm starts with the root node
of the tree and iterates continuously by
1. Traversing the left sub tree
2. Traversing the right sub tree.

3. Visiting the root node.


We may recursively define it as :
Post-order: children, root, parent comes after the children, overall: root at
the last
Consider the tree given in the figure, the nodes of the tree T would be
visited in the order: A C B E G F D. An implementation of the post-order
traversal of the binary trees using the C language has been shown below.
Here, the root represents the root of the binary tree to be traversed with left
and right indicating the address of the left and right child of every node.
int postorder (struct node *root)
{
if(root==NULL)

{
printf(“\n\tEMPTY TREE”);
return 0;
}//end if
if(root->left !=NULL)
postorder(root->left);

if(root->right!=NULL)
postorder(root->right);
printf(“%d”,root->info);

return 0;
} //end postorder

150
d) Binary Expression Trees

Binary trees are widely used to store algebraic expressions. For


example, consider the algebraic expression given as follows: This
expression could be better represented using a binary tree as shown in
figure:

Figure: Binary expression tree


Now, the above expression tree could be traversed in any of the
traversal methods of preorder, inorder and postorder and the resultant
expressions we obtain through this traversal could be named as Prefix, Infix
and Postfix expressions.
e) Tree Traversal Problems

1. Construct a tree for the expression given below and produce its
equivalent pre order(prefix) and post order (post fix) expressions: (
(a+ ( b / c) ^ ( ( a + b ) * C ) )
Step 1: Include brackets as per rule of algebra and precedence of operators
and check correctness of parenthesis.
Step 2: Number the parenthesis as shown below :-

Step 3: Assign governing operator ^ of outer most bracket (no 1) to root.

151
Assign expression to the left as LST and expression to the right of ^
as RST
The expression tree which has been constructed for the above expression
is as follows:

+ *

C
A /
+

B C A B

Inorder Traversal: ( a+ ( b / c) ^ ( ( a + b ) * C ). Note that this is nothing but


infix notation, what we have studied. Now, the equivalent preorder and post
order expressions for the above tree is:
Pre order Expression: ^+A/BC*+ABC

Post order Expression: ABC/+AB+C*^

f) Applications of Binary Trees

Main advantages of Binary trees include:


1. Manipulating hierarchical data.
2. Making the information easy to search

3. Manipulating sorted lists of data.


4. Providing a workflow for compositing digital images for visual
effects.

5. Implementation of Dictionaries and Translators, and so on.

152
Let us sum up
This Unit introduced you to the basics of Binary trees also with the
importance of Binary trees. Also, you were exposed to their representations,
properties and manipulation operations. Finally, binary tree traversal has
been discussed with suitable examples and implementation using the C
language. For further readings refer the suggested books listed below.
Check your progress
a) Fill in the blanks:

1. _____________ are widely used to store algebraic expressions.


2. A binary tree is a data structure which is defined as a collection of
elements called____________.
3. A ___________is a tree in which every node has zero or two
children.
4. ___________ means visiting all the nodes of the tree exactly once
in a systematic way.
Glossary
Inorder, post order, inorder, Prefix, infix, postfix
Suggested readings

1. E Balagurusamy, “Data Structures”, McGraw Hill Education (India)


Private Limited, 2019.
2. Mark Allen Weiss, "Data Structures and Algorithm Analysis in
C++”, Third Edition, Pearson Education, 2007.
3. Reema Thareja, ―Data Structures Using C‖, Second Edition ,
Oxford University Press, 2011
4. Ellis Horowitz, Sartaj Sahni, Susan Anderson-Freed,
―Fundamentals of Data Structures in C‖, Second Edition,
University Press, 2008
Answers to check your progress
a) Fill in the blanks:

1. Binary trees
2. Nodes
3. Full binary tree
4. Traversing a tree

153
Unit -9
Binary Search Tree

Structure

Overview
Learning objectives
9.0 Binary Search Tree

9.1 Operations on a Binary Tree


9.2 Expression Trees.
Let us sum up
Check your progress
Glossary
Suggested readings
Answers to check your progress

Overview
We have discussed about the linear data structures and nonlinear
data structures in the previous units. In this chapter, we will learn about the
Binary search Trees. This chapter will also give an overview of their
Operations, notations, terminologies, representations and operations
performed in it. At the end we discuss about the Expression trees.
Learning objectives
At the end of this unit you will be able to,
 Understand the concepts of the Binary Search Tree
 Get knowledge of the Operations on a Binary search Tree
 Get idea about the Expression Trees.

9.0 Binary Search Trees

Definition: A Binary Search Tree (BST) also known as ordered binary tree,
is a variant of binary trees in which the nodes are arranged in an order. In a

154
binary search tree, all the nodes in the left sub tree have a value less than
that of the root node. Correspondingly, all the nodes in the right sub tree
have a value either equal to or greater than the root node. The same rule is
applicable to every sub-tree in the tree. Lets look at an example of a BST in
the figure.

Figure: Binary Search Tree


In the above example you can see that at each node the value in the left
child is lesser than or equal to the value in the node and the value in the
right child is greater than the value in the node.
a) Properties

A Binary Search Tree(BST) is a binary tree with the following properties:


1. The left sub tree of the node N contains values that are less than
N’s values.
2. The right sub tree of a node N contains values that are greater
than N’s values.
3. If both the left and right binary trees also satisfies these
properties, then they are binary search trees.

9.1 Operations BINARY TREE:

A binary tree is referred to as a binary search tree if for any node n in the
tree:
a. The node elements in the left subtree of n are lesser in value
than n.

155
b. The node elements in the right subtree of n are greater than
or equal to n.

Thus, binary search tree arranges its node elements in a sorted


manner. As the name suggests, the most important application of a binary
search tree is searching. The average running time of searching an element
in a binary search tree is O (logn), which is better than other data structures
like array and linked lists.

Figure: Binary search tree

As we can see in the figure, all the nodes in the left subtree are less than
the nodes in the right subtree.

The various operations performed on a binary search tree are:

1. Insert
2. Search
3. Delete

1. Insert:

Inserting a new node in a Binary search tree should not violate the
properties of the tree. Therefore, it has to be inserted in such a way that, we
have to find the correct position where the insertion has to be done and then

156
add the node at that position. The insertion function changes the structure
of the tree. Insertion operation is done as follows:
1. Initially, the new node to be inserted is checked against the root of
the tree. If it is less than the root node, the new element has to be
inserted in the left side of the root the go to step 2, else on the right
side, to goto step 3.

2. If it is less than the root, the search begins with the next node in the
left side of the root ‘n’ (i.e. one level less than the root) and checked
against the new element. Again if it is lesser, the new element has
been checked with the left sub tree of ‘n’ else the right sub tree of ‘n.
This is done recursively until the correct position is found.
3. If it is greater than the root, the search begins with the next node in
the right hand side of the root node called ‘n1’ (i.e. one node which
is greater than the root in right side) and checked against the new
element. Again if it is greater, the new element has been checked
with the right sub tree of ‘n1’ else left sub tree of ‘n1’. This is done
recursively until the correct position is found.
4. Once the correct position is found, new element gets inserted.

Now, let us see how one can build a BST and insert nodes into the
tree. The basic idea is that at each node we compare with the new value
being inserted. If the value is lesser then we traverse through the left sub
tree and if the value is greater we traverse through the right subtree.

The insert operation involves adding an element into the binary tree. The
location of the new element is determined in such a manner that insertion
does not disturb the sort order of the tree

Example: The C function for inserting an element into a binary search tree.

node *insert(node *r, int n)

{
if(r==NULL)
{

157
r=(node*) malloc (sizeof(node));
r->LEFT = r->RIGHT = NULL;
r->INFO = n;

}
else if(n<r->INFO)
r->LEFT = insert(r->LEFT, n);
else if(n>r->INFO)
r->RIGHT = insert(r->RIGHT, n);
else if(n==r->INFO)
printf(“\nInsert Operation failed: Duplicate Entry!!”);
return(r);
}

2. Search
The search operation is performed to find whether a given element
is present in a tree or not. The searching process begins at the root node. If
the root node is empty, then the value is not present in the tree. If the root is
not empty, perform the following steps:

1. Compare the search element with the key of the current node. If it is
equal, element is found.
2. Else, if the element is lesser than the key of the current node,
recursively search in the left sub tree of the current node until the
element is found.
3. If the element is greater than the key of the current node, recursively
search in the right sub tree of the current node until the element is
found.
To search an element 6 in the following figure, first it is checked with the
root node7. Since it is lesser than 7, its left sub tree has been searched.
Again, it is compared with the element 5, still 6 is greater than 5 and the
search gets continued with the right sub tree of the node 5. The right sub
tree of 5 consisiting of 6 gets a match and the element is found.

158
Figure: Searching a BST

The search operation involves traversing the various nodes of the


binary tree to search the desired element. The sorted nature of the tree
greatly benefits the search operation as with each iteration, the number of
nodes to be searched get reduced. For example, if the value to be searched
is less than the root value then the remainder of the search operation will
only be performed in the left subtree while the right subtree will be
completely ignored.

Example: The C function for searching an element in a binary search tree.

void search(node *r,int n)

{
if(r==NULL)
{

printf(“\n%d not present in the tree!!”,n);


return;
}

else if(n==r->INFO)
printf(“\nElement %d is present in the tree!!”,n);

159
else if(n<r->INFO)
search(r->LEFT,n);
else

search(r->RIGHT,n);
}

3. Delete:

Deletion operation deletes a node from the binary search tree in


such a way that the properties of the binary search tree should not get
violated and the nodes are not lost in the process. Three cases needs to get
handled while deleting a node from a binary search tree:
Case 1: Deleting a node that has no children

This is the simplest case in deletion. If we wish to remove the leaf


node from a BST, it can be removed as it is. Consider the figure (a), Node
to be deleted is a leaf node, 4. Make its parent to point to NULL and free the
node 4. For example to delete node 4, right subtree of 5 has been made to
point to NULL and node 4 has been freed as shown in the figure (b).

Figure.(a): Deleting a leaf node Figure (b): After deleting a leaf node

160
Case 2: Deleting a node with one Child

In this case, node’s child is set to be the child of the node’s parent.
In other words, replace the node to be deleted with its child. If the node to
be deleted ‘n’ was the left child of its parent means, n’s child becomes the
left child of n’s parent. Similarly, if the node to be deleted ‘n’ was the right
child of its parent means, n’s child becomes the right child of n’s parent. For
example let us delete the node 9 that has only a right child shown in figure
(a). The right pointer of node 7 is made to point to node 11, which is the
child of 9 as shown in the figure (b). The new tree after deletion is shown in
figure(b).

Figure(a) BST before deletion Figure (b) BST after deletion

Case 3: Deleting a node with two Children

In this case, the node to be deleted can be replaced by its in-order


predecessor or in-order successor, i.e. either maximum most child of the
left sub-tree or the minimum most element of the right subtree. Then,
subsequently, predecessor or successor value needs to get deleted using
any of the above cases. In the below example figure, node to be deleted 9,
has two children. Therefore node 9 gets replaced with the smallest value in
the right subtree which is ‘6’ as shown in the figure (b).

Figure (a) Node 9 to be deleted Figure (b) Node 9 gets replaced with 6

161
Figure (c ) BST after deletion of 9
The empty place shown in figure(b) has been replaced by its children 7
using case 1 as shown in the figure (c).

The delete operation involves removing an element from the binary


search tree. It is important to ensure that after the element is removed from
the tree, the other elements are shuffled in such a manner that the sort
order of the tree is regained.
If the node to be deleted is a leaf node, then it is simply deleted
without requiring any shuffling of other nodes. However, if the node to be
deleted is an internal node then appropriate shuffling is required to ensure
that the tree regains its sort order.

Example: The C function for deleting an element from a binary search tree.

int del(node *r,int n)

{
node *ptr;
if(r==NULL)

{
return(0);
}

162
else if(n<r->INFO)
return(del(r->LEFT,n));
else if(n>r->INFO)

return(del(r->RIGHT,n));
else
{
if(r->LEFT==NULL)
{
ptr=r;
r=r->RIGHT;
free(ptr);
return(1);
}
else if(r->RIGHT==NULL)
{

ptr=r;
r=r->LEFT;
free(ptr);
return(1);
}
else

{
ptr=r->LEFT;
while(ptr->RIGHT!=NULL)
ptr=ptr->RIGHT;
r->INFO=ptr->INFO;
return(del(r->LEFT,ptr->INFO));
}}}

163
Example program:

The C implementation of the various operations that could be


performed in a Binary Search Tree has been given below with the help of a
menu driven program:
Binary Search Tree Insertion, Deletion and Traversal Operations

#include <stdio.h>
#include <stdlib.h>

struct treeNode {
int data;
struct treeNode *left, *right;
};
struct treeNode *root = NULL;

/* create a new node with the given data */


struct treeNode* createNode(int data) {
struct treeNode *newNode;
newNode = (struct treeNode *) malloc(sizeof (struct treeNode));
newNode->data = data;
newNode->left = NULL;
newNode->right = NULL;
return(newNode);
}
/* insertion in binary search tree */
void insertion(struct treeNode **node, int data) {
if (*node == NULL) {
*node = createNode(data);
} else if (data < (*node)->data) {
insertion(&(*node)->left, data);
} else if (data > (*node)->data) {
insertion(&(*node)->right, data);
}
}

/* deletion in binary search tree */


void deletion(struct treeNode **node, struct treeNode **parent, int data) {
struct treeNode *tmpNode, *tmpParent;
if (*node == NULL)

164
return;
if ((*node)->data == data) {
/* deleting the leaf node */
if (!(*node)->left && !(*node)->right) {
if (parent) {
/* delete leaf node */
if ((*parent)->left == *node)
(*parent)->left = NULL;
else
(*parent)->right = NULL;
free(*node);
} else {
/* delete root node with no children */
free(*node);
}
/* deleting node with one child */
} else if (!(*node)->right && (*node)->left) {
/* deleting node with left child alone */
tmpNode = *node;
(*parent)->right = (*node)->left;
free(tmpNode);
*node = (*parent)->right;
} else if ((*node)->right && !(*node)->left) {
/* deleting node with right child alone */
tmpNode = *node;
(*parent)->left = (*node)->right;
free(tmpNode);
(*node) = (*parent)->left;
} else if (!(*node)->right->left) {
/* * deleting a node whose right child

* is the smallest node in the right


* subtree for the node to be deleted.
*/
tmpNode = *node;
(*node)->right->left = (*node)->left;
(*parent)->left = (*node)->right;

165
free(tmpNode);
*node = (*parent)->left;
} else {

/*
* Deleting a node with two children.
* First, find the smallest node in
* the right subtree. Replace the
* smallest node with the node to be
* deleted. Then, do proper connections
* for the children of replaced node.
*/
tmpNode = (*node)->right;
while (tmpNode->left) {
tmpParent = tmpNode;
tmpNode = tmpNode->left;

}
tmpParent->left = tmpNode->right;
tmpNode->left = (*node)->left;
tmpNode->right =(*node)->right;
free(*node);
*node = tmpNode;

}
} else if (data < (*node)->data) {
/* traverse towards left subtree */
deletion(&(*node)->left, node, data);
} else if (data > (*node)->data) {
/* traversing towards right subtree */
deletion(&(*node)->right, node, data);

166
}
}
/* search the given element in binary search tree */

void findElement(struct treeNode *node, int data) {


if (!node)
return;
else if (data < node->data) {
findElement(node->left, data);
} else if (data > node->data) {
findElement(node->right, data);
} else
printf("data found: %d\n", node->data);
return;
}
void traverse(struct treeNode *node) {

if (node != NULL) {
traverse(node->left);
printf("%3d", node->data);
traverse(node->right);
}
return;

}
int main() {
int data, ch;
while (1) {
printf("1. Insertion in Binary Search Tree\n");
printf("2. Deletion in Binary Search Tree\n");
printf("3. Search Element in Binary Search Tree\n");

167
printf("4. Inorder traversal\n5. Exit\n");
printf("Enter your choice:");
scanf("%d", &ch);

switch (ch) {
case 1:
while (1) {
printf("Enter your data:");
scanf("%d", &data);
insertion(&root, data);
printf("Continue Insertion(0/1):");
scanf("%d", &ch);
if (!ch)
break;
}
break;

case 2:
printf("Enter your data:");
scanf("%d", &data);
deletion(&root, NULL, data);
break;
case 3:

printf("Enter value for data:");


scanf("%d", &data);
findElement(root, data);
break;
case 4:
printf("Inorder Traversal:\n");
traverse(root);

168
printf("\n");
break;
case 5:

exit(0);
default:
printf("u've entered wrong option\n");
break;
}
}
return 0;
}
9.2 Expression Trees
Expression tree is nothing but a binary tree containing mathematical
expression. The internal nodes of the tree are used to store operators while
the leaf or terminal nodes are used to store operands. Various compilers
and parsers use expression trees for evaluating arithmetic and logical
expressions.
Consider the following expression:

(a+b)*(a–b/c)
The expression tree for the above expression is shown in Figure.

Figure: Expression tree

169
As shown in the above tree, the internal nodes store the operators while the
leaf nodes store the operands. While constructing a binary tree from a given
expression, the following precedence rules are followed:

1. Parentheses are evaluated first.


2. The exponential expressions are evaluated next.
3. Then, division and multiplication operations are evaluated.
4. Finally, addition and subtraction operations are evaluated.

Representing an expression using a binary tree has another key


advantage. By applying the various traversal methods we can deduce the
other representations of an expression. For example, the preorder traversal
of an expression tree derives its prefix notation.

Table shows the various expression notations deduced after traversing the
expression tree shown in Fig..
Table: Expression notations

Example (Refer to Fig.


Expression Notation Traversal Method 6.10)

Prefix Preorder *+ab–a/bc

Infix Inorder a+b*a–a/c

Postfix Postorder ab+abc/-*

Let us Sum up
In this Unit, we will discussed about the Binary Search Trees. This
unit will also give an overview of their Operations, notations, terminologies,
representations and operations performed in it. At the end we discussed
about the Expression trees. For further readings refer the suggested books
listed below.

170
Check your progress
a) Fill in the blanks:

1. BST stands for ____________


2. Binary search tree arranges its node elements in a ___________
manner.
3. The insert operation involves __________ into the binary tree.
4. Expression tree is nothing but a binary tree
containing____________.

Glossary
BST, Child, Node, Insert, Delete, Search and Traversal

Suggested readings
1. E Balagurusamy, “Data Structures”, McGraw Hill Education (India)
Private Limited, 2019.
2. Mark Allen Weiss, "Data Structures and Algorithm Analysis in
C++”, Third Edition, Pearson Education, 2007.
3. Reema Thareja, ―Data Structures Using C‖, Second Edition ,
Oxford University Press, 2011.
4. Ellis Horowitz, Sartaj Sahni, Susan Anderson-Freed,
―Fundamentals of Data Structures in C‖, Second Edition,
University Press, 2008.
Answers to check your progress
1. Binary Search Tree
2. Sorted
3. adding an element
4. mathematical expression

171
BLOCK – 4: GRAPHS

Unit -10: Basics of Graphs

Unit -11: Shortest Path Algorithm

Unit -12: Graph Traversal

Unit -13: Searching techniques

172
Unit -10
Basics of Graphs

Structure

Overview
Learning objectives
10.0 Graphs

10.1 Graph Terminology


10.2 Implementing Graphs Using Adjacency Matrix and Path Matrix
10.4 Adjacency List
Let us sum up
Check your progress
Glossary
Suggested feadings
Answers to check your progress

Overview
We have discussed about the linear data structures and nonlinear
data structures in the previous units. In this chapter, we will learn about the
graphs and the Graph Terminology. This chapter will give an overview of
these graphs with their implementations, representations and operations
performed using lists.

Learning objectives
At the end of this unit you will be able to

 Understand the concepts of the Graphs and Graph Terminology.


 Get knowledge of Implementing Graphs Using Adjacency Matrix and
Path Matrix.
 Clear idea about the Adjacency List.

173
10.0 Graph
Definition
A graph is an abstract data structure that is used to implement the
graph concept from mathematics. It is basically a collection of vertices (also
called as nodes) and edges that connect these vertices. A graph is often
viewed as a generalization of tree structure, which represents a complex
relationship between the nodes instead of a parent-child relationship that
exists in trees.
A graph G consists of two things:

A set V of element called nodes (or points or vertices).


A set E of edges such that each edge e in E is identified with a unique
(unordered) pair [u, v] of nodes in V, denoted by e = [u, v].
A graph can also be represented as G= (V, E) where V corresponds to the
set of vertices and E corresponds to the set of edges. Each edge is a pair
(v,w) where v,w € V. Edges are sometimes referred to as arcs. If the edges
form an ordered pair, then the graph is directed. Directed graphs are
sometimes referred to as digraphs. Vertex w is adjacent to v if and only if
(v,w)€ E. In an undirected graph, edges do not have direction. Therefore the
edge can be considered as (v,w), and hence (w,v), w is adjacent to v and v
is adjacent to w. Sometimes, an edge has a third component called weight
or cost of the edge. Figure a) shows a sample undirected graph whereas a
directed graph with ordered pair of edges has been represented by the
figure b).

A E

C D

Figure a): Undirected Graph Figure b): Directed Graph

Before getting into details of a Graph, let us see a few of the commonly
used notations and terminologies with respect to graphs and their
definitions.

174
10.1 Notations and Terminologies of a Graph:
 End points: If e = [u, v]. Then nodes u and v are called the
endpoints of e.
 Adjacent nodes or neighbors: If e = [u, v], then the nodes u and v
are called endpoints of e and the nodes u and v are said to be
adjacent nodes or neighbors.
 Degree: The degree of a node u, written deg (u), is the number of
edges containing u.
 Isolated node: If deg (u) = 0 – that is, if u does not belong to any
edge – then u is called an isolated node.
 Path: A path P of length n from a node u to a node v is defined as a
sequence of n + 1 nodes.
P = (v0, v1, v2. . . vn )
such that u = v0; vi – 1 is adjacent to vi for i = 1,2, . . . , n; and vn = v.
 Closed Path: The path P is said to be closed if v0 = vn.
 Simple Path : The path P is said to be simple if all the nodes are
distinct, with the exception that v0 may equal vn;
 Cycle: A cycle is a closed simple path with the length 3 or more.
 k – Cycle: A cycle of length k is called a k – cycle.
 Connected Graph: A graph G is said to be connected if and only if
there is a simple path between any two of its nodes in G.
 Complete Graph: A graph G is said to be complete if every node u
in G is adjacent to every other node v in G. A complete graph with n
nodes will have n(n – 1) / 2 edges.
 Labeled Graph: A graph G is said to be labeled, if its edges are
assigned data.
 Tree or Tree graph or free tree: A connected graph T without any
cycles is called a tree graph or free tree, or simply, a tree. If T is a
finite tree with m nodes, then T will have m – 1 edges.
 Weighted Graph: A graph G is said to be weighted if each edge e in
G is assigned a nonnegative numerical value w (e) called the weight
or length of e.
 Multiple Edges: Distinct edges e and e’ are called multiple edges if
they connect the same endpoints, that is, if e = [u, v] and e’ = [u, v].

175
 Loops: An edge e is called a loop if it has identical end points, that
is, if e = [u, u].
 Multigraph: The graph G with either multiple edges or loops are
called multigraph.
 Size of a Graph: The size of a graph is the total number of edges in
it.
 Degree: The number of edges beginning at a node u is the out
degree of that node and the number of edges ending at the node e
is the in degree of e. A node is called the Source if it has a positive
out degree and zero in degree. A node is called as a Sink if it has a
positive in degree and zero out degree.
 Directed Graph: A directed graph G, also called a digraph or graph,
is the same as a multigraph except that each edge e in G is
assigned a direction.
 If G is a directed graph with a directed edge e = (u, v). Then :
 e is also called as an arc.
 e begins at u and ends at v. u is the origin or initial
point of e.
 u is a predecessor of v, and u is a successor or
neighbor of u.
 u is adjacent to v, and v is adjacent to u.

 outdegree of a node u in G, defined as outdeg(u) is


the number of edges beginning at u.
 indegree of u, defined as indeg(u), is the number of
edges ending at u.
 Source: A node u is called a source if it has a positive outdegree but
zero indegree.
 Sink: A node u is called a sink if it has a zero outdegree but a
positive indegree.
 Reachable node: A node v is said to be reachable from a node u if
there is a directed path from u to v.
 Strongly connected: A directed graph G is said to be connected, or
strongly connected, if for each pair (u, v) of nodes in G there is a
path from u to v and there is also a path from v to u.

176
 Unilaterally connected: A digraph G is said to be unilaterally
connected if for any pair (u, v) of nodes in G, either there is a path
from u to v or a path from v to u.
 Parallel/Multiple Edges: Distinct edges which connect the same
endpoints are called multiple edges. That is e=(u,v) and e’ =(v,u) are
known as multiple edges of G.
 Simple graph: A directed graph is said to be simple if G has no
parallel edges.
As a result of these definitions and terminologies, figure a) shows a graph
whereas b) depicts a multigraph. A tree graph and a weighted graph has
been correspondingly depicted in figures c) and d).

A E

C D

Figure a) Graph

e1
e6
A D
e2

e3
e5
B C

e4

Figure b) Multigraph

177
A B C

D E F

Figure c) Tree Graph

A 3
E
3

D
2 C 2

Figure d) Weighted Graph


Representation of Graph

A graph is a mathematical structure which could be represented in the


computer memory in two ways as follows:
 Sequential representation using Adjacency Matrix
 Linked representation using Adjacency List that stores the neighbors
of a node using linked list.

10.2 Implementing Graphs Using Adjacency Matrix


a) Adjacency Matrix Representation

An adjacency matrix is used to represent which nodes are adjacent


to one another. Two nodes are said to be adjacent to each other if there is
an edge connecting them. If G is a simple directed graph with m nodes, and

178
suppose the nodes of G have been ordered as v1, v2. . . . . vm, then the
adjacency matrix A = (aij) of the graph G is an
m x n matrix which is defined as follows:

1 if vi is adjacent to vj, that is, if there is an edge


(vi, vj)

aij =
0 otherwise

Such a matrix A, which contains entries of only 0 and 1, is called a bit matrix
or Boolean matrix. In an adjacency matrix, the rows and columns are
labeled by graph vertices. An entry aij, in the adjacency matrix will contain 1,
if the vertices vi and vj are adjacent to each other. If the nodes are not
adjacent, aij will be set to zero.

A E

D
C

Figure: Graph for Adjacency Matrix Representation


The adjacency matrix A for the above graph G shown in figure is as follows:

A C D E
A 0 1 1 1
C 0 0 1 1
Aij = D 0 1 0 0
E 0 0 1 0

179
Note: The number of 1’s in A is equal to the number of edges in G. The
adjacency matrix for an undirected graph is symmetric, as the edge (i,j) will
be in Edges(G) if the edge (j,i) is also in Edges(G). The adjacency matrix for
a directed graph may not be symmetric. The space needed to represent a
graph using its adjacency matrix is n*n bits.
To find number of paths of length k:

Let Aij be the adjacency matrix of a graph G. Then ak(i, j), the ij entry in the
matrix Ak, gives the number of paths of length k from vi to vj.
Example:

Consider the above graph G given in figure, whose adjacency matrix A is


already given above. The powers of A2, A3 and A4 (k = 4) of the matrix A
follows:

0 0 0 1
1 0 1 2
A2 = 0 0 1 1
1 0 0 1

1 0 0 1
1 0 2 2
A3 =1 0 1 1
0 0 1 1

0 0 1 1

2 0 2 3
A4 =1 0 1 2
1 0 1 1

180
If k is the given number of nodes then the matrix Bk gives the number of
paths of length k or less from node vi to vj. The matrix Bk can be defined as
follows:

Bk = A + A2 + A3 + . . . . + Ar

b) Path Matrix representation:

The path matrix is a special matrix/structure that has been


used in answering the generalized forms of partially and fully
instantiated same generation queries in deductive databases and in
computing the transitive closure of a database relation. In this
matrix, the rows represent some paths in the graph starting from the
roots/source vertex to the leaves. Basically, depth-first search is
used to create the paths of the graph. Instead of storing every
vertex in all paths, the common parts of these paths can be stored
only once to avoid duplications.
If two paths P 1 = <a 1 , a 2 ,….., a n , b 1 , b 2 ,……., b m > and P 2 =
<a 1 , a 2 ,….., a n , c 1 , c 2 ,……., c l > have the common parts <a 1 , a 2 ,…..,
a n > then P 1 and P 2 can be stored in the two consecutive rows of the
matrix as <a 1 , a 2 ,….., a n , b 1 , b 2 ,……., b m > and < -- n empty entries --
, c 1 , c 2 ,……., c l >, where, the first n entries of the second row are
empty.

10.3 Adjacency List Representation


In an adjacency list representation, we can store the graph in the
form of a linked list. This structure consists of a list of all nodes in a graph
G. Furthermore, every node in turn is linked to the list of all other nodes that
are adjacent to it. Let us consider a graph with full list of vertices, say 1, 2,
3, 4, 5 shown in figure a). Then for each vertex, a linked list has been
defined for its adjacent vertices. Consider the graph given in Fig a)., a
corresponding adjacency list has been shown together. For a directed
graph, the sum of the lengths of all adjacency lists is equal to the number of
edges in G. However, for an undirected graph, the sum of the lengths of all
adjacency lists is equal to twice the number of edges in G because an edge
(u,v) means adjacency will be treated for both u to v and v to u. It is easy to
follow the adjacent nodes with the help of this representation. Figure b)

181
shows an undirected graph with its corresponding adjacency list
representation.

Figure a) Directed graph and its adjacency list representation

Figure b) Undirected Graph with its adjacency list representation

Let us sum up
In this unit we discussed about the basics of graphs and their
terminologies. We also discussed the implementation of graphs in paths
and adjacency matrix. Finally, operations on adjacency list have been
discussed with suitable examples and implementation using the C
language. For further readings refer the suggested books listed below.

Check your progress


a) Fill in the blanks:

1. ______ is basically a collection of vertices


2. The graph with either multiple edges or loops is called_________.

182
3. ________ has a zero outdegree but a positive indegree.
4. The size of a graph is the total number of ________.

Glossary
Weighted graph, multigraph, size of graph, cycle, loops, directed graph.

Suggested readings

1. E Balagurusamy, “Data Structures”, McGraw Hill Education (India)


Private Limited, 2019.
2. Mark Allen Weiss, "Data Structures and Algorithm Analysis in C++”,
Third Edition, Pearson Education, 2007.
3. Reema Thareja, ―Data Structures Using C‖, Second Edition ,
Oxford University Press, 2011.
4. Ellis Horowitz, Sartaj Sahni, Susan Anderson-Freed,
―Fundamentals of Data Structures in C‖, Second Edition, University
Press, 2008.

Answers to check your progress


a) Fill in the blanks:
1. Graph
2. Multigraph
3. sink
4. edges

183
Unit -11
Shortest Path Algorithms

Structure

Overview
Learning objectives
11.0 Shortest Path Algorithms

11.1 Types of Shortest Path Algorithms


11.2 Examples
11.3 Performance Analysis of Algorithms
Let us sum up
Check your progress
Glossary
Suggested readings
Answers to check your progress

Overview
This unit teaches you the definition and basics of shortest Path
algorithms. We also discuss the different types of shortest path algorithms.
This chapter will give three examples and performance analysis of
algorithms.

Learning objectives
At the end of this unit you will be able to
 Understand the concepts of shortest Path algorithms
 Get knowledge of shortest Path algorithms applications with
example
 Get clear idea about the Performance Analysis of Algorithms.

184
11.0 Shortest Path Algorithms
Shortest path algorithms are a family of algorithms designed to solve
the shortest path problem. The shortest path problem is something most
people have some intuitive familiarity with: given two points, A and B, what
is the shortest path between them? In computer science, however, the
shortest path problem can take different forms and so different algorithms
are needed to be able to solve them all.
For simplicity and generality, shortest path algorithms typically
operate on some input graph, GG. This graph is made up of a set of
vertices, VV, and edges, EE, that connect them. If the edges have weights,
the graph is called a weighted graph. Sometimes these edges are
bidirectional and the graph is called undirected. Sometimes there can be
even be cycles in the graph. Each of these subtle differences are what
makes one algorithm work better than another for certain graph type. An
example of a graph is shown below.

Figure: Undirected weighted graph


There are also different types of shortest path algorithms. Maybe
you need to find the shortest path between point A and B, but maybe you
need to know shortest path between point A and all other points in the
graph.
Shortest path algorithms have many applications. As noted earlier,
mapping software like Google or Apple maps makes use of shortest path

185
algorithms. They are also important for road network, operations, and
logistics research. Shortest path algorithms are also very important for
computer networks, like the Internet.

Any software that helps you choose a route uses some form of a
shortest path algorithm. Google Maps, for instance, has you put in a starting
point and an ending point and will solve the shortest path problem for you.

a) Types of Graphs
There are many variants of graphs. The first property is the
directionality of its edges. Edges can either be unidirectional or bidirectional.
If they are unidirectional, the graph is called a directed graph. If they are
bidirectional (meaning they go both ways), the graph is called
a undirected graph. In the case where some edges are directed and others
are not, the bidirectional edges should be swapped out for 2 directed edges
that fulfill the same functionality. That graph is now fully directed.

Figure: directed weighted graph


The second property of a graph has to do with the weights of the
edges. Edges can have no weight, and in that case the graph is
called unweighted. If edges do have weights, the graph is said to
be weighted. There is an extra caveat here: graphs can be allowed to have
negative weight edges. The inclusion of negative weight edges prohibits the
use of some shortest path algorithms.

186
The third property of graphs that affects what algorithms can be
used is the existence of cycles. A cycle is defined as any path pp through a
graph, GG, that visits that same vertex, vv, more than once. So, if a graph
has any path that has a cycle in it, that graph is said to
be cyclic. Acyclic graphs, graphs that have no cycles, allow more freedom
in the use of algorithms.

Figure: cycle graph

11.1 Types of Shortest Path Algorithms


There are two main types of shortest path algorithms, single-source
and all-pairs. Both types have algorithms that perform best in their own way.
All-pairs algorithms take longer to run because of the added complexity. All
shortest path algorithms return values that can be used to find the shortest
path, even if those return values vary in type or form from algorithm to
algorithm.
Single-source

Single-source shortest path algorithms operate under the following


principle:

187
Given a graph GG, with vertices VV, edges EE with weight function w(u, v)
= w_{u, v}w(u,v)=wu,v, and a single source vertex, ss, return the shortest
paths from ss to all other vertices in VV.

If the goal of the algorithm is to find the shortest path between only
two given vertices, ss and tt, then the algorithm can simply be stopped
when that shortest path is found. Because there is no way to decide which
vertices to "finish" first, all algorithms that solve for the shortest path
between two given vertices have the same worst-case asymptotic
complexity as single-source shortest path algorithms.
All-pairs

All-pairs shortest path algorithms follow this definition:


Given a graph GG, with vertices VV, edges EE with weight function w(u, v)
= w_{u, v}w(u,v)=wu,v return the shortest path from uu to vv for all (u,
v)(u,v) in VV.
The most common algorithm for the all-pairs problem is the floyd-warshall
algorithm. This algorithm returns a matrix of values MM, where each
cell M_{i, j}Mi,j is the distance of the shortest path from vertex ii to vertex jj.
Path reconstruction is possible to find the actual path taken to achieve that
shortest path, but it is not part of the fundamental algorithm.

11.2 Examples
The shortest path problem is about finding a path between 2 vertices
in a graph such that the total sum of the edges weights is minimum.
This problem could be solved easily using (BFS) if all edge weights
were (1), but here weights can take any value. Three different algorithms
are discussed below depending on the use-case.
a) Bellman Ford's Algorithm:

Bellman Ford's algorithm is used to find the shortest paths from the
source vertex to all other vertices in a weighted graph. It depends on the
following concept: Shortest path contains at most n−1 edges, because the
shortest path couldn't have a cycle.
So why shortest path shouldn't have a cycle ?.

188
There is no need to pass a vertex again, because the shortest path
to all other vertices could be found without the need for a second visit for
any vertices.
Algorithm Steps:
 The outer loop traverses from 0 : n−1.
 Loop over all edges, check if the next node distance > current node
distance + edge weight, in this case update the next node distance
to "current node distance + edge weight".
This algorithm depends on the relaxation principle where the
shortest distance for all vertices is gradually replaced by more accurate
values until eventually reaching the optimum solution. In the beginning all
vertices have a distance of "Infinity", but only the distance of the source
vertex = 0, then update all the connected vertices with the new distances
(source vertex distance + edge weights), then apply the same concept for
the new vertices with new distances and so on.
Implementation:
Assume the source node has a number (0):

vector <int> v [2000 + 10];

int dis [1000 + 10];

for(int i = 0; i < m + 2; i++){

v[i].clear();
dis[i] = 2e9;

for(int i = 0; i < m; i++){

scanf("%d%d%d", &from , &next , &weight);

v[i].push_back(from);

189
v[i].push_back(next);
v[i].push_back(weight);
}

dis[0] = 0;
for(int i = 0; i < n - 1; i++){
int j = 0;
while(v[j].size() != 0){

if(dis[ v[j][0] ] + v[j][2] < dis[ v[j][1] ] ){


dis[ v[j][1] ] = dis[ v[j][0] ] + v[j][2];
}
j++;
}

A very important application of Bellman Ford is to check if there is a


negative cycle in the graph,
Time Complexity of Bellman Ford algorithm is relatively high O(V⋅E), in
case E=V2, O(V3).
Let's discuss an optimized algorithm.
b) Dijkstra's Algorithm

Dijkstra's algorithm has many variants but the most common one is
to find the shortest paths from the source vertex to all other vertices in the
graph.
Algorithm Steps:

 Set all vertices distances = infinity except for the source vertex, set
the source distance = 0.
 Push the source vertex in a min-priority queue in the form (distance,
vertex), as the comparison in the min-priority queue will be
according to vertices distances.

190
 Pop the vertex with the minimum distance from the priority queue (at
first the popped vertex = source).
 Update the distances of the connected vertices to the popped vertex
in case of "current vertex distance + edge weight < next vertex
distance", then push the vertex with the new distance to the priority
queue.
 If the popped vertex is visited before, just continue without using it.
 Apply the same algorithm again until the priority queue is empty.
Implementation:

Assume the source vertex = 1.

#define SIZE 100000 + 1

vector < pair < int , int > > v [SIZE]; // each vertex has all the connected
vertices with the edges weights
int dist [SIZE];
bool vis [SIZE];

void dijkstra(){

// set the vertices distances as infinity


memset(vis, false , sizeof vis); // set all vertex as unvisited
dist[1] = 0;

multiset < pair < int , int > > s; // multiset do the job as a min-priority
queue

s.insert({0 , 1}); // insert the source node with distance = 0

while(!s.empty()){

pair <int , int> p = *s.begin(); //pop the vertex with the minimum
distance

191
s.erase(s.begin());

int x = p.s; int wei = p.f;

if( vis[x] ) continue; // check if the popped vertex is visited before


vis[x] = true;

for(int i = 0; i < v[x].size(); i++){


int e = v[x][i].f; int w = v[x][i].s;
if(dist[x] + w < dist[e] ){ // check if the next vertex distance
could be minimized
dist[e] = dist[x] + w;
s.insert({dist[e], e} ); // insert the next vertex with the
updated distance
}
}

Time Complexity of Dijkstra's Algorithm is O(V2) but with min-priority queue


it drops down to O(V+ElogV).
However, if we have to find the shortest path between all pairs of vertices,
both of the above methods would be expensive in terms of time. Discussed
below is another alogorithm designed for this case.
c) floyd-Warshall's Algorithm

Warshall's Algorithm is used to find the shortest paths between all


pairs of vertices in a graph, where each edge in the graph has a weight
which is positive or negative. The biggest advantage of using this algorithm
is that all the shortest distances between any 2 vertices could be calculated
in O(V3), where V is the number of vertices in a graph.
The Algorithm Steps:
For a graph with N vertices:

192
 Initialize the shortest paths between any 2 vertices with Infinity.
 Find all pair shortest paths that use 0 intermediate vertices, then find
the shortest paths that use 1 intermediate vertex and so on.. until
using all N vertices as intermediate nodes.
 Minimize the shortest paths between any 2 pairs in the previous
operation.
 For any 2 vertices (i,j) , one should actually minimize the distances
between this pair using the first K nodes, so the shortest path will
be: min(dist[i][k]+dist[k][j],dist[i][j]).

dist[i][k] represents the shortest path that only uses the


first K vertices, dist[k][j] represents the shortest path between the pair k,j. As
the shortest path will be a concatenation of the shortest path from i to k,
then from k to j.

for(int k = 1; k <= n; k++){


for(int i = 1; i <= n; i++){
for(int j = 1; j <= n; j++){
dist[i][j] = min( dist[i][j], dist[i][k] + dist[k][j] );
}

Time Complexity of Floyd\u2013Warshall's Algorithm is O(V3), where V is


the number of vertices in a graph.

11.3. Performance Analysis of Algorithms


The typical meaning of an algorithm is a formally defined procedure
for performing some process or calculation. Before analyzing the
performance of an algorithm, let us study about the efficiency of an
algorithm. If the functions present in an algorithm is linear, (without any
loops or recursions), the efficiency of the algorithm depends on the number
of instructions it contains. If an algorithm contains certain loops or recursive
functions, then the efficiency of the algorithm heavily depends on the
running time of the number of loops in that algorithm. So the efficiency of an

193
algorithm can be stated as the number of elements that has to be
processed.
To analyze the performance of an algorithm means determining
the amount of resources (such as time and storage) needed to execute it.
Algorithms are generally designed to work with an arbitrary number of
inputs, so the efficiency or complexity of an algorithm is stated in terms of
space and time complexity.
Let us take a scenario, there are several problems and many
algorithms exist to solve them. For example, take a problem of sorting a set
of numbers in ascending order with the problem instance (2, 3, 9, 5, 6, 1, 7)
and the algorithms available would be Bubble sort, Merge sort, Quick sort
and Selection sort etc.
• Which is the best algorithm for the problem? How do we judge?
Two criteria can be used to judge these algorithms. They are:
I. Time complexity
II. Space complexity.
The time complexity of an algorithm is basically the running time of a
program as a function of the input size. On the other hand, space
complexity of an algorithm is the amount of computer memory that is
required during the program execution as a function of the input size.
a) Space Complexity

Memory space defined as S(P) needed by a program P, consists of two


components:
 A fixed part: This includes the space needed for instruction space
(byte code), simple variable space, constants space etc defined by
c.
 A variable part: dependent on a particular instance of input and
output data
-Sp(instance)
S(P) = c + Sp(instance)
Let us look at some examples with the computation of their space
complexities.
Space Complexity: Example 1

194
1. Algorithm abc (a, b, c)
{
. return a+b+b*c+(a+b-c)/(a+b)+4.0;

}
For every instance 3 computer words are required to store variables: a, b,
and c.

Therefore Sp(instance)= 3. S(P) = 3. (where fixed part is not defined)


Space Complexity: Example 2
Algorithm Sum(a[], n)

{
s:= 0.0;
for i = 1 to n do
s := s + a[i];
return s;
}

- Every instance needs to store array a[] & n.


– Space needed to store n = 1 word.
– Space needed to store a[ ] = n floating point words (or at least n words)
– Space needed to store i and s = 2 words
Therefore, Sp(n) = (n + 3). Hence S(P) = (n + 3).
b) Time Complexity

The number of machine instructions which a program requires during its


execution is termed as time complexity. Time complexity can also be
defined as:
• Time required T(P) to run a program P with two components:
– A fixed part: compile time which is independent of the problem instance
defined by ‘c’.
– A variable part: run time which depends on the problem instance
tp(instance)

195
• T(P) = c + tp(instance)
How to measure T(P)?
– To Measure experimentally, using a “stop watch” where T(P) is obtained
in secs, msecs.
– To Count program steps where T(P) has been obtained as a step count.
Here,

Fixed part is usually ignored and only the variable part tp() is measured.
Let us look at some examples with the computation of their time
complexities.

Time Complexity: Example 1


Here S/E represents the count for assigning expressions whereas freq gives
the total number of times the loop has been executed. Normally, the
constant s are ignored, and the time complexity of the above problem has
been considered as ‘n’.

196
After ignoring the constants, time complexity has been assumed as
‘nm’. However, the running time of an algorithm can also be analyzed in
terms of the worst, best and average possible case of the input instance.
c) Worst Case, Average Case and Best Case Complexity
Worst case running time denotes the behavior of the algorithm with
respect to the worst possible case of the input instance. This denotes the
upper bound on the running time for any input. This gives an assurance that
the algorithm will never go beyond this time limit.
Average case running time of an algorithm is an estimate of the running
time for average input. This specifies the expected behavior of the algorithm
when the input is randomly drawn from a given distribution.
Best case running time of an algorithm indicates the best performance
under optimal conditions. It is always recommended to improve the average
and the worst-case performance of an algorithm.
d) Time-Space Trade-off

The best algorithm to solve a particular problem is that the one


requires less memory space and takes less time to complete its execution.
But, practically, designing such an ideal algorithm is not an easier task.
Always there exists a trade-off between these two complexities. So, if the
space is a big constraint, one might choose a program that takes less space
at the cost of more CPU time whereas if time is the major constraint, one

197
might choose a program that takes minimum time to execute at the cost of
more space.
Let us sum up
In this unit we discussed about the shortest path Algorithm and their
types. We also discussed the implementation of algorithms. Finally,
operations on shortest path Algorithm have been discussed with suitable
examples using C language. For further readings refer the suggested books
listed below.

Check your progress


a) TEST YOUR UNDERSTANDING

Shortest Path Problem


You are given 2 integers (N,M), N is the number of vertices, M is the
number of edges. You'll also be given ai , bi, wi where ai and bi represents
an edge from a vertex ai to a vertex bi and wi respresents the weight of that
edge.
Task is to find the shortest path from source vertex (vertex number 1) to all
other vertices (vi) where (2≤i≤N).
Input:

First line contains two space separated integers, (N,M) Then M lines follow,
each line has 3 space separated integers ai , bi, wi.
Output:
Print the shortest distances from the source vertex (vertex number 1) to all
other vertices (vi) where (2≤i≤N). Print "109" in case the vertex "vi" can't be
reached form the source vertex.

Leave a space between any 2 printed numbers.


Constraints:
(1≤N≤104)
(1≤M≤106)
(1≤ai,bi≤N)
(1≤wi≤1000)

198
Glossary
Time complexity, space complexity, vertices, edges, weight, cyclic, Acyclic

Suggested readings
1. E Balagurusamy, “Data Structures”, McGraw Hill Education (India)
Private Limited, 2019.

2. Mark Allen Weiss, "Data Structures and Algorithm Analysis in C++”,


Third Edition, Pearson Education, 2007.
3. Reema Thareja, ―Data Structures Using C‖, Second Edition ,
Oxford University Press, 2011.
4. Ellis Horowitz, Sartaj Sahni, Susan Anderson-Freed,
―Fundamentals of Data Structures in C‖, Second Edition, University
Press, 2008.

Answers to check your progress


a) TEST YOUR UNDERSTANDING
SAMPLE INPUT

55
125
132

341
146
355
SAMPLE OUTPUT

5237

199
Unit -12
Graph Traversal

Structure

Overview
Learning objectives

12.0 Graph Traversal


12.1 Breadth First Search
12.2 Depth First Search Traversal of a Graph
12.3 Applications
Let us sum up
Check your progress
Glossary
Suggested readings
Answer to check your progress

Overview
In this unit we are going to discuss about the basic concepts of
Graph traversal. We also discuss the types of graph traversal such as BFS
and DFS. Finally we study their features and applications.

Learning objectives
At the end of this unit you will be able to
 Understand the concepts of Graph Traversal and their types.
 Get clear idea about the Breadth First Search.
 Get knowledge of the Depth First Search Traversal of a Graph.
 To know the Applications of graph Traversal.

200
12.0 GRAPH TRAVERSALS
Traversal means visiting each node of a graph atleast once. There
are two different methods for graph traversals which are of interest to us.
They are
a) Depth First Search (DFS)
b) Breadth First Search (BFS)

Unlike trees, graphs do not have any root node; hence any node can
be constructed as start node. Let us explain the algorithm through an
example.

Consider the graph shown in Figure a) and its adjacency matrix


representation shown at Figure b).

a
a
b c

d h
e f g

Figure a): Graph for DFS and BFS algorithms


In graph traversal algorithms, we will mark all the nodes as unvisited to
start with. Using these search algorithms, we would try to store all the adjacent
nodes that have not been visited so far to the node we have just visited, in a
suitable data structure and the next node to be visited will be chosen
appropriately from the data structure. Stack data structure is suitable to achieve
depth first search as it’s a last in first out structure. On the other hand, Queue
data structure is suitable for breadth first search as it is a first in first out data
structure, indicative of the sequence of arrival. (Level by level visit). Also note
that we have to keep track of the nodes which we have visited. In either case,
we will continue the process; till all the nodes are visited i.e. stack or queue is
empty. Illustration of these graph traversals in the following sections would
make it clearer.

201
Figure b): Adjacency matrix representation

12.1 Breadth First Search Traversal


Like a depth-first traversal, breadth-first traversal algorithm visits
each node in a connected graph. However, when a breadth-first traversal
arrives at a certain node, v, it visits all neighbors of node v before continuing
to process other, more distant, parts of the graph. A depth first traversal can
be implemented recursively, whereas a breadth-first search is not naturally
recursive. Instead of using a stack data structure, breadth-first traversal
usually makes use of a queue. A queue is much like a line of people; it
operates on a FIFO (first in first out) or ``first come, first served'' principle.
The actual traversal procedure begins by enqueueing the starting node of
the graph into a queue. Then the following process repeats:
 Dequeue a current node.
 Enqueue all non-visited nodes adjacent to the current node.

 Mark non-visited nodes adjacent to the current node as ‘visited’.


In other words, we could explain the breadth first traversal of the graph
as follows: To start with the, initial node of the graph G (root node) has been
examined first and marked as visited. After examining A, all the nodes which
are adjacent to A (neighbors) are explored further and marked as ‘visited’.
Then for each of those neighboring nodes, the traversal algorithm explores
their unexplored neighbor nodes and so on until the nodes are ‘visited’. This
process guarantees that every node in the graph gets visited once and no node

202
has been processed more than once. Breadth first traversal is better
accomplished with a queue data structure as illustrated in the following
procedure:

Step 1: Initialization
a) Initialize Queue to empty
b) Mark all nodes of the graph ‘G’ as not visited

c) Enqueue the node 0 and mark it as ‘visited’.


2. While queue is not empty
{
a) Dequeue an element from the queue and mark it as current node.
b) Enqueue all non-visited adjacent nodes to the current node into
the queue.
c) Mark all the enqued nodes as ‘visited’.
}
As a result of the above procedure, dequeued order of the nodes
from the queue will provide the breadth first order traversal of the graph.
Below Figure illustrates a step by step illustration for the breadth first
traversal of the graph given in figure a) (with its adjacency matrix listed in
figure b) using a queue data structure. This implementation returns an
answer of a b c d e f g h. Traversal order may differ, if there are many
adjacent nodes for a node 0 and if a different adjacent node has been
visited first, by node 0 during traversal.

203
204
205
Figure: Breadth First Search Procedure Implementation

12.2 Depth First Search Traversal


The depth first search algorithm progresses by expanding the
starting node of G and thus going deeper and deeper until a goal node is
found, or until a node that has no children is encountered. When a dead-
end is reached, the algorithm backtracks, returning to the most recent node
that has not been completely explored.

206
In other words, depth first search begins at a starting node A which
becomes the current node. Then, it examines each node ‘n’ along the path
‘P’ which begins at A. First, we visit the adjacent neighbor of A called B,
then the neighbor of B called C, then the neighbor of C and so on until new
nodes(unvisited nodes) in the path has been reached or the path reaches a
dead end. Once the path reaches the dead end or all the nodes in the path
has been visited, it backtracks the path where it travels through and tries to
explore each node in the path to find whether any alternative path or
unvisited nodes are present. If any unvisited node is present in the
alternative path, it recursively follows this depth first procedure. The
algorithm terminates when backtracking leads to the starting node A by
exhausting all the alternative paths in the graph G. In this traversal, edges
that lead to a new vertex are called discovery edges and edges that lead to
an already visited vertex are called back edges. A step by step procedure
which is involved in the depth first traversal of a graph using a Stack data
structure is illustrated below:
Step 1: Initialization
a) Initialize Stack to empty
b) Mark all nodes of the graph as ‘not visited’.
c) Push node 0 onto the stack and mark it as ‘visited’.
Step 2: While (stack is not empty)

{
a) Pop value from stack
b) Push all the nodes adjacent to popped node and which have not yet been
visited onto the stack
c) Mark all pushed nodes as ‘visited’.
}

As a result of the above procedure, popped order of the nodes from


the stack will provide the depth first order traversal of the graph. Figure
illustrates a step by step illustration for the depth first traversal of the graph
given in figure a) (with its adjacency matrix listed in figure b)) using a stack
data structure. This implementation returns an answer of a c h g f b e d.
Traversal order may differ, if there are many adjacent nodes for a node 0

207
and if a different adjacent node has been visited first, by node 0 during
traversal.

Depth First Search Algorithm-implementation

208
209
DFS Traversal is: a c h g f b e d
Figure: Depth First Search Procedure Implementation

210
12.3 Applications:
Graphs are widely used to model any situation where the entities or things
are related to each other in pairs. Eg. Family trees, transportation networks
in which nodes are airports, intersections, ports etc. Here the edges can be
airline flights, roads and routes etc. Apart from that some more applications
are listed below:
 Social network graphs: These graphs are used to tweet or not to
tweet. Graphs that represent who knows whom, who communicates
with whom, who influences whom or other relationships in social
structures. An example is the twitter graph of who follows whom.
These can be used to determine how information flows, how topics
become hot, how communities develop, or even who might be a
good match for who, or is that whom.
 Transportation networks: In road networks, vertices are
intersections and edges are the road segments between them, and
for public transportation networks vertices are stops and edges are
the links between them. Such networks are used by many map
programs such as google maps, Bing maps and now Apple IOS 6
maps (well perhaps without the public transport) to find the best
routes between locations. They are also used for studying traffic
patterns, traffic light timings, and many aspects of transportation.
 Utility graphs. The power grid, the Internet, and the water network are
all examples of graphs where the vertices represent the connection
points, and edges the wires or pipes between them. Analyzing
properties of these graphs is very important in understanding the
reliability of such utilities under failure or attack, or in minimizing the
costs to build infrastructure that matches required demands
 Document link graphs. The best known example is the link graph
of the web, where each web page is a vertex, and each hyperlink a
directed edge. Link graphs are used, for example, to analyze
relevance of web pages, the best sources of information, and good
link sites.
 Protein-protein interactions graphs. Vertices represent proteins
and edges represent interactions between them that carry out some
biological function in the cell. These graphs can be used, for
example, to study molecular pathways—chains of molecular
interactions in a cellular process. Humans have over 120K proteins
with millions of interactions among them.

211
 Network packet traffic graphs. Vertices are IP (Internet protocol)
addresses and edges are the packets that flow between them. Such
graphs are used for analyzing the network security, studying the
spread of worms, and tracking criminal or non-criminal activity.
 Scene graphs. In graphics and computer games scene graphs
represent the logical or spacial relationships between objects in a
scene. Such graphs are very important in the computer games
industry.
 Finite element meshes. In engineering many simulations of
physical systems, such as the flow of air over a car or airplane wing,
the spread of earthquakes through the ground, or the structural
vibrations of a building, involve partitioning space into discrete
elements. The elements along with the connections between
adjacent elements forms a graph that is called a finite element
mesh.
 Robot planning. Vertices represent states the robot can be in and the
edges the possible transitions between the states. This requires
approximating continuous motion as a sequence of discrete steps.
Such graph plans are used, for example, in planning paths for
autonomous vehicles.
 Neural networks. Vertices represent neurons and edges the
synapses between them. Neural networks are used to understand
how our brain works and how connections change when we learn.
The human brain has about 1011 neurons and close to 1015
synapses.
 Graphs in quantum field theory. Vertices represent states of a
quantum system and the edges the transitions between them. The
graphs can be used to analyze path integrals and summing these up
generates a quantum amplitude.
 Semantic networks. Vertices represent words or concepts and
edges represent the relationships among the words or concepts.
These have been used in various models of how humans organize
their knowledge, and how machines might simulate such an
organization.
 Graphs in epidemiology. Vertices represent individuals and
directed edges the transfer of an infectious disease from one
individual to another. Analyzing such graphs has become an

212
important component in understanding and controlling the spread of
diseases.
 Graphs in compilers. Graphs are used extensively in compilers.
They can be used for type inference, for so called data flow analysis,
register allocation and many other purposes. They are also used in
specialized compilers, such as query optimization in database
languages.
 Constraint graphs. Graphs are often used to represent constraints
among items. For example the GSM network for cell phones
consists of a collection of overlapping cells. Any pair of cells that
overlap must operate at different frequencies. These constraints can
be modeled as a graph where the cells are vertices and edges are
placed between cells that overlap.
 Dependence graphs. Graphs can be used to represent
dependences or precedence’s among items. Such graphs are often
used in large projects in laying out what components rely on other
components and used to minimize the total time or cost to
completion while abiding by the dependences.

Let us sum up
In this unit we discussed about the Graph Traversal and their types.
We also discussed the implementation of breadth first search and Depth
first search. Finally, applications of graph traversal have been discussed
with suitable examples. For further readings refer the suggested books
listed below.

Check your progress


a) Fill up the blanks:

1. _________ means visiting each node of a graph atleast once.


2. BFS stands for ________.
3. Graphs are used extensively in ________.

Glossary
Enqueue, dequeue, push, pop, root node, neighbor node.

213
Suggested readings
1. E Balagurusamy, “Data Structures”, McGraw Hill Education (India)
Private Limited, 2019.
2. Mark Allen Weiss, "Data Structures and Algorithm Analysis in C++”,
Third Edition, Pearson Education, 2007.
3. Reema Thareja, ―Data Structures Using C‖, Second Edition ,
Oxford University Press, 2011
4. Ellis Horowitz, Sartaj Sahni, Susan Anderson-Freed,
―Fundamentals of Data Structures in C‖, Second Edition, University
Press, 2008

Answers to check your progress


a) Fill up the blanks:
1. Traversal
2. Breadth First Search
3. compilers

214
Unit -13
Searching techniques

Structure
Overview
Learning objectives

13.0 Searching
13.1 Linear Search
13.2 Binary Search
13.3 Hashing

Let us sum up
Check your progress
Glossary
Suggested readings

Answer to check your progress

Overview
In this unit we are going to discuss about the basic concepts of
searching. We also discuss the types of searching techniques. Finally we
study the hashing techniques with their features and applications.

Learning objectives
At the end of this unit you will be able to
 Understand the concepts of Searching and their types.
 Get clear idea about the Linear Search
 Know the concepts of Binary Search
 Get knowledge about Hashing and double hashing with suitable
examples.

215
13.0 Searching
Searching in data structure refers to the process of finding the
required information from a collection of items stored as elements in the
computer memory. These sets of items are in different forms, such as an
array, linked list, graph, or tree. Another way to define searching in the data
structures is by locating the desired element of specific characteristics in a
collection of items.
Searching Methods
Searching in the data structure can be done by applying searching
algorithms to check for or extract an element from any form of stored data
structure.
These algorithms are classified according to the type of search operation
they perform, such as:
 Sequential search

The list or array of elements is traversed sequentially while checking every


component of the set.

For example – Linear Search.


 Interval Search

The interval search includes algorithms that are explicitly designed for
searching in sorted data structures. In terms of efficiency, these
algorithms are far better than linear search algorithms.
Example- Logarithmic Search, Binary search.

These methods are evaluated based on the time taken by an algorithm


to search an element matching the search item in the data collections and
are given by,
 The best possible time
 The average time
 The worst-case time

216
The primary concerns are with worst-case times, which provide
guaranteed predictions of the algorithm’s performance and are also easier
to calculate than the average times.

To illustrate the concepts and examples in this article, we are


assuming ‘n’ items in the data collection in any data format. To make
analysis and the algorithm comparison easier, dominant operations are
used. A comparison is a dominant operation for searching in a data
structure, denoted by O() and pronounced as “big-Oh” or “Oh.”
There are numerous searching algorithms in a data structure such
as linear search, binary search, interpolation search, sublist search,
exponential search, jump search, Fibonacci search, the ubiquitous binary
search, recursive function for substring search, unbounded, binary search,
and recursive program to search an element linearly in the given array. The
article includes linear search, binary search, and interpolation search
algorithms and their working principles.
Let’s take a closer look at the linear and binary searches in the data
structure.
13.1 Linear Search
The linear search algorithm iteratively searches all elements of the
array. It has the best execution time of one and the worst execution time of
n, where n is the total number of items in the search array.

It is the simplest search algorithm in data structure and checks each


item in the set of elements until it matches the searched element till the end
of data collection. When the given data is unsorted, a linear search
algorithm is preferred over other search algorithms.
Complexities in linear search are given below:
Space Complexity:

Since linear search uses no extra space, its space complexity is


O(n), where n is the number of elements in an array.
Time Complexity:

 Best-case complexity = O(1) occurs when the searched item is


present at the first element in the search array.
 Worst-case complexity = O(n) occurs when the required element is
at the tail of the array or not present at all.

217
 Average- case complexity = average case occurs when the item to
be searched is in somewhere middle of the Array.

Pseudocode for the linear search algorithm

procedure linear_search (list, value)

for each item in the list


if match item == value
return the item's location
end if
end for

end procedure
Example,
Let’s take the following array of elements:
45, 78, 15, 67, 08, 29, 39, 40, 12, 99
To find ‘29’ in an array of 10 elements given above, as we know linear
search algorithm will check.

13.2 Binary Search


This algorithm locates specific items by comparing the middlemost
items in the data collection. When a match is found, it returns the index of
the item. When the middle item is greater than the search item, it looks for a
central item of the left sub-array. If, on the other hand, the middle item is
smaller than the search item, it explores for the middle item in the right sub-
array. It keeps looking for an item until it finds it or the size of the sub-arrays
reaches zero.
Binary search needs sorted order of items of the array. It works
faster than a linear search algorithm. The binary search uses the divide and
conquers principle.
Run-time complexity = O(log n)

Complexities in binary search are given below:

218
 The worst-case complexity in binary search is O(n log n).
 The average case complexity in binary search is O(n log n)
 Best case complexity = O (1)

Pseudocode for the Binary search algorithm

Procedure binary_search

A ← sorted array
n ← size of array
x ← value to be searched

Set lowerBound = 1
Set upperBound = n

while x not found


if upperBound < lowerBound

EXIT: x does not exists.


set midPoint = lowerBound + ( upperBound - lowerBound ) / 2

if A[midPoint] x
set upperBound = midPoint - 1

if A[midPoint] = x
EXIT: x found at location midPoint
end while

end procedure

Example,
Let’s take a sorted array of 08 elements:

219
09, 12, 26, 39, 45, 61, 67, 78
 To find 61 in an array of the above elements,
 The algorithm will divide an array into two arrays, 09, 12, 26, 39 and
45, 61, 67, 78
 As 61 is greater than 39, it will start searching for elements on the
right side of the array.
 It will further divide the into two such as 45, 61 and 67, 78
 As 61 is smaller than 67, it will start searching on the left of that sub-
array.
 That subarray is again divided into two as 45 and 61.
 As 61 is the number matching to the search element, it will return the
index number of that element in the array.
 It will conclude that the search element 61 is located at the 6th
position in an array.
Binary search reduces the time to half as the comparison count is reduced
significantly as compared to the linear search algorithm.

a) Interpolation Search

It’s a better version of the binary search algorithm that focuses on


the probing position of the search element. It only works on sorted data
collection, similar to binary search algorithms.

Complexities in interpolation search are given below:


When the middle (our approximation) is the desired key,
Interpolation Search works best. As a result, the best case time complexity
is O(1).
If the data set is sorted and distributed uniformly, the interpolation
search’s average time complexity is O(log2(log2n)), where n denotes the
total of elements in an array.
In the worst-case scenario, we’ll have to traverse the entire array, which will
take O(n) time.

An interpolation search is used when the location of the target


element is known in the data collection. If you want to find Rahul’s phone
number in the phone book, instead of using a linear or binary search, you
can directly probe to memory space storage where names begin with ‘R’.

220
Pseudocode for the Interpolation search algorithm
A → Array list
N → Size of A

X → Target Value

Procedure Interpolation_Search()

Set Lo → 0
Set Mid → -1
Set Hi → N-1
While X does not match
if Lo equals to Hi OR A[Lo] equals to A[Hi]
EXIT: Failure, Target not found
end if

Set Mid = Lo + ((Hi - Lo) / (A[Hi] - A[Lo])) * (X - A[Lo])

if A[Mid] = X
EXIT: Success, Target found at Mid
else
if A[Mid] X

Set Hi to Mid-1
end if
end if
End While
End Procedure

221
13.3 Hashing
Hashing is a technique that is used to uniquely identify a specific
object from a group of similar objects. Some examples of how hashing is
used in our lives include:
 In universities, each student is assigned a unique roll number that
can be used to retrieve information about them.
 In libraries, each book is assigned a unique number that can be
used to determine information about the book, such as its exact
position in the library or the users it has been issued to etc.

In both these examples the students and books were hashed to a unique
number.
Assume that you have an object and you want to assign a key to it to
make searching easy. To store the key/value pair, you can use a simple
array like a data structure where keys (integers) can be used directly as an
index to store values. However, in cases where the keys are large and
cannot be used directly as an index, you should use hashing.
In hashing, large keys are converted into small keys by using hash
functions. The values are then stored in a data structure called hash table.
The idea of hashing is to distribute entries (key/value pairs) uniformly across
an array. Each element is assigned a key (converted key). By using that key
you can access the element in O(1) time. Using the key, the algorithm (hash
function) computes an index that suggests where an entry can be found or
inserted.
Hashing is implemented in two steps:

1. An element is converted into an integer by using a hash function.


This element can be used as an index to store the original element,
which falls into the hash table.
2. The element is stored in the hash table where it can be quickly
retrieved using the hashed key.
hash = hashfunc(key)

index = hash % array_size


In this method, the hash is independent of the array size and it is then
reduced to an index (a number between 0 and array_size − 1) by using the
modulo operator (%).

222
a) Hash function

A hash function is any function that can be used to map a data set of
an arbitrary size to a data set of a fixed size, which falls into the hash table.
The values returned by a hash function are called hash values, hash codes,
hash sums, or simply hashes.
To achieve a good hashing mechanism, It is important to have a good hash
function with the following basic requirements:
1. Easy to compute: It should be easy to compute and must not
become an algorithm in itself.
2. Uniform distribution: It should provide a uniform distribution across
the hash table and should not result in clustering.
3. Less collisions: Collisions occur when pairs of elements are mapped
to the same hash value. These should be avoided.
b) Hash table
A hash table is a data structure that is used to store keys/value pairs.
It uses a hash function to compute an index into an array in which an
element will be inserted or searched. By using a good hash function,
hashing can work well. Under reasonable assumptions, the average time
required to search for an element in a hash table is O(1).
Let us consider string S. You are required to count the frequency of all the
characters in this string.

string S = “ababcd”

The simplest way to do this is to iterate over all the possible characters and
count their frequency one by one. The time complexity of this approach
is O(26*N) where N is the size of the string and there are 26 possible
characters.

void countFre(string S)
{
for(char c = ‘a’;c <= ‘z’;++c)
{

int frequency = 0;

223
for(int i = 0;i < S.length();++i)
if(S[i] == c)
frequency++;

cout << c << ‘ ‘ << frequency << endl;


}

Output

a2
b2

c1
d1
e0

f0

z0

Let us apply hashing to this problem. Take an array frequency of size 26


and hash the 26 characters with indices of the array by using the hash
function. Then, iterate over the string and increase the value in the
frequency at the corresponding index for each character. The complexity of
this approach is O(N) where N is the size of the string.

int Frequency[26];

int hashFunc(char c)
{
return (c - ‘a’);

224
void countFre(string S)
{

for(int i = 0;i < S.length();++i)


{
int index = hashFunc(S[i]);
Frequency[index]++;
}
for(int i = 0;i < 26;++i)
cout << (char)(i+’a’) << ‘ ‘ << Frequency[i] << endl;

Output

a2
b2
c1
d1
e0
f0

z0

c) Double hashing

Double hashing is similar to linear probing and the only difference is


the interval between successive probes. Here, the interval between probes
is computed by using two hash functions.
Let us say that the hashed index for an entry record is an index that
is computed by one hashing function and the slot at that index is already
occupied. You must start traversing in a specific probing sequence to look
for an unoccupied slot. The probing sequence will be:

225
index = (index + 1 * indexH) % hashTableSize;
index = (index + 2 * indexH) % hashTableSize;
and so on…
Here, indexH is the hash value that is computed by another hash function.
Implementation of hash table with double hashing
Assumption

 There are no more than 20 elements in the data set.


 Hash functions will return an integer from 0 to 19.
 Data set must have unique elements.

string hashTable[21];

int hashTableSize = 21;

Insert

void insert(string s)
{
//Compute the index using the hash function1
int index = hashFunc1(s);
int indexH = hashFunc2(s);
//Search for an unused slot and if the index exceeds the
hashTableSize roll back
while(hashTable[index] != "")
index = (index + indexH) % hashTableSize;

hashTable[index] = s;

Search

void search(string s)

{
//Compute the index using the hash function

226
int index = hashFunc1(s);
int indexH = hashFunc2(s);
//Search for an unused slot and if the index exceeds the
hashTableSize roll back
while(hashTable[index] != s and hashTable[index] != "")
index = (index + indexH) % hashTableSize;

//Is the element present in the hash table


if(hashTable[index] == s)
cout << s << " is found!" << endl;

else
cout << s << " is not found!" << endl;

Applications
 Associative arrays: Hash tables are commonly used to implement
many types of in-memory tables. They are used to implement
associative arrays (arrays whose indices are arbitrary strings or
other complicated objects).
 Database indexing: Hash tables may also be used as disk-based
data structures and database indices (such as in dbm).
 Caches: Hash tables can be used to implement caches i.e. auxiliary
data tables that are used to speed up the access to data, which is
primarily stored in slower media.
 Object representation: Several dynamic languages, such as Perl,
Python, JavaScript, and Ruby use hash tables to implement objects.
 Hash Functions are used in various algorithms to make their
computing faster

Let us sum up
In this unit we discussed about the searching techniques and their
types. We also discussed the implementation of linear and binary search.

227
Finally, hashing and applications of hashing techniques have been
discussed with suitable examples. For further readings refer the suggested
books listed below.

Check your progress


a) Fill up the blanks:

1. The ________ algorithm iteratively searches all elements of the


array.
2. A hash table is a data structure that is used to store __________.
3. Double hashing is similar to ___________.

Glossary
Less collisions, Uniform distribution, Easy to compute, The best possible
time, The average time, The worst-case time.

Suggested Readings
1. E Balagurusamy, “Data Structures”, McGraw Hill Education
(India) Private Limited, 2019.

2. Mark Allen Weiss, "Data Structures and Algorithm Analysis in


C++”, Third Edition, Pearson Education, 2007.
3. Reema Thareja, ―Data Structures Using C‖, Second Edition,
Oxford University Press, 2011.
4. Ellis Horowitz, Sartaj Sahni, Susan Anderson-Freed,
―Fundamentals of Data Structures in C‖, Second Edition,
University Press, 2008.

Answers to check your progress


a) Fill up the blanks:

1. linear search
2. keys/value pairs
3. linear probing

228
Block – 5: SORTING

Unit -14: Sorting techniques


Unit-15: Bubble Sort and Quick Sort
Unit-16: Merge Sort and Bucket Sort.

229
Unit -14
Sorting techniques

Structure

Overview
Learning objectives
14.0 Sorting

14.1 Selection Sort


14.2 Insertion Sort
Let us sum up
Check your progress
Glossary
Suggested readings
Answers to check your progress

Overview
We have discussed about the linear and non linear data structures,
in the previous units. In this unit, we will learn about the sorting in data
structures. This unit will give an overview of different types of sorting such
as selection and insertion sort with their notations, representations,
structures and manipulation operations with suitable examples.
Learning objectives
At the end of this unit you will be able to
 Understand the concepts of sorting and their types.
 Get knowledge about selection sort and their features and
implementation.
 Get clear idea about the insertion sort and its operations.

230
14.0 Sorting
Sorting refers to arranging data in a particular format. Sorting
algorithm specifies the way to arrange data in a particular order. Most
common orders are in numerical or lexicographical order.
The importance of sorting lies in the fact that data searching can be
optimized to a very high level, if data is stored in a sorted manner. Sorting
is also used to represent data in more readable formats. Following are
some of the examples of sorting in real-life scenarios −
 Telephone Directory − The telephone directory stores the
telephone numbers of people sorted by their names, so that the
names can be searched easily.
 Dictionary − The dictionary stores words in an alphabetical order
so that searching of any word becomes easy.

A file of size of n is a sequence of n items r[0], r[1], . . . r[n – 1]. Each


item in the file is called a record. A key k[i] is associated with each record
r[i]. The key is usually a subfield of the entire record. The file is said to be
sorted on the key if i < j implies that k[i] precedes k[j] in some ordering on
the keys.
Depending on the data type of the key, records can be sorted
numerically or, more generally, alphanumerically. In numerical sorting, the
records are arranged in the ascending or descending order according to the
numerical value of the key.
In the example of the telephone book, the file consists of all the
entries in the book. Each entry is a record. The key upon which the file is
sorted is the name field of the record. Each record also contains fields for
an address and a telephone number.

A sort can be classified as being internal if the records that it is


sorting are in main memory or external if some of the records that it is
sorting are in auxiliary storage. The main difference between the two is that
an internal sort can access any item easily whereas an external sort must
access items sequentially or at least in large blocks.
It is possible for two records in a file to have the same key. A sorting
method is said to be stable if it preserves the relative order of duplicate keys
in the file.

231
The general criteria for judging a sorting algorithm are
 How fast can it sort information in an average case?
 How fast are its best and worst cases?

 Does it exhibit natural or unnatural behavior?


 Does it rearrange elements with equal keys?

14.1 SELECTION SORT


Selection sort is a simple sorting algorithm. This sorting algorithm is
an in-place comparison-based algorithm in which the list is divided into two
parts, the sorted part at the left end and the unsorted part at the right end.
Initially, the sorted part is empty and the unsorted part is the entire list.
The smallest element is selected from the unsorted array and
swapped with the leftmost element, and that element becomes a part of the
sorted array. This process continues moving unsorted array boundary by
one element to the right.
This algorithm is not suitable for large data sets as its average and
worst case complexities are of Ο(n2), where n is the number of items.

A selection sort selects the element with the lowest value and
exchanges it with the first element. Then, from the remaining n – 1
elements, the element with the smallest key is found and exchanged with
the second element, and so forth. The exchanges continue to the last two
elements.

Algorithm:
Step 1 − Set MIN to location 0
Step 2 − Search the minimum element in the list
Step 3 − Swap with value at location MIN

Step 4 − Increment MIN to point to next element


Step 5 − Repeat until list is sorted

The code that follows shows the basic selection sort.

232
procedure selection sort

list : array of items


n : size of list

for i = 1 to n - 1
/* set current element as minimum*/
min = i

/* check the element to be minimum */


for j = i+1 to n

if list[j] < list[min] then


min = j;
end if

end for

/* swap the minimum element with the current element*/

if indexMin != i then
swap list[min] and list[i]
end if

end for

end procedure

The first pass makes n – 1 comparisons, the second pass makes n –


2, and so on. Therefore, there is a total of

(n – 1) + (n – 2) + (n – 3) + . . . + 1 = n * (n – 1)/2
comparisons, which is O(n2). There is little additional storage required.
Therefore, the sort may categorize as O(n2).
For example, if the selection method were used on the following array 25
57 48 37 12 92 86 33 each pass would look like this:

233
Iteration 0 25 57 48 37 12 92 86 33

Iteration 1 12 57 48 37 25 92 86 33

Iteration 2 12 25 48 37 57 92 86 33

Iteration 3 12 25 33 37 57 92 86 48

Iteration 4 12 25 33 37 57 92 86 48

Iteration 5 12 25 33 37 48 92 86 57

Iteration 6 12 25 33 37 48 57 86 92

Iteration 7 12 25 33 37 48 57 86 92

Diagram 10.22

Example: program for selection sort in C

#include <stdio.h>
#include <stdbool.h>

#define MAX 7
int intArray[MAX] = {4,6,3,2,1,9,7};
void printline(int count) {

int i;
for(i = 0;i < count-1;i++) {
printf("=");

}
printf("=\n");
}
void display() {

234
int i;

printf("[");

// navigate through all items

for(i = 0;i < MAX;i++) {

printf("%d ", intArray[i]);


}
printf("]\n");

}
void selectionSort() {
int indexMin,i,j;

// loop through all numbers


for(i = 0; i < MAX-1; i++) {
// set current element as minimum

indexMin = i;
// check the element to be minimum
for(j = i+1;j < MAX;j++) {

if(intArray[j] < intArray[indexMin]) {


indexMin = j;
}

}
if(indexMin != i) {
printf("Items swapped: [ %d, %d ]\n" , intArray[i], intArray[indexMin]);

// swap the numbers


int temp = intArray[indexMin];

intArray[indexMin] = intArray[i];
intArray[i] = temp;
}

235
printf("Iteration %d#:",(i+1));

display();
}
}

void main() {
printf("Input Array: ");
display();

printline(50);
selectionSort();
printf("Output Array: ");

display();
printline(50);
}

If we compile and run the above program, it will produce the following result

Output

Input Array: [4 6 3 2 1 9 7 ]
==================================================

Items swapped: [ 4, 1 ]
Iteration 1#:[1 6 3 2 4 9 7 ]
Items swapped: [ 6, 2 ]
Iteration 2#:[1 2 3 6 4 9 7 ]
Iteration 3#:[1 2 3 6 4 9 7 ]
Items swapped: [ 6, 4 ]

Iteration 4#:[1 2 3 4 6 9 7 ]
Iteration 5#:[1 2 3 4 6 9 7 ]
Items swapped: [ 9, 7 ]

Iteration 6#:[1 2 3 4 6 7 9 ]

236
Output Array: [1 2 3 4 6 7 9 ]

14.2. INSERTION SORT


Insertion sort is a simple sort algorithm in which the sorted array (or
list) is built one entry at a time.
This is an in-place comparison-based sorting algorithm. Here, a
sub-list is maintained which is always sorted. For example, the lower part of
an array is maintained to be sorted. An element which is to be 'insert'ed in
this sorted sub-list, has to find its appropriate place and then it has to be
inserted there. Hence the name, insertion sort.
The array is searched sequentially and unsorted items are moved
and inserted into the sorted sub-list (in the same array). This algorithm is
not suitable for large data sets as its average and worst case complexity
are of Ο(n2), where n is the number of items.

The algorithm can be described as:


1. Initialize the sorted list using the first element of the list.
2. Loop over the input array until it is empty, "removing" the first
remaining (leftmost) element.
3. Compare the removed element against the current result, starting
from the highest (rightmost) element, and working left towards the
lowest element.
4. If the removed input element is lower than the current result
element, copy that value into the following element to make room for
the new element below, and repeat with the next lowest result
element.
5. Otherwise, the new element is in the correct location; save it in the
cell left by copying the last examined result up, and start again from
(2) with the next input element.
The insertion sort initially sorts the first two members of the array. Next, the
algorithm inserts the third member into its sorted position in relation to the
first two numbers. Then it inserts the fourth element into the list of three
elements. The process continues until all the elements have been sorted.

237
The code for a version of the insertion sort is shown next.
void insertionsort(int x[], int n)
{

int i, k, y;

for(k = 1; k < n; k++) {


/* insert x[k] into the sorted file */
y = x[k];
/* move down 1 position all elements greater than y */
for (i = k – 1; i >= 0; && y < x[i]; i--)
x[i + 1] = x[i];
/* insert y at proper position */
x[i + 1] = y;
}
}

The number of comparisons occur during an insertion sort depending upon


the ordering of the file. If the file is sorted, only one comparison is made on
each pass, so that the sort is O(n). If the file is initially sorted in the reverse
order, the sort is O(n2), since the total number of comparisons is
(n – 1) + (n – 2) + . . . 3 + 2 + 1 = n * (n – 1)/2

The complete set of iterations for the insertion sort for the file with elements
25 57 48 37 12 92 86 33 is shown in Diagram.

Iteration 0 25 57 48 37 12 92 86 33

Iteration 1 25 57 48 37 12 92 86 33

Iteration 2 25 57 48 37 12 92 86 33

Iteration 3 25 48 57 37 12 92 86 33

238
Iteration 4 25 37 48 57 12 92 86 33

Iteration 5 12 25 37 48 57 92 86 33

Iteration 6 12 25 37 48 57 92 86 33

Iteration 7 12 25 37 48 57 86 92 33

Iteration 8 12 25 33 37 48 57 86 92

Diagram

Example: program for insertion sort in C

#include <stdio.h>
#include <stdbool.h>
#define MAX 7

int intArray[MAX] = {4,6,3,2,1,9,7};

void printline(int count) {

int i;

for(i = 0;i < count-1;i++) {

printf("=");
}
printf("=\n");

void display() {
int i;

239
printf("[");

// navigate through all items


for(i = 0;i < MAX;i++) {

printf("%d ",intArray[i]);
}
printf("]\n");

}
void insertionSort() {
int valueToInsert;

int holePosition;
int i;

// loop through all numbers


for(i = 1; i < MAX; i++) {
// select a value to be inserted.

valueToInsert = intArray[i];

// select the hole position where number is to be inserted

holePosition = i;
// check if previous no. is larger than value to be inserted
while (holePosition > 0 && intArray[holePosition-1] > valueToInsert) {

intArray[holePosition] = intArray[holePosition-1];
holePosition--;
printf(" item moved : %d\n" , intArray[holePosition]);

}
if(holePosition != i) {

240
printf(" item inserted : %d, at position : %d\n" ,
valueToInsert,holePosition);
// insert the number at hole position
intArray[holePosition] = valueToInsert;

}
printf("Iteration %d#:",i);
display();

}
}

void main() {
printf("Input Array: ");
display();

printline(50);
insertionSort();
printf("Output Array: ");

display();
printline(50);
}

If we compile and run the above program, it will produce the following result

Output

Input Array: [4 6 3 2 1 9 7 ]
==================================================

Iteration 1#:[4 6 3 2 1 9 7 ]
item moved : 6
item moved : 4

item inserted : 3, at position : 0

241
Iteration 2#:[3 4 6 2 1 9 7 ]
item moved : 6
item moved : 4

item moved : 3
item inserted : 2, at position : 0
Iteration 3#:[2 3 4 6 1 9 7 ]
item moved : 6
item moved : 4
item moved : 3
item moved : 2
item inserted : 1, at position : 0
Iteration 4#:[1 2 3 4 6 9 7 ]
Iteration 5#:[1 2 3 4 6 9 7 ]
item moved : 9
item inserted : 7, at position : 5

Iteration 6#:[1 2 3 4 6 7 9 ]
Output Array: [1 2 3 4 6 7 9 ]

It is much less efficient than the more advanced algorithms such as


quicksort, heapsort, or merge sort, but its advantages are
 Simple to implement
 Efficient on (quite) small data sets
 Efficient on data sets which are already substantially sorted
 Stable (ie, does not change the order of already ordered elements)
 Minimal memory requirements

Let us sum up
In this unit we discussed about the basic concepts of Sorting and
types of sorting such as selection sort and insertion sort with suitable

242
algorithms and notations. For further reading refer the suggested books
listed below.
Check your progress
a) Answer the following:
1. What are the real-life scenarios of sorting?
2. Define selection sort.
3. Describe insertion sort.
Glossary

Leftmost, rightmost,
Suggested readings
1. E Balagurusamy, “Data Structures”, McGraw Hill Education (India)
Private Limited, 2019.
2. Mark Allen Weiss, "Data Structures and Algorithm Analysis in C++”,
Third Edition, Pearson Education, 2007.
3. Reema Thareja, ―Data Structures Using C‖, Second Edition ,
Oxford University Press, 2011.
4. Ellis Horowitz, Sartaj Sahni, Susan Anderson-Freed,
―Fundamentals of Data Structures in C‖, Second Edition, University
Press, 2008.

Answers to check your progress


a) Answer the following:
1. Real-life scenarios.

 Telephone Directory − The telephone directory stores the


telephone numbers of people sorted by their names, so that
the names can be searched easily.
 Dictionary − The dictionary stores words in an alphabetical
order so that searching of any word becomes easy.
2. A selection sort selects the element with the lowest value and
exchanges it with the first element.
3. Insertion sort is a simple sort algorithm in which the sorted array (or
list) is built by making one entry at a time.

243
Unit -15
Bubble Sort and Quick Sort

Structure

Overview
Learning objectives

15.0 Bubble Sort


15.1 Quick Sort
Let us sum up
Check your progress
Glossary
Suggested readings
Answers to check your progress

Overview
We have discussed about the linear and non linear data structures,
concepts of sorting in the previous units. In this unit, we will learn about the
two types of sorting techniques such as Bubble and quick sort. This unit will
give an overview of the two sorting with their notations, representations,
structures and manipulation operations with suitable examples.
Learning objectives
At the end of this unit you will be able to
 Understand the concepts of Bubble sort with suitable examples.
 Get knowledge about the Quick sort with the respective algorithm.

15.0 Bubble Sort


Bubble sort is the most straightforward and simplistic method of
sorting data. The algorithm starts at the beginning of the data set. It
compares the first two elements, and if the first is greater than the second, it

244
swaps them, then repeats until no swaps have occurred on the last pass.
The algorithm does this for each pair of adjacent elements until there are no
more pairs to compare. This algorithm is inefficient, and is rarely used.

The bubble sort algorithm works as follows:


1. Compare adjacent elements. If the first is greater than the second,
swap them.
2. Do this for each pair of adjacent elements, starting with the first two
and ending with the last two. At this point the last element should be
the greatest.
3. Repeat the steps for all elements except the last one.
4. Keep repeating for one fewer element each time, until you have no
more pairs to compare.
Example:

Let x is an array of integers of which the first n are to be sorted so that x[i] ≤
x[j] for 0 ≤ i < j < n. The basic idea underlying the bubble sort is to pass
through the file sequentially. In each pass an element is compared with its
predecessor (x[i] with x[i-1]) and the two elements are interchanged if they
are not in proper order. The elements are like bubbles in a tank of water –
each seeks its own level. A simple form of the bubble sort is:
void bubble (int x[ ], int n)
{

int i, j, t;

for (i = 1; i < n; i++)

for (j = n-1; j >= i; j--) {


if (x[j-1] > x[j]) {
/* exchange elements */
t = x[j-1];
x[j-1] = x[j];
x[j] = t;

}
}

245
}

Considering the file 25 57 48 37 12 92 86 33. The following


comparisons are made on the first pass.
x[7] with x[6] (33 with 86) Interchange
x[6] with x[5] (33 with 92) Interchange

x[5] with x[4] (33 with 12) No Interchange


x[4] with x[3] (12 with 37) Interchange
x[3] with x[2] (12 with 48) Interchange

x[2] with x[1] (12 with 57) Interchange


x[1] with x[0] (33 with 26) Interchange
Thus, after the first after the first pass, the file is in the order 12 25 57 48
37 33 92 86 The complete set of iterations is the following:

Iteration 0 25 57 48 37 12 92 86 33

Iteration 1 12 25 57 48 37 33 92 86

Iteration 2 12 25 33 57 48 37 86 92

Iteration 3 12 25 33 37 57 48 86 92

Iteration 4 12 25 33 37 48 57 86 92

Iteration 5 12 25 33 37 48 57 86 92

Iteration 6 12 25 33 37 48 57 86 92

Iteration 7 12 25 33 37 48 57 86 92

Diagram 10.21

There are n-1 passes and n-1 comparisons on each pass. Thus the total
number of comparisons is (n – 1) * (n – 1), which is O(n2).

246
We shall see the implementation of bubble sort in C programming
language here.

#include <stdio.h>
#include <stdbool.h>

#define MAX 10

int list[MAX] = {1,8,4,6,0,3,5,2,7,9};

void display() {
int i;
printf("[");

// navigate through all items


for(i = 0; i < MAX; i++) {

printf("%d ",list[i]);
}

printf("]\n");
}

void bubbleSort() {
int temp;
int i,j;

bool swapped = false;

247
// loop through all numbers

for(i = 0; i < MAX-1; i++) {


swapped = false;

// loop through numbers falling ahead


for(j = 0; j < MAX-1-i; j++) {
printf(" Items compared: [ %d, %d ] ", list[j],list[j+1]);

// check if next number is lesser than current no


// swap the numbers.

// (Bubble up the highest number)

if(list[j] > list[j+1]) {

temp = list[j];
list[j] = list[j+1];
list[j+1] = temp;

swapped = true;
printf(" => swapped [%d, %d]\n",list[j],list[j+1]);

} else {
printf(" => not swapped\n");
}

// if no number was swapped that means


// array is sorted now, break the loop.
if(!swapped) {

248
break;

printf("Iteration %d#: ",(i+1));

display();
}
}

void main() {
printf("Input Array: ");

display();
printf("\n");

bubbleSort();
printf("\nOutput Array: ");
display();

If we compile and run the above program, it will produce the following
result:−
Output

Input Array: [1 8 4 6 0 3 5 2 7 9 ]
Items compared: [ 1, 8 ] => not swapped
Items compared: [ 8, 4 ] => swapped [4, 8]
Items compared: [ 8, 6 ] => swapped [6, 8]

Items compared: [ 8, 0 ] => swapped [0, 8]


Items compared: [ 8, 3 ] => swapped [3, 8]
Items compared: [ 8, 5 ] => swapped [5, 8]

Items compared: [ 8, 2 ] => swapped [2, 8]

249
Items compared: [ 8, 7 ] => swapped [7, 8]
Items compared: [ 8, 9 ] => not swapped
Iteration 1#: [1 4 6 0 3 5 2 7 8 9 ]

Items compared: [ 1, 4 ] => not swapped


Items compared: [ 4, 6 ] => not swapped
Items compared: [ 6, 0 ] => swapped [0, 6]
Items compared: [ 6, 3 ] => swapped [3, 6]
Items compared: [ 6, 5 ] => swapped [5, 6]
Items compared: [ 6, 2 ] => swapped [2, 6]
Items compared: [ 6, 7 ] => not swapped
Items compared: [ 7, 8 ] => not swapped
Iteration 2#: [1 4 0 3 5 2 6 7 8 9 ]
Items compared: [ 1, 4 ] => not swapped
Items compared: [ 4, 0 ] => swapped [0, 4]
Items compared: [ 4, 3 ] => swapped [3, 4]

Items compared: [ 4, 5 ] => not swapped


Items compared: [ 5, 2 ] => swapped [2, 5]
Items compared: [ 5, 6 ] => not swapped
Items compared: [ 6, 7 ] => not swapped
Iteration 3#: [1 0 3 4 2 5 6 7 8 9 ]
Items compared: [ 1, 0 ] => swapped [0, 1]

Items compared: [ 1, 3 ] => not swapped


Items compared: [ 3, 4 ] => not swapped
Items compared: [ 4, 2 ] => swapped [2, 4]
Items compared: [ 4, 5 ] => not swapped
Items compared: [ 5, 6 ] => not swapped
Iteration 4#: [0 1 3 2 4 5 6 7 8 9 ]
Items compared: [ 0, 1 ] => not swapped

250
Items compared: [ 1, 3 ] => not swapped
Items compared: [ 3, 2 ] => swapped [2, 3]
Items compared: [ 3, 4 ] => not swapped

Items compared: [ 4, 5 ] => not swapped


Iteration 5#: [0 1 2 3 4 5 6 7 8 9 ]
Items compared: [ 0, 1 ] => not swapped
Items compared: [ 1, 2 ] => not swapped
Items compared: [ 2, 3 ] => not swapped
Items compared: [ 3, 4 ] => not swapped

Output Array: [0 1 2 3 4 5 6 7 8 9 ]

15.1 QUICK SORT


Quick sort is a divide-and-conquer method for sorting. This sort is
also called as partition exchange sort. Let x be an array, and n the number
of elements in the array to be sorted. An element a called as pivot is
selected from a specific position within the array (for example, a = x[0]).
Suppose that the elements of x are partitioned so that ‘a’ is placed into
position j and the following conditions hold:
1. Each of the elements in positions 0 through j – 1 is less than or
equal to a.
2. Each of the elements in positions j + 1 through n – 1 is greater than
or equal to a.
If these two conditions hold for a particular a and j, a is the jth smallest
element of x, so that a remains in position j when the array is completely
sorted. If this process is repeated with the subarrays x[0] through x[j – 1]
and x[j + 1] through x[n – 1] and any subarrays created by the process in
successive iterations, the final result is a sorted file.
The outline of an algorithm quick(lb, ub) to sort all the elements in an array x
between positions lb and ub as follows:
if (lb >= ub)

251
return;
partition(x, lb, ub, j);
quick(x, lb, j – 1);

quick(x, j + 1, ub);

The object of partition is to allow a specific element to find its proper


position with respect to the others in the subarray. One way to effect a
partition efficiently is the following: Let a = x[lb] be the element whose
final position is sought. Two pointers, up and down, are initialized to the
upper and lower bounds of the subarray, respectively.
At any point during execution, each element in a position above up is
greater than or equal to a, and each element in a position below down is
less than or equal to a. the two pointers up and down are moved towards
each other in the following fashion.
1. Repeatedly increase the pointer down by one position until x[down]
> a.
2. Repeatedly decrease the pointer up by one position until x[up] < = a.
3. If up > down, interchange x[down] with x[up].

The process is repeated until the condition is step 3 fails (up < = down), at
which point x[up] is interchanged with x[lb], whose final position was sought,
and j is set to up.
The algorithm can be implemented by the following procedure.
void partition (int x[], int lb, int ub, int *pj)
{

int a, down, temp, up;

a = x[lb];

up = ub;
down = lb;
while (down < up)
while (x[down] <= a && down < ub)

252
down ++; /* move up the array */
while (x[up] > a) /* move down the array */
up-- ;

if (down < up) {


/* interchange x[down] and x[up] */
temp = x[down];
x[down] = x[up];
x[up] = temp;
}
}
x[lb] = x[up];
x[up] = a;
*pj = up;
}

If a file of size n is a power of 2, such that n = 2m, and m = log2n. If the


position of pivot is the middle of the array, then there will be n comparisons
on the first pass, after which the file is split into two subfiles each of size n/2,
approximately. For each of these two files there are n/2 comparisons, and a
total of four files of each size n/4 are formed. Thus the total number of
comparisons for the entire sort is

n + 2*(n/2) + 4*(n/4) + 8*(n/8) + . . .+ n*(n/n)


or

n + n + n + n + . . . + n (m terms) = n * m
comparisons. There are m terms because the file is subdivided m times.
Thus the total number of comparisons is O(n*m) or O(n log n).

If an initial array is given as 12 25 57 48 37 33 92 86 the steps are


performed to obtain the sorted list.

253
25 57 48 37 12 92 86 33

12 25 57 48 37 92 86 33

12 25 57 48 37 92 86 33

12 25 48 37 33 57 92 86

12 25 37 33 48 57 92 86

12 25 33 37 48 57 92 86

12 25 33 37 48 57 86 92

12 25 33 37 48 57 86 92

Diagram 10.24

We shall see the implementation of Quick sort in C programming language


here.

#include <stdio.h>

#include <stdbool.h>
#define MAX 7

254
int intArray[MAX] = {4,6,3,2,1,9,7};

void printline(int count) {


int i;

for(i = 0;i < count-1;i++) {


printf("=");

}
printf("=\n");
}

void display() {
int i;
printf("[");

// navigate through all items


for(i = 0;i < MAX;i++) {

printf("%d ",intArray[i]);
}
printf("]\n");

}
void swap(int num1, int num2) {
int temp = intArray[num1];

intArray[num1] = intArray[num2];
intArray[num2] = temp;
}

int partition(int left, int right, int pivot) {


int leftPointer = left -1;
int rightPointer = right;

255
while(true) {

while(intArray[++leftPointer] < pivot) {


//do nothing
}

while(rightPointer > 0 && intArray[--rightPointer] > pivot) {


//do nothing
}

if(leftPointer >= rightPointer) {


break;
} else {

printf(" item swapped :%d,%d\n",


intArray[leftPointer],intArray[rightPointer]);
swap(leftPointer,rightPointer);

}
}
printf(" pivot swapped :%d,%d\n", intArray[leftPointer],intArray[right]);

swap(leftPointer,right);
printf("Updated Array: ");
display();

return leftPointer;
}
void quickSort(int left, int right) {

if(right-left <= 0) {
return;
} else {

int pivot = intArray[right];


int partitionPoint = partition(left, right, pivot);
quickSort(left,partitionPoint-1);

256
quickSort(partitionPoint+1,right);

}
}
int main() {

printf("Input Array: ");


display();
printline(50);

quickSort(0,MAX-1);
printf("Output Array: ");
display();

printline(50);
}

If we compile and run the above program, it will produce the following
result:-
Output

Input Array: [4 6 3 2 1 9 7 ]
==================================================
pivot swapped :9,7

Updated Array: [4 6 3 2 1 7 9 ]
pivot swapped :4,1
Updated Array: [1 6 3 2 4 7 9 ]

item swapped :6,2


pivot swapped :6,4
Updated Array: [1 2 3 4 6 7 9 ]

pivot swapped :3,3


Updated Array: [1 2 3 4 6 7 9 ]
Output Array: [1 2 3 4 6 7 9 ]

257
Let us sum up
In this unit we discussed about the two types of Sorting techniques
such as Bubble sort and quick sort. We also discussed their algorithm,
sample code of implementation with example program for each. For further
readings refer the suggested books listed below.
Check your progress
a) Fill up the blanks:
1. The basic idea underlying the bubble sort is to pass through the
file__________.
2. The algorithm starts at the beginning of the data set is called______.
3. ________ is a divide-and-conquer method for sorting.
4. This sort is also called as___________.

Glossary
Swap, interchange, subarray, Dataset.

Suggested readings
1. E Balagurusamy, “Data Structures”, McGraw Hill Education (India)
Private Limited, 2019.
2. Mark Allen Weiss, "Data Structures and Algorithm Analysis in C++”,
Third Edition, Pearson Education, 2007.
3. Reema Thareja, ―Data Structures Using C‖, Second Edition ,
Oxford University Press, 2011.
4. Ellis Horowitz, Sartaj Sahni, Susan Anderson-Freed,
―Fundamentals of Data Structures in C‖, Second Edition, University
Press, 2008.

Answers to check your progress


a) Fill up the blanks:
1. sequentially
2. bubble sort
3. Quicksort
4. partition exchange sort

258
Unit -16
Merge Sort and Bucket Sort

Structure

Overview
Learning objectives

16.0 Merge Sort


16.1 Bucket Sort
16.2 Sorting With Disk
Let us sum up
Check your progress
Glossary
Suggested readings
Answers to check your progress

Overview
We have discussed about the linear and non linear data structures in
the previous units. In this unit, we will learn about the types of sorting data
structures which are used to data or elements in disk storage. This unit will
give an overview of merge sort and bucket sort with their notations,
representations, structures and manipulation operations with suitable
examples.

Learning objectives
At the end of this unit you will be able to
 Understand the Merge sort with suitable algorithms and program.
 Get knowledge about the Bucket sorting with example program.

259
16.0 Merge Sort
Merge sort is a sorting technique based on the divide and conquer
technique. With worst-case time complexity being Ο(n log n), it is one of the
most respected algorithms.
Merge sort first divides the array into equal halves and then
combines them in a sorted manner.

Merge sort works as follows:


1. Divide the unsorted list into two sub lists of about half the size

2. Sort each of the two sub lists


3. Merge the two sorted sub lists back into one sorted list.
void sort (int array[], int first, int last)
{
int mid;
if (first > last) return;

id = (first + last)/2;
sort (array, first, mid);

sort (array, mid, last);


merge (array, first, mid , last);
}
void merge (int v[], int first, int mid, int last)
{
int i, j, k;
int * tmp = malloc(sizeof(float) * (last - first));
i = first;
j = mid;

k = 0;
while ((i < mid) && (j < last))

260
{
if (v[i] <= v[j])
tmp[k++] = v[i++];

else
tmp[k++] = v[j++];
}

while (i < mid)


tmp[k++] = v[i++];
while (j < last)
tmp[k++] = v[j++];
for (i = 0; i < (last- first); i++)
v[start+i] = tmp[i];
free(tmp);
}

a) TWO-WAY MERGE SORT

Merging is the process of combining two or more sorted files into a


third sorted file.
Divide the file into n subfiles of size 1 and merge adjacent pairs of
files. Then there will be n/2 files of size 2. Repeat this process until there is
only one file remaining of size n. Diagram illustrates this process on a
sample file. Each individual file is contained in boxes.
A routine to implement the merge sort is described as follows: an
auxiliary array aux of size n is required to hold the results of merging two
subarrays of x. The variable size contains the size of the subarrays being
merged. Since at any time the two files being merged are both subarrays of
x, lower and upper bounds are required to indicate the subfiles of x being
merged.

261
File 123 [25] [57] [48] [37] [12] [92] [86] [33]

Pass 1 [25 57] [37 48] [12 92] [33 86]

Pass 2 [25 37 48 57 12 33 86 92]

Pass 3 12 25 33 37 48 57 86 92

Diagram

l1 and u1 represent the lower and upper bounds of the first file, and l2 and
u2 represent the lower and upper bounds of the second file, respectively. i
and j are used to reference elements of the source files being merged, and
k indexes the destination file aux. The routine follows:
#define NUMELTS . . .

void mergesort (int x[ ], int n)


{

int aux[NUMELTS], i, j, k, l1, l2, size, u1, u2;

size = 1; /* merge files of size 1*/

while (size < n) {


l1 = 0;

262
k = 0;
while (l1+size < n) {
l2 = l1+size;

u1 = l2-1;
u2 = (l2+size-1 < n) ? l2+size-1 : n-1;
/* proceed through the two subfiles */
for (i = l1, j = l2; i <= u1 && j <=u2; k++)
/* enter smaller into the array aux */
if (x[i] <= x[j])
aux[k] = x[i++];
else
aux[k] = x[j++];
/* one of the files exhausted */
for (; i <= u1; k++)
aux[k] = x[i++];

for (; j <= u2; k++)


aux[k] = x[j++];
l1 = u2+1;
}
for (i = l1; k < n; i++)
aux[k++] = x[i];

for (i = 0; i < n; i++)


x[i] = aux[i];
size *= 2;
}
}
There are obviously no more than log2n passes in merge sort, each
involving n or fewer comparisons. Thus, merge sort requires no more than

263
n * l og 2 n comparisons. But merge sort requires O(n) additional space for
the auxiliary array.

Example program:

We shall see the implementation of merge sort in C programming language


here –

#include <stdio.h>

#define max 10

int a[11] = { 10, 14, 19, 26, 27, 31, 33, 35, 42, 44, 0 };
int b[10];

void merging(int low, int mid, int high) {


int l1, l2, i;

for(l1 = low, l2 = mid + 1, i = low; l1 <= mid && l2 <= high; i++) {

if(a[l1] <= a[l2])


b[i] = a[l1++];
else
b[i] = a[l2++];
}
while(l1 <= mid)

b[i++] = a[l1++];

while(l2 <= high)

b[i++] = a[l2++];
for(i = low; i <= high; i++)
a[i] = b[i];

264
}

void sort(int low, int high) {


int mid;

if(low < high) {


mid = (low + high) / 2;
sort(low, mid);

sort(mid+1, high);
merging(low, mid, high);
} else {

return;
}
}

int main() {
int i;

printf("List before sorting\n");

for(i = 0; i <= max; i++)


printf("%d ", a[i]);

sort(0, max);

printf("\nList after sorting\n");

for(i = 0; i <= max; i++)


printf("%d ", a[i]);

265
}

If we compile and run the above program, it will produce the following result

Output

List before the sorting


10 14 19 26 27 31 33 35 42 44 0
List after the sorting
0 10 14 19 26 27 31 33 35 42 44

16.1 Bucket Sort

Bucket sort distributes the list of elements across different buckets in such a
way that any bucket m contains elements greater than the elements of
bucket m–1 but less than the elements of bucket m+1. The elements within
each bucket are sorted individually either by using some alternate sorting
technique or by recursively applying the bucket sort technique. In the end,
elements of all the buckets are merged to generate the sorted list. This
technique is particularly effective for smaller range of data series.

Consider a list containing ten integers stored in a random fashion, as shown


in Figure.

Figure: List of integers

In the above list, all the elements are between the range of 0 to 50. So, let
us create five buckets for storing ten elements each. Figure shows how
these buckets are used for sorting the list.

266
Figure: 8.4 Bucket sort

As we can see in the above illustration, the list elements are first distributed
as per their values across different buckets. Then, each of the buckets are
individually sorted and later merged to generate the original sorted list.

a) Example: Write an algorithm to perform bucket sort on a given


array of integers.
Assumption: The input list elements are within the range of 0 to 49.
bucket(arr[], size)

Step 1: Start
Step 2: Set i = 0, j = 0 and k = 0

Step 3: Initialize an array c[5] and set all its values to 0; it keeps track of
the number of elements in each of the five buckets
Step 4: Create five buckets by initializing a 2-D array b[5][10] and set all its
values to 0
Step 5: Now, distribute the input list elements across different buckets. To
do this, repeat Steps 6-16 while i < size

267
Step 6: if 0 <= arr[i] <=9 then goto Step 7 else goto Step 8
Step 7: Set b[0][c[0]] = arr[i] and c[0] = c[0] + 1
Step 8: if 10 <= arr[i] <=19 then goto Step 9 else goto Step 10

Step 9: Set b[1][c[1]] = arr[i] and c[1] = c[1] + 1


Step 10: if 20 <= arr[i] <=29 then goto Step 11 else goto Step 12
Step 11: Set b[2][c[2]] = arr[i] and c[2] = c[2] + 1
Step 12: if 30 <= arr[i] <=39 then goto Step 13 else goto Step 14
Step 13: Set b[3][c[3]] = arr[i] and c[3] = c[3] + 1
Step 14: if 40 <= arr[i] <=49 then goto Step 15 else goto Step 16
Step 15: Set b[4][c[4]] = arr[i] and c[4] = c[4] + 1
Step 16: Set i = i + 1
Step 17: Sort each of the buckets b[][] by calling insertion sort module
insertion(&b[][],c[])
Step 18: Merge all the buckets together into the main array by setting array
[] = b[][]

Step 19: Stop

b) Implementation of bucket sorting technique for N elements in C.

#include <stdio.h>
#include <conio.h>

#include <stdlib.h>

void insertion(int*, int); /*Function prototype for performing insertion sort*/


void bucket(int*, int); /*Function prototype for performing bucket sort*/
void main()

{
int *arr;

268
int i, N;
clrscr();

printf(“Enter the number of elements in the array:\n”);


scanf(“%d”,&N);

arr = (int*) malloc(sizeof(int)); /*Dynamic allocation of memory for the


array*/
printf(“Enter the %d elements to sort:\n”,N);
for (i=0;i<N;i++)

scanf(“%d”,&arr[i]); /*Reading array elements*/


bucket(arr,N); /*Calling bucket function*/
printf(“\nThe sorted elements are:\n”);
for(i=0;i<N;i++)

printf(“%d\n”,arr[i]); /*Printing sorted array*/


getch();

}
/*Insertion sort function for sorting elements in a bucket*/ void
insertion(int *array, int size) {
int i=0,j=0,temp=0;
for(i=1;i<size;i++)
{

temp=array[i];
for(j=i-1;j>=0;j—)
if(array[j]>temp)

array[j+1]=array[j];
else

269
break;
array[j+1]=temp;
}

}
void bucket(int *array, int size)
{
int i, j, k, b[5][10];
int c[5];
for(i=0;i<5;i++)
c[i]=0;

/*Distributing elements across different buckets*/

for(i=0;i<size;i++)
{

if(array[i]>=0 && array[i]<=9)


b[0][c[0]++]=array[i];
if(array[i]>=10 && array[i]<=19)

b[1][c[1]++]=array[i];
if(array[i]>=20 && array[i]<=29)

b[2][c[2]++]=array[i];
if(array[i]>=30 && array[i]<=39)

b[3][c[3]++]=array[i];
if(array[i]>=40 && array[i]<=49)
b[4][c[4]++]=array[i];

270
}
/*Sorting elements in each bucket using insertion sort*/
for(i=0;i<5;i++)

if(c[i]!=0)
insertion(&b[i][0], c[i]);
/*Calling insert function*/
/*Merging buckets to form the original list*/
i=0;
k=0;
while(i<5)
{
if(c[i]==0)
{
i=i+1;
continue;

}
for(j=0;j<c[i];j++)
array[k++]=b[i][j];
i=i+1;
}
}
Output

Enter the number of elements in the array:


10

Enter the 10 elements to sort:


2
27
13

271
18
21
43

42
39
31
4

The sorted elements are:-


2
4
13
18
21
27

31
39
42
43

a) Efficiency of Bucket Sorting

Assume that an array containing n elements is sorted using bucket


sort technique.
In the worst case, all the elements of the list will be placed in a single
bucket.
Now, each bucket is sorted using insertion sort, whose efficiency = O(n2)
Thus, worst case efficiency of bucket sort = O(n2)
b) Advantages and Disadvantages
Some of the key advantages of bucket sorting technique are:

272
1. It preserves the order of repetitive values in the list.
2. It performs well for large size lists having elements in a smaller range.
The disadvantages of bucket sort are as follows:

1. It works only if the range of input values is fixed.


2. It requires additional space to perform the sorting operation.

16.2 SORTING WITH DISK


The general sorting algorithms assume high-speed random access
to all data values, they are not suitable if the values to be sorted don't fit into
the main memory. Any sorting algorithm that uses external memory, such
as tape or disk, during the sorting is called external sorting. The main
concern with external sorting is to minimize the disk access since reading a
disk block takes about a million times longer than accessing an item in
RAM.
Most external sort routines are based on merge sort. They break a
large data file into a number of shorter, sorted runs. These can be
produced by repeatedly reading a section of the data file into RAM, sorting it
out with ordinary quick sort, and writing the sorted data to disk. After the
sorted runs have been generated, a merge algorithm is used to combine
sorted files into longer sorted files. The simplest scheme is to use a 2-way
merge: merge the 2 sorted files into one sorted file, then merge 2 more, and
so on until there is just one large sorted file.

a) BUFFERING

Explicit control of buffering is important in many applications,


including ones that need to deal with raw devices (such as disks), ones
which need instantaneous input from the user, or ones which are involved in
communication. Examples might be interactive multimedia applications, or
programs such as telnet. In the absence of such strict buffering semantics, it
can also be difficult to reason (even informally) about the contents of a file
following a series of interacting I/O operations.

Three kinds of buffering are supported: line-buffering, block-buffering or no-


buffering. These modes have the following effects. For output, items are
written out from the internal buffer according to the buffer mode:

273
 line-buffering: the entire buffer is written out whenever a new line is
output, the buffer overflows, a flush is issued, or the handle is
closed.
 block-buffering: the entire buffer is written out whenever it
overflows, a flush is issued, or the handle is closed.
 no-buffering: output is written immediately, and never stored in the
buffer.
The buffer is emptied as soon as it has been written out.
Similarly, input occurs according to the buffer mode for handle hdl.
 line-buffering: when the buffer for hdl is not empty, the next item is
obtained from the buffer; otherwise, when the buffer is empty,
characters up to and including the next newline character are read
into the buffer. No characters are available until the newline
character is available.
 block-buffering: when the buffer for hdl becomes empty, the next
block of data is read into the buffer.
 no-buffering: the next input item is read and returned.
For most implementations, the physical files will normally be block-buffered
and terminals will normally be line-buffered.
The computation hSetBuffering hdl mode sets the mode of buffering for
handle hdl on subsequent reads and writes as follows.

 If mode is Line Buffering, then line-buffering is enabled if possible.


 If mode is Block Buffering m, then block-buffering is enabled if
possible. The size of the buffer is n items if m is Just n and is
otherwise implementation-dependent.
 If mode is No Buffering, then buffering is disabled if possible.
If the mode is changed from Block Buffering or Line Buffering to No
Buffering, then
 if hdl is writable, the buffer is flushed as for hFlush;
 if hdl is not writable, the contents of the buffer is discarded.
The default buffering mode when a handle is opened is implementation-
dependent and may depend on the object which is attached to that handle.
The three buffer modes mirror those provided by ANSI C.

274
Let us sum up
In this unit we discussed about the two types of sorting techniques
such as Merge sort and Bucket sort. We also discussed their algorithm,
sample code of implementation with example program for each. For further
readings refer the suggested books listed below.

Check your progress


a) Fill up the blanks:
1. Merge sort is a sorting technique based on the __________.

2. Merging is the process of combining two or more sorted files into a


third sorted file called_________.
3. ________ distributes the list of elements across different buckets
4. Any sorting algorithm that uses external memory, such as tape or
disk, during the sorting is called_______.
Glossary
no-buffering, block-buffering, line-buffering:
Suggested readings
1. E Balagurusamy, “Data Structures”, McGraw Hill Education (India)
Private Limited, 2019.
2. Mark Allen Weiss, “Data Structures and Algorithm Analysis in C++”,
Third Edition, Pearson Education, 2007.
3. Reema Thareja, ―Data Structures Using C‖, Second Edition ,
Oxford University Press, 2011.
4. Ellis Horowitz, Sartaj Sahni, Susan Anderson-Freed,
―Fundamentals of Data Structures in C‖, Second Edition, University
Press, 2008.
Answers to check your progress

a) Fill up the blanks:

1. divide and conquer technique


2. two-way merge sort
3. Bucket sort
4. external sorting

275
Document Information

Analyzed document Full BCAP.pdf (D137294327)

Submitted 2022-05-20T10:29:00.0000000
Submitted by
Submitter email [email protected]

Similarity 9%
Analysis address [email protected]

Sources included in the report

DATA STRUCTURE AND ALGORITHM modules 2020.docx


7
Document DATA STRUCTURE AND ALGORITHM modules 2020.docx (D79729372)

Data Structure Using C_Correction.pdf


34
Document Data Structure Using C_Correction.pdf (D103408238)

MCA20R4002_DS-ol.pdf
5
Document MCA20R4002_DS-ol.pdf (D123743539)

31313PG_M.Sc._Information Technology_313 13_Data Structures and Algorithms_Binder.pdf


20
Document 31313PG_M.Sc._Information Technology_313 13_Data Structures and Algorithms_Binder.pdf (D101797542)

URL: https://sist.sathyabama.ac.in/sist_coursematerial/uploads/SBSA1201.pdf
18
Fetched: 2022-03-15T13:14:45.2570000

MCA 104 Data structure.docx


29
Document MCA 104 Data structure.docx (D53176823)

URL: https://pubhtml5.com/kcvf/pujz/basic/101-150
11
Fetched: 2021-06-23T04:05:33.7230000

DSFull.docx
16
Document DSFull.docx (D120297474)

URL: https://dokumen.pub/data-structures-9789353161828-9353161827-9789353161835-9353161835.html
3
Fetched: 2021-12-14T11:37:55.9370000

594a044ee5eaa_Data Structure_Version_1.docx
15
Document 594a044ee5eaa_Data Structure_Version_1.docx (D29466771)

edited plag (1).docx


6
Document edited plag (1).docx (D102572131)

RI V3_Advanced Data Structures_MCA ODL_Course Material_.docx


1
Document RI V3_Advanced Data Structures_MCA ODL_Course Material_.docx (D121411436)

URL: https://www.cet.edu.in/noticefiles/280_DS%20Complete.pdf
13
Fetched: 2021-07-05T23:22:28.7500000

V3_Advanced Data Structures_MCA ODL_Course Material_10.09.2021.docx


3
Document V3_Advanced Data Structures_MCA ODL_Course Material_10.09.2021.docx (D118385321)

DS_Unit 2_after plag.pdf


4
Document DS_Unit 2_after plag.pdf (D109508738)

Datastructurescorrected.doc
1
Document Datastructurescorrected.doc (D121399957)

URL: https://subhartidde.co.in/slms/DATA%20&%20FILE%20STRUCTURES%20USING%20%E2%80%98C%E2%80%99%20(MCA-%20201).pdf
13
Fetched: 2022-03-25T08:25:27.3000000

RI V3_Advanced Data Structures_MCA ODL_Course Material_.docx


1
Document RI V3_Advanced Data Structures_MCA ODL_Course Material_.docx (D123197014)

URL: https://www.vidyarthiplus.com/vp/attachment.php?aid=7486
1
Fetched: 2021-12-11T02:58:35.8230000

URL: https://sist.sathyabama.ac.in/sist_coursematerial/uploads/SBS1201.pdf
2
Fetched: 2022-04-25T11:40:23.1970000

1/62

You might also like