Unit-1
Sub.- Data Structure
Class- B.Sc.-IIIrd
Data Structures:
A data structure is a specialized format for organizing, processing, retrieving and storing
data.Data structures refer to data and representation of data objects within a program, that is,
the implementation of structured relationships. A data structure is a collection of atomic and
composite data types into a set with defined relationships.
The term Data Structure refers to the organization of data elements and the interrelationships
among them.
Data Structure is a way to store and organize data so that it can be used efficiently.
Introduction
Data Structure can be defined as the group of data elements which provides an efficient way of storing
and organizing data in the computer so that it can be used efficiently. Some examples of Data
Structures are arrays, Linked List, Stack, Queue, etc. Data Structures are widely used in almost every
aspect of Computer Science i.e. Operating System, Compiler Design, Artificial intelligence, Graphics
and many more.
Data Structures are the main part of many computer science algorithms as they enable the programmers
to handle the data in an efficient way. It plays a vital role in enhancing the performance of a software or
a program as the main function of the software is to store and retrieve the user's data as fast as possible.
Basic Terminology
Data structures are the building blocks of any program or the software. Choosing the appropriate data
structure for a program is the most difficult task for a programmer. Following terminology is used as
far as data structures are concerned.
Data: Data can be defined as an elementary value or the collection of values, for example,
student's name and its id are the data about the student.
Group Items: Data items which have subordinate data items are called Group item, for example,
name of a student can have first name and the last name.
Record: Record can be defined as the collection of various data items, for example, if we talk
about the student entity, then its name, address, course and marks can be grouped together to
form the record for the student.
File: A File is a collection of various records of one type of entity, for example, if there are 60
employees in the class, then there will be 20 records in the related file where each record
contains the data about each employee.
Attribute and Entity: An entity represents the class of certain objects. it contains various
attributes. Each attribute represents the particular property of that entity.
Field: Field is a single elementary unit of information representing the attribute of an entity.
Need of Data Structures
As applications are getting complex and amount of data is increasing day by day, there may arise
the following problems:
Processor speed: To handle very large amount of data, high speed processing is required, but as
the data is growing day by day to the billions of files per entity, processor may fail to deal with
that much amount of data.
Data Search: Consider an inventory size of 106 items in a store, If our application needs to
search for a particular item, it needs to traverse 106 items every time, results in slowing down the
search process.
Multiple requests: If thousands of users are searching the data simultaneously on a web server,
then there are the chances that a very large server can be failed during that process
in order to solve the above problems, data structures are used. Data is organized to form a data
structure in such a way that all items are not required to be searched and required data can be
searched instantly.
Advantages of Data Structures
Efficiency: Efficiency of a program depends upon the choice of data structures. For example:
suppose, we have some data and we need to perform the search for a particular record. In that
case, if we organize our data in an array, we will have to search sequentially element by element.
hence, using array may not be very efficient here. There are better data structures which can
make the search process efficient like ordered array, binary search tree or hash tables.
Reusability: Data structures are reusable, i.e. once we have implemented a particular data
structure, we can use it at any other place. Implementation of data structures can be compiled
into libraries which can be used by different clients.
Abstraction: Data structure is specified by the ADT which provides a level of abstraction. The
client program uses the data structure through interface only, without getting into the implementation
details.
Data Structure Classification
Linear Data Structures
If a data structure organizes the data in sequential order, then that data structure is called
a Linear Data Structure.
Example
1. Arrays
2. List (Linked List)
3. Stack
4. Queue
Types of Linear Data Structures are given below:
Arrays: An array is a collection of similar type of data items and each data item is called an
element of the array. The data type of the element may be any valid data type like char, int, float
or double.
The elements of array share the same variable name but each one carries a different index
number known as subscript. The array can be one dimensional, two dimensional or
multidimensional.
The individual elements of the array age are:
age[0], age[1], age[2], age[3],......... age[98], age[99].
Linked List: Linked list is a linear data structure which is used to maintain a list in the memory.
It can be seen as the collection of nodes stored at non-contiguous memory locations. Each node
of the list contains a pointer to its adjacent node.
Stack: Stack is a linear list in which insertion and deletions are allowed only at one end,
called top.
A stack is an abstract data type (ADT), can be implemented in most of the programming
languages. It is named as stack because it behaves like a real-world stack, for example: - piles of
plates or deck of cards etc.
Queue: Queue is a linear list in which elements can be inserted only at one end called rear and
deleted only at the other end called front.
It is an abstract data structure, similar to stack. Queue is opened at both end therefore it follows
First-In-First-Out (FIFO) methodology for storing the data items.
Non Linear Data Structures:
This data structure does not form a sequence i.e. each item or element is connected with two or
more other items in a non-linear arrangement. The data elements are not arranged in sequential
structure.
Non - Linear Data Structures
If a data structure organizes the data in random order, then that data structure is called as
Non-Linear Data Structure.
Example
1. Tree
2. Graph
3. Dictionaries
4. Heaps
5. Tries, Etc.,
Types of Non Linear Data Structures are given below:
Trees: Trees are multilevel data structures with a hierarchical relationship among its elements
known as nodes. The bottommost nodes in the herierchy are called leaf node while the topmost
node is called root node. Each node contains pointers to point adjacent nodes.
Tree data structure is based on the parent-child relationship among the nodes. Each node in the
tree can have more than one children except the leaf nodes whereas each node can have atmost
one parent except the root node. Trees can be classfied into many categories which will be
discussed later in this tutorial.
Graphs: Graphs can be defined as the pictorial representation of the set of elements (represented
by vertices) connected by the links known as edges. A graph is different from tree in the sense
that a graph can have cycle while the tree can not have the one.
Operations on data structure
1) Traversing: Every data structure contains the set of data elements. Traversing the data
structure means visiting each element of the data structure in order to perform some specific
operation like searching or sorting.
Example: If we need to calculate the average of the marks obtained by a student in 6 different
subject, we need to traverse the complete array of marks and calculate the total sum, then we will
devide that sum by the number of subjects i.e. 6, in order to find the average.
2) Insertion: Insertion can be defined as the process of adding the elements to the data structure
at any location.
If the size of data structure is n then we can only insert n-1 data elements into it.
3) Deletion:The process of removing an element from the data structure is called Deletion. We
can delete an element from the data structure at any random location.
If we try to delete an element from an empty data structure then underflow occurs.
4) Searching: The process of finding the location of an element within the data structure is
called Searching. There are two algorithms to perform searching, Linear Search and Binary
Search. We will discuss each one of them later in this tutorial.
5) Sorting: The process of arranging the data structure in a specific order is known as Sorting.
There are many algorithms that can be used to perform sorting, for example, insertion sort,
selection sort, bubble sort, etc.
6) Merging: When two lists List A and List B of size M and N respectively, of similar type of
elements, clubbed or joined to produce the third list, List C of size (M+N), then this process is
called merging.
Asymptotic Notations
Asymptotic notations are the mathematical notations used to describe the
running time of an algorithm when the input tends towards a particular
value or a limiting value.
There are mainly three asymptotic notations:
1. Big-O notation
2. Omega notation
3. Theta notation
Abstract Data Type:
An abstract data type, sometimes abbreviated ADT, is a logical description of how we view the data and
the operations that are allowed without regard to how they will be implemented. This means that we are
concerned only with what data is representing and not with how it will eventually be constructed. By
providing this level of abstraction, we are creating an encapsulation around the data. The idea is that by
encapsulating the details of the implementation, we are hiding them from the user’s view. This is called
information hiding. The implementation of an abstract data type, often referred to as a data structure, will
require that we provide a physical view of the data using some collection of programming constructs and
primitive data types.
Abstract Data Types (ADT)
In this, we will learn about ADT but before understanding what ADT is let us consider different inbuilt
data types that are provided to us. Data types such as int, float, double, long, etc. are considered to be
in-built data types and we can perform basic operations with them such as addition, subtraction,
division, multiplication, etc. Now there might be a situation when we need operations for our
userdefined data type which have to be defined. These operations can be defined only as and when we
require them. So, in order to simplify the process of solving problems, we can create data structures
along with their operations, and such data structures that are not in-built are known as Abstract Data
Type (ADT).
Abstract Data type (ADT) is a type (or class) for objects whose behavior is defined by a set of values
and a set of operations. The definition of ADT only mentions what operations are to be performed but
not how these operations will be implemented. It does not specify how data will be organized in
memory and what algorithms will be used for implementing the operations. It is called “abstract”
because it gives an implementation-independent view.
The process of providing only the essentials and hiding the details is known as abstraction.
The user of data type does not need to know how that data type is implemented, for example, we have
been using Primitive values like int, float, char data types only with the knowledge that these data type
can operate and be performed on without any idea of how they are implemented. So, a user only needs
to know what a data type can do, but not how it will be implemented. Think of ADT as a black box
which hides the inner structure and design of the data type. Now we’ll define three ADTs namely List
ADT, Stack ADT, Queue ADT.
1. List ADT
The data is generally stored in key sequence in a list which has a head structure consisting of count,
pointers and address of compare function needed to compare the data in the list.
The data node contains the pointer to a data structure and a self-referential pointer which points to
the next node in the list.
The List ADT Functions is given below:
o get() – Return an element from the list at any given position.
o insert() – Insert an element at any position of the list.
o remove() – Remove the first occurrence of any element from a non-empty list.
o removeAt() – Remove the element at a specified location from a non-empty list.
o replace() – Replace an element at any position by another element.
o size() – Return the number of elements in the list.
o isEmpty() – Return true if the list is empty, otherwise return false.
o isFull() – Return true if the list is full, otherwise return false.
2. Stack ADT
In Stack ADT Implementation instead of data being stored in each node, the pointer to data is
stored.
o The program allocates memory for the data and address is passed to the stack ADT.
o The head node and the data nodes are encapsulated in the ADT. The calling function can only
o see the pointer to the stack.
o The stack head structure also contains a pointer to top and count of number of entries
o currently in stack.
o push() – Insert an element at one end of the stack called top.
o pop() – Remove and return the element at the top of the stack, if it is not empty.
o peek() – Return the element at the top of the stack without removing it, if the stack is not
o empty.
o size() – Return the number of elements in the stack.
o isEmpty() – Return true if the stack is empty, otherwise return false.
o isFull() – Return true if the stack is full, otherwise return false
3. Queue ADT
The queue abstract data type (ADT) follows the basic design of the stack abstract data type.
Each node contains a void pointer to the data and the link pointer to the next element in the
queue. The program’s responsibility is to allocate memory for storing the data.
enqueue() – Insert an element at the end of the queue.
dequeue() – Remove and return the first element of the queue, if the queue is not empty.
peek() – Return the element of the queue without removing it, if the queue is not empty.
size() – Return the number of elements in the queue.
isEmpty() – Return true if the queue is empty, otherwise return false.
isFull() – Return true if the queue is full, otherwise return false.
Features of ADT:
Abstraction: The user does not need to know the implementation of the data structure.
Better Conceptualization: ADT gives us a better conceptualization of the real world.
Robust: The program is robust and has the ability to catch errors.
From these definitions, we can clearly see that the definitions do not specify how these ADTs will be
represented and how the operations will be carried out. There can be different ways to implement an
ADT, for example, the List ADT can be implemented using arrays, or singly linked list or doubly
linked list. Similarly, stack ADT and Queue ADT can be implemented using arrays or linked lists.
ARRAY: