S. Y. B.
Tech (Computer Engineering)
Academic Year – 2024-2025 Semester –III
[CS2203T]: Data structures
Teaching Scheme: Credit: Examination Scheme:
TH: 3 Hours/Week TH: 3 In Sem. Evaluation : 20 Marks
Mid Sem. Exam : 30 Marks
End Sem. Exam : 50 Marks
Total : 100 Marks
Course Prerequisites : Fundamentals of Computer Programming [CS1203T], Object
Oriented Programming [CS1205T]
Course Objective:
· To understand the memory requirement for various data structure and abstract data
representation methods.
· Allow to assess how the choice of data structures and algorithm design methods
impacts the performance of programs
· To choose the appropriate data structure and algorithm design method for a specified
application.
· To solve problems using data structures such as linear lists, stacks, queues, binary
trees, binary search trees, and graphs, hash tables and writing programs for these
solutions.
Course Contents
UNIT-I Introduction to Algorithm and Analysis of 6 Hours
Algorithms
Concept of Problem Solving, Introduction to Algorithms,
Characteristics of Algorithms, Introduction to Data Structure, Data
Structure Classification (Linear and Non-linear, Static and Dynamic,
Persistent and Ephemeral data structures), Time complexity and Space
complexity, Asymptotic Notation - The Big-O, Omega and Theta
notation, Algorithmic upper bounds, lower bounds, Best, Worst and
Average case analysis of an Algorithm, Abstract Data Types (ADT).
Problem Solving
• Problem Solving is a step-by-step process of identifying an
issue, analyzing it, and finding the most effective solution. It is a
fundamental skill in computer science, mathematics,
engineering, and real life.
Why is Problem Solving
Important?
• Enhances critical thinking
• Helps in decision making
• Encourages logical reasoning
• Used in coding, daily life, and technical fields
• Essential for interview preparation and competitive exams
Steps in Problem Solving
What is an Algorithm?
• An algorithm is a finite set of instructions designed to perform a
specific task or solve a particular problem.
Categories of Algorithm
• Sequence
• Selection
• Iteration
Sequence Algorithm
• The steps described in an algorithm are performed successively one by one without
skipping any step.
• The sequence of steps defined in an algorithm should be simple and easy to understand.
• Each instruction of such an algorithm is executed, because no selection procedure or
conditional branching exists in a sequence algorithm
Example: Adding two numbers
• Step 1: start
• Step 2: read a,b
• Step 3: Sum=a+b
• Step 4: write Sum
• Step 5: stop
Selection Algorithm
• The sequence type of algorithms are not sufficient to solve the problems, which
involves decision and conditions. In order to solve the problem which involve
decision making or option selection, we go for Selection type of algorithm.
Example:
if(condition) Statement-1; else Statement-2;
• The above syntax specifies that if the condition is true, statement-1 will be executed
otherwise statement-2 will be executed. In case the operation is unsuccessful. Then
sequence of algorithm should be changed/ corrected in such a way that the system will re
execute until the operation is successful
Iteration
• Iteration type algorithms are used in solving the problems which involves repetition of
statement. In this type of algorithms, a particular number of statements are repeated ‘n’
no. of times.
Example :
• Step 1 : start
• Step 2 : read n
• Step 3 : repeat step 4 until n>0
• Step 4 : (a) r=n mod 10
(b) s=s+r
(c) n=n/10
• Step 5 : write s
• Step 6 : stop
Characteristics of Good Algorithm
No. Characteristic Description
The algorithm should accept 0 or
more inputs. These are the values
1 Input
provided externally to produce the
output.
It must produce at least one result
2 Output (output). The result should be the
solution to the given problem.
All steps should be clear, precise,
and unambiguous. There should be
3 Definiteness
no confusion in interpreting the
steps.
The algorithm must terminate after
4 Finiteness a finite number of steps. It should
not go into an infinite loop.
Each instruction must be simple
5 Effectiveness enough to be executed with basic
operations in a limited time.
The algorithm should be applicable
6 Generality to a set of similar problems, not
just one specific case.
What is a Flowchart?
• A flowchart is a diagram that visually represents a process or
workflow, using symbols and arrows to illustrate the steps and their
order
Difference between Algorithm and
Flowchart
Algorithm:
• It is a procedure for solving problems.
• The process is shown in step-by-step instruction. It is complex and difficult to understand.
• It is convenient to debug errors.
• The solution is showcased in natural language.
• It is somewhat easier to solve complex problem. It costs more time to create an algorithm.
Flowchart:
• It is a graphic representation of a process.
• The process is shown in block-by-block information diagram.
• It is intuitive and easy to understand.
• It is hard to debug errors.
• The solution is showcased in pictorial format.
• It is hard to solve complex problem. It costs less time to create a flowchart
• Example : Print 1 to 20
Algorithm:
• Step 1: Initialize X as 0,
• Step 2: Increment X by 1,
• Step 3: Print X,
• Step 4: If X is less than 20 then go back to step 2
Flowchart
What is Data Structure
• A data structure is a way of organizing and storing data so that it can
be accessed and modified efficiently.
Classification of Data Structures
• Linear
• Non-Linear
• Static
• Dynamic
• Persistent
• Ephermal
🔸
Linear And Non-Linear
Linear Data Structures
• Elements are stored in a linear (sequential) order.
• Easy to traverse and manage.
Examples:
• Array – Fixed size, fast access.
• Linked List – Elements linked with pointers.
• Stack – LIFO (Last In First Out).
• Queue – FIFO (First In First Out).
Non-Linear Data Structures
• Data is organized in hierarchical or graph-like manner.
• Useful for complex relationships.
Examples:
• Tree – Hierarchical structure.
• Graph – Set of nodes (vertices) and connections (edges).
• Heap – Special tree used in sorting and priority queues.
• Trie – Tree used for searching strings (e.g., dictionaries).
Static and Dynamic
Static Data Structures
• Size is fixed at compile time.
• Example: Array
Dynamic Data Structures
• Size can change at runtime.
• Example: Linked List, Stack, Queue
Persistent and Ephemerel Data Structure
Ephemeral Data Structures
• These are the normal, traditional data structures.
• When you update them, the old version is lost.
• Example: If you change a value in a list, the original value is overwritten.
• They are destructive in nature.
• Example in daily life: Editing a Word document and saving it — the old content is gone unless you undo it.
Persistent Data Structures
• These preserve previous versions after updates.
• You can access past states at any time.
• Mostly used in functional programming and version control systems.
• Example in daily life: Google Docs with version history — you can go back to older versions even after making
changes.
Asymptotic Notation
• Asymptotic notations are the mathematical notations used to describe the running time of an algorithm
when the input tends towards a particular value or a limiting value.
• For example: In bubble sort, when the input array is already sorted, the time taken by the algorithm is linear
i.e. the best case.
• But, when the input array is in reverse condition, the algorithm takes the maximum time (quadratic) to sort
the elements i.e. the worst case.
• When the input array is neither sorted nor in reverse order, then it takes average time. These durations are
denoted using asymptotic notations.
Big-O Notation (O-
notation)
O(g(n)) = { f(n): there exist positive constants c and
n0 such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0 }
Omega Notation (Ω-
notation)
Ω(g(n)) = { f(n): there exist positive constants c
and n0 such that 0 ≤ cg(n) ≤ f(n) for all n ≥ n0 }
Theta Notation
(Θ-notation)
Θ(g(n)) = { f(n): there exist positive
constants c1, c2 and n0 such that 0 ≤
c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ n0 }
Worst Case Analysis
• In the worst-case analysis, we calculate the upper bound on the running time of an algorithm. We must know the
case that causes a maximum number of operations to be executed.
• For Linear Search, the worst case happens when the element to be searched (x) is not present in the array.
When x is not present, the search() function compares it with all the elements of arr[] one by one.
• This is the most commonly used analysis of algorithms (We will be discussing below why). Most of the time we
consider the case that causes maximum operations.
Best Case Analysis
• In the best-case analysis, we calculate the lower bound on the running time of an algorithm. We must know the
case that causes a minimum number of operations to be executed.
• For linear search, the best case occurs when x is present at the first location. The number of operations in the
best case is constant (not dependent on n). So the order of growth of time taken in terms of input size is
constant.
Average Case Analysis
• In average case analysis, we take all possible inputs and calculate the computing time for all of the
inputs. Sum all the calculated values and divide the sum by the total number of inputs.
• We must know (or predict) the distribution of cases. For the linear search problem, let us assume
that all cases are uniformly distirbuted . So we sum all the cases and divide the sum by (n+1). We
take (n+1) to consider the case when the element is not present.
Abstract Data Type
• An Abstract Data Type (ADT) is a model or logical description of how data is organized and
what operations are allowed on it — without specifying how it is implemented.
• It tells what operations can be done, not how they are done in memory.
Features Of ADT
• Abstraction: The user does not need to know the implementation of the data structure only essentials are provided.
• Better Conceptualization: ADT gives us a better conceptualization of the real world.
• Robust: The program is robust and has the ability to catch errors.
• Encapsulation: ADTs hide the internal details of the data and provide a public interface for users to interact with the data.
• Data Abstraction: ADTs provide a level of abstraction from the implementation details of the data. Users only need to know the
operations that can be performed on the data, not how those operations are implemented.
• Data Structure Independence: ADTs can be implemented using different data structures, such as arrays or linked lists, without
affecting the functionality of the ADT.
• Information Hiding: ADTs can protect the integrity of the data by allowing access only to authorized users and operations. This
helps prevent errors and misuse of the data.
• Modularity: ADTs can be combined with other ADTs to form larger, more complex data structures. This allows for greater
flexibility and modularity in programming.
Why use ADT
• Encapsulation: Hides complex implementation details behind a
clean interface.
• Reusability: Allows different internal implementations (e.g.,
array or linked list) without changing external usage.
• Modularity: Simplifies maintenance and updates by separating
logic.
• Security: Protects data by preventing direct access, minimizing
bugs and unintended changes.
List ADT
The List ADT (Abstract Data Type) is a
sequential collection of elements that
supports a set of operations without
specifying the internal implementation. It
provides an ordered way to store, access, and
modify data
Operations on ListADT
• get(): Return an element from the list at any given position.
• insert(): Insert an element at any position in the list.
• remove(): Remove the first occurrence of any element from a non-empty list.
• removeAt(): Remove the element at a specified location from a non-empty list.
• replace(): Replace an element at any position with another element.
• size(): Return the number of elements in the list.
• isEmpty(): Return true if the list is empty; otherwise, return false.
• isFull(): Return true if the list is full, otherwise, return false. Only applicable in
fixed-size implementations (e.g., array-based lists).
Stack ADT
A stack is an ordered list or we can say a
container in which insertion and deletion
can be done from the one end known as
the top of the stack. The last inserted
element is available first and is the first
one to be deleted. Hence, it is known as
Last In, First Out LIFO, or First In, Last
Out FILO.
Operations on StackADT
• push(): Insert an element at one end of the stack called the top.
• pop(): Remove and return the element at the top of the stack, if it is
not empty.
• peek(): Return the element at the top of the stack without removing
it, if the stack is not empty.
• size(): Return the number of elements in the stack.
• isEmpty(): Return true if the stack is empty; otherwise, return false.
• isFull(): Return true if the stack is full; otherwise, return false.
Queue ADT
The Queue ADT is a linear data
structure that follows the FIFO (First
In, First Out) principle. It allows
elements to be inserted at one end
(rear) and removed from the other
end (front).
Operations on QueueADT
• enqueue(): Insert an element at the end of the queue.
• dequeue(): Remove and return the first element of the queue, if the
queue is not empty.
• peek(): Return the element of the queue without removing it, if the
queue is not empty.
• size(): Return the number of elements in the queue.
• isEmpty(): Return true if the queue is empty; otherwise, return false.
Advantages
• Encapsulation: ADTs provide a way to encapsulate data and operations into a single unit, making it easier to manage and
modify the data structure.
• Abstraction: ADTs allow users to work with data structures without having to know the implementation details, which can
simplify programming and reduce errors.
• Data Structure Independence: ADTs can be implemented using different data structures, which can make it easier to adapt to
changing needs and requirements.
• Information Hiding: ADTs can protect the integrity of data by controlling access and preventing unauthorized modifications.
• Modularity: ADTs can be combined with other ADTs to form more complex data structures, which can increase flexibility and
modularity in programming.
Disadvantages
• Overhead: Implementing ADTs can add overhead in terms of memory and processing, which can affect performance.
• Complexity: ADTs can be complex to implement, especially for large and complex data structures.
• Learning Curve: Using ADTs requires knowledge of their implementation and usage, which can take time and effort to learn.
• Limited Flexibility: Some ADTs may be limited in their functionality or may not be suitable for all types of data structures.
• Cost: Implementing ADTs may require additional resources and investment, which can increase the cost of development.
Programming Style
• Clarity and simplicity of Expression
• Naming
• Control Constructs
• Information hiding
• Nesting
• User-defined types
• Module size
• Module Interface
• Side-effects
Clarity and Simplicity of Expression
• Code should express its purpose clearly
• Avoid complex logic when simple alternatives exist
• Break complex expressions into smaller steps
Naming
• Use meaningful names for variables, functions, and classes
• Good naming makes the code self-documenting
• Example: int totalMarks instead of int x
Control Constructs
• Use appropriate control structures like if-else, loops, switch
• Avoid overusing nested structures
• Prefer switch over multiple if-else where applicable
Information Hiding
• Keep internal module details private
• Expose only necessary functionality
• Promotes encapsulation and modularity
Nesting
• Avoid deep nesting of loops and conditions
• Refactor or use early return to reduce nesting
• Improves code readability
User-defined Types
• Use classes/structures to represent real-world entities
• Group related data into objects
• Improves code organization and reusability
Module Size
• Each module should perform one specific task
• Split large functions into smaller helper methods
• Ideal function length: 20–30 lines
Module Interface
• Keep inter-module communication simple
• Avoid passing too many arguments
• Use objects to group related data
Side Effects
• Avoid unexpected changes to global variables/files
• Pure functions are more reliable and testable
• Functions should only do what they are meant to do
Refinement of Coding
• Stepwise refinement of Coding: design a problem solution by
• stating the solution at a high level
• refining steps of the solution into simpler steps
• repeating step 2, until steps are simple enough to execute
• Decompose based on function of each step
• Makes heavy use of pseudocode
Testing
• Software testing can be divided into two steps:
1. Verification: it refers to the set of tasks that ensure that software
correctly implements a specific function.
• 2. Validation: it refers to a different set of tasks that ensure that the
software that has been built is traceable to customer requirements.
Types of software testing
1.Manual Testing: Manual testing includes testing a software manually,
i.e., without using any automated tool or any script. In this type, the tester
takes over the role of an end-user and tests the software to identify any
unexpected behaviour or bug. Testers use test plans, test cases, or test
scenarios to test a software to ensure the completeness of testing.
2.Automation Testing: Automation testing, which is also known as Test
Automation, is when the tester writes scripts and uses another software to
test the product. This process involves automation of a manual process.
Automation Testing is used to re-run the test scenarios that were
performed manually, quickly, and repeatedly.