Data Structure-1 Bca 3rd
Data Structure-1 Bca 3rd
UNIT-I
Introduction: Elementary data organization, Data Structure definition, Data type vs. data
structure, Categories of data structures, Data structure operations, Applications of data
structures, Algorithms complexity and time-space tradeoff, Big-O notation.
UNIT-II
Arrays: Introduction, Linear arrays, Representation of linear array in memory, address
calculations, Traversal, Insertions, Deletion in an array, Multidimensional arrays, Parallel
arrays, Sparse arrays.
Linked List: Introduction, Array vs. linked list, Representation of linked lists in memory,
Traversal, Insertion, Deletion, Searching in a linked list, Header linked list, Circular linked
list, Two-way linked list, Threaded lists, Garbage collection, Applications of linked lists
UNIT-III
Stack: Introduction, Array and linked representation of stacks, Operations on stacks,
Applications of stacks: Polish notation, Recursion.
UNIT-IV
Tree: Introduction, Definition, Representing Binary tree in memory, Traversing binary trees,
Traversal algorithms using stacks.
Elementary data organization refers to the fundamental ways in which data can be structured
and represented in computer systems.
It involves the basic building blocks and techniques used to store and manipulate data
efficiently.
Key Concepts:
Data Representation:
Data Storage:
● Main Memory: Main memory (RAM) is used to store data that is currently being
accessed by the CPU.
● Secondary Storage: Devices like hard drives and SSDs are used to store data
persistently.
Data Manipulation:
examples:
● Arrays: A collection of elements of the same data type stored in contiguous memory
locations.
● Linked Lists: A sequence of nodes, each containing a data element and a pointer to
the next node.
○ Singly Linked Lists: Nodes point to the next node.
○ Doubly Linked Lists: Nodes point to both the previous and next nodes.
○ Circular Linked Lists: Last node points back to the first node.
● Stacks: A LIFO (Last-In-First-Out) data structure where elements are inserted and
deleted from the same end.
● Queues: A FIFO (First-In-First-Out) data structure where elements are inserted at one
end and deleted from the other end.
2.Non-Linear Data Structures
examples:
Definition:
A homogeneous data structure is one in which all the elements are of the same type or data
category.
Examples:
Array:
Stack:
Queue:
Linked List:
In programming languages, lists are often homogeneous, meaning all elements are of the
same type.
Definition:
A non-homogeneous data structure is one where elements can be of different types, or the
structure allows for various data types to be combined.
Examples:
A structure allows you to group different types of data under one name.
struct Student {
char name[50];
int age;
float marks;
};
A class is an advanced form of structure that allows you to bundle data and functions
together.
class Employee {
public:
string name;
int id;
float salary;
};
Linked List:
Linked lists can store elements of different types, especially when pointers to different types
of data are used in the nodes.
struct Node {
int data;
Node* next;
};
Tree:
A tree data structure often allows nodes to store elements of various types, particularly in
object-oriented implementations.
2. PRIMITIVE AND NON-PRIMITIVE DATA STRUCTURE:
simplest form of data structures and usually operate at a low level. They directly store data
values and have predefined data types.
1. Integer (int):
○ Stores numerical values without decimal points.
○ Example: int a = 10;
2. Float (float):
○ Stores decimal numbers.
○ Example: float pi = 3.14;
3. Character (char):
○ Stores a single character.
○ Example: char letter = 'A';
4. Boolean (bool):
○ Stores either true or false.
○ Example: bool isPassed = true;
5. Pointers (in C/C++):
○ Stores the memory address of another variable.
○ Example: int *ptr = &a;
Non-Primitive Data Structures:
They are more complex and store multiple values, often combining different data types.
Examples:
Array:
Linked List:
int data;
Node* next;
};
Stack:
Queue:
Tree:
int data;
TreeNode* left;
TreeNode* right;
};
Graph:
int vertex;
GraphNode* next;
};
3.Static and Dynamic Data Structures:
Definition:
● Static data structures have a fixed size, meaning the amount of memory allocated for
storing data is determined at compile time. They do not grow or shrink during
program execution.
Key Features:
1. Fixed Memory Size: Memory allocation is done before the program runs.
2. Fast Access: Since memory is contiguous, accessing elements is quick.
3. Less Flexible: The size of the structure cannot be changed at runtime.
Examples:
Array:
Static Stack:
Static Queue:
● Dynamic data structures can change size during runtime, meaning memory is
allocated and deallocated as needed.
● This allows for more efficient use of memory when dealing with varying amounts of
data.
Key Features:
Examples:
1. Linked List:
struct Node {
int data;
Node* next;
};
2. Dynamic Stack:
struct Node {
int data;
Node* next;
};
3. Dynamic Queue:
struct Node {
int data;
Node* next;
};
Basic Operations
Additional Operations
● Arrays:
○ Accessing elements by index
○ Inserting and deleting elements
○ Sorting the array
● Linked Lists:
○ Traversing the list
○ Inserting and deleting elements at the beginning, end, or a specific position
● Stacks:
○ Pushing elements onto the stack (adding to the top)
○ Popping elements from the stack (removing from the top)
● Queues:
○ Enqueuing elements (adding to the rear)
○ Dequeuing elements (removing from the front)
● Trees:
○ Traversing the tree (pre-order, in-order, post-order)
○ Inserting and deleting nodes
○ Searching for a specific node
● Graphs:
○ Traversing the graph (breadth-first search, depth-first search)
○ Finding shortest paths between nodes
○ Detecting cycles
Applications of Data Structures
1. Web Development:
● HTML DOM: The Document Object Model represents the structure of an HTML
document as a tree of nodes.
● Database Indexing: Data structures like B-trees are used to efficiently index and
retrieve data in databases.
● Caching: Hash tables are used to implement caches for storing frequently accessed
data.
2. Operating Systems:
● Process Management: Stacks and queues are used to manage processes and threads.
● Memory Management: Heaps are used for memory allocation and deallocation.
● File Systems: Trees are used to represent the hierarchical structure of file systems.
3. Game Development:
● Pathfinding: Graphs are used to represent game maps and algorithms like Dijkstra's
algorithm are used to find the shortest path between locations.
4. Database Systems:
● Indexing: B-trees and other balanced trees are used to efficiently index data.
● Query Processing: Data structures like hash tables and inverted indexes are used for
query optimization.
Algorithm Complexity
Algorithm complexity refers to the efficiency of an algorithm in terms of its resource usage,
typically measured in terms of time and space.
Time Complexity
● Definition: a function that describes how the algorithm's running time grows with the
input size.
● Notation: Big O notation (e.g., O(n), O(n log n), O(n^2)).
● Factors: number of operations, loop iterations, and recursive calls.
Space Complexity
● Definition: a function that describes the amount of memory the algorithm requires to
run.
● Factors: size of input data, temporary variables, and recursive call stack.
Algorithmic Time-Space Tradeoff
In many cases, there is a trade-off between time complexity and space complexity.
An algorithm that is efficient in terms of time might require more space, and vice versa.
In computing, the time-space tradeoff refers to the balance between time complexity and
space complexity.
Generally, optimizing an algorithm to use less time (faster execution) can require more
memory (space), and vice versa.
● Time-efficient algorithms:
These may use more memory to precompute or store results to avoid repeated
computations (e.g., caching).
● Space-efficient algorithms:
These may use less memory but can be slower since they might need to recompute
results or process data multiple times.
● Problem Size: The choice of algorithm depends on the expected size of the input data.
● Hardware Constraints: The available memory and processing power can influence the
choice of algorithm.
● Specific Requirements: The application's requirements for speed, memory usage, and
other factors should be considered.
Asymptotic Analysis
process of evaluating the efficiency of an algorithm by studying its behavior as the size of the
input grows towards infinity. This analysis helps in understanding the algorithm's
performance in terms of time and space complexity, particularly for large input sizes.
Purpose:
Process:
1. Identify the Basic Operations: Determine which operations dominate the algorithm’s
runtime or space usage.
2. Express the Complexity Function: Write down the function that describes the number
of operations or space used as a function of input size n.
3. Simplify: Focus on the growth rate of the function and ignore lower-order terms and
constant factors.
Asymptotic Notations
Definition: Asymptotic notations are mathematical tools used to describe the growth rate of
functions in asymptotic analysis.
They provide a way to formally express the efficiency of algorithms by characterizing their
time or space complexity.
● Definition:
It represents the worst-case scenario, showing the maximum time an algorithm will
take as input size grows.
●
Significance of Big O notation:
Big O notation helps in measuring how well an algorithm performs when the input size
increases.
It allows you to compare the efficiency of different algorithms. For example, if one algorithm
has a time complexity of O(n) and another has O(n^2), you can tell that the first one will
generally perform better for larger inputs.
By using Big O notation, you can predict how an algorithm will behave with larger inputs,
which is crucial in real-world applications where the amount of data may increase
significantly.
It helps in identifying the most time-consuming part of an algorithm, so developers can focus
on optimizing those areas.
Understanding the Big O of algorithms is important for writing efficient code, which
conserves resources like time (CPU cycles) and memory (RAM)
Steps to Determine Big O Notation:
● Examine the function and identify the term with the highest order of growth as the
● The Big O notation is written as O(f(n)), where f(n) represents the dominant term.
● For example, if the dominant term is n^2, the Big O notation would be O(n^2).
● In some cases, the Big O notation can be simplified by removing constant factors
Example:
● Efficient Algorithms: Those with O(1), O(logn), and O(n) time complexities are
considered efficient for large inputs.
● Moderate Efficiency: Algorithms with O(n logn) time complexity are usually efficient but
slower than linear algorithms, often used in sorting algorithms.
● Inefficient Algorithms: Algorithms with O(n^2), O(n^3), O(2^n) and O(n!) grow much
faster and can become impractical for large inputs.
Mathematical Examples of Runtime Analysis:
Below table illustrates the runtime analysis of different orders of algorithms as the input size
(n) increases.
Comparison of Big O Notation, Big Ω (Omega) Notation, and Big θ (Theta) Notation:
In the analysis of algorithms, "best case," "average case," and "worst case" refer to how an
algorithm performs under different conditions of input.
Each describes the time or space complexity based on specific input scenarios:
1. Best Case:
● Definition: The best case refers to the scenario where the algorithm performs
optimally, taking the least amount of time or space.
● Complexity: The best-case time complexity represents the minimum time the
algorithm can take for any input of size n.
● Usage: It shows the ideal situation but is usually not a reliable performance measure,
as best-case scenarios rarely happen in real-world conditions.
● Example: In a linear search, the best case occurs when the target element is found at
the first position. The time complexity in this case is O(1) (constant time).
2. Average Case:
● Definition: The average case represents the expected time or space complexity when
typical or random inputs are considered. It’s the average time the algorithm will take
over all possible inputs.
● Complexity: The average-case complexity gives a realistic performance measure as it
reflects typical behavior.
● Usage: It’s the most common performance measure used when analyzing algorithms.
● Example: For linear search, the average case occurs when the target element is found
roughly halfway through the list. The average time complexity would be O(n/2), which
simplifies to O(n).
3. Worst Case:
● Definition: The worst case refers to the scenario where the algorithm takes the
maximum time or space, typically due to the most challenging input of size n.
● Complexity: The worst-case time complexity represents the upper bound of the
algorithm’s performance.
● Usage: Worst-case analysis is critical, as it ensures the algorithm won’t perform
worse than this bound, which is useful for performance guarantees.
● Example: In linear search, the worst case occurs when the target element is at the
last position or not present in the list at all. The time complexity here would be O(n),
meaning the algorithm checks all elements.
Example of All Cases: Linear Search
● Best Case: The element is found at the first position. Time complexity is O(1).
● Average Case: The element is found somewhere in the middle of the array. Time
complexity is O(n/2), simplified to O(n).
● Worst Case: The element is found at the last position or is not present at all. Time
complexity is O(n).
Key Points:
1. Introduction to Strings
Examples of Strings:
● "Hello, world!"
● "123456"
● "A quick brown fox jumps over the lazy dog."
Output:
3. String Operations
Single Program with all string operations
TOPIC: Applications of Strings
1. Text Processing:
○ Searching: Strings are used to search for patterns or specific sequences of
characters within larger texts. For example, in a search engine, a query is
matched with strings from the indexed web pages.
■ Example: Searching for a word in a document (strstr() in C, .find() in
C++).
○ Pattern Matching: This is commonly used in text editors, compilers, and other
tools to search for specific patterns using algorithms like Knuth-Morris-Pratt
(KMP) or regular expressions (regex).
■ Example: Finding all email addresses in a document using a regex
pattern.
○ Replacing: Strings are used to find and replace specific patterns of text within
larger bodies of text.
■ Example: Replacing all occurrences of "hello" with "hi" in a document.
2. Data Validation and Parsing:
○ Input Validation: Strings are used to validate user input in applications. For
instance, checking if an entered email address or phone number is in the
correct format.
■ Example: Validating a user’s email format in an application using
regular expressions.
○ Parsing Data: Strings are used in parsing tasks, such as breaking down and
analyzing a structured format (e.g., JSON, XML, or CSV files) for further
processing.
■ Example: Parsing a CSV file where data is stored as a string and split
into components using delimiters like commas.
3. Communication Systems:
○ Data Transmission: Strings are essential in network communication. Data, such
as messages or requests, are often encoded as strings before being
transmitted between systems or devices.
■ Example: HTTP requests and responses use strings for communication
between a client and server.
4. Bioinformatics:
○ DNA Sequence Analysis: Strings are used to represent DNA sequences, which
consist of strings of the letters A, C, G, and T. Various algorithms are applied to
compare and analyze these sequences in biological research.
■ Example: Matching DNA sequences to identify genetic similarities.
5. Cryptography and Security:
○ Encryption and Decryption: Strings are crucial in cryptography, where plain
text is transformed into encrypted text (cipher) and vice versa using string
manipulation techniques.
■ Example: Encrypting sensitive data like passwords using algorithms
such as AES or RSA, where text is treated as a string of characters.
6. Natural Language Processing (NLP):
○ Sentiment Analysis: Strings are processed in NLP applications to determine
the sentiment of a text, whether it is positive, negative, or neutral.
■ Example: Analyzing customer feedback on social media posts using
string data to classify the sentiment.
○ Speech Recognition: Speech is often converted into text (strings) for further
processing in applications like voice assistants (Siri, Google Assistant).
■ Example: Converting spoken words into a string and interpreting them
to provide answers or actions.
7. User Interfaces:
○ Textual Content Representation: Most graphical user interfaces (GUIs) display
strings for textual content, such as labels, buttons, or menus.
■ Example: Displaying "Submit" on a button in a web form or mobile
application.
8. File Handling and Manipulation:
○ Reading and Writing Files: Strings are used extensively for reading from and
writing to files. For example, reading a text file line by line involves treating
each line as a string.
■ Example: Reading a .txt file where each line of text is handled as a
string.
○ File Naming: File names, extensions, and paths are all strings in computer
systems.
■ Example: document.txt is a string representing a file name.
9. Database Queries:
○ SQL Queries: In database management systems, SQL queries are constructed
and processed as strings. For instance, to retrieve or manipulate data from a
database, strings are used in queries.
■ Example: SELECT * FROM users WHERE name = 'John';
10. Game Development:
○ Character and Dialogue Management: Strings are used to store and display
in-game dialogue, names of characters, and other text-based content.
○ Example: Displaying dialogues during character interactions or storing player
names.
TOPICS: Pattern Matching Algorithms:
Pattern matching refers to the process of searching a text or a sequence for occurrences of a
specified pattern.
The goal is to determine whether the pattern exists in the text and, if so, to identify its
position(s) within the text.
● Text Searching: Pattern matching is essential for search engines, editors, and
database systems.
● Biological Sequences: Used for matching DNA, RNA sequences.
● Spam Filtering: Matching email content with spam filters.
Types of Pattern Matching Algorithms
The simplest approach, where we check for the pattern at every possible position in the text.
This algorithm compares the pattern to every substring of the text, one character at a time.
Algorithm Steps
● Input:
○ A text string T of length n.
○ A pattern string P of length m.
● Output:
○ All occurrences of the pattern P in the text T.
● Procedure:
○ For each position i in the text from 0 to n - m:
■ For each character j in the pattern from 0 to m - 1:
● If T[i + j] is not equal to P[j], break the inner loop.
■ If all characters matched (i.e., j reached m), print the starting index i.
Output:
Explanation of the Program
1. Function Definition:
○ The naiveStringMatch function takes two strings: text and pattern.
2. Loop Through Text:
○ The outer loop iterates through each possible starting position of the pattern
within the text.
3. Inner Loop for Comparison:
○ The inner loop checks character by character to see if the pattern matches the
substring of the text starting at position i.
4. Match Found:
○ If the inner loop completes without a mismatch (i.e., j equals m), it indicates
that the pattern has been found, and the starting index is printed.
5. Main Function:
○ The main function takes user input for both the text and the pattern, then calls
the naiveStringMatch function to perform the search.
Time Complexity:
O(n * m), where n is the length of the text and m is the length of the pattern.
B. Knuth-Morris-Pratt (KMP) Algorithm
The KMP algorithm improves on the naive approach by using information gathered during the
matching process to avoid redundant comparisons. It preprocesses the pattern to create a
partial match table (also called the "prefix table"), which tells the algorithm how much to
shift the pattern when a mismatch occurs.
This algorithm uses hashing to find any one of a set of pattern strings in a text. The idea is to
hash the pattern and then compare it with the hash of each substring of the text of the same
length as the pattern.
D. Boyer-Moore Algorithm
Boyer-Moore is one of the most efficient algorithms for pattern matching. It processes the
pattern and text from right to left and uses two heuristics—Bad Character and Good
Suffix—to skip sections of the text, making it very efficient in practical use.
E. Aho-Corasick Algorithm
This algorithm is used for matching multiple patterns at once. It constructs a trie of all the
patterns and then processes the text using this trie, enabling simultaneous pattern matching.
● Prefix Function: In the KMP algorithm, the prefix function helps reduce unnecessary
comparisons.
● Heuristics: Used in Boyer-Moore to skip sections of the text, reducing the number of
comparisons.
● Hashing: In Rabin-Karp, a hash function allows comparison of the hash of the pattern
to the hash of substrings of the text.
6. Comparative Analysis
TOPICS:
Array: Introduction
An Array is a data structure that stores a collection of elements, all of the same data type, in
contiguous memory locations.
Arrays are used to store multiple values efficiently and are accessible by indexing.
Characteristics of Arrays:
1. Fixed Size
2. Homogeneous Elements
3. Contiguous Memory Locations
4. Indexing starting from 0
5. Efficient Access Using Indexes
Types of Arrays:
● Efficient Access: Arrays provide O(1) time complexity for accessing elements using an
index.
● Easy to Traverse: Arrays allow easy iteration through elements using loops.
Disadvantages of Arrays
● Fixed Size: The size of an array is fixed, which can lead to wasted space or insufficient
space.
● Expensive Insertion/Deletion: Adding or removing elements in arrays, especially in the
middle, requires shifting elements, which can be time-consuming (O(n)).
● Traversing
● Insertion
● Deletion
● Searching
● Sorting
Topic: Linear Arrays
A Linear Array is a type of array where elements are arranged in a linear sequence, meaning one after
the other in contiguous memory locations.
1. Single Dimension
2. Fixed Size
3. Homogeneous Elements
4. Indexing
Accessing Elements in a Linear Array:
1. Fixed Size
2. Insertion and Deletion Overhead:requires shifting elements, which can be
time-consuming for large arrays.
Applications of Linear Arrays:
1. Storing Lists: Arrays are used to store lists of items like names, numbers, or any other
data type.
2. Matrix Representation: Linear arrays can be extended to form multi-dimensional
arrays, such as 2D arrays for matrices.
3. Searching and Sorting: Arrays are commonly used in algorithms like linear search,
binary search, and sorting techniques (bubble sort, selection sort, etc.).
A linear array is stored in contiguous memory locations in the computer's memory. This
means that once an array is declared, a block of memory is allocated for storing the elements,
and each element is placed one after another in consecutive memory addresses.
Key Points:
Each integer typically takes 4 bytes of memory (this may vary based on the system
architecture).
If the base address of arr[0] is 1000, the memory locations for each element would be as
follows:
Traversal in an Array
Traversal in an array refers to the process of accessing and visiting each element of the array,
one by one, starting from the first element to the last.
Traversal is one of the most basic operations performed on arrays, and it is often used for
operations like searching, updating, or displaying elements.
1. Linear Access: During traversal, each element is accessed sequentially using its index.
2. Fixed Iteration: The number of iterations is equal to the size of the array.
3. No Modifications: Traversal generally does not change the contents of the array
unless additional operations are performed during the traversal (like insertion or
deletion).
Traversal Algorithm:
Complexity of Traversal:
● Time Complexity: O(n), where n is the number of elements in the array. Since each
element must be visited exactly once, the time taken is proportional to the size of the
array.
● Space Complexity: O(1), as only a constant amount of additional memory is required,
regardless of the array size.
Applications of Traversal:
1. Searching: During traversal, we can check if a particular element exists in the array.
2. Updating Elements: Traversal can be used to update or modify the values of array
elements.
3. Displaying Elements: It is often used to print or display the contents of the array.
Traversal can also be done in reverse order, starting from the last element to the first
element.
Insertions in an Array
Insertion in an array refers to the process of adding a new element to an array at a specific
position.
Since arrays have a fixed size, inserting an element may involve shifting the existing elements
to make space for the new element.
The insertion can be done at different positions like the beginning, middle, or end of the
array.
Types of Insertion:
1.Insertion at the Beginning: The new element is inserted at the start of the array.
2.Insertion at the Middle (at a specific index): The element is inserted at any valid index
within the array.
3.Insertion at the End: The new element is added after the last element of the array.
1. Ensure space: Make sure there is enough space to accommodate the new element. If
the array is full, no new element can be added without resizing (in static arrays).
2. Shift elements: If the insertion is not at the end, shift the elements to the right to
create space for the new element.
3. Insert the new element: Place the new element in the desired position.
4. Update the size: Increment the array size after inserting the element.
Time Complexity of Insertion:
Deletion in an Array
Deletion in an array refers to the process of removing an element from a specific position in
the array.
Similar to insertion, deletion involves shifting elements to fill the gap left by the removed
element.
The operation can be performed at various positions: at the beginning, in the middle, or at the
end of the array.
Types of Deletion:
1. Deletion at the Beginning: The first element is removed, and subsequent elements
are shifted left.
2. Deletion at a Specific Position: Any element can be removed from a specified index.
3. Deletion at the End: The last element is removed, which is straightforward as no
shifting is needed.
When deleting the first element, you also need to shift all subsequent elements to the left.
Example:
Delete the first element from arr[5] = {10, 20, 30, 40, 50}:
Time Complexity of Deletion:
Multidimensional arrays are arrays that have more than one dimension.
The most common type is the two-dimensional array, which can be thought of as a table
with rows and columns.
Multidimensional arrays can be used to represent complex data structures such as matrices,
grids, or even higher-dimensional data.
Key Points:
Two-Dimensional Arrays:
A two-dimensional array consists of rows and columns. The number of rows and columns
can vary based on the requirements.
Memory Representation:
This means that all elements of the first column are stored first, followed by the elements of
the second column, and so on.
●
Applications of Multidimensional Arrays:
The elements are typically accessed using three indices: arr[x][y][z], where:
To calculate the address of an element in a 3D array, you can use the following formula:
Row-Major Order
For a 3D array arr[x][y][z], the address of the element arr[i][j][k] can be calculated
using:
Where:
Column-Major Order
Consider a 3D array arr[3][4][5], and we want to find the address of the element
arr[1][2][3]:
Key Points:
Characteristics:
● Fixed Length: All arrays in parallel must have the same length
to ensure that corresponding elements relate correctly.
● Data Integrity: Care must be taken to ensure that data
across the arrays is synchronized, as modifying one array
without updating the others can lead to inconsistencies.
Accessing Elements:
To access the data, you can use a common index for all the
parallel arrays. For example, to get the grade of "Bob", you
would access grades[1], since "Bob" is at index 1 in the
names array.
Key Points:
Structure:
Disadvantages:
● Complexity in Operations: Operations like addition, subtraction,
and multiplication can become more complex due to the
indirect storage of data.
● Access Time: While memory usage is reduced, accessing
individual elements can take more time compared to a
dense matrix due to the need for additional indexing.