0% found this document useful (0 votes)
9 views100 pages

Data Structure-1 Bca 3rd

data structure semeter 3rd mdu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views100 pages

Data Structure-1 Bca 3rd

data structure semeter 3rd mdu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 100

FACULTY NOTES

BCA 3RD SEMESTER


SYLLABUS
DATA STRUCTURE-1

UNIT-I
Introduction: Elementary data organization, Data Structure definition, Data type vs. data
structure, Categories of data structures, Data structure operations, Applications of data
structures, Algorithms complexity and time-space tradeoff, Big-O notation.

Strings: Introduction, Storing strings, String operations, Pattern matching algorithms.

UNIT-II
Arrays: Introduction, Linear arrays, Representation of linear array in memory, address
calculations, Traversal, Insertions, Deletion in an array, Multidimensional arrays, Parallel
arrays, Sparse arrays.

Linked List: Introduction, Array vs. linked list, Representation of linked lists in memory,
Traversal, Insertion, Deletion, Searching in a linked list, Header linked list, Circular linked
list, Two-way linked list, Threaded lists, Garbage collection, Applications of linked lists

UNIT-III
Stack: Introduction, Array and linked representation of stacks, Operations on stacks,
Applications of stacks: Polish notation, Recursion.

Queues: Introduction, Array and linked representation of queues, Operations on queues,


Deques, Priority Queues, Applications of queues.

UNIT-IV
Tree: Introduction, Definition, Representing Binary tree in memory, Traversing binary trees,
Traversal algorithms using stacks.

Graph: Introduction, Graph theory terminology, Sequential and linked representation of


graphs
Topic 1: Elementary Data Organization

Elementary data organization refers to the fundamental ways in which data can be structured
and represented in computer systems.

It involves the basic building blocks and techniques used to store and manipulate data
efficiently.

Key Concepts:

● Data Types: The fundamental categories of data, such as integers, floating-point


numbers, characters, and Boolean values.
● Variables: Named storage locations used to hold data values.
● Operators: Symbols used to perform operations on data, such as arithmetic, logical,
and relational operations.
● Expressions: Combinations of variables, constants, and operators that evaluate to a
value.
● Control Flow: The order in which statements are executed, including conditional
statements (if-else) and loops (for, while).

Basic Data Structures:

● Arrays: Ordered collections of elements of the same data type.


● Linked Lists: Sequences of nodes, each containing a data element and a pointer to the
next node.
● Stacks: LIFO (Last-In-First-Out) data structures.
● Queues: FIFO (First-In-First-Out) data structures.

Data Representation:

● Binary Representation: Numbers and characters are typically represented using


binary code (0s and 1s).
● Character Encoding: ASCII, Unicode, and other character encoding schemes are used
to represent textual data.

Data Storage:

● Main Memory: Main memory (RAM) is used to store data that is currently being
accessed by the CPU.
● Secondary Storage: Devices like hard drives and SSDs are used to store data
persistently.
Data Manipulation:

● Algorithms: Procedures or sets of instructions used to perform operations on data.


● Programming Languages: High-level languages (like C++, Java, Python) provide tools
and syntax for working with data
Topic 2: Data Type vs Data Structure
Topic 3: Categories of Data Structures:

1.Linear Data Structures:

Elements are arranged sequentially.

examples:

● Arrays: A collection of elements of the same data type stored in contiguous memory
locations.
● Linked Lists: A sequence of nodes, each containing a data element and a pointer to
the next node.
○ Singly Linked Lists: Nodes point to the next node.
○ Doubly Linked Lists: Nodes point to both the previous and next nodes.
○ Circular Linked Lists: Last node points back to the first node.
● Stacks: A LIFO (Last-In-First-Out) data structure where elements are inserted and
deleted from the same end.
● Queues: A FIFO (First-In-First-Out) data structure where elements are inserted at one
end and deleted from the other end.
2.Non-Linear Data Structures

Elements have complex relationships, not necessarily sequential.

examples:

● Trees: Hierarchical data structures with nodes connected by edges.


○ Binary Trees: Each node has at most two children.
○ Binary Search Trees: A binary tree where the left child's key is less than the
parent's key, and the right child's key is greater.
● Graphs: A collection of nodes (vertices) connected by edges.
○ Directed Graphs: Edges have a direction.
○ Undirected Graphs: Edges do not have a direction.
Other Categories of Data Structures:

1.HOMOGENEOUS & NON-HOMOGENEOUS DATA STRUCTURES

Homogeneous Data Structures:

Definition:

A homogeneous data structure is one in which all the elements are of the same type or data
category.

Examples:

Array:

int arr[5] = {1, 2, 3, 4, 5}; // All elements are integers.

Stack:

It can be implemented using arrays or linked lists.

int stack[5]; // Stack storing integer elements.

Queue:

int queue[5]; // Queue storing integer elements.

Linked List:

In programming languages, lists are often homogeneous, meaning all elements are of the
same type.

int list[5]; // A list storing integers.


Non-Homogeneous Data Structures:

Definition:

A non-homogeneous data structure is one where elements can be of different types, or the
structure allows for various data types to be combined.

Examples:

Structure (struct in C/C++):

A structure allows you to group different types of data under one name.

struct Student {

char name[50];

int age;

float marks;

};

Class (in C++/Object-Oriented Languages):

A class is an advanced form of structure that allows you to bundle data and functions
together.

class Employee {

public:

string name;

int id;

float salary;

};
Linked List:

Linked lists can store elements of different types, especially when pointers to different types
of data are used in the nodes.

struct Node {

int data;

Node* next;

};

Tree:

A tree data structure often allows nodes to store elements of various types, particularly in
object-oriented implementations.
2. PRIMITIVE AND NON-PRIMITIVE DATA STRUCTURE:

Primitive Data Structures

simplest form of data structures and usually operate at a low level. They directly store data
values and have predefined data types.

Types of Primitive Data Structures:

1. Integer (int):
○ Stores numerical values without decimal points.
○ Example: int a = 10;
2. Float (float):
○ Stores decimal numbers.
○ Example: float pi = 3.14;
3. Character (char):
○ Stores a single character.
○ Example: char letter = 'A';
4. Boolean (bool):
○ Stores either true or false.
○ Example: bool isPassed = true;
5. Pointers (in C/C++):
○ Stores the memory address of another variable.
○ Example: int *ptr = &a;
Non-Primitive Data Structures:

These are derived from primitive data structures.

They are more complex and store multiple values, often combining different data types.

Non-primitive structures can be either linear or non-linear.

Examples:

Array:

Example: int arr[5] = {1, 2, 3, 4, 5};

Linked List:

Example: struct Node {

int data;

Node* next;

};

Stack:

Example: int stack[5];

Queue:

Example: int queue[5];

Tree:

Example: struct TreeNode {

int data;

TreeNode* left;

TreeNode* right;

};
Graph:

Example: struct GraphNode {

int vertex;

GraphNode* next;

};
3.Static and Dynamic Data Structures:

Definition:

● Static data structures have a fixed size, meaning the amount of memory allocated for
storing data is determined at compile time. They do not grow or shrink during
program execution.

Key Features:

1. Fixed Memory Size: Memory allocation is done before the program runs.
2. Fast Access: Since memory is contiguous, accessing elements is quick.
3. Less Flexible: The size of the structure cannot be changed at runtime.

Examples:

Array:

int arr[5] = {1, 2, 3, 4, 5}; // Array of size 5.

Static Stack:

int stack[10]; // Static stack of size 10.

Static Queue:

int queue[10]; // Static queue of size 10.


Definition:

● Dynamic data structures can change size during runtime, meaning memory is
allocated and deallocated as needed.
● This allows for more efficient use of memory when dealing with varying amounts of
data.

Key Features:

1. Variable Memory Size: Memory is allocated during runtime based on need.


2. Flexible: Can grow or shrink as data is added or removed.
3. Memory Efficiency: No memory is wasted as allocation is done dynamically.

Examples:

1. Linked List:

struct Node {

int data;

Node* next;

};

Node* head = nullptr; // Dynamically created linked list.

2. Dynamic Stack:

struct Node {

int data;

Node* next;

};

Node* top = nullptr; // Dynamically allocated stack.

3. Dynamic Queue:
struct Node {

int data;

Node* next;

};

Node* front = nullptr; // Dynamically allocated queue.

Node* rear = nullptr;

Choosing the Right Data Structure:

● Search efficiency: How quickly elements need to be found.


● Insertion and deletion efficiency: How frequently elements need to be added or
removed.
● Memory usage: The amount of memory required to store the data.
Data Structure Operations

Basic Operations

● Traversing: Visiting each element in the data structure exactly once.


● Searching: Finding the location of a specific element based on its value.
● Insertion: Adding a new element to the data structure.
● Deletion: Removing an element from the data structure.

Additional Operations

● Updating: Modifying the value of an existing element.


● Sorting: Arranging elements in a specific order (e.g., ascending, descending).
● Merging: Combining two or more data structures into a single one.
● Splitting: Dividing a data structure into smaller substructures.
Specific Operations for Different Data Structures:

● Arrays:
○ Accessing elements by index
○ Inserting and deleting elements
○ Sorting the array
● Linked Lists:
○ Traversing the list
○ Inserting and deleting elements at the beginning, end, or a specific position
● Stacks:
○ Pushing elements onto the stack (adding to the top)
○ Popping elements from the stack (removing from the top)
● Queues:
○ Enqueuing elements (adding to the rear)
○ Dequeuing elements (removing from the front)
● Trees:
○ Traversing the tree (pre-order, in-order, post-order)
○ Inserting and deleting nodes
○ Searching for a specific node
● Graphs:
○ Traversing the graph (breadth-first search, depth-first search)
○ Finding shortest paths between nodes
○ Detecting cycles
Applications of Data Structures

1. Web Development:

● HTML DOM: The Document Object Model represents the structure of an HTML
document as a tree of nodes.
● Database Indexing: Data structures like B-trees are used to efficiently index and
retrieve data in databases.
● Caching: Hash tables are used to implement caches for storing frequently accessed
data.

2. Operating Systems:

● Process Management: Stacks and queues are used to manage processes and threads.
● Memory Management: Heaps are used for memory allocation and deallocation.
● File Systems: Trees are used to represent the hierarchical structure of file systems.

3. Game Development:

● Pathfinding: Graphs are used to represent game maps and algorithms like Dijkstra's
algorithm are used to find the shortest path between locations.

4. Database Systems:

● Indexing: B-trees and other balanced trees are used to efficiently index data.
● Query Processing: Data structures like hash tables and inverted indexes are used for
query optimization.
Algorithm Complexity

Algorithm complexity refers to the efficiency of an algorithm in terms of its resource usage,
typically measured in terms of time and space.

Time Complexity

● Definition: a function that describes how the algorithm's running time grows with the
input size.
● Notation: Big O notation (e.g., O(n), O(n log n), O(n^2)).
● Factors: number of operations, loop iterations, and recursive calls.

Space Complexity

● Definition: a function that describes the amount of memory the algorithm requires to
run.
● Factors: size of input data, temporary variables, and recursive call stack.
Algorithmic Time-Space Tradeoff

In many cases, there is a trade-off between time complexity and space complexity.

An algorithm that is efficient in terms of time might require more space, and vice versa.

In computing, the time-space tradeoff refers to the balance between time complexity and
space complexity.

Generally, optimizing an algorithm to use less time (faster execution) can require more
memory (space), and vice versa.

● Time-efficient algorithms:

These may use more memory to precompute or store results to avoid repeated
computations (e.g., caching).

● Space-efficient algorithms:

These may use less memory but can be slower since they might need to recompute
results or process data multiple times.

A classic example is sorting algorithms:

● Merge Sort: Requires O(n log n) time and O(n) space.


● Quick Sort: Requires O(n log n) time but only O(log n) space.

Key Considerations to choose efficient algorithm:

● Problem Size: The choice of algorithm depends on the expected size of the input data.
● Hardware Constraints: The available memory and processing power can influence the
choice of algorithm.
● Specific Requirements: The application's requirements for speed, memory usage, and
other factors should be considered.
Asymptotic Analysis

process of evaluating the efficiency of an algorithm by studying its behavior as the size of the
input grows towards infinity. This analysis helps in understanding the algorithm's
performance in terms of time and space complexity, particularly for large input sizes.

Purpose:

● To estimate the running time or space requirements of an algorithm.


● To understand how the algorithm scales with larger input sizes.
● To compare the relative efficiency of different algorithms.

Process:

1. Identify the Basic Operations: Determine which operations dominate the algorithm’s
runtime or space usage.
2. Express the Complexity Function: Write down the function that describes the number
of operations or space used as a function of input size n.
3. Simplify: Focus on the growth rate of the function and ignore lower-order terms and
constant factors.
Asymptotic Notations

Definition: Asymptotic notations are mathematical tools used to describe the growth rate of
functions in asymptotic analysis.

They provide a way to formally express the efficiency of algorithms by characterizing their
time or space complexity.

Commonly used asymptotic notations:

1. Big-O Notation (O)

● Definition:

Describes the upper bound of the time or space complexity of an algorithm.

It represents the worst-case scenario, showing the maximum time an algorithm will
take as input size grows.

● Purpose: It gives the worst-case time complexity.


● Formal Definition: A function f(n) = O(g(n)) if there exist positive constants c and n₀
such that f(n) ≤ c * g(n) for all n ≥ n₀.


Significance of Big O notation:

1. Performance Measurement of algorithm:

Big O notation helps in measuring how well an algorithm performs when the input size
increases.

2. Comparing Algorithms Time or Space Complexity:

It allows you to compare the efficiency of different algorithms. For example, if one algorithm
has a time complexity of O(n) and another has O(n^2), you can tell that the first one will
generally perform better for larger inputs.

3. Predicting Scalability for large inputs:

By using Big O notation, you can predict how an algorithm will behave with larger inputs,
which is crucial in real-world applications where the amount of data may increase
significantly.

4. Identifying Bottlenecks for Improvement and Optimizing the algo:

It helps in identifying the most time-consuming part of an algorithm, so developers can focus
on optimizing those areas.

5. Conservation of Resources(like memory and processor time):

Understanding the Big O of algorithms is important for writing efficient code, which
conserves resources like time (CPU cycles) and memory (RAM)
Steps to Determine Big O Notation:

1. Identify the Dominant Term:

● Examine the function and identify the term with the highest order of growth as the

input size increases.

● Ignore any constant factors or lower-order terms.

2. Write the Big O Notation:

● The Big O notation is written as O(f(n)), where f(n) represents the dominant term.

● For example, if the dominant term is n^2, the Big O notation would be O(n^2).

3. Simplify the Notation (Optional):

● In some cases, the Big O notation can be simplified by removing constant factors

or by using a more concise notation.

● For instance, O(2^n) can be simplified to O(n).

Example:

Function: f(n) = 3n^3 + 2n^2 + 5n + 1

1. Dominant Term: 3n^3

2. Order of Growth: Cubic (n^3)

3. Big O Notation: O(n^3)

4. Simplified Notation: O(n^3)


Graphical Comparison of Big(O) functions
Key Points:

● Efficient Algorithms: Those with O(1), O(log⁡n), and O(n) time complexities are
considered efficient for large inputs.

● Moderate Efficiency: Algorithms with O(n log⁡n) time complexity are usually efficient but
slower than linear algorithms, often used in sorting algorithms.

● Inefficient Algorithms: Algorithms with O(n^2), O(n^3), O(2^n) and O(n!) grow much
faster and can become impractical for large inputs.
Mathematical Examples of Runtime Analysis:
Below table illustrates the runtime analysis of different orders of algorithms as the input size
(n) increases.
Comparison of Big O Notation, Big Ω (Omega) Notation, and Big θ (Theta) Notation:
In the analysis of algorithms, "best case," "average case," and "worst case" refer to how an
algorithm performs under different conditions of input.

Each describes the time or space complexity based on specific input scenarios:

1. Best Case:

● Definition: The best case refers to the scenario where the algorithm performs
optimally, taking the least amount of time or space.
● Complexity: The best-case time complexity represents the minimum time the
algorithm can take for any input of size n.
● Usage: It shows the ideal situation but is usually not a reliable performance measure,
as best-case scenarios rarely happen in real-world conditions.
● Example: In a linear search, the best case occurs when the target element is found at
the first position. The time complexity in this case is O(1) (constant time).

2. Average Case:

● Definition: The average case represents the expected time or space complexity when
typical or random inputs are considered. It’s the average time the algorithm will take
over all possible inputs.
● Complexity: The average-case complexity gives a realistic performance measure as it
reflects typical behavior.
● Usage: It’s the most common performance measure used when analyzing algorithms.
● Example: For linear search, the average case occurs when the target element is found
roughly halfway through the list. The average time complexity would be O(n/2), which
simplifies to O(n).

3. Worst Case:

● Definition: The worst case refers to the scenario where the algorithm takes the
maximum time or space, typically due to the most challenging input of size n.
● Complexity: The worst-case time complexity represents the upper bound of the
algorithm’s performance.
● Usage: Worst-case analysis is critical, as it ensures the algorithm won’t perform
worse than this bound, which is useful for performance guarantees.
● Example: In linear search, the worst case occurs when the target element is at the
last position or not present in the list at all. The time complexity here would be O(n),
meaning the algorithm checks all elements.
Example of All Cases: Linear Search

Suppose you are searching for an element in an unsorted array:

● Best Case: The element is found at the first position. Time complexity is O(1).
● Average Case: The element is found somewhere in the middle of the array. Time
complexity is O(n/2), simplified to O(n).
● Worst Case: The element is found at the last position or is not present at all. Time
complexity is O(n).

Key Points:

● Best Case: Ideal scenario (fastest).


● Average Case: Typical scenario (realistic).
● Worst Case: Worst scenario (guaranteed performance).
Strings: Comprehensive Notes

1. Introduction to Strings

● A string is a sequence of characters, typically used to represent text.

Examples of Strings:

● "Hello, world!"
● "123456"
● "A quick brown fox jumps over the lazy dog."

2. Storing Strings Methods

2.1 Fixed-length Character Arrays

2.2 Null-terminated Strings

2.3 Dynamic Memory Allocation

2.4 Pointer to a String Literal

2.5 Using scanf() for input strings


2.6 Using fgets()
All type of storage methods in C Programming:

Output:
3. String Operations
Single Program with all string operations
TOPIC: Applications of Strings

1. Text Processing:
○ Searching: Strings are used to search for patterns or specific sequences of
characters within larger texts. For example, in a search engine, a query is
matched with strings from the indexed web pages.
■ Example: Searching for a word in a document (strstr() in C, .find() in
C++).
○ Pattern Matching: This is commonly used in text editors, compilers, and other
tools to search for specific patterns using algorithms like Knuth-Morris-Pratt
(KMP) or regular expressions (regex).
■ Example: Finding all email addresses in a document using a regex
pattern.
○ Replacing: Strings are used to find and replace specific patterns of text within
larger bodies of text.
■ Example: Replacing all occurrences of "hello" with "hi" in a document.
2. Data Validation and Parsing:
○ Input Validation: Strings are used to validate user input in applications. For
instance, checking if an entered email address or phone number is in the
correct format.
■ Example: Validating a user’s email format in an application using
regular expressions.
○ Parsing Data: Strings are used in parsing tasks, such as breaking down and
analyzing a structured format (e.g., JSON, XML, or CSV files) for further
processing.
■ Example: Parsing a CSV file where data is stored as a string and split
into components using delimiters like commas.
3. Communication Systems:
○ Data Transmission: Strings are essential in network communication. Data, such
as messages or requests, are often encoded as strings before being
transmitted between systems or devices.
■ Example: HTTP requests and responses use strings for communication
between a client and server.
4. Bioinformatics:
○ DNA Sequence Analysis: Strings are used to represent DNA sequences, which
consist of strings of the letters A, C, G, and T. Various algorithms are applied to
compare and analyze these sequences in biological research.
■ Example: Matching DNA sequences to identify genetic similarities.
5. Cryptography and Security:
○ Encryption and Decryption: Strings are crucial in cryptography, where plain
text is transformed into encrypted text (cipher) and vice versa using string
manipulation techniques.
■ Example: Encrypting sensitive data like passwords using algorithms
such as AES or RSA, where text is treated as a string of characters.
6. Natural Language Processing (NLP):
○ Sentiment Analysis: Strings are processed in NLP applications to determine
the sentiment of a text, whether it is positive, negative, or neutral.
■ Example: Analyzing customer feedback on social media posts using
string data to classify the sentiment.
○ Speech Recognition: Speech is often converted into text (strings) for further
processing in applications like voice assistants (Siri, Google Assistant).
■ Example: Converting spoken words into a string and interpreting them
to provide answers or actions.
7. User Interfaces:
○ Textual Content Representation: Most graphical user interfaces (GUIs) display
strings for textual content, such as labels, buttons, or menus.
■ Example: Displaying "Submit" on a button in a web form or mobile
application.
8. File Handling and Manipulation:
○ Reading and Writing Files: Strings are used extensively for reading from and
writing to files. For example, reading a text file line by line involves treating
each line as a string.
■ Example: Reading a .txt file where each line of text is handled as a
string.
○ File Naming: File names, extensions, and paths are all strings in computer
systems.
■ Example: document.txt is a string representing a file name.
9. Database Queries:
○ SQL Queries: In database management systems, SQL queries are constructed
and processed as strings. For instance, to retrieve or manipulate data from a
database, strings are used in queries.
■ Example: SELECT * FROM users WHERE name = 'John';
10. Game Development:
○ Character and Dialogue Management: Strings are used to store and display
in-game dialogue, names of characters, and other text-based content.
○ Example: Displaying dialogues during character interactions or storing player
names.
TOPICS: Pattern Matching Algorithms:

Pattern matching refers to the process of searching a text or a sequence for occurrences of a
specified pattern.

The goal is to determine whether the pattern exists in the text and, if so, to identify its
position(s) within the text.

Importance of Pattern Matching Algorithms

● Text Searching: Pattern matching is essential for search engines, editors, and
database systems.
● Biological Sequences: Used for matching DNA, RNA sequences.
● Spam Filtering: Matching email content with spam filters.
Types of Pattern Matching Algorithms

A. Naive String Matching Algorithm

The simplest approach, where we check for the pattern at every possible position in the text.
This algorithm compares the pattern to every substring of the text, one character at a time.

Algorithm Steps

● Input:
○ A text string T of length n.
○ A pattern string P of length m.
● Output:
○ All occurrences of the pattern P in the text T.
● Procedure:
○ For each position i in the text from 0 to n - m:
■ For each character j in the pattern from 0 to m - 1:
● If T[i + j] is not equal to P[j], break the inner loop.
■ If all characters matched (i.e., j reached m), print the starting index i.
Output:
Explanation of the Program

1. Function Definition:
○ The naiveStringMatch function takes two strings: text and pattern.
2. Loop Through Text:
○ The outer loop iterates through each possible starting position of the pattern
within the text.
3. Inner Loop for Comparison:
○ The inner loop checks character by character to see if the pattern matches the
substring of the text starting at position i.
4. Match Found:
○ If the inner loop completes without a mismatch (i.e., j equals m), it indicates
that the pattern has been found, and the starting index is printed.
5. Main Function:
○ The main function takes user input for both the text and the pattern, then calls
the naiveStringMatch function to perform the search.

Time Complexity:

O(n * m), where n is the length of the text and m is the length of the pattern.
B. Knuth-Morris-Pratt (KMP) Algorithm

The KMP algorithm improves on the naive approach by using information gathered during the
matching process to avoid redundant comparisons. It preprocesses the pattern to create a
partial match table (also called the "prefix table"), which tells the algorithm how much to
shift the pattern when a mismatch occurs.

● Advantages: Efficient in cases with repetitive patterns.


● Time Complexity: O(n + m), where n is the length of the text and m is the length of
the pattern.
C. Rabin-Karp Algorithm

This algorithm uses hashing to find any one of a set of pattern strings in a text. The idea is to
hash the pattern and then compare it with the hash of each substring of the text of the same
length as the pattern.

● Advantages: Suitable for multiple pattern matching.


● Time Complexity: O(n * m) in the worst case, but O(n + m) on average with a good
hash function.

D. Boyer-Moore Algorithm

Boyer-Moore is one of the most efficient algorithms for pattern matching. It processes the
pattern and text from right to left and uses two heuristics—Bad Character and Good
Suffix—to skip sections of the text, making it very efficient in practical use.

● Advantages: Efficient for large text and pattern sizes.


● Time Complexity: Best case O(n/m), worst case O(n * m).

E. Aho-Corasick Algorithm

This algorithm is used for matching multiple patterns at once. It constructs a trie of all the
patterns and then processes the text using this trie, enabling simultaneous pattern matching.

● Advantages: Useful for applications like searching in dictionaries.


● Time Complexity: O(n + m + z), where z is the number of matches found.

4. Key Concepts in Pattern Matching

● Prefix Function: In the KMP algorithm, the prefix function helps reduce unnecessary
comparisons.
● Heuristics: Used in Boyer-Moore to skip sections of the text, reducing the number of
comparisons.
● Hashing: In Rabin-Karp, a hash function allows comparison of the hash of the pattern
to the hash of substrings of the text.

5. Applications of Pattern Matching Algorithms

● Text Editors: Search and replace functionalities.


● Data Compression: Tools like gzip use pattern matching to identify repeated
substrings.
● Security: Intrusion detection systems use pattern matching to detect malicious
signatures.
● Bioinformatics: Pattern matching is used in DNA sequence alignment.

6. Comparative Analysis
TOPICS:
Array: Introduction

An Array is a data structure that stores a collection of elements, all of the same data type, in
contiguous memory locations.

Arrays are used to store multiple values efficiently and are accessible by indexing.

This makes arrays a fundamental concept for organizing data in programming.

Characteristics of Arrays:

1. Fixed Size
2. Homogeneous Elements
3. Contiguous Memory Locations
4. Indexing starting from 0
5. Efficient Access Using Indexes

Types of Arrays:

1. One-dimensional arrays (Linear arrays): A single row or list of elements.


2. Multidimensional arrays: Arrays with more than one dimension (e.g., 2D arrays, 3D
arrays).
Advantages of Arrays

● Efficient Access: Arrays provide O(1) time complexity for accessing elements using an
index.
● Easy to Traverse: Arrays allow easy iteration through elements using loops.

Disadvantages of Arrays

● Fixed Size: The size of an array is fixed, which can lead to wasted space or insufficient
space.
● Expensive Insertion/Deletion: Adding or removing elements in arrays, especially in the
middle, requires shifting elements, which can be time-consuming (O(n)).

Common Operations on Arrays:

● Traversing
● Insertion
● Deletion
● Searching
● Sorting
Topic: Linear Arrays

A Linear Array is a type of array where elements are arranged in a linear sequence, meaning one after
the other in contiguous memory locations.

It is also known as a one-dimensional array.

Key Characteristics of Linear Arrays:

1. Single Dimension
2. Fixed Size
3. Homogeneous Elements
4. Indexing
Accessing Elements in a Linear Array:

Advantages of Linear Arrays:

1. Random Access using index.


2. Easy to Implement
3. Memory Efficiency

Disadvantages of Linear Arrays:

1. Fixed Size
2. Insertion and Deletion Overhead:requires shifting elements, which can be
time-consuming for large arrays.
Applications of Linear Arrays:

1. Storing Lists: Arrays are used to store lists of items like names, numbers, or any other
data type.
2. Matrix Representation: Linear arrays can be extended to form multi-dimensional
arrays, such as 2D arrays for matrices.
3. Searching and Sorting: Arrays are commonly used in algorithms like linear search,
binary search, and sorting techniques (bubble sort, selection sort, etc.).

Representation of Linear Arrays in Memory

A linear array is stored in contiguous memory locations in the computer's memory. This
means that once an array is declared, a block of memory is allocated for storing the elements,
and each element is placed one after another in consecutive memory addresses.

Key Points:

1. Contiguous Memory Allocation


2. Indexing and Addresses assigned to each element and calculated using index
3. First element address is Base Address
4. Element Size depends on data type of the element

Each integer typically takes 4 bytes of memory (this may vary based on the system
architecture).

If the base address of arr[0] is 1000, the memory locations for each element would be as
follows:
Traversal in an Array

Traversal in an array refers to the process of accessing and visiting each element of the array,
one by one, starting from the first element to the last.

Traversal is one of the most basic operations performed on arrays, and it is often used for
operations like searching, updating, or displaying elements.

Key Points about Traversal:

1. Linear Access: During traversal, each element is accessed sequentially using its index.
2. Fixed Iteration: The number of iterations is equal to the size of the array.
3. No Modifications: Traversal generally does not change the contents of the array
unless additional operations are performed during the traversal (like insertion or
deletion).

Traversal Algorithm:

For an array arr of size n, the traversal algorithm is as follows:

1. Start from the first element (index 0).


2. Access each element one by one by incrementing the index.
3. Stop when the last element (index n-1) is reached.
After accessing each element, traversal completes.

Complexity of Traversal:

● Time Complexity: O(n), where n is the number of elements in the array. Since each
element must be visited exactly once, the time taken is proportional to the size of the
array.
● Space Complexity: O(1), as only a constant amount of additional memory is required,
regardless of the array size.
Applications of Traversal:

1. Searching: During traversal, we can check if a particular element exists in the array.
2. Updating Elements: Traversal can be used to update or modify the values of array
elements.
3. Displaying Elements: It is often used to print or display the contents of the array.

Traversal in Reverse Order:

Traversal can also be done in reverse order, starting from the last element to the first
element.

Insertions in an Array

Insertion in an array refers to the process of adding a new element to an array at a specific
position.

Since arrays have a fixed size, inserting an element may involve shifting the existing elements
to make space for the new element.

The insertion can be done at different positions like the beginning, middle, or end of the
array.

Types of Insertion:

1.Insertion at the Beginning: The new element is inserted at the start of the array.
2.Insertion at the Middle (at a specific index): The element is inserted at any valid index
within the array.

3.Insertion at the End: The new element is added after the last element of the array.

Steps for Insertion:

1. Ensure space: Make sure there is enough space to accommodate the new element. If
the array is full, no new element can be added without resizing (in static arrays).
2. Shift elements: If the insertion is not at the end, shift the elements to the right to
create space for the new element.
3. Insert the new element: Place the new element in the desired position.
4. Update the size: Increment the array size after inserting the element.
Time Complexity of Insertion:

● At the End: O(1) – Constant time, as no shifting is required.


● At the Beginning: O(n) – Linear time, as all elements need to be shifted.
● At a Specific Position: O(n) – Linear time, depending on the position of insertion and
the number of elements that need to be shifted.

Deletion in an Array

Deletion in an array refers to the process of removing an element from a specific position in
the array.

Similar to insertion, deletion involves shifting elements to fill the gap left by the removed
element.

The operation can be performed at various positions: at the beginning, in the middle, or at the
end of the array.

Types of Deletion:

1. Deletion at the Beginning: The first element is removed, and subsequent elements
are shifted left.
2. Deletion at a Specific Position: Any element can be removed from a specified index.
3. Deletion at the End: The last element is removed, which is straightforward as no
shifting is needed.

Steps for Deletion:

1. Identify the position: Determine the index of the element to be deleted.


2. Shift elements: Move all subsequent elements one position to the left to fill the gap.
3. Update the size: Decrease the size of the array (logical size) after deletion.
Deletion at the Beginning:

When deleting the first element, you also need to shift all subsequent elements to the left.

Example:

Delete the first element from arr[5] = {10, 20, 30, 40, 50}:
Time Complexity of Deletion:

● At the Beginning: O(n) – Linear time, as all elements need to be shifted.


● At a Specific Position: O(n) – Linear time, depending on the position of deletion and
the number of elements that need to be shifted.
● At the End: O(1) – Constant time, as no shifting is needed.
Multidimensional Arrays

Multidimensional arrays are arrays that have more than one dimension.

The most common type is the two-dimensional array, which can be thought of as a table
with rows and columns.

Multidimensional arrays can be used to represent complex data structures such as matrices,
grids, or even higher-dimensional data.

Key Points:

1. Definition: A multidimensional array is an array of arrays. Each element can itself be


an array, allowing for multiple dimensions.
2. Declaration: In programming languages like C/C++, a two-dimensional array can be
declared as:

Two-Dimensional Arrays:

A two-dimensional array consists of rows and columns. The number of rows and columns
can vary based on the requirements.
Memory Representation:

● Row-Major Order: In languages like C/C++, multidimensional arrays are stored in


memory in row-major order, meaning that the entire first row is stored in contiguous
memory locations, followed by the entire second row, and so on.
Column major order in 2d array:

In Column-Major Order, elements of a 2D array are stored column by column in a contiguous


block of memory.

This means that all elements of the first column are stored first, followed by the elements of
the second column, and so on.


Applications of Multidimensional Arrays:

1. Matrices and Mathematical Operations: Useful in mathematical computations,


graphics, and simulations.
2. Image Processing: Representing images as 2D arrays of pixels.
3. Game Development: Representing game boards, grids, or levels.
Address Calculation in a 3D Array

A 3D array can be thought of as an array of 2D arrays.

The elements are typically accessed using three indices: arr[x][y][z], where:

● x is the index of the first dimension (depth or layer).


● y is the index of the second dimension (row).
● z is the index of the third dimension (column).

Address Calculation Formula

To calculate the address of an element in a 3D array, you can use the following formula:

Row-Major Order

For a 3D array arr[x][y][z], the address of the element arr[i][j][k] can be calculated
using:

Address of arr[i][j][k] = BaseAddress + ((i * (rows * cols)) + (j * cols) + k) * size_of_element

Where:

● BaseAddress is the starting address of the array in memory.


● i is the index of the first dimension (depth).
● j is the index of the second dimension (row).
● k is the index of the third dimension (column).
● rows is the number of rows in the 2D slices.
● cols is the number of columns in the 2D slices.
● size_of_element is the size of one element in bytes.

Column-Major Order

For a 3D array stored in column-major order, the formula changes to:

Address of arr[i][j][k] = BaseAddress + ((k * (depth * rows)) + (j * depth) + i) *


size_of_element
Example Calculation

Consider a 3D array arr[3][4][5], and we want to find the address of the element
arr[1][2][3]:

● Dimensions: depth = 3, rows = 4, cols = 5


● BaseAddress = 1000
● size_of_element = 4 bytes

1. Row-Major Order Calculation

Using the row-major order formula:

Address = BaseAddress + ((i * (rows * cols)) + (j * cols) + k) * size_of_element

2. Column-Major Order Calculation

Using the column-major order formula:

Address = BaseAddress + ((k * (depth * rows)) + (j * depth) + i) * size_of_element


Parallel Arrays

Parallel arrays are a programming technique used to store


related data in separate arrays, where each array corresponds
to a specific attribute of the data. This approach is useful when
you want to keep related pieces of information organized, but
don’t necessarily want to use a complex data structure like a
class or a struct.

Key Points:

1. Definition: Parallel arrays consist of multiple arrays that are


aligned in such a way that the elements at corresponding
indices represent related information.
2. Example: If you have students and their corresponding
grades, you might have one array for student names and
another for their grades:
○ names[]: {"Alice", "Bob", "Charlie"}
○ grades[]: {85, 90, 78}

Characteristics:

● Fixed Length: All arrays in parallel must have the same length
to ensure that corresponding elements relate correctly.
● Data Integrity: Care must be taken to ensure that data
across the arrays is synchronized, as modifying one array
without updating the others can lead to inconsistencies.
Accessing Elements:

To access the data, you can use a common index for all the
parallel arrays. For example, to get the grade of "Bob", you
would access grades[1], since "Bob" is at index 1 in the
names array.

Advantages of Parallel Arrays:


1. Simplicity: Easy to implement and understand, especially for
small datasets.
2. Performance: Can lead to better performance in some
cases, as data access is straightforward.
3. Flexibility: Individual arrays can be easily modified without
needing to change the overall structure.

Disadvantages of Parallel Arrays:

1. Data Integrity Risks: More prone to errors due to synchronization


issues between arrays.
2. Maintenance Difficulty: As the number of related attributes
increases, managing multiple arrays can become
cumbersome.
3. Limited Functionality: Lacks the capabilities of more
complex data structures, like classes or structs, which can
encapsulate data and behaviors.

When to Use Parallel Arrays:

● Use parallel arrays for simple applications where data is


related but does not require the complexity of a structured
data type.
● Ideal for small datasets or when performance is a critical
factor and simplicity is preferred
Sparse Arrays

A Sparse Array is an array in which the majority of elements are


zero (or a default value). In such arrays, the number of non-zero
elements is much smaller compared to the number of zero
elements, which makes it inefficient to store all elements in a
traditional array format. Therefore, special storage techniques
are used to save memory and improve efficiency.

Key Points:

1. Definition: A sparse array has most of its elements as zero or


empty, and only a few elements contain meaningful data
(non-zero values).
2. Purpose: Storing large arrays with mostly zero values is
inefficient in terms of memory usage. Sparse arrays aim to
optimize this by storing only the non-zero elements along
with their corresponding indices.
3. Representation: Sparse arrays are typically represented
using data structures like lists of tuples, dictionaries, or
specialized sparse matrix formats.

Representing a Sparse Array:


One efficient way to represent a sparse array is by using a list of
tuples, where each tuple contains:

1. The row index


2. The column index
3. The value (non-zero)

For the above array, the sparse representation would be:

Advantages of Sparse Arrays:

1. Memory Efficiency: By storing only the non-zero elements,


sparse arrays reduce memory usage, especially when
dealing with large arrays where most elements are zero.
2. Faster Processing: Operations on sparse arrays can be
faster, as only non-zero elements are processed, skipping
over the zeros.
3. Storage Flexibility: Sparse arrays can be easily stored in a
compressed form or in data structures like dictionaries or
linked lists.
Sparse Matrix Representation:

A sparse matrix is a matrix with a majority of its elements as


zero. Sparse matrices can be represented using the following
techniques:

1. Triplet Form (Coordinate List or COO Format):


○ Store non-zero elements as a list of tuples (row,
column, value).
○ This is the simplest way to represent sparse matrices.

Compressed Sparse Row (CSR) Format:

● Store the row indices, column indices, and non-zero values


in separate arrays.
● This format is often used in scientific computing libraries
like SciPy.

Structure:

● Values[]: Stores the non-zero values.


● Column_Index[]: Stores the column index for each
non-zero value.
● Row_Pointer[]: Stores the starting position of each
row's data in the Values[] and Column_Index[] arrays.

Compressed Sparse Column (CSC) Format:

● Similar to CSR, but stores data column-wise instead of


row-wise.
Applications of Sparse Arrays:

1. Storing Sparse Data: Used in applications where most of the


dataset contains zero or default values, such as image
processing, network connectivity graphs, or scientific
computing.
2. Matrix Operations: Sparse arrays are commonly used in
operations involving large matrices in machine learning,
numerical simulations, or optimization problems.
3. Graph Representation: Adjacency matrices for graphs are
often sparse since not all nodes are connected to each
other.

Advantages of Sparse Matrix Representations:

● Reduced Space Complexity: Only non-zero elements are stored,


reducing the space needed for storage.
● Efficiency: Operations like matrix multiplication or solving
linear systems can be optimized by skipping zero elements.

Disadvantages:
● Complexity in Operations: Operations like addition, subtraction,
and multiplication can become more complex due to the
indirect storage of data.
● Access Time: While memory usage is reduced, accessing
individual elements can take more time compared to a
dense matrix due to the need for additional indexing.

You might also like