Job Sequencing
Job Sequencing Problem is a classic greedy algorithm problem in computer science and operations
research. The goal is to schedule jobs in such a way that maximum profit is earned while respecting
job deadlines.
📌 Problem Statement
You are given n jobs where each job has:
A deadline (by when it must be completed).
A profit (earned only if the job is completed before or on its deadline).
Each job takes 1 unit of time.
Objective: Schedule jobs to maximize total profit. Only one job can be scheduled at a time.
🧠 Greedy Approach
1. Sort all jobs in decreasing order of profit.
2. Use a time slot array to track free time slots up to the maximum deadline.
3. For each job, find the latest available slot before its deadline.
4. If a slot is found, assign the job and update the slot as occupied.
🧮 Example
Jobs = [{id: 'A', deadline: 2, profit: 100}, {id: 'B', deadline: 1, profit: 19}, {id: 'C', deadline: 2, profit: 27},
{id: 'D', deadline: 1, profit: 25}, {id: 'E', deadline: 3, profit: 15}]
Sorted by profit:
A (100), C (27), D (25), B (19), E (15)
Max deadline = 3 → Time slots = [None, None, None]
Step-by-step Scheduling:
A → slot 2 → [None, A, None]
C → slot 1 → [C, A, None]
D → no free slot before 1 → skipped
B → no free slot before 1 → skipped
E → slot 3 → [C, A, E]
Selected Jobs: C, A, E
Total Profit: 27 + 100 + 15 = 142
✅ Time Complexity
Sorting: O(n log n)
Scheduling: O(n * d), where d is the max deadline
(can be optimized to O(n log n) using Disjoint Set Union)
🛠 Python Code
python
CopyEdit
class Job:
def __init__(self, job_id, deadline, profit):
self.id = job_id
self.deadline = deadline
self.profit = profit
def job_sequencing(jobs):
# Sort jobs by decreasing profit
jobs.sort(key=lambda x: x.profit, reverse=True)
max_deadline = max(job.deadline for job in jobs)
slots = [False] * max_deadline
job_sequence = [None] * max_deadline
total_profit = 0
for job in jobs:
for slot in range(min(max_deadline, job.deadline) - 1, -1, -1):
if not slots[slot]:
slots[slot] = True
job_sequence[slot] = job.id
total_profit += job.profit
break
return job_sequence, total_profit
# Example
jobs = [Job('A', 2, 100), Job('B', 1, 19), Job('C', 2, 27), Job('D', 1, 25), Job('E', 3, 15)]
sequence, profit = job_sequencing(jobs)
print("Job Sequence:", sequence)
print("Total Profit:", profit)
Prim’s Algorithm
Prim’s Algorithm is a greedy algorithm used to find the Minimum Spanning Tree (MST) of a
connected, undirected, weighted graph.
A Minimum Spanning Tree connects all the vertices in a graph with the minimum possible total edge
weight, without forming any cycles.
✅ Key Concepts
Input: A graph with V vertices and E edges.
Output: A tree that connects all vertices with the least total weight.
🧠 How Prim’s Algorithm Works
1. Start with any vertex (typically vertex 0).
2. Maintain a set of vertices included in the MST.
3. Repeatedly add the cheapest edge that connects a vertex in the MST to a vertex outside it.
4. Continue until all vertices are included.
📌 Data Structures
Visited array: Marks vertices already included in MST.
Priority Queue / Min-Heap: Selects the next minimum weight edge efficiently.
Adjacency List or Matrix: Stores the graph.
🧮 Example
Graph:
mathematica
CopyEdit
2 3
A ---- B ------- C
| | |
6 8 7
| | |
D ---- E ------- F
9 5
Vertices: A, B, C, D, E, F
MST (Prim's) might include edges:
A–B (2)
A–D (6)
B–E (8)
E–F (5)
C–F (7)
Total weight = 2 + 6 + 8 + 5 + 7 = 28
🛠 Python Implementation (Using Min-Heap)
python
CopyEdit
import heapq
def prims_algorithm(graph, start):
visited = set()
min_heap = [(0, start)] # (weight, vertex)
total_cost = 0
while min_heap:
weight, u = heapq.heappop(min_heap)
if u not in visited:
visited.add(u)
total_cost += weight
for v, w in graph[u]:
if v not in visited:
heapq.heappush(min_heap, (w, v))
return total_cost
# Example graph as an adjacency list
graph = {
'A': [('B', 2), ('D', 6)],
'B': [('A', 2), ('C', 3), ('E', 8)],
'C': [('B', 3), ('F', 7)],
'D': [('A', 6), ('E', 9)],
'E': [('B', 8), ('D', 9), ('F', 5)],
'F': [('C', 7), ('E', 5)]
cost = prims_algorithm(graph, 'A')
print("Total cost of MST:", cost)
⏱ Time Complexity
Using Min-Heap + Adjacency List: O(E log V)
Good for dense graphs compared to Kruskal’s algorithm.
Huffman Coding
Huffman Coding is a popular lossless data compression algorithm. It is used to compress data
efficiently by assigning shorter codes to more frequent characters and longer codes to less frequent
characters, thereby reducing the overall size of the data.
🔧 How It Works
1. Frequency Count:
o Count the frequency of each character in the input data.
2. Build a Priority Queue (Min-Heap):
o Create a leaf node for each character and insert it into the min-heap based on
frequency.
3. Build the Huffman Tree:
o While the heap has more than one node:
Remove two nodes with the lowest frequency.
Create a new internal node with these two nodes as children.
The frequency of the new node is the sum of the two nodes.
Insert the new node back into the heap.
o The remaining node is the root of the Huffman Tree.
4. Assign Codes:
o Traverse the tree:
Assign '0' for left edges.
Assign '1' for right edges.
o The characters at the leaves get the Huffman codes.
5. Encode Data:
o Replace each character in the input with its corresponding Huffman code.
6. Decode Data:
o Use the Huffman Tree to convert the binary code back to characters.
📦 Example
Let’s say the input is:
"ABBCAB"
Frequencies:
A: 2
B: 3
C: 1
Huffman Tree:
css
CopyEdit
(*,6)
/ \
(*,3) B:3
/ \
C:1 A:2
Codes:
B: 1
A: 01
C: 00
Encoded Output:
01 1 1 00 01 1 → 011100011
✅ Advantages
Efficient compression (especially when some symbols are more frequent).
Always produces a prefix code (no code is a prefix of another).
Widely used in applications like ZIP files, JPEG, and MP3 formats.
🚫 Limitations
Needs to store the Huffman Tree or code map for decoding.
Not ideal for small files or uniform frequency distributions.
Master Theorem
The Master Theorem provides a straightforward way to analyze the time complexity of divide-and-
conquer algorithms, especially those that follow a recurrence of the form:
T(n)=a⋅T(nb)+f(n)T(n) = a \cdot T\left(\frac{n}{b}\right) + f(n)T(n)=a⋅T(bn)+f(n)
Where:
a ≥ 1: number of subproblems in the recursion
b > 1: factor by which the problem size is reduced
f(n): the cost of dividing and combining the subproblems (non-recursive work)
✅ Master Theorem Cases
To apply the Master Theorem, compare f(n) with nlogban^{\log_b a}nlogba:
Case 1: Polynomially smaller
f(n)=O(nlogba−ε) for some ε>0f(n) = O\left(n^{\log_b a - \varepsilon}\right) \text{ for some } \
varepsilon > 0f(n)=O(nlogba−ε) for some ε>0
Then:
T(n)=Θ(nlogba)T(n) = \Theta(n^{\log_b a})T(n)=Θ(nlogba)
✅ Subproblem work dominates
Case 2: Same order
f(n)=Θ(nlogba⋅logkn) for some k≥0f(n) = \Theta(n^{\log_b a} \cdot \log^k n) \text{ for some } k \geq
0f(n)=Θ(nlogba⋅logkn) for some k≥0
Then:
T(n)=Θ(nlogba⋅logk+1n)T(n) = \Theta(n^{\log_b a} \cdot \log^{k+1} n)T(n)=Θ(nlogba⋅logk+1n)
✅ Balanced work between recursion and combination
Case 3: Polynomially larger
f(n)=Ω(nlogba+ε) for some ε>0f(n) = \Omega\left(n^{\log_b a + \varepsilon}\right) \text{ for some } \
varepsilon > 0f(n)=Ω(nlogba+ε) for some ε>0
and
a⋅f(n/b)≤c⋅f(n) for some c<1 and large na \cdot f(n/b) \leq c \cdot f(n) \text{ for some } c < 1 \
text{ and large } na⋅f(n/b)≤c⋅f(n) for some c<1 and large n
Then:
T(n)=Θ(f(n))T(n) = \Theta(f(n))T(n)=Θ(f(n))
✅ Combination work dominates (requires regularity condition)
🧮 Examples
Example 1
T(n)=2T(n/2)+nT(n) = 2T(n/2) + nT(n)=2T(n/2)+n
a = 2, b = 2, f(n) = n
nlogba=nlog22=nn^{\log_b a} = n^{\log_2 2} = nnlogba=nlog22=n
→ Case 2 ⇒
T(n) = Θ(n log n)
Example 2
T(n)=8T(n/2)+n2T(n) = 8T(n/2) + n^2T(n)=8T(n/2)+n2
a = 8, b = 2, f(n) = n²
nlog28=n3n^{\log_2 8} = n^3nlog28=n3
→ f(n)=O(n3−ε)f(n) = O(n^{3 - \varepsilon})f(n)=O(n3−ε), Case 1 ⇒
T(n) = Θ(n³)
Example 3
T(n)=2T(n/2)+n2T(n) = 2T(n/2) + n^2T(n)=2T(n/2)+n2
a = 2, b = 2, f(n) = n²
nlog22=nn^{\log_2 2} = nnlog22=n
→ Case 3 ⇒
→ f(n)=Ω(n1+ε)f(n) = \Omega(n^{1 + \varepsilon})f(n)=Ω(n1+ε) and regularity condition holds
T(n) = Θ(n²)
❌ Master Theorem Limitations
Doesn’t apply if a, b, or f(n) vary with n
Doesn’t handle non-polynomial f(n) like n log n when it doesn’t match cases
Doesn’t handle recurrences with multiple different subproblem sizes (e.g., T(n) = T(n/2) +
T(n/3) + n)
Strassen’s Matrix Multiplication
Strassen’s Matrix Multiplication Algorithm is a divide-and-conquer algorithm that multiplies two
matrices faster than the conventional O(n3)O(n^3)O(n3) approach.
✅ Motivation
Standard matrix multiplication of two n×nn \times nn×n matrices requires:
O(n3)O(n^3)O(n3)
Strassen reduced the number of multiplications needed, improving the time complexity to:
O(nlog27)≈O(n2.81)O(n^{\log_2 7}) \approx O(n^{2.81})O(nlog27)≈O(n2.81)
📌 Key Idea
Instead of performing 8 multiplications (as in the traditional method), Strassen uses only 7 recursive
multiplications with extra additions and subtractions.
🧠 Algorithm Overview
Given two n×nn \times nn×n matrices A and B, divide them into 4 submatrices:
A=[A11A12A21A22],B=[B11B12B21B22]A = \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \
end{bmatrix}, \quad B = \begin{bmatrix} B_{11} & B_{12} \\ B_{21} & B_{22} \end{bmatrix}A=[A11
A21A12A22],B=[B11B21B12B22]
Compute 7 products:
M1=(A11+A22)
(B11+B22)M2=(A21+A22)B11M3=A11(B12−B22)M4=A22(B21−B11)M5=(A11+A12)B22M6=(A21−A11
)(B11+B12)M7=(A12−A22)(B21+B22)\begin{aligned} M_1 &= (A_{11} + A_{22})(B_{11} + B_{22}) \\
M_2 &= (A_{21} + A_{22})B_{11} \\ M_3 &= A_{11}(B_{12} - B_{22}) \\ M_4 &= A_{22}(B_{21} -
B_{11}) \\ M_5 &= (A_{11} + A_{12})B_{22} \\ M_6 &= (A_{21} - A_{11})(B_{11} + B_{12}) \\ M_7 &=
(A_{12} - A_{22})(B_{21} + B_{22}) \\ \end{aligned}M1M2M3M4M5M6M7=(A11+A22)(B11+B22
)=(A21+A22)B11=A11(B12−B22)=A22(B21−B11)=(A11+A12)B22=(A21−A11)(B11+B12)=(A12−A22)
(B21+B22)
Use these to compute the final quadrants of the result matrix C:
C11=M1+M4−M5+M7C12=M3+M5C21=M2+M4C22=M1−M2+M3+M6\begin{aligned} C_{11} &=
M_1 + M_4 - M_5 + M_7 \\ C_{12} &= M_3 + M_5 \\ C_{21} &= M_2 + M_4 \\ C_{22} &= M_1 - M_2
+ M_3 + M_6 \\ \end{aligned}C11C12C21C22=M1+M4−M5+M7=M3+M5=M2+M4=M1−M2+M3+M6
🕒 Time Complexity
T(n)=7T(n/2)+O(n2)⇒T(n)=O(nlog27)≈O(n2.81)T(n) = 7T(n/2) + O(n^2) \Rightarrow T(n) = O(n^{\
log_2 7}) \approx O(n^{2.81})T(n)=7T(n/2)+O(n2)⇒T(n)=O(nlog27)≈O(n2.81)
Faster than conventional O(n3)O(n^3)O(n3) for large matrices.
🛠 Python (Conceptual Implementation)
python
CopyEdit
import numpy as np
def strassen(A, B):
n = A.shape[0]
if n == 1:
return A * B
mid = n // 2
A11, A12 = A[:mid, :mid], A[:mid, mid:]
A21, A22 = A[mid:, :mid], A[mid:, mid:]
B11, B12 = B[:mid, :mid], B[:mid, mid:]
B21, B22 = B[mid:, :mid], B[mid:, mid:]
M1 = strassen(A11 + A22, B11 + B22)
M2 = strassen(A21 + A22, B11)
M3 = strassen(A11, B12 - B22)
M4 = strassen(A22, B21 - B11)
M5 = strassen(A11 + A12, B22)
M6 = strassen(A21 - A11, B11 + B12)
M7 = strassen(A12 - A22, B21 + B22)
C11 = M1 + M4 - M5 + M7
C12 = M3 + M5
C21 = M2 + M4
C22 = M1 - M2 + M3 + M6
top = np.hstack((C11, C12))
bottom = np.hstack((C21, C22))
return np.vstack((top, bottom))
# Example usage:
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
print(strassen(A, B))
❗ Notes
Works best for large matrices with size as powers of 2.
For small matrices or odd dimensions, padding and switching to naive multiplication may be
better.
More memory usage due to extra additions/subtractions.