0% found this document useful (0 votes)

36 views22 pages

Week 6

The document discusses the Union-Find data structure and its application in Kruskal's Algorithm for finding the Minimum Cost Spanning Tree (MCST). It details the operations of MakeUnionFind, Find, and Union, along with their complexities, and introduces improvements to the naive implementation, leading to an amortized complexity of O(log m) for Union operations. Additionally, it covers priority queues, heaps, and their implementation in algorithms like Dijkstra's, highlighting the efficiency of heaps for dynamic sorted data.

Uploaded by

Harshdeep Sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views22 pages

Week 6

Uploaded by

Harshdeep Sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Union-Find Data Structure

Kruskal's Algorithm for Minimum Cost Spanning Tree (MCST)

Process the edges in ascending order of cost
If edge (u, v) does not create a cycle, add it

(u, v) can be added if u and v are in different components

Adding edge (u, v) merges these components
How can we keep track of components and merge them efficiently?
Components partition vertices

Collection of disjoint sets

Need data structure to maintain collection of disjoint sets

find(v) - return set containing v

union(u, v) - merge sets of u, v

Union-Find Data Structure

A set S partitioned into components {C1 , C2 , . . . , Ck }

Each s ∈ S belongs to exactly one Cj

Support the following operations

MakeUnionFind(S) - set up initial singleton components {s} , for each s ∈ S

Find(s) - return the components containing s

Union(s, s') - merges components containing s, s'

Naive Implementation
Assume S = {0, 1, . . . , n − 1}

Set up an array/dictionary Component

MakeUnionFind(S)

Set Component[i] = i for each i

Find(i)

Return Component[i]

Union(i, j)
c_old = Component[i]
c_new = Component[j]

for k in range(n):
if Component[k] == c_old:
Component[k] = c_new

Complexity

MakeUnionFind(S) - O(n)
Find(i) - O(1)
Union(i, j) - O(n)
Sequence of m Union() operations takes time O(mn)
Improved Implementation
Another array/dictionary Members

For each component c , Members[c] is a list of its members

Size[c] = length(Members[c]) is the number of members

MakeUnionFind(S)

Set Component[i] = i for all i

Set Members[i] = i, Size[i] = 1 for all i

Find(i)

Return Component[i]

Union(i, j)

c_old = Component[i]
c_new = Component[j]

for k in Members[c_old]:
Component[k] = c_new
Members[c_new].append(k)
Size[c_new] += 1

Why does this help?

MakeUnionFind(S)

Set Component[i] = i for all i

Set Members[i] = [i], Size[i] = 1 for all i
Find(i)

Return Component[i]

Union(i, j)

c_old = Component[i]
c_new = Component[j]

for k in Members[c_old]:
Component[k] = c_new
Members[c_new].append(k)
Size[c_new] += 1

Members[c_old] allows us to merge Component[i] into Component[j] in time

O(S ize[c_old]) rather than O(n)

How can we make use of Size[c]

Always merge smaller component into the larger one

If Size[c] < Size[c'] re-label c as c' , else re-label c' as c

Individual merge operatios can still take time O(n)

Both Size[c], Size[c'] could be about n/2

More careful accounting

Always merge smaller component into the larger one

For each i , size of Component[i] at least doubles each time it is re-labelled

After m Union() operations, at most 2m elements have been "touched"

Size of Component[i] is at most 2m

Size of Component[i] grows as 1, 2, 4, . . . , so i changes component at most log m

times

Over m updates

At most 2m updates are re-labelled

Each one at most O(log m) times

Overall, m Union() operations take time O(m. log m)

Works out to time O(log m) per Union() operation

Amortized complexity of Union() is O(log m)

Back to Kruskal's Algorithm

Sort E = {e0 , e1 , . . . , em−1 } in ascending order
MakeUnionFind(V) - each vertex j is in component j
Adding an edge ek = (u, v) to the tree

Check that Find(u) != Find(v)

Merge components: Union(Component[u], Component[v])
Tree has n − 1 edges, so O(n) Union() operations

O(n. logn) amortized cost overall

Sorting E takes O(m. logm)

Equivalently, O(m. logm) , since m ≤ n

Overall time, O((m + n)logn)

Summary
Implement Union-Find using arrays/dictionaries Component, Member, Size

MakeUnionFind(S) is O(n)
Find(i) is O(1)
Across m operations, amortized complexity of each Union() operation is log m

Can also maintain Members[k] as a tree rather than as a list

Union() becomes O(1)

Priority Queues

Dealing with Priorities

Job Scheduler

A job scheduler maintains a list of pending jobs with their priorities

When the processor is free, the scheduler picks out the job with maximum priority in the
list and schedules it
New jobs may join the list at any time
How should the scheduler maintain the list of pending jobs and their priorities?

Priority Queue

Need to maintain a collection of items with priorities to optimize the following operations
delete_max()

Identify and remove item with the highest priority

Need not be unique
insert()

Add a new item to the collection

Implementing Priority Queues with one dimensional structures

delete_max()

Identify and remove item with highest priority

Need not be unique
insert()

Add a new item to the list

Unsorted list

insert() is $O(1)$
delete_max() is $O(n)$

Sorted list

delete_max() is $O(1)$
insert() is $O(n)$

Processing $n$ items requires $O(n^2)$

Moving to 2 dimensions
First Attempt

Assume $N$$N$ processes enter/leave the queue

Maintain a $\sqrt{N} \times \sqrt{N}$$\sqrt{N} \times \sqrt{N}$ array
Each row is in sorted order

N = 25
3 19 23 35 58

12 17 25 43 67

10 13 20

11 16 28 49
6 14

Summary
2D $\sqrt{N} \times \sqrt{N}$$\sqrt{N} \times \sqrt{N}$ array with sorted rows

insert() is $O(\sqrt{N})$$O(\sqrt{N})$
delete_max() is $O(\sqrt{N})$$O(\sqrt{N})$
Processing $N$$N$ items is $O(N \sqrt{N})$$O(N \sqrt{N})$
Can we do better than this?
Maintain a special binary tree - heap

Height $O(log \ N)$$O(log \ N)$

insert() is $O(log \ N)$$O(log \ N)$
delete_max() is $O(log \ N)$$O(log \ N)$
Processing $N$$N$ items is $O([Link] \ N)$$O([Link] \ N)$
Flexible - need not fix $N$$N$ in advance
Heaps

Priority Queue
Need to maintain a collection of items with priorities to optimize the following operations
delete_max()

Identify and remove item with highest priority

Need not be unique
insert()

Add a new item to the list

Maintaining as a list incurs cost O(N 2 ) across N inserts and deletions
−
− −
− −
−
Using a √N × √N array reduces the cost to O(√N ) per oprations
−
−
O(N √N ) across N inserts and deletions

Binary Trees
Values are stored as nodes in a rooted tree
Each node has up to two children

Left child and Right child

Order is important
Other than the root, each node has a unique parent
Leaf node - no children
Size - number of nodes
Height - number of levels
Heap
Binary tree filled level-by-level, left-to-right
The value at each node is at least as big as the values of its children

max-heap
Binary tree on the right is an example of a heap
Root always has the largest value

By induction, because of the max-heap property

Non-Examples
Complexity of insert()
Need to walk up from the leaf to the root

Height of the tree

Number of nodes at level $0$ is $2^0 = 1$
If we fill $k$ levels, $2^0 + 2^1 + ... + 2^{k - 1} = 2^k - 1$ nodes
If we have $N$ nodes, at most $1 + log \ N$ levels
insert() is $O(log \ N)$

delete_max()

Maximum value is always at the root

After we delete one value, tree shrinks

Node to delete is right-most at lowest level

Move "homeless" value to the root
Restore the heap property downwards
Only need to follow a single path down

Again $O(log \ N)$$O(log \ N)$

Implementation
Number the nodes top to bottom left right
Store as a list H = [h0, h1, h2, ..., h9]
Children of H[i] are at H[2 * i + 1], H[w * i + 2]
Parent of H[i] is at H[(i - 1)//2] , for i > 0

Building a heap - heapify()

Convert a list [v0, v1, ..., vN] into a heap
Simple strategy

Start with an empty heap

Repeatedly apply insert(vj)
Total time is $O([Link] \ N)$$O([Link] \ N)$

Better heapify()
List L = [v0, v1, ..., vN]
mid = len(L)//2 , Slice L[mid:] has only leaf nodes

Already satisfy the heap condition

Fix heap property downwards for second last level
Fix heap property downwards for third last level
...
Fix heap property at level 1
Fix heap property at the root
Each time we go up one level, one extra step per node to fix the heap peoperty
However, number of nodes to fix halves
Second last level, $n/4 \times 1$$n/4 \times 1$ steps
Third last level, $n/8 \times 2$$n/8 \times 2$ steps
Fourth last level, $n/16 \times 3$$n/16 \times 3$ steps
...
Cost turns out to be $O(n)$$O(n)$

Summary
Heaps are a tree implementation of priority queues
insert() is $O(log \ N)$$O(log \ N)$
delete_max() is $O(log \ N)$$O(log \ N)$
heapify() builds a heap in $O(N)$$O(N)$

Can invert the heap condition

Each node is smaller than its children

min-heap
delete_min() rather than delete_max()
Using Heaps in Algorithms

Priority Queues and Heaps

Priority Queues support the following operations

insert()
delete_max() or delete_min()

Heaps are tree based implementation of priority queues

insert(), delete_max()/delete_min() are both O(log n)

heapify() builds a heap from a list/array in time O(n)

Heap can be represented as a list/array

Simple index arithmetic to find parent and children of a node

What more do we need to use a heap in an algorithm?

Dijkstra's Algorithm
Maintain 2 dictionaries with vertices as keys

visited , initially False for all v

distance , initially infinity for all v

Set distance[v] to 0
Repeat, untill all the reachable vertices are visited

Find unvisited vertex nextv with minimum distance

Set visited[nextv] to True
Re-compute distance[v] for every neighbour v of nextv

def dijkstra(WMat,s):
(rows,cols,x) = [Link]
infinity = [Link](WMat)*rows+1
(visited,distance) = ({},{})

for v in range(rows):
(visited[v],distance[v]) = (False,infinity)

distance[s] = 0

for u in range(rows):
nextd = min([distance[v] for v in range(rows)
if not visited[v]])
nextvlist = [v for v in range(rows)
if (not visited[v]) and distance[v] == nextd]
if nextvlist == []:
break

nextv = min(nextvlist)
visited[nextv] = True

for v in range(cols):
if WMat[nextv,v,0] == 1 and (not visited[v]):
distance[v] = min(distance[v],distance[nextv] + WMat[nextv,v,1])

return distance

Bottleneck

Find unvisited vertex j with minimum distance

Naive implementation requires an O(n) scan

Maintain unvisited vertices as a min-heap

delete_min() in O(log n) time

But, also need to update distances of the neighbours

Unvisited neighbour's distances are inside the min-heap

Updating a value is not a basic heap operation

Heap sort
Start with an un-ordered list
Build a heap - O(n)
Call delete_max() n times to extract elements in descending order - O(n. log n)

After each delete_max() , heap shrinks by 1

Store maximum value at the end of current heap
In place O(n. log n) sort

Summary
Updating a value in a heap takes O(log n)

Need to maintain additional pointers to map values to heap positions and vice versa
With this extended notion of heap, Dijkstra’s algorithm complexity improves from O(n2 ) to
O((m + n). log n)

Heaps can also be used to sort a list in place in O(n. log n)

Search Trees

Dynamic Sorted Data

Sorting is useful for efficient searching
What if the data is changing dynamically?

Items are periodically inserted and deleted

Insert/delete in a sorted list takes time O(n)
Move to a tree structure, like heaps for priority queues

Binary Search Tree

For each node with the value v

All values in the left sub-tree are < v

All values in the right sub-tree are > v

No duplicate values
Implementing a Binary Search Tree
Each node has a value and pointers to its children
Add a frontier with empty nodes, all fields -

Empty tree is single empty node

Leaf node points to empty nodes
Easier to implement operations recursively

The class Tree

Three local fields value , 'left , 'right'
Value None for empty value
Empty tree has all fields None
Left has a non-empty value and empty left and right

class Tree:
# Constructor
def __init__(self, init_val = None):
[Link] = init_val

if [Link]:
[Link] = Tree()
[Link] = Tree()
else:
[Link] = None
[Link] = None

return

# Only empty node has value None

def is_empty(self):
return [Link] == None

# Leaf nodes have both children empty

def is_leaf(self):
return [Link] != None and [Link].is_empty() and [Link].is_empty()

In-order traversal
List the left sub-tree, then the current node, then the right sub-tree
Lists values in sorted order
Use to print the tree

class Tree:
...
# In-order traversal
def in_order(self):
if self.is_empty():
return []
else:
return [Link].in_order() + [[Link]] + [Link].in_order()

# Display the tree as a string

def __str__(self):
return str(self.in_order())

Find a value v
Check value at current node
If v is smaller than the current node, go left
If v is greater than the current node, go right
Natural generalization of binary search

class Tree:
...
# Check if the value v occurs in the tree
def find(self, v):
if self.is_empty():
return False

if [Link] == v:
return True

if v < [Link]:
return [Link](v)

if v > [Link]:
return [Link](v)

Minimum and Maximum

Minimum is the left most node in the tree
Maximum is the right most node in the tree

class Tree:
...
def min_val(self):
if [Link].is_empty():
return [Link]
else:
return [Link].min_val()

def max_val(self):
if [Link].is_empty():
return [Link]
else:
return [Link].max_val()

Insert a value v
Try to find v
Insert at the position where find fails

class Tree:
...
def insert(self, v):
if self.is_empty():
[Link] = v
[Link] = Tree()
[Link] = Tree()

if [Link] == v:
return

if v < [Link]:
[Link](v)
return

if v > [Link]:
[Link](v)
return

Delete a value v
If v is present, delete
Leaf node? No problem
If only one child, promote that sub-tree
Otherwise, replace v with [Link].max_val() and delete [Link].max_val()

[Link].max_val() has no right child

class Tree:
...
def delete(self, v):
if self.is_empty():
return

if v < [Link]:
[Link](v)
return

if v > [Link]:
[Link](v)
return

if v == [Link]:
if self.is_leaf():
self.make_empty()
elif [Link].is_empty():
self.copy_right()
elif [Link].is_empty():
self.copy_left()
else:
[Link] = [Link].max_val()
[Link]([Link].max_val())
return

# Convert left node to empty node

def make_empty(self):
[Link] = None
[Link] = None
[Link] = None
return

# Promote left child

def copy_left(self):
[Link] = [Link]
[Link] = [Link]
[Link] = [Link]
return

# Promote right child

def copy_right(self):
[Link] = [Link]
[Link] = [Link]
[Link] = [Link]
return

Summary
find(), insert() and delete() all walk down a single path
Worst-case: height of the tree
An un-balanced tree with n nodes may have the height O(n)
Balanced trees have height O(log n)

We will see how to keep a tree balanced to ensure all operations remain O(log n)

Module 3
No ratings yet
Module 3
52 pages
EECS 281: Trees and Heaps Overview
No ratings yet
EECS 281: Trees and Heaps Overview
25 pages
Notes
No ratings yet
Notes
50 pages
CSC 172 Midterm
No ratings yet
CSC 172 Midterm
11 pages
Programming Contest Structures
No ratings yet
Programming Contest Structures
39 pages
Ching and Christine - S Final Review Solutions
No ratings yet
Ching and Christine - S Final Review Solutions
15 pages
CPCS204 Big-O Notations & Some Rules
No ratings yet
CPCS204 Big-O Notations & Some Rules
6 pages
Lecture5p 1pp
No ratings yet
Lecture5p 1pp
57 pages
Data Structures - Cheat Sheet
100% (1)
Data Structures - Cheat Sheet
2 pages
ClassNotes 5 - DSA (Heap)
No ratings yet
ClassNotes 5 - DSA (Heap)
18 pages
Lecture 3 Notes
No ratings yet
Lecture 3 Notes
8 pages
Lec-20 Priority Quwu & Heaps
No ratings yet
Lec-20 Priority Quwu & Heaps
27 pages
Lec38-39 DSA Heaps
No ratings yet
Lec38-39 DSA Heaps
112 pages
Lecture 5 Soriting Revisited (Heap and Linear Searches)
No ratings yet
Lecture 5 Soriting Revisited (Heap and Linear Searches)
137 pages
04 Pqs
No ratings yet
04 Pqs
24 pages
Operations On Dynamic Sets
No ratings yet
Operations On Dynamic Sets
34 pages
Graphs, Heaps, Hashing, and Trees
No ratings yet
Graphs, Heaps, Hashing, and Trees
12 pages
Fib Heaps
100% (1)
Fib Heaps
7 pages
DS Cheatsheet
No ratings yet
DS Cheatsheet
2 pages
CSI 2110 Summary PDF
No ratings yet
CSI 2110 Summary PDF
17 pages
Elementary Data Structures
No ratings yet
Elementary Data Structures
66 pages
CPSC Algorithms Cheat Sheet
No ratings yet
CPSC Algorithms Cheat Sheet
6 pages
Heapsort and Quicksort Overview
No ratings yet
Heapsort and Quicksort Overview
54 pages
Understanding Python's Heapq Module
No ratings yet
Understanding Python's Heapq Module
12 pages
Priority Queues and Disjoint Sets Overview
No ratings yet
Priority Queues and Disjoint Sets Overview
29 pages
COMP3506: Data Structures Overview
No ratings yet
COMP3506: Data Structures Overview
17 pages
Data Structures Cheat Sheet
71% (14)
Data Structures Cheat Sheet
2 pages
Sorting
No ratings yet
Sorting
98 pages
Tree and Heap Sort Algorithms Explained
No ratings yet
Tree and Heap Sort Algorithms Explained
6 pages
Data Structures and Algorithms - L10
No ratings yet
Data Structures and Algorithms - L10
61 pages
Computer Engineering Basics
No ratings yet
Computer Engineering Basics
14 pages
Algorithmic Cheatsheet: Typesetting Math: 97%
No ratings yet
Algorithmic Cheatsheet: Typesetting Math: 97%
12 pages
Java Data Structures and Algorithms
No ratings yet
Java Data Structures and Algorithms
7 pages
Lec 13 0 Data Structures Review Dictionary Heaps
No ratings yet
Lec 13 0 Data Structures Review Dictionary Heaps
39 pages
Understanding Data Structures and Algorithms
No ratings yet
Understanding Data Structures and Algorithms
16 pages
DS 14 Heap
No ratings yet
DS 14 Heap
66 pages
Heap
No ratings yet
Heap
56 pages
Priority Queues: and The Amazing Binary Heap Chapter 20 in DS&PS Chapter 6 in DS&AA
No ratings yet
Priority Queues: and The Amazing Binary Heap Chapter 20 in DS&PS Chapter 6 in DS&AA
17 pages
14-16 Heapsort Analysis
No ratings yet
14-16 Heapsort Analysis
43 pages
CS 210: Binary Heaps & Priority Queues
No ratings yet
CS 210: Binary Heaps & Priority Queues
53 pages
Asymptotic Notation
No ratings yet
Asymptotic Notation
42 pages
Binary Heaps and Priority Queues
No ratings yet
Binary Heaps and Priority Queues
152 pages
Day 5
No ratings yet
Day 5
12 pages
Daa PQ
No ratings yet
Daa PQ
16 pages
Minimum Spanning Trees Explained
No ratings yet
Minimum Spanning Trees Explained
20 pages
CSE 326: Data Structures Priority Queues - Binary Heaps
No ratings yet
CSE 326: Data Structures Priority Queues - Binary Heaps
84 pages
NLM Big O Exam 2 Summary
No ratings yet
NLM Big O Exam 2 Summary
6 pages
Ads 1
No ratings yet
Ads 1
8 pages
Ads Unit-2
No ratings yet
Ads Unit-2
48 pages
Adsa-2unit Heap Continue
No ratings yet
Adsa-2unit Heap Continue
20 pages
Midterm Study Guide
No ratings yet
Midterm Study Guide
6 pages
Van Emde Boas Trees
No ratings yet
Van Emde Boas Trees
5 pages
Class Delegate Functions
No ratings yet
Class Delegate Functions
3 pages
Biogas from Food and Piggery Waste Co-Digestion
No ratings yet
Biogas from Food and Piggery Waste Co-Digestion
11 pages
Pioneer Plasma Field Service Guide
No ratings yet
Pioneer Plasma Field Service Guide
48 pages
Online News Portal Synopsys
No ratings yet
Online News Portal Synopsys
22 pages
Income Statement Preparation Guide
No ratings yet
Income Statement Preparation Guide
16 pages
154 1247 Midterm1 Formula Sheet
No ratings yet
154 1247 Midterm1 Formula Sheet
1 page
A Blockbuster Failure - How An Outdated Business Model Destroyed A
No ratings yet
A Blockbuster Failure - How An Outdated Business Model Destroyed A
76 pages
Unit 5
No ratings yet
Unit 5
3 pages
RSRNurburg Pricelist 2025
No ratings yet
RSRNurburg Pricelist 2025
4 pages
Taha ThesisReport
No ratings yet
Taha ThesisReport
30 pages
Electric Traction Systems Overview
No ratings yet
Electric Traction Systems Overview
21 pages
Chint - China
No ratings yet
Chint - China
116 pages
Eps 3201 Education Psychology
No ratings yet
Eps 3201 Education Psychology
2 pages
SleepMax Unlock The Secrets To Deep Rest and Rejuvenation
100% (1)
SleepMax Unlock The Secrets To Deep Rest and Rejuvenation
64 pages
Agr Sciences p1 QP Nov2017 - English
No ratings yet
Agr Sciences p1 QP Nov2017 - English
14 pages
多益 1
No ratings yet
多益 1
23 pages
High Hopes Chords by Panic! at The Disco
No ratings yet
High Hopes Chords by Panic! at The Disco
4 pages
Set Theory Fundamentals and Operations
No ratings yet
Set Theory Fundamentals and Operations
22 pages
File 3
No ratings yet
File 3
3 pages
Mathematics C4 Pure Mathematics
No ratings yet
Mathematics C4 Pure Mathematics
4 pages
Efe and Ife Matrix: Strategic Management
No ratings yet
Efe and Ife Matrix: Strategic Management
29 pages
07 Thermal Moisture Prot
No ratings yet
07 Thermal Moisture Prot
19 pages
A Girl Like Me PC Edit 040424D - Score
No ratings yet
A Girl Like Me PC Edit 040424D - Score
14 pages
Arctic Building Construction Guide
No ratings yet
Arctic Building Construction Guide
15 pages
HRDSCJ 2025 165113
No ratings yet
HRDSCJ 2025 165113
1 page
IMCOS 7001 (Operator Manual)
No ratings yet
IMCOS 7001 (Operator Manual)
4 pages
Actus Tragicus - Bach
No ratings yet
Actus Tragicus - Bach
2 pages
Exam Questions AWS-Solution-Architect-Associate
100% (1)
Exam Questions AWS-Solution-Architect-Associate
21 pages
Lecture 01
No ratings yet
Lecture 01
12 pages
Masoneilan Valves: Aftermarket Support and Services
No ratings yet
Masoneilan Valves: Aftermarket Support and Services
8 pages