UNIT-5 21CSC201J
UNIT-5 21CSC201J
UNIT-5
GRAPH
SYLLABUS
Introduction to Graph, Graph Traversal, Topological sorting, Minimum spanning tree – Prims
Algorithm, Kruskal’s Algorithm, Shortest Path Algorithm - Dijkstra’s Algorithm
1. Introduction to Graph:
• A graph is a non-linear data structure which is a collection of vertices (also called nodes) and
edges that connect these vertices.
• The set of edges describes relationships among the vertices.
• A graph is often viewed as a generalization of the tree structure, where instead of a purely
parent-to-child relationship between tree nodes, any kind of complex relationships between
the nodes can exist.
• A node in the graph may represent a city and the edges connecting the nodes can represent
roads.
• A graph can also be used to represent a computer network where the nodes are workstations
and the edges are the network connections.
• Graphs have so many applications in computer science and mathematics that several
algorithms have been written to perform the standard graph operations, such as searching the
graph and finding the shortest path between the nodes of a graph.
Types of a graph
Directed vs Undirected
• When the edges in a graph have a direction, the graph is called directed –(V1,V2),(V2,V3) &
(V3,V1). Example: route network
Figure 2: Directed and Undirected Graph
• When the edges in a graph have no direction, the graph is called undirected –(V1,V2),(V2,V3)
& (V3,V1). Example: flight network
Bipartite graph
• A bipartite graph is a graph whose vertices can be divided into two sets such that all
edges connect a vertex in one set with a vertex in the other set.
Figure: Bipartite Graph
• Graphs with all edges present are called complete graphs. Graphs with relatively few edges
are called sparse graphs.
Figure 7: DAG
Path
• A path in a graph is a sequence of adjacent vertices. Simple path is a path with no repeated
vertices. In the graph below, the dotted lines represent a path from G to E.
Figure 8: Path
Degree of a node
• The number of edges incident on a vertex determines its degree. The degree of the vertex is
written as degree (v). The in degree of the vertex V is the number of edges entering into vertex
V. Similarly, the out degree of the vertex V is the number of edges existing from that vertex V.
Figure 9: Degree of a node – In degree for node (v1) -1 and Out degree for node(v1) -2
Graph Representation
There are two common ways of storing graphs in the computer’s memory. They are:
• An adjacency matrix is used to represent which nodes are adjacent to one another. By definition,
two nodes are said to be adjacent if there is an edge connecting them.
• The adjacency matrix of an undirected graph is symmetric. The memory use of an adjacency matrix
is O(n2 ), where n is the number of nodes in the graph.
Figure 11: Graphs and their corresponding adjacency matrices
In adjacency list every node is in turn, is linked to its own list that contains the names of all other nodes
that are adjacent to it.
▪ It is easy to follow and clearly shows the adjacent nodes of a particular node.
▪ It is often used for storing graphs that have a small-to-moderate number of edges. That is, an
adjacency list is preferred for representing sparse graphs in the computer’s memory
2. Graph Traversal
• To solve problems on graphs, we need a mechanism for traversing the graphs. Graph traversal
algorithms are also called graph search algorithms.
• Two such algorithms for traversing the graphs is Depth First Search [DFS] and Breadth First
Search [BFS].
• The DFS algorithm is a recursive algorithm that uses the idea of backtracking.
• The depth-first search algorithm progresses by expanding the starting node of G and then
going deeper and deeper until the goal node is found, or until a node that has no children is
encountered.
• DFS is usually implemented using a Stack
• When a dead-end is reached, the algorithm backtracks, returning to the most recent node
that has not been completely explored.
• In other words, depth-first search begins at a starting node A which becomes the current node.
Then, it examines each node N along a path P which begins at A. That is, we process a
neighbour of A, then a neighbour of neighbour of A, and so on.
• During the execution of the algorithm, if we reach a path that has a node N that has already
been processed, then we backtrack to the current node. Otherwise, the unvisited
(unprocessed) node becomes the current node.
• The algorithm proceeds like this until we reach a dead-end (end of path P). On reaching the
dead end, we backtrack to find another path 𝑃` .
• The algorithm terminates when backtracking leads back to the starting node A. In this
algorithm, edges that lead to a new vertex are called discovery edges and edges that lead to
an already visited vertex are called back edges.
DFS Example:
Let us start from vertex 0, the DFS algorithm starts by putting it in the Visited list and putting all its
adjacent vertices in the stack.
Next, we visit the element at the top of stack i.e. 1 and go to its adjacent nodes. Since 0 has already
been visited, we visit 2 instead.
Vertex 2 has an unvisited adjacent vertex in 4, so we add that to the top of the stack and visit it.
Next, we visit the element at the top of stack i.e. 4 and as the adjacent node of 4 (i.e .2) is already
visited, we mark node 4 as visited.
After we visit the last element 3, it doesn't have any unvisited adjacent nodes, so we have completed
the Depth First Traversal of the graph.
Pseudocode:
Time complexity
• 𝑂(𝑉+𝐸), when implemented using an adjacency list. (Algorithm will for sure traverse through
all the node atleast once. So, the time complexity will include all the vertex V
Applications of DFS
• Topological sorting
• Finding connected components
• Finding articulation points (cut vertices) of the graph
• Finding strongly connected components
• Solving puzzles such as mazes
• Breadth-first search (BFS) is a graph search algorithm that begins at the root node and explores
all the neighbouring nodes. Then for each of those nearest nodes, the algorithm explores their
unexplored neighbour nodes, and so on, until it finds the goal.
• That is, we start examining the node A and then all the neighbours of A are examined. In the
next step, we examine the neighbours of neighbours of A, so on and so forth. This means that
we need to track the neighbours of the node and guarantee that every node in the graph is
processed and no node is processed more than once.
• This is accomplished by using a queue that will hold the nodes that are waiting for further
processing.
BFS Example:
We start from vertex 0, the BFS algorithm starts by putting it in the Visited list and putting all its
adjacent vertices in the stack.
Next, we visit the element at the front of queue i.e. 1 and go to its adjacent nodes. Since 0 has
already been visited, we visit 2 instead.
Vertex 2 has an unvisited adjacent vertex in 4, so we add that to the back of the queue and visit 3,
which is at the front of the queue.
Only 4 remains in the queue since the only adjacent node of 3 i.e. 0 is already visited. We visit it.
Visit last remaining item in the queue to check if it has unvisited neighbors
Pseudocode:
Time complexity
The time complexity of BFS is O (V + E), where V is the number of nodes and E is the number of edges.
Applications of BFS
3. Topological Sorting
Topological sort is an ordering of vertices in a directed acyclic graph [DAG] in which each node
comes before all nodes to which it has outgoing edges. As an example, consider the course
prerequisite structure at universities.
A directed edge (v,w) indicates that course v must be completed before course w. Topological
ordering for this example is the sequence which does not violate the prerequisite requirement.
Every DAG may have one or more topological orderings. Topological sort is not possible if the
graph has a cycle, since for two vertices v and w on the cycle, v precedes w and w precedes v.
Once a node is added to the topological ordering, we can take the node, and its outgoing edges,
out of the graph.
Then, we can repeat our earlier approach: look for any node with an indegree of zero and add it
to the ordering.
E is a node with an indegree of 0, add it to our topological ordering and remove it from the graph:
and repeat
and repeat
Time complexity
Spanning Tree:
A minimum spanning tree (MST) is defined as a spanning tree with weight less than or equal
to the weight of every other spanning tree. In other words, a minimum spanning tree is a
spanning tree that has weights associated with its edges, and the total weight of the tree (the
sum of the weights of its edges) is at a minimum.
Example 1:
Consider an unweighted graph G given below. From G, we can draw many distinct spanning
trees. Eight of them are given here. For an unweighted graph, every spanning tree is a
minimum spanning tree.
Example 2:
Consider a weighted graph G given below. From G, we can draw three distinct spanning trees.
But only a single minimum spanning tree can be obtained, that is, the one that has the
minimum weight (cost) associated with it.
Of all the spanning trees given in Figure, the one that is highlighted is called the minimum
spanning tree, as it has the lowest cost associated with it.
Prims Algorithm
Algorithm
Example
Generate a minimum spanning tree structure for the Graph G(V, E) given below containing 9 vertices
and 12 edges. We are supposed to create a minimum spanning tree T(V’, E’) for G(V, E) such that the
number of vertices in T will be 9 and edges will be 8.
Choosing A as starting vertex,
After the inclusion of node A, look into the connected edges going outward from node A and pick the
one with a minimum edge weight to include it in your T(V’, E’) structure.
Now, node B is reached. From node B, there are two possible edges out of which edge BD has the
least edge weight value. So, you will include it in your MST.
From node D, there is only have one edge. So, you will include it in your MST. Further, node H, has
two incident edges. Out of those two edges, edge HI has the least cost, so you will include it in MST
structure.
• Similarly, the inclusion of nodes G and E will happen in MST.
After that, nodes E and C will get included. Now, from node C, there are two incident edges. Edge CA
has the tiniest edge weight. But its inclusion will create a cycle in a tree structure, which you cannot
allow. Thus, we will discard edge CA as shown in the image below.
• The summation of all the edge weights in MST T(V’, E’) is equal to 30, which is the least
possible edge weight for any possible spanning tree structure for this particular graph.
Time complexity
The running time of Prim’s algorithm can be given as O(E log V) where E is the number of edges and
V is the number of vertices in the graph.
Kruskal’s Algorithm
▪ The algorithm starts with V different trees (V is the vertices in the graph). While constructing
the minimum spanning tree, every time Kruskal’s algorithm selects an edge that has minimum
weight and then adds that edge if it doesn’t create a cycle.
▪ So, initially, there are | V | single-node trees in the forest. Adding an edge merges two trees
into one. When the algorithm is completed, there will be only one tree, and that is the
minimum spanning tree.
Example:
The graph contains 9 vertices and 14 edges. So, the minimum spanning tree formed will be having (9
– 1) = 8 edges.
After sorting:
1 7 6
2 8 2
2 6 5
4 0 1
4 2 5
Weight Source Destination
6 8 6
7 2 3
7 7 8
8 0 7
8 1 2
9 3 4
10 5 4
11 1 7
14 3 5
Now pick all edges one by one from the sorted list of edges
Step 6: Pick edge 8-6. Since including this edge results in the cycle, discard it. Pick edge 2-3: No cycle
is formed, include it.
Step 7: Pick edge 7-8. Since including this edge results in the cycle, discard it. Pick edge 0-7. No cycle
is formed, include it.
Step 8: Pick edge 1-2. Since including this edge results in the cycle, discard it. Pick edge 3-4. No cycle
is formed, include it.
Since the number of edges included in the MST equals to (V – 1), so the algorithm stops here.
Algorithm
Time complexity
The time complexity of Kruskal's Algorithm is O(ElogE), where E is the number of edges in the graph.
This complexity is because the algorithm uses a priority queue with a time complexity of O(logE).
Algorithm:
• Declare two arrays − distance [] to store the distances from the source vertex to the other
vertices in graph and visited [] to store the visited vertices.
• Set distance[S] to ‘0’ and distance[v] = ∞, where v represents all the other vertices in the
graph.
• Add S to the visited [] array and find the adjacent vertices of S with the minimum distance.
• The adjacent vertex to S, say A, has the minimum distance and is not in the visited array yet.
A is picked and added to the visited array and the distance of A is changed from ∞ to the
assigned distance of A, say d1, where d1 < ∞.
• Repeat the process for the adjacent vertices of the visited vertices until the shortest path
spanning tree is formed.
Examples:
Step 1:
Initialize the distances of all the vertices as ∞, except the source node S.
Now that the source vertex S is visited, add it into the visited array.
visited = {S}
Step 2:
The vertex S has three adjacent vertices with various distances and the vertex with minimum distance
among them all is A. Hence, A is visited and the dist[A] is changed from ∞ to 6.
S → A = 6
S → D = 8
S → E = 7
Visited = {S, A}
Step 3:
▪ There are two vertices visited in the visited array, therefore, the adjacent vertices must be
checked for both the visited vertices.
▪ Vertex S has two more adjacent vertices to be visited yet: D and E. Vertex A has one adjacent
vertex B.
▪ Calculate the distances from S to D, E, B and select the minimum distance −
Visited = {S, A, E}
Step 4:
▪ Calculate the distances of the adjacent vertices – S, A, E – of all the visited arrays and select
the vertex with minimum distance.
S → D = 8
S → B = 15
S → C = S → E + E → C = 7 + 5 = 12
Visited = {S, A, E, D}
Step 5:
Recalculate the distances of unvisited vertices and if the distances minimum than existing distance is
found, replace the value in the distance array.
S → C = S → E + E → C = 7 + 5 = 12
S → C = S → D + D → C = 8 + 3 = 11
S → B = S → A + A → B = 6 + 9 = 15
S → B = S → D + D → C + C → B = 8 + 3 + 12 = 23
Visited = { S, A, E, D, C}
Step 6:
The remaining unvisited vertex in the graph is B with the minimum distance 15, is added to the output
spanning tree.
Visited = {S, A, E, D, C, B}
The shortest path spanning tree is obtained as an output using the Dijkstra’s algorithm.
Time complexity
Time Complexity of Dijkstra's Algorithm is O(V + E l o g V) where V represents the number of vertices
and E represents the number of edges in the graph.