Graph Traversals
and
Minimum Spanning Trees
15-211: Fundamental Data Structures
and Algorithms
Rose Hoberman
April 8, 2003
Announcements
Announcements
Readings:
• Chapter 14
HW5:
• Due in less than one week!
• Monday, April 14, 2003, 11:59pm
15-211: Fundamental Data 3 Rose Hoberman
Structures and Algorithms April 8, 2003
Today
• More Graph Terminology (some review)
• Topological sort
• Graph Traversals (BFS and DFS)
• Minimal Spanning Trees
• After Class... Before Recitation
15-211: Fundamental Data 4 Rose Hoberman
Structures and Algorithms April 8, 2003
Graph
Terminology
Paths and cycles
• A path is a sequence of nodes
v1, v2, …, vN such that (vi,vi+1)∈E for 0<i<N
– The length of the path is N-1.
– Simple path: all vi are distinct, 0<i<N
• A cycle is a path such that v1=vN
– An acyclic graph has no cycles
15-211: Fundamental Data 6 Rose Hoberman
Structures and Algorithms April 8, 2003
Cycles
BOS
DTW
SFO
PIT
JFK
LAX
15-211: Fundamental Data 7 Rose Hoberman
Structures and Algorithms April 8, 2003
More useful definitions
• In a directed graph:
• The indegree of a node v is the number of
distinct edges (w,v)∈E.
• The outdegree of a node v is the number of
distinct edges (v,w)∈E.
• A node with indegree 0 is a root.
15-211: Fundamental Data 8 Rose Hoberman
Structures and Algorithms April 8, 2003
Trees are graphs
• A dag is a directed acyclic graph.
• A tree is a connected acyclic undirected
graph.
• A forest is an acyclic undirected graph (not
necessarily connected), i.e., each connected
component is a tree.
15-211: Fundamental Data 9 Rose Hoberman
Structures and Algorithms April 8, 2003
Example DAG
Undershorts
Socks
Watch
Pants Shoes
Shirt
a DAG implies an
Belt Tie ordering on events
Jacket
15-211: Fundamental Data 10 Rose Hoberman
Structures and Algorithms April 8, 2003
Example DAG
Undershorts
Socks
Watch
Pants Shoes
Shirt
In a complex DAG, it
Belt Tie can be hard to find a
schedule that obeys
Jacket all the constraints.
15-211: Fundamental Data 11 Rose Hoberman
Structures and Algorithms April 8, 2003
Topological Sort
Topological Sort
• For a directed acyclic graph G = (V,E)
• A topological sort is an ordering of all of G’s
vertices v1, v2, …, vn such that...
Formally: for every edge (vi,vk) in E, i<k.
Visually: all arrows are pointing to the right
15-211: Fundamental Data 13 Rose Hoberman
Structures and Algorithms April 8, 2003
Topological sort
• There are often many possible topological
sorts of a given DAG
• Topological orders for this DAG :
1 2
• 1,2,5,4,3,6,7
• 2,1,5,4,7,3,6 3 4 5
• 2,5,1,4,7,3,6
• Etc. 6 7
• Each topological order is a feasible schedule.
15-211: Fundamental Data 14 Rose Hoberman
Structures and Algorithms April 8, 2003
Topological Sorts for Cyclic
Graphs?
1 2
Impossible!
3
• If v and w are two vertices on a cycle, there
exist paths from v to w and from w to v.
• Any ordering will contradict one of these paths
15-211: Fundamental Data 15 Rose Hoberman
Structures and Algorithms April 8, 2003
Topological sort algorithm
• Algorithm
– Assume indegree is stored with each node.
– Repeat until no nodes remain:
• Choose a root and output it.
• Remove the root and all its edges.
• Performance
– O(V2 + E), if linear search is used to find a root.
15-211: Fundamental Data 16 Rose Hoberman
Structures and Algorithms April 8, 2003
Better topological sort
• Algorithm:
– Scan all nodes, pushing roots onto a stack.
– Repeat until stack is empty:
• Pop a root r from the stack and output it.
• For all nodes n such that (r,n) is an edge, decrement n’s
indegree. If 0 then push onto the stack.
• O( V + E ), so still O(V2) in worst case, but better
for sparse graphs.
• Q: Why is this algorithm correct?
15-211: Fundamental Data 17 Rose Hoberman
Structures and Algorithms April 8, 2003
Correctness
• Clearly any ordering produced by this algorithm is
a topological order
But...
• Does every DAG have a topological order, and if
so, is this algorithm guaranteed to find one?
15-211: Fundamental Data 18 Rose Hoberman
Structures and Algorithms April 8, 2003
Quiz Break
Quiz
• Prove:
– This algorithm never gets stuck, i.e. if there
are unvisited nodes then at least one of
them has an indegree of zero.
• Hint:
– Prove that if at any point there are unseen
vertices but none of them have an indegree
of 0, a cycle must exist, contradicting our
assumption of a DAG.
15-211: Fundamental Data 20 Rose Hoberman
Structures and Algorithms April 8, 2003
Proof
• See Weiss page 476.
15-211: Fundamental Data 21 Rose Hoberman
Structures and Algorithms April 8, 2003
Graph Traversals
Graph Traversals
•Both take time: O(V+E)
15-211: Fundamental Data 23 Rose Hoberman
Structures and Algorithms April 8, 2003
Use of a stack
• It is very common to use a stack to keep track
of:
– nodes to be visited next, or
– nodes that we have already visited.
• Typically, use of a stack leads to a depth-first
visit order.
• Depth-first visit order is “aggressive” in the
sense that it examines complete paths.
15-211: Fundamental Data 24 Rose Hoberman
Structures and Algorithms April 8, 2003
Topological Sort as DFS
• do a DFS of graph G
• as each vertex v is “finished” (all of it’s
children processed), insert it onto the front of
a linked list
• return the linked list of vertices
• why is this correct?
15-211: Fundamental Data 25 Rose Hoberman
Structures and Algorithms April 8, 2003
Use of a queue
• It is very common to use a queue to keep
track of:
– nodes to be visited next, or
– nodes that we have already visited.
• Typically, use of a queue leads to a breadth-
first visit order.
• Breadth-first visit order is “cautious” in the
sense that it examines every path of length i
before going on to paths of length i+1.
15-211: Fundamental Data 26 Rose Hoberman
Structures and Algorithms April 8, 2003
Graph Searching ???
• Graph as state space (node = state, edge = action)
• For example, game trees, mazes, ...
• BFS and DFS each search the state space for a best
move. If the search is exhaustive they will find the same
solution, but if there is a time limit and the search space is
large...
• DFS explores a few possible moves, looking at the effects
far in the future
• BFS explores many solutions but only sees effects in the
near future (often finds shorter solutions)
15-211: Fundamental Data 27 Rose Hoberman
Structures and Algorithms April 8, 2003
Minimum Spanning Trees
15-211: Fundamental Data 28 Rose Hoberman
Structures and Algorithms April 8, 2003
Problem: Laying Telephone Wire
Central office
15-211: Fundamental Data 29 Rose Hoberman
Structures and Algorithms April 8, 2003
Wiring: Naïve Approach
Central office
Expensive!
15-211: Fundamental Data 30 Rose Hoberman
Structures and Algorithms April 8, 2003
Wiring: Better Approach
Central office
Minimize the total length of wire connecting the customers
15-211: Fundamental Data 31 Rose Hoberman
Structures and Algorithms April 8, 2003
Minimum Spanning Tree (MST)
(see Weiss, Section 24.2.2)
A minimum spanning tree is a subgraph of an
undirected weighted graph G, such that
• it is a tree (i.e., it is acyclic)
• it covers all the vertices V
– contains |V| - 1 edges
• the total cost associated with tree edges is the
minimum among all possible spanning trees
• not necessarily unique
15-211: Fundamental Data 32 Rose Hoberman
Structures and Algorithms April 8, 2003
Applications of MST
• Any time you want to visit all vertices in a graph at
minimum cost (e.g., wire routing on printed circuit boards,
sewer pipe layout, road planning…)
• Internet content distribution
– $$$, also a hot research topic
– Idea: publisher produces web pages, content distribution network
replicates web pages to many locations so consumers can access
at higher speed
– MST may not be good enough!
• content distribution on minimum cost tree may take a long time!
• Provides a heuristic for traveling salesman problems. The
optimum traveling salesman tour is at most twice the
length of the minimum spanning tree (why??)
15-211: Fundamental Data 33 Rose Hoberman
Structures and Algorithms April 8, 2003
How Can We Generate a MST?
9 9
b b
a 2 6 a 2 6
d d
4 5 4 5
5 4 5 4
5 e 5 e
c c
15-211: Fundamental Data 34 Rose Hoberman
Structures and Algorithms April 8, 2003
Prim’s Algorithm
Initialization
a. Pick a vertex r to be the root
b. Set D(r) = 0, parent(r) = null
c. For all vertices v ∈ V, v ≠ r, set D(v) = ∞
d. Insert all vertices into priority queue P,
using distances as the keys
9 Vertex Parent
b e -
a 2 6 e a b c d
d
4 5
5 4 0 ∞ ∞ ∞ ∞
5 e
c
15-211: Fundamental Data 35 Rose Hoberman
Structures and Algorithms April 8, 2003
Prim’s Algorithm
While P is not empty:
1. Select the next vertex u to add to the tree
u = P.deleteMin()
2. Update the weight of each vertex w adjacent to
u which is not in the tree (i.e., w ∈ P)
If weight(u,w) < D(w),
a. parent(w) = u
b. D(w) = weight(u,w)
c. Update the priority queue to reflect
new distance for w
15-211: Fundamental Data 36 Rose Hoberman
Structures and Algorithms April 8, 2003
Prim’s algorithm
Vertex Parent
e -
e d b c a b -
0 ∞ ∞ ∞ ∞ c -
9 b d -
a 2 6
d Vertex Parent
4 5
5 4 e -
5 e d b c a b e
c 4 5 5 ∞ c e
d e
The MST initially consists of the vertex e, and we update
the distances and parent for its adjacent vertices
15-211: Fundamental Data 37 Rose Hoberman
Structures and Algorithms April 8, 2003
Prim’s algorithm
Vertex Parent
e -
d b c a b e
4 5 5 ∞ c e
9 b d e
a 2 6
d
4 5
5 4
Vertex Parent
5 e e -
c a c b b e
2 4 5 c d
d e
a d
15-211: Fundamental Data 38 Rose Hoberman
Structures and Algorithms April 8, 2003
Prim’s algorithm
Vertex Parent
e -
a c b b e
2 4 5 c d
9 d e
b
a d
a 2 6
d
4 5
5 4
5 e
c Vertex Parent
e -
c b b e
4 5 c d
d e
a d
15-211: Fundamental Data 39 Rose Hoberman
Structures and Algorithms April 8, 2003
Prim’s algorithm
Vertex Parent
e -
c b b e
9 4 5 c d
b d e
a 2 6
a d
d
4 5
5 4
5 e Vertex Parent
c e -
b b e
5 c d
d e
a d
15-211: Fundamental Data 40 Rose Hoberman
Structures and Algorithms April 8, 2003
Prim’s algorithm
Vertex Parent
e -
b b e
5 c d
9 b d e
a 2 6 a d
d
4 5
5 4
5 e
c Vertex Parent
e -
b e
The final minimum spanning tree c d
d e
a d
15-211: Fundamental Data 41 Rose Hoberman
Structures and Algorithms April 8, 2003
Running time of Prim’s algorithm
(without heaps)
Initialization of priority queue (array): O(|V|)
Update loop: |V| calls
• Choosing vertex with minimum cost edge: O(|V|)
• Updating distance values of unconnected
vertices: each edge is considered only once
during entire execution, for a total of O(|E|)
updates
Overall cost without heaps: O(|E| + |V| 2)
When heaps are used, apply same analysis as for
Dijkstra’s algorithm (p.469) (good exercise)
15-211: Fundamental Data 42 Rose Hoberman
Structures and Algorithms April 8, 2003
Prim’s Algorithm Invariant
• At each step, we add the edge (u,v) s.t. the
weight of (u,v) is minimum among all edges
where u is in the tree and v is not in the tree
• Each step maintains a minimum spanning tree of
the vertices that have been included thus far
• When all vertices have been included, we have a
MST for the graph!
15-211: Fundamental Data 43 Rose Hoberman
Structures and Algorithms April 8, 2003
Correctness of Prim’s
• This algorithm adds n-1 edges without creating a cycle,
so clearly it creates a spanning tree of any connected
graph (you should be able to prove this).
But is this a minimum spanning tree?
Suppose it wasn't.
• There must be point at which it fails, and in particular
there must a single edge whose insertion first
prevented the spanning tree from being a minimum
spanning tree.
15-211: Fundamental Data 44 Rose Hoberman
Structures and Algorithms April 8, 2003
Correctness of Prim’s
• Let G be a connected,
undirected graph
• Let S be the set of
edges chosen by Prim’s x
algorithm before y
choosing an errorful
edge (x,y)
• Let V' be the vertices incident with edges in S
• Let T be a MST of G containing all edges in S, but not (x,y).
15-211: Fundamental Data 45 Rose Hoberman
Structures and Algorithms April 8, 2003
Correctness of Prim’s
w
v
• Edge (x,y) is not in T, so
there must be a path in
T from x to y since T is x
connected. y
• Inserting edge (x,y) into
T will create a cycle
• There is exactly one edge on this cycle with exactly one
vertex in V’, call this edge (v,w)
15-211: Fundamental Data 46 Rose Hoberman
Structures and Algorithms April 8, 2003
Correctness of Prim’s
• Since Prim’s chose (x,y) over (v,w), w(v,w) >= w(x,y).
• We could form a new spanning tree T’ by swapping (x,y)
for (v,w) in T (prove this is a spanning tree).
• w(T’) is clearly no greater than w(T)
• But that means T’ is a MST
• And yet it contains all the edges in S, and also (x,y)
...Contradiction
15-211: Fundamental Data 47 Rose Hoberman
Structures and Algorithms April 8, 2003
Another Approach
• Create a forest of trees from the vertices
• Repeatedly merge trees by adding “safe edges”
until only one tree remains
• A “safe edge” is an edge of minimum weight which
does not create a cycle
9 b
a 2 6
d forest: {a}, {b}, {c}, {d}, {e}
4 5
5 4
5 e
c
15-211: Fundamental Data 48 Rose Hoberman
Structures and Algorithms April 8, 2003
Kruskal’s algorithm
Initialization
a. Create a set for each vertex v ∈ V
b. Initialize the set of “safe edges” A
comprising the MST to the empty set
c. Sort edges by increasing weight
9 b
a 2 6 F = {a}, {b}, {c}, {d}, {e}
d A=∅
4 5
5 4
E = {(a,d), (c,d), (d,e), (a,c),
5 e (b,e), (c,e), (b,d), (a,b)}
c
15-211: Fundamental Data 49 Rose Hoberman
Structures and Algorithms April 8, 2003
Kruskal’s algorithm
For each edge (u,v) ∈ E in increasing order
while more than one set remains:
If u and v, belong to different sets U and V
a. add edge (u,v) to the safe edge set
A = A ∪ {(u,v)}
b. merge the sets U and V
F = F - U - V + (U ∪ V)
Return A
• Running time bounded by sorting (or findMin)
• O(|E|log|E|), or equivalently, O(|E|log|V|) (why???)
15-211: Fundamental Data 50 Rose Hoberman
Structures and Algorithms April 8, 2003
Kruskal’s algorithm
9 b
a 2 6
E= {(a,d), (c,d), (d,e), (a,c),
d
5 4
4 5 (b,e), (c,e), (b,d), (a,b)}
5 e
c
Forest A
{a}, {b}, {c}, {d}, {e} ∅
{a,d}, {b}, {c}, {e} {(a,d)}
{a,d,c}, {b}, {e} {(a,d), (c,d)}
{a,d,c,e}, {b} {(a,d), (c,d), (d,e)}
{a,d,c,e,b} {(a,d), (c,d), (d,e), (b,e)}
15-211: Fundamental Data 51 Rose Hoberman
Structures and Algorithms April 8, 2003
Kruskal’s Algorithm Invariant
• After each iteration, every tree in the forest is a MST
of the vertices it connects
• Algorithm terminates when all vertices are connected
into one tree
15-211: Fundamental Data 52 Rose Hoberman
Structures and Algorithms April 8, 2003
Correctness of Kruskal’s
• This algorithm adds n-1 edges without creating a
cycle, so clearly it creates a spanning tree of any
connected graph (you should be able to prove this).
But is this a minimum spanning tree?
Suppose it wasn't.
• There must be point at which it fails, and in particular
there must a single edge whose insertion first
prevented the spanning tree from being a minimum
spanning tree.
15-211: Fundamental Data 53 Rose Hoberman
Structures and Algorithms April 8, 2003
Correctness of Kruskal’s
K T
S
e
• Let e be this first errorful edge.
• Let K be the Kruskal spanning tree
• Let S be the set of edges chosen by Kruskal’s algorithm before choosing e
• Let T be a MST containing all edges in S, but not e.
15-211: Fundamental Data 54 Rose Hoberman
Structures and Algorithms April 8, 2003
Correctness of Kruskal’s
Lemma: w(e’) >= w(e) for all edges e’ in T - S
Proof (by contradiction):
• Assume there exists some K T
edge e’ in T - S, w(e’) < S
w(e)
• Kruskal’s must have e
considered e’ before e
• However, since e’ is not in K (why??), it must have
been discarded because it caused a cycle with some of
the other edges in S.
• But e’ + S is a subgraph of T, which means it cannot
form a cycle ...Contradiction
15-211: Fundamental Data 55 Rose Hoberman
Structures and Algorithms April 8, 2003
Correctness of Kruskal’s
• Inserting edge e into T will create a cycle
• There must be an edge on this cycle which is not in K (why??).
Call this edge e’
• e’ must be in T - S, so (by our lemma) w(e’) >= w(e)
• We could form a new spanning tree T’ by swapping e for e’ in T
(prove this is a spanning tree).
• w(T’) is clearly no greater than w(T)
• But that means T’ is a MST
• And yet it contains all the edges in S, and also e
...Contradiction
15-211: Fundamental Data 56 Rose Hoberman
Structures and Algorithms April 8, 2003
Greedy Approach
• Like Dijkstra’s algorithm, both Prim’s and Kruskal’s
algorithms are greedy algorithms
• The greedy approach works for the MST problem;
however, it does not work for many other
problems!
15-211: Fundamental Data 57 Rose Hoberman
Structures and Algorithms April 8, 2003
That’s All!