COMP2007 + COMP2907 Notes
COMP2007 + COMP2907 Notes
Polynomial time:
Graphs
-
A uv =1 if ( u , v ) is an edge .
Running time:
queued at most once O(V), and scanning all adjacent vertices takes
O(E).
o Using either Adjacency-matrix or list. Space complexity will differ
however.
o Uses a queue.
Connected component: All nodes reachable from s.
o Theorem: Upon termination, R is the connected component
containing s.
Shortest Paths: Compute the shortest path from a given node to all
other nodes.
o BFS: Computes the hop distance from s to u.
Initialise all dist[u] as infinity in BFS. Dist[s] = 0.
Transitive closure: Given a graph G, compute G such that:
o Has the same vertices
with an edge between all pairs of nodes that are connected
by a path in G.
o Use BFS but modify by adding an edge to (s,v) after we have seen
it.
Depth first search: Pick a starting vertex, s, and following outgoing
edges that lead to undiscovered vertices and backtrack (pop the stack)
whenever stuck.
o Assume that G is connected.
o Uses a stack.
o
Running time:
O(V +E )
O ( m+n ) .
o
L0 , , Lk
be the
Algorithm:
Run DFS on graph
For each edge in the DFS tree
Remove that edge from graph G.
Check if G is now disconnected (using DFS).
Running time:
O ( mn ) .
O ( m+n )
Running time:
Greversed
both executions.
To consider disjoint SCC use DFS twice, once to compute finish[u],
and then call in main loop in decreasing order. Output two forests
and compare.
Topological order: an ordering of nodes v, such that for every edge,
o
(v i , V j ) I < j.
o
o
o
O ( m+n ) .
Interval scheduling:
o
o
o
O(nlogn)
Running time:
Let
i1 , i2 , ik
si f i
where * denotes
greedy.
Let
j1 , j2 , , jm
solution.
For this we only need to prove |A| = |O|, as proving A =
O is too hard.
OPT wouldve just chosen a different interval in the
same time frame however the overall intervals remains
the same. Use induction.
Since
f ( i r1 ) f ( j r 1 ) s ( j r ) , Job
jr
is
s
( i , f i) find the minimum
number of bins to schedule such that all the intervals in the bin are
compatible.
depth.
si
s i +e , where e
d classrooms.
f id i ( i<k ) l i
T , such that
o
-
Invariant: For each node, d(u) is the length of the shortest s-u
path.
classify into
coherent groups.
Distance function: Numeric value specifying closeness of two objects.
o Algorithm: Form a graph on the vertex set V, corresponding to n
clusters.
Find the closest pair of objects such that each item is in a
different cluster, and add an edge between them.
Repeat n-k times until there are exactly k clustesrs.
Similar to finding an MST, and deleting k-1 expensive edges.
Kruskals.
p1 , , pn
Binary search
Mergesort
Closest pair of points:
o Given n points in the plane, find a pair with smallest Euclidean
distance between them.
o Algorithm: (Assume no two points have same x coordinate).
1
n
2
points on
each side.
Conquer: find the closest pair in each side recursively.
Combine: find the closest pair with one point in each side.
Return best of 3 solutions.
We only need to consider the points that lie within a certain
distance from the line.
Sort points in the 2*certain distance by their ycoordinate, check distances of those within the shorted
list.
Proof: No two points lie in the same *certain
distance by *certain distance box,
o Two points at least 2 rows apart have distance
>= 2*1/2(certain distance).
Running time:
Can achieve
O ( nlo g2 n )
O ( nlogn )
T ( n ) =aT
where
( nb )+f ( n) ,
Case 1: If
f ( n )=O ( nlog a )
b
Case 2: If
f ( n ) and
Case 3: If
log b a
nlog a . Hence
b
log b a
k 0
) .
then
n
k +1
n
log
T ( n ) =O
log b a
f (n )
T ( n ) =O ( f ( n ) ) .
Quick and dirty: if n^logba = f(n), add a log(n), if less take f(n), if
greater take n^logba.
Sweepline technique (and computational geometry): study of
algorithms to solve problems stated in terms of geometry.
Depth of interval: Given a set S of n intervals compute the depth of S is
the maximum number of intervals passing over a point.
o Algorithm: Sweepline sweeping from left to right while maintaining
the current depth. Data structure: Binary search tree with event
points.
o Event points: Endpoints of the intervals.
o Current depth is stored in the sweepline.
o
f (n )
T ( n ) =O(n
> 0 then
Running time:
O ( nlogn ) .
Convex hulls: a subset of the plane is convex if for every pair of points
(p,q) in S the straight line segment pq is completely contained in S. The
convex hull of a point set is the smallest convex set containing S.
o Divide and conquer approach:
If S not empty then
Find farthest point C in S from AB
Add C to convex hull between A and B
S0 = {points inside ABC}
S1 = {points to the right of AC}
S2 = {points to right of CB}
FindHull(S1,A,C)
FindHull(S2, C,B)
o Sweepline approach:
Maintain hull while adding the points one by one, from left to
right sweep the point from left to right.
O(nlogn)
Closest pair: Use two parallel vertical sweep-lines: the front
Lf back LB .
o
o
o
o
o
Invariant: The closest pair among the points to the left and the
distance d between this pair.
Data structure: BST to store all points in S between the two
sweeplines from top to bottom.
Once you reach an event point for both lines, calculate the
distances.
Find the point s closest to s inbetween
within vertical
distance d from s.
If |ss| < d then
set d = |ss|
CP = (s,s)
LbLf
Sweep
Lb
and update T.
o Insert s into T.
o O(nlogn).
Visibility: Let S be a set of n disjoint line segments in the plane, and let p
be a point not on any line segment of S, determine all the line segments
that p can see.
o Event points: Endpoints of segments.
o Keep track of: The segment that q sees in that direction, and the
order of the segments along the ray.
o Invariant: the segments seen so far, and the order of the segments
intersecting the ray.
o Event handling: Two cases: first endpoint, last endpoint.
First endpoint: insert new segment into D, if s is the first
segment hit by ray, report s.
Second endpoint: Remove s from D, if s was the first
segment in D, report new first segment in D.
o Complexity: Number of endpoints: 2n, handle events log(n)
(insert/delete from BST). = O(nlogn).
o Start sweep by sorting all segments intersecting starting
ray.
Dynamic programming
1) Define sub-problems: define what OPT(i) is, and what it does. And the
invariant.
2) Find recurrences: name all the test cases and write the formula.
3) Solve the base cases
4) Transform recurrence into an efficient algorithm.
o
o
Running time:
Case 1: Base
1).
Case 2: Base
bj
bj
pairs with
bt
for some
i t< j4 .
i t< j4
Base case:
If
Running time:
o
o
M[t] = 0
For i =1 to n-1
For each node w in V, if M[w] has been updated in
previous iteration, then for each node v such that (v,w)
in E. If M[v] > M[w] + c_vw then update.
With successor[v] being w.
If no M[w] value has changed in iteration I, then stop.
Improvements:
Maintain only one array M[v] = shortest apth we have
found so far.
No need to check edges of form (v,w) unless M[w]
changed in previous iteration.
Theorem: Throughout the algorithm, M[v] is length of some
v-t path, and after I rounds of updates, the value M[v] is no
larger than the length of shortest v-t path using <= I edges.
Memory: O(m+n)
Running time: O(mn) worst case, but on average is faster.
Least squares
squared error.
Sub-problems:
y=ax+b
p1 , p2 , , p j
Recurrences: OPT(j) =
min ( e ( i , j ) +c +OPT ( i1 ) } 1 i j, e (i , j)
squares for points
= minimising sum of
p1 p j .
Running time:
3
2
Running time:O ( n ) , space ( O ( n ) ) .
Flow Networks
-
For each
e E :0 f ( e ) c (e)
For each
v V { s , t } : f = f ( e ) ( e out of s)
o
-
Capacity restriction.
conservation.
s At B
The net flow sent across the cut is equal to the amount leaving s.
This is for flow across a cut not the capacity as described
above.
v ( f )=f out ( a )f ( A)
Weak duality
The value of the flow is at most the capacity of the cut.
v ( f ) cap( A , B)
Optimality
cap ( A , b ) = c ( e ) ( e out of A ) .
If
Greedy algorithm
eE
o
o
o
f ( e ) <c (e )
c ( e ) f ( e)
Gf
and
f (e )
f = Augment(f,P) [Bottleneck]
update
Gf
Gf
return f.
Proof
Assume initial capacities are integers.
At every intermidate stage of Ford-Fulkerson algorithm
the flow values and the residual graph capacities in
Gf
are integers
Induction
o Base case: initially the statements are correct
o Induction hyp: true after j iterations
o Induction step: Since all residual capacities in Gf
are integers, the bottleneck must also be an
integer. Thus the flow will have integer values
O(m2 logC )
Bipartite matching
A subset of E is a matching if each node appears in at most
one edge in M.
Max matching: find a max cardinality matching.
Algorithm
Use max flow
o Create digraph G
o Direct all edges from L to R, and assign unit
capacity
o Attached source s, and unit capacity edges from
s to each node in L
o Likewise R to t with unit capacities.
Proof
o You can only have 1 unit of flow on each of k
paths define by M,
o F is a flow and it has a value k.
o Cardinality is therefore at most k.
o Integrality theorem k is integral so f(e) is 0 or 1.
o Each node in L and R participate in at most one
edge in M.
Perfect matching: All nodes appear in the matching.
Marriage theorem:
Running time:
n1
Ask consumer
Ask between
consumers about
between
p j p j '
n2
about a product
c ic
'
i
products.
if they own it
questions
Circulation problem
Include edge (I,j) if consumer owns product I,
If the circulation problem is feasible then the survey problem
is feasible and vice versa.
Image segmentation.
Baseball elimination
o
Team is eliminated iff max flow is less than the total number of
games left.
Polynomial-Time reduction
X pY
if
Class NP-hard
-
Approximation ratio=
Approximation ratio
Approximation solution
3-Sat
o SAT where each clause contains 3 literals, is there a truth
assignment? [satisfiable].
Clique
o A clique of a graph G is complete sub graph of G, is there a subgraph of size k.
o A G has a k-clique iff E is satisfiable.
Careful with direction of reduction:
o In order to prove NP-hard, we reduce a known problem to it.
E.g. independent set can be decided iff the instance is
satisfiable.
Hamiltonian cycle
o Does there exist a simple cycle C that visits every node.
Longest path
TSP: Given a set of n cities and a pairwise distance function d(u,v) is there
a tour of length
D.
COMP2907
-
Assume
H opt
HA
is the tour
x=U s ( sF)
Find a subset C of minimal size which covers X.
That is find the minimum number of sets that covers the universe,
X.
Algorithm:
C <- empty set.
U<-X
While U
do
Sets U =2
Kargers algorithms
o Global minimum cut: find a cut (S,S) of minimum cardinality
o While |V| > 2; contract an arbitrary edge (u,v) in G, return the cut S.
o This algorithm is an approximation algorithm.
To amplify the probability of success, run the contraction
algorithm many times
Repeat the contract algorithm r[n 2] times with independent
random choices, the probability that all runs fail is at most
n^(-c).
o Running time:
N-2 iterations
Each iteration O(n)
O(n^2)
2
Hence O( n logn .
times.
k-path
-
Parameterised problem:
o Given an instance of the problem and a parameter k, can we give a
yes or no instance?
Tutorials
1) If we have a blackbox, we can only make assumptions about the overall
lower bound. We cant make any assumptions about the upper bound.
2) Greedy algorithm, 2-approximation; pick the edge points of k-edges. If we
pick at least k edges, we have at least 2k vertices.
3) If we add edge weights to shortest path; it is no longer the shortest path,
or by squaring (weight < 1).
4) Optimal spanning tree does not change; order of edge weights doesnt
change.
5) Greedy algorithm for sorting by time/weight; exchange argument.
6) S-t path with vertex weights; change vertex into two vertices with the
vertex weight being the edge between them.
13)
If I can see another interval; ray intersection; sweep r
counterclockwise from a position, initialise a BST that contains all
segments intersecting r, ordered using a distance function and the
intersection point. Consider the events.
14)
To merge two hulls in a hull; compute upper and lower tangents and
discarding all points lying between the two tangents. Find the rightmost
point and leftmost point, if its not a tangent, go clockwise/anticlockwise.
15)
Reverse-MST works. Sort by highest weights, and if graph remains
connected after removing the edge, keep goin.
16)
Another variation, for every cycle in T, remove the heaviest edge.
Proof for MSTs is via contradiction, we assume its an MST first, and show that it
contradicts with that MST definition.
17)
Fibonacci recursive is equal to the Fibonacci number itself
exponentially. An alternative is to use dynamic programming:
- Base cases: M[0] 0, M[1] 1, M[i] = M[i-1] + M[i-2].
18)
For some cases like with coins, you must initialise your base case
for the first 10 coins. And then use min(1+ j-1, j-7 and j-10).
19)
Dynamic programming, make the table.
20)
21)
22)
In the residual graph; any node reachable by s is in the min cut.
23)
Vertex capacities can be accounted for like before.
24)
Effective radius problem; precompute edge capacities if the pair is
in range.
25)
Flight problem: assign rigid unit capacities; add an extra edge, if the
flight is reachable (pre-compute).
26)
For an undirected graph; create two antiparallel edges; max flow is
k for capture flag problem. Ford Fulkerson.
27)
For even edges, the max flow must be even, (bottleneck must be
even; use induction).
28)
Every d-regular graph has pefect matching because marriage
theorem holds.
29)
To reduce vertex cover to set cover; U is the edge set of G, for each
vertex, we make a set containing the edges incident on u, set t = k. Vertex
cover is a special case of set cover.
30)
D-interval scheduling; use independent set.
31)
Degree at least delta, use base case n = 2, then input dummy
nodes to increase degree.
32)
Value of max flow always equal to the min cut.
33)
Augment called at most C times; every iteration, capacity must
increase by 1.