10.23
Idea
The problem is the same as finding the cheapest way to parenthesize a sequence (or to triangulate a polygon).
Insert two fictitious “breaks”, one at the beginning (position 0) and one at the end (position L).
Let
pos[0] = 0 < pos[1] < … < pos[m] < pos[m+1] = L
(m is the number of real cuts).
Define dp[i][j] = minimum cost of completely breaking the substring that starts just after pos[i] and ends at pos[j] (i < j).
The length of that substring is len = pos[j] – pos[i].
Recurrence
If no real cut lies between i and j (j = i+1) the cost is 0.
Otherwise, choose the first cut inside this interval at some k, i < k < j.
That cut costs len units, plus the optimal costs of the two sub-intervals it creates:
dp[i][j] = mini<k<j ( dp[i][k] + dp[k][j] + len )
Algorithm (O(n³)) (n = m+2 is the size of pos)
Correctness sketch
The outermost cut inside any interval
(i,j) must be at some k.
Removing that cut isolates two smaller sub-problems (i,k) and (k,j), whose optimal solutions are, by definition, dp[i][k] and dp[k][j].
Adding the unavoidable cost pos[j]–pos[i] for this outermost cut yields the recurrence; choosing the minimum over all feasible k gives the optimal value.
Induction on the interval width proves optimality.Complexities
There are O(n²) intervals (
i,j).
For each, the inner loop scans at most O(n) choices of k.
Hence total time O(n³) and space O(n²).Result
Running the algorithm on the example (cuts at 3, 8, 10 of a 20-character string) returns the optimal cost 38, matching the right-to-left strategy.
Back to
Chapter 10