Open In App

Find Longest Common Subpath

Last Updated : 17 Jun, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Given an integer n and m friends and a 2D integer array paths[][] where paths[i] is an integer array representing the path of the ith friend, return the length of the longest common subpath that is shared by every friend's path, or 0 if there is no common subpath at all.

A subpath of a path is a contiguous sequence of cities within that path.

Examples:

Input: n = 5, paths = {{0, 1, 2, 3, 4}, {2, 3, 4}, {4, 0, 1, 2, 3}}
Output: 2
Explanation: The longest common subpath is {2, 3}.

Input: n = 3, paths = {{0}, {1}, {2}}
Output: 0
Explanation: There is no common subpath shared by the three paths.

Approach:

Basically, we apply binary search to find the longest common length subpath among all the subpaths.

If we find a common subpath of length k using binary search, then it means we can search for longer common subpaths. If not, then it means that we have to reduce our search space by half. So we check possible length for a max of O(logn) times as that is the max length of the minimum sized path.

Now for finding if we can actually find a common subpath of length k, we keep 2 unordered maps. hs which maps the hash values of the subpaths of length k in the previous path to the starting indices of those subpaths. and hs1 which does the same thing for the current path. Before inserting any subpath of length k into the hs1 hashmap, we have to make sure the hash representing the current subpath is the same as that of the previous subpath i.e there is no collision. This will take O(k) time but since this occurs very rarely, we can omit it from our time complexity calculation.

Step-by-step algorithm:

  • Find the smallest path length r among all given paths and set l to 0.
  • While l is less than r, perform binary search on the length of the subpath:
    • Calculate the midpoint m = (l + r + 1) / 2.
    • Initialize a hash map hs to store hash values of subpaths from the previous iteration.
  • Hashing Subpaths:
    • For each path, compute the hash for subpaths of length m:
    • Use Rabin-Karp's rolling hash method to update the hash efficiently.
    • Maintain a secondary hash map hs1 for the current path's hashes.
    • If this is the first path, simply store the hashes in hs1.
    • For subsequent paths, check if the hash exists in hs and ensure the subpaths match to avoid collisions.
  • Updating Hash Maps:
    • Swap hs and hs1 for the next iteration.
    • If hs is empty, reduce the search space (r = m - 1).
    • If hs contains valid hashes, increase the search space (l = m).
  • Return the result

Below is the implementation of the approach:

C++
#include <bits/stdc++.h>
using namespace std;

const int MOD = 1e9 + 7;
// Base for Rabin-Karp algorithm
const int BASE = 100001;

bool check(int mid, vector<vector<int> >& paths, int n)
{
    // Map to store possible key value pair of hashes
    // and their starting position in previous path
    unordered_map<int, vector<int> > hs;
    for (int i = 0;
         i < paths.size() && (i == 0 || !hs.empty()); ++i) {
        // d stores the required power of base
        long long hash = 0, d = 1;
        // Map to store the found hashes in current path
        unordered_map<int, vector<int> > hs1;
        for (int j = 0; j < paths[i].size(); ++j) {
            // Update hash for current element
            hash = (hash * BASE + paths[i][j]) % MOD;
            // Remove the extra element in the front
            if (j >= mid) {
                hash = (MOD + hash
                        - d * paths[i][j - mid] % MOD)
                       % MOD;
            }
            else {
                // Increase the power of base
                d = d * BASE % MOD;
            }
            // We find a new hash every index from j = m
            // - 1
            if (j >= mid - 1) {
                // If this is the first path
                if (i == 0) {
                    hs1[hash].push_back(j + 1 - mid);
                }
                // For subsequent paths, check if hash
                // exists in previous path
                else {
                    if (hs.count(hash)) {
                        for (auto pos : hs[hash]) {
                            if (equal(begin(paths[0]) + pos,
                                      begin(paths[0]) + pos
                                          + mid,
                                      begin(paths[i]) + j
                                          + 1 - mid)) {
                                hs1[hash].push_back(pos);
                                break;
                            }
                        }
                    }
                }
            }
        }
        // Current map becomes previous map for next path
        swap(hs, hs1);
    }
    return !hs.empty();
}

int longestCommonSubpath(int n, vector<vector<int> >& paths)
{
    // Find the smallest length path of the given paths
    int lo = 0, hi = 1e9;
    for (auto path : paths) {
        hi = min(hi, (int)path.size());
    }

    // Binary Search on answer
    while (lo < hi) {
        // Check if common path of length mid exists
        int mid = (lo + hi + 1) / 2;

        if (check(mid, paths, n)) {
            lo = mid;
        }
        else {
            hi = mid - 1;
        }
    }
    return lo;
}

int main()
{
    // Sample Input
    vector<vector<int> > paths = { { 0, 1, 2, 3, 4 },
                                   { 2, 3, 4 },
                                   { 4, 0, 1, 2, 3 } };

    cout << longestCommonSubpath(5, paths) << endl;
    return 0;
}

Output
2

Time Complexity: O(m * logL * (P+Q)), where m is the number of paths, L is the length of the shortest path, P is the average length of a path, and Q is the number of subpaths of a given length.
Auxiliary Space: O(m * P), due to the storage of hashes for each path.


Next Article
Practice Tags :

Similar Reads