CSES solution-Repeating Substring
A repeating substring is a substring that occurs in two (or more) locations in the string. Your task is to find the longest repeating substring in a given string.
Example:
Input: s = "cabababc"
Output: ababInput: s = "babababb"
Output: babab
Approach:
The solution is based on two main concepts: Suffix Arrays and Longest Common Prefix (LCP) Arrays.
Suffix Arrays: A suffix array is a sorted array of all suffixes of a given string. The suffixes are sorted in lexicographical order. The purpose of creating a suffix array is to sort all suffixes so that we can search for patterns (in this case, repeating substrings) in the sorted list of suffixes.
Longest Common Prefix (LCP) Array: The LCP array is an array that stores the longest common prefix between two consecutive suffixes in the sorted suffix array. The purpose of creating an LCP array is to find the longest common prefix between all pairs of consecutive suffixes. This helps in finding the longest repeating substring.
The core logic of the solution is as follows:
- Step 1: Build the suffix array of the string. This is done using the Manber-Myers algorithm, which is an efficient algorithm to build a suffix array in O(n log n) time. The algorithm starts by sorting all 1-length suffixes, then 2-length, 4-length, and so on until all suffixes are sorted.
- Step 2: Once the suffix array is built, the next step is to build the LCP array. This is done by comparing characters of suffixes one by one. If the characters match, increment the count of the longest common prefix.
- Step 3: After the LCP array is built, the maximum value in the LCP array is the length of the longest repeating substring. The substring itself can be obtained from the suffix array.
- Step 4: If the maximum LCP is 0, it means there are no repeating substrings, so the program outputs -1. Otherwise, it prints the longest repeating substring.
Step-by-step approach:
- Build the Suffix Array
- Iterate over gaps gap (powers of 2).
- Sort the suffix array suffixArray[] using the compareSuffixes function.
- Update the temporary[] array temporary based on the sorted suffix array.
- Update the position array position with temporary values.
- Check if all suffixes are at their correct position.
- Build the Longest Common Prefix Array
- Iterate over the string s.
- Get the next suffix in sorted order j.
- Compare characters and update the LCP value k.
- Set the LCP value and decrement k if non-zero.
- Find the Maximum Common Substring
- Find the index of the maximum element in the LCP array maxIndex.
- If the maximum LCP value is 0, no common substring exists.
- Output the longest common substring using the suffix array and LCP value.
Below are the implementation of the above approach:
#include <bits/stdc++.h>
using namespace std;
#define int long long
#define endl '\n'
// Maximum size of the string
const int maxSize = 1e5 + 5;
// Arrays for suffix array, position, temporary, and longest
// common prefix
int suffixArray[maxSize], position[maxSize],
temporary[maxSize], LCP[maxSize];
// Variables for gap between characters and size of the
// string
int gap, n;
// The input string
string s;
// Function to compare two suffixes
bool compareSuffixes(int x, int y)
{
// Compare function for sorting suffix array
if (position[x] != position[y])
return position[x] < position[y];
x += gap;
y += gap;
return (x < n && y < n) ? position[x] < position[y]
: x > y;
}
// Function to build the suffix array
void buildSuffixArray()
{
// Build the suffix array using the Manber-Myers
// algorithm
// Initialize suffix array and position array
for (int i = 0; i < n; i++)
suffixArray[i] = i, position[i] = s[i];
// Iterate over gaps (powers of 2) for sorting
for (gap = 1;; gap <<= 1) {
// Sort suffix array using the current gap
sort(suffixArray, suffixArray + n, compareSuffixes);
// Update temporary array based on the sorted suffix
// array
for (int i = 0; i < n - 1; i++)
temporary[i + 1]
= temporary[i]
+ compareSuffixes(suffixArray[i],
suffixArray[i + 1]);
// Update position array with temporary values
for (int i = 0; i < n; i++)
position[suffixArray[i]] = temporary[i];
// Check if all suffixes are at their correct
// position
if (temporary[n - 1] == n - 1)
break;
}
}
// Function to build the longest common prefix array
void buildLCP()
{
// Build the Longest Common Prefix (LCP) array
// Iterate over the original string to compute LCP
// values
for (int i = 0, k = 0; i < n; i++)
if (position[i] != n - 1) {
// Get the next suffix in the sorted order
int j = suffixArray[position[i] + 1];
// Compare characters and update LCP
while (s[i + k] == s[j + k])
k++;
// Set LCP value and decrement if non-zero
LCP[position[i]] = k;
if (k)
k--;
}
}
// Main function
signed main()
{
// Read the input string
s = "cabababc";
n = s.size();
// Build the suffix array and longest common prefix
// array
buildSuffixArray();
buildLCP();
// Find the index of the maximum element in the LCP
// array
int maxIndex = max_element(LCP, LCP + n) - LCP;
// If the maximum LCP value is 0, no common substring
// exists
if (LCP[maxIndex] == 0)
return cout << -1, 0;
// Output the longest common substring
cout << s.substr(suffixArray[maxIndex], LCP[maxIndex]);
}
import java.util.Arrays;
public class Main {
// Maximum size of the string
static final int maxSize = (int) Math.pow(10, 5) + 5;
// Variables for gap between characters and size of the string
static int gap = 0;
static int n = 0;
// The input string
static String s = "";
// Function to compare two suffixes
static boolean compareSuffixes(int x, int y, int[] position) {
// Compare function for sorting suffix array
if (position[x] != position[y]) {
return position[x] < position[y];
}
x += gap;
y += gap;
return (x < n && y < n) && (position[x] < position[y]) || (x > y);
}
// Function to build the suffix array
static int[][] buildSuffixArray() {
// Initialize suffix array and position array
Integer[] suffixArray = new Integer[n];
for (int i = 0; i < n; i++) {
suffixArray[i] = i;
}
int[] position = new int[n];
for (int i = 0; i < n; i++) {
position[i] = s.charAt(i);
}
// Iterate over gaps (powers of 2) for sorting
gap = 1;
while (true) {
// Sort suffix array using the current gap
Arrays.sort(suffixArray, (x, y) -> compareSuffixes(x, y, position) ? -1 : 1);
// Check if all suffixes are at their correct position
if (gap >= n) {
break;
}
// Update temporary array based on the sorted suffix array
int[] temporary = new int[n];
for (int i = 0; i < n - 1; i++) {
temporary[i + 1] = temporary[i] + (compareSuffixes(suffixArray[i], suffixArray[i + 1], position) ? 1 : 0);
}
// Update position array with temporary values
for (int i = 0; i < n; i++) {
position[suffixArray[i]] = temporary[i];
}
gap <<= 1;
}
return new int[][] {Arrays.stream(suffixArray).mapToInt(Integer::intValue).toArray(), position};
}
// Function to build the longest common prefix array
static int[] buildLCP(int[] suffixArray, int[] position) {
// Build the Longest Common Prefix (LCP) array
// Iterate over the original string to compute LCP values
int[] LCP = new int[n];
int k = 0;
for (int i = 0; i < n; i++) {
if (position[i] != n - 1) {
// Get the next suffix in the sorted order
int j = suffixArray[position[i] + 1];
// Compare characters and update LCP
while (i + k < n && j + k < n && s.charAt(i + k) == s.charAt(j + k)) {
k++;
}
// Set LCP value and decrement if non-zero
LCP[position[i]] = k;
if (k > 0) {
k--;
}
}
}
return LCP;
}
public static void main(String[] args) {
// Read the input string
s = "cabababc";
n = s.length();
// Build the suffix array
int[][] result = buildSuffixArray();
int[] suffixArray = result[0];
int[] position = result[1];
// Build the longest common prefix array
int[] LCP = buildLCP(suffixArray, position);
// Find the index of the maximum element in the LCP array
int maxIndex = 0;
for (int i = 1; i < LCP.length; i++) {
if (LCP[i] > LCP[maxIndex]) {
maxIndex = i;
}
}
// If the maximum LCP value is 0, no common substring exists
if (LCP[maxIndex] == 0) {
System.out.println(-1);
} else {
// Output the longest common substring
System.out.println(s.substring(suffixArray[maxIndex], suffixArray[maxIndex] + LCP[maxIndex]));
}
}
}
# Maximum size of the string
maxSize = 10**5 + 5
# Variables for gap between characters and size of the string
gap = 0
n = 0
# The input string
s = ""
# Function to compare two suffixes
def compareSuffixes(x, y, position):
# Compare function for sorting suffix array
if position[x] != position[y]:
return position[x] < position[y]
x += gap
y += gap
return (x < n and y < n) and (position[x] < position[y]) or (x > y)
# Function to build the suffix array
def buildSuffixArray():
global gap, n
# Initialize suffix array and position array
suffixArray = [i for i in range(n)]
position = [ord(c) for c in s]
# Iterate over gaps (powers of 2) for sorting
gap = 1
while True:
# Sort suffix array using the current gap
suffixArray.sort(key=lambda x: (position[x], x))
# Check if all suffixes are at their correct position
if gap >= n:
break
# Update temporary array based on the sorted suffix array
temporary = [0] * n
for i in range(n - 1):
temporary[i + 1] = temporary[i] + compareSuffixes(suffixArray[i], suffixArray[i + 1], position)
# Update position array with temporary values
for i in range(n):
position[suffixArray[i]] = temporary[i]
gap <<= 1
return suffixArray, position
# Function to build the longest common prefix array
def buildLCP(suffixArray, position):
# Build the Longest Common Prefix (LCP) array
# Iterate over the original string to compute LCP values
LCP = [0] * n
k = 0
for i in range(n):
if position[i] != n - 1:
# Get the next suffix in the sorted order
j = suffixArray[position[i] + 1]
# Compare characters and update LCP
while i + k < n and j + k < n and s[i + k] == s[j + k]:
k += 1
# Set LCP value and decrement if non-zero
LCP[position[i]] = k
if k:
k -= 1
return LCP
# Read the input string
s = "cabababc"
n = len(s)
# Build the suffix array
suffixArray, position = buildSuffixArray()
# Build the longest common prefix array
LCP = buildLCP(suffixArray, position)
# Find the index of the maximum element in the LCP array
maxIndex = LCP.index(max(LCP))
# If the maximum LCP value is 0, no common substring exists
if LCP[maxIndex] == 0:
print(-1)
else:
# Output the longest common substring
print(s[suffixArray[maxIndex]: suffixArray[maxIndex] + LCP[maxIndex]])
// Maximum size of the string
const maxSize = 10 ** 5 + 5;
// Variables for gap between characters and size of the string
let gap = 0;
let n = 0;
// The input string
let s = "";
// Function to compare two suffixes
function compareSuffixes(x, y, position) {
// Compare function for sorting suffix array
if (position[x] !== position[y]) {
return position[x] < position[y];
}
x += gap;
y += gap;
return (x < n && y < n) && (position[x] < position[y]) || (x > y);
}
// Function to build the suffix array
function buildSuffixArray() {
// Initialize suffix array and position array
let suffixArray = [...Array(n).keys()];
let position = [...s].map(c => c.charCodeAt(0));
// Iterate over gaps (powers of 2) for sorting
gap = 1;
while (true) {
// Sort suffix array using the current gap
suffixArray.sort((x, y) => {
const comparison = position[x] - position[y];
if (comparison !== 0) {
return comparison;
}
return x - y;
});
// Check if all suffixes are at their correct position
if (gap >= n) {
break;
}
// Update temporary array based on the sorted suffix array
let temporary = Array(n).fill(0);
for (let i = 0; i < n - 1; i++) {
temporary[i + 1] = temporary[i] + compareSuffixes(suffixArray[i], suffixArray[i + 1], position);
}
// Update position array with temporary values
for (let i = 0; i < n; i++) {
position[suffixArray[i]] = temporary[i];
}
gap <<= 1;
}
return [suffixArray, position];
}
// Function to build the longest common prefix array
function buildLCP(suffixArray, position) {
// Build the Longest Common Prefix (LCP) array
// Iterate over the original string to compute LCP values
let LCP = Array(n).fill(0);
let k = 0;
for (let i = 0; i < n; i++) {
if (position[i] !== n - 1) {
// Get the next suffix in the sorted order
let j = suffixArray[position[i] + 1];
// Compare characters and update LCP
while (i + k < n && j + k < n && s[i + k] === s[j + k]) {
k++;
}
// Set LCP value and decrement if non-zero
LCP[position[i]] = k;
if (k !== 0) {
k--;
}
}
}
return LCP;
}
// Read the input string
s = "cabababc";
n = s.length;
// Build the suffix array
const [suffixArray, position] = buildSuffixArray();
// Build the longest common prefix array
const LCP = buildLCP(suffixArray, position);
// Find the index of the maximum element in the LCP array
const maxIndex = LCP.indexOf(Math.max(...LCP));
// If the maximum LCP value is 0, no common substring exists
if (LCP[maxIndex] === 0) {
console.log(-1);
} else {
// Output the longest common substring
console.log(s.substring(suffixArray[maxIndex], suffixArray[maxIndex] + LCP[maxIndex]));
}
Output
abab
Time Complexity: O(n log n)
Auxiliary Space: O(n)