4.
28 Data structures using C
If p[i] is not equal to p[j] and if i is not zero, then assign f[i-1] to i. The code for
this case can be written as shown below:
i = f[i-1]
Now, the complete code for the failure function can be written as shown below:
Example 4.13: Failure function for the pattern string
void failure(int f[], char p[])
{
int i = 0, j = 1;
f[i] = 0;
while ( j < strlen(p))
{
if (p[i] == p[j])
f[j] = i+1, i++, j++;
else if (i == 0)
f[j++] = i;
else
i = f[i-1];
}
}
Search for the pattern using failure function: For searching the pattern string in the
text string, first we align patter to the beginning of text string as shown below:
i
0 1 2 3 4 5 6 7 8 9 10 11 12 13
t A B A D A B A B E A B A B D
p A B A B D
0 1 2 3 4
j
Initialization: Since we have to compare the first character of text string and first
character of pattern string we have to initialize i to 0 and j to 0 as shown below:
i=0
j=0
Case 1: When corresponding characters are equal: If t[i] is same as p[j] then we
increment i by 1 and j by 1 so as to compare the next characters. The code for this
case can be written as shown below:
Strings 4.29
if (t[i] == p[j)
{
i++;
j++;
}
Case 2: When corresponding characters are not equal: Consider the following
scenario:
i
0 1 2 3 4 5 6 7 8 9 10 11 12 13
t A B A D A B A B E A B A B D
p B B A B D
0 1 2 3 4
j
Note that t[i] is not equal to p[j] in the above scenario. So, we have to slide the pattern
string towards right as shown below:
i
0 1 2 3 4 5 6 7 8 9 10 11 12 13
t A B A D A B A B E A B A B D
p B B A B D
0 1 2 3 4
j
Observe that the value of j has not been changed. But, the value of i is incremented by
one. It indicates that incrementing the value of i by 1 implies the pattern string is
moved towards right by one position. The code corresponding to this can be written
as shown below:
if (j == 0) i++;
Case 3: When corresponding characters are not equal: Consider the following
scenario:
i
0 1 2 3 4 5 6 7 8 9 10 11 12 13
t A B A D A B A B E A B A B D
p A B A B D
0 1 2 3 4
j
Observe that the A and A enclosed within the oval in the above figure are matched.
But, A is the proper prefix and A is the proper suffix (shown using the arrow mark). It
4.30 Data structures using C
means that the proper prefix A in the pattern and A in the 2nd position of text string
are one and the same. So, shift the pattern string towards right so that A in the 2 nd
position of text string and the proper prefix A in the 0th position of pattern string are
aligned as shown below:
i
0 1 2 3 4 5 6 7 8 9 10 11 12 13
t A B A D A B A B E A B A B D
p A B A B D
0 1 2 3 4
j
Now, there is no need to compare two A’s shown in oval shape in the above figure.
Observe that the value of j should be 1. This can be done very easily by looking at the
scenario shown in the first figure of case 3. In that figure mismatch occurs when j is 3
(see the figure in the previous page). Now, take the previous character i.e., A in
position 2. Now, take the failure value in position 2 in failure function f. That is f[2]
whose value is 1 has to be copied into j. The code for this case can be written as
shown below:
if (j != 0) j = f [j-1];
All the statements in three cases should be repeatedly executed as long as i is less than
string length of text string and j is less than string length of pattern string. Now, the
code can be written as shown below:
while ( i < strlen(t) && j < strlen(p))
{
if (t[i] == p[j)
{
i++;
j++;
}
else if (j == 0)
i++;
else if (j != 0)
j = f [j-1];
}
Whenever control comes out of the loop, if j is equal to the string length of pattern
string, there is a match and return the position of the pattern string in the text string.
Otherwise, return -1 indicating pattern not found. The code for this case can be
written as shown below:
Strings 4.31
if ( j == strlen(p))
return i – strlen(p); // Pattern string found. Return its position
else
return -1; // Pattern string not found
Now, the complete function to search for the pattern string in a text string using KMP
method can be written as shown below:
Example 4.14: Pattern match using Knuth, Moris and Pratt method
int pattern_match(char p[], char t[], int f[])
{
int i = 0, j = 0;
while (i < strlen(t) && j < strlen(p))
{
if (t[i] == p[j])
i++, j++; // compare successive characters
else if (j == 0)
i++; // Move the pattern string towards right by 1
else
j = f[j-1]; // Move the pattern string towards right so that i = j
}
if (j == strlen(p)) // Pattern string found
return i - strlen(p);
else // Pattern string not found
return -1;
}
Now, the complete program to search for the pattern string in a text string using KMP
algorithm is shown below:
Example 4.15: C Program to search for the pattern in a given text using KMP method
#include <stdio.h>
#include <string.h>
// Include: Example 4.13: Failure function for the pattern string
// Include: Example 4.14: Pattern matching using KMP method
void main()
{
4.32 Data structures using C
int i, pos;
char t[40], p[20];
int f[20];
printf("Enter the text string:");
scanf("%s", t);
printf("Enter the pattern string:");
scanf("%s", p);
failure(f, p);
pos = pattern_match(p, t, f);
if (pos == -1)
printf("Pattern string not found\n");
else
printf("Pattern string found at pos: %d", pos);
}
Exercises
1) What is a string? How strings are stored in memory?
2) What are operations that can be performed on strings?
3) What is pattern matching? Design an algorithm to search for pattern string p in
text string t from position i using straight forward method (brute force method)
4) Design a function to search for the string in a pattern string using pattern matching
table.
5) Design the function to search for a pattern string p in the text string t and replace it
with replace string r
6) Design a KMP algorithm for pattern matching
7) Design brute force pattern matching algorithm by checking end indices first
8) Design functions to implement following C string functions:
a) strlen b) strcpy c) strcat d) strrev e)strcmp