The document describes the Knuth-Morris-Pratt (KMP) string matching algorithm. The KMP algorithm matches a pattern string to a main string in linear time O(n) by utilizing a prefix function to avoid backtracking. It consists of two parts: (1) a prefix function that calculates shift amounts to avoid re-matching already seen prefixes; and (2) a KMP matcher that uses the prefix function to efficiently find matches by shifting the pattern instead of re-matching characters. Pseudocode is provided to calculate the prefix function in O(m) time and to perform the matching in O(n) time, where m and n are the lengths of the pattern and main string.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
31 views
Lecture 39 Knutt Morris Pratt
The document describes the Knuth-Morris-Pratt (KMP) string matching algorithm. The KMP algorithm matches a pattern string to a main string in linear time O(n) by utilizing a prefix function to avoid backtracking. It consists of two parts: (1) a prefix function that calculates shift amounts to avoid re-matching already seen prefixes; and (2) a KMP matcher that uses the prefix function to efficiently find matches by shifting the pattern instead of re-matching characters. Pseudocode is provided to calculate the prefix function in O(m) time and to perform the matching in O(n) time, where m and n are the lengths of the pattern and main string.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 15
DR.
APJ ABDUL KALAM TECHNICAL UNIVERSITY
Branch - CSE Design and Analysis of Algorithms
Lecture – 39
String Matching Algorithm: Knuth-Morris-Pratt
By
Mr. Prabhat Singh
Assistant Professor Department of Computer Science & Engineering ABES Engineering College, Ghaziabad String Matching: Knuth-Morris-Pratt Algorithm Knuth-Morris-Pratt Algorithm: Knuth-Morris and Pratt introduce a linear time algorithm for the string matching problem. A matching time of O (n) is achieved by avoiding comparison with an element of 'S' that have previously been involved in comparison with some element of the pattern 'p' to be matched. i.e., backtracking on the string 'S' never occurs. Components of KMP Algorithm: 1. The Prefix Function (Π): The Prefix Function, Π for a pattern encapsulates knowledge about how the pattern matches against the shift of itself. This information can be used to avoid a useless shift of the pattern 'p.' In other words, this enables avoiding backtracking of the string 'S.' String Matching: Knuth-Morris-Pratt Algorithm 2. The KMP Matcher: With string 'S,' pattern 'p' and prefix function 'Π' as inputs, find the occurrence of 'p' in 'S' and returns the number of shifts of 'p' after which occurrences are found. Knuth Morris Pratt (KMP) is an algorithm, which checks the characters from left to right. For Example: Input: Input: Main String: “AAAABAAAAABBBAAAAB”, The pattern “AAAB” Output: Pattern found at location: 1 Pattern found at location: 7 Pattern found at location: 14 String Matching: Knuth-Morris-Pratt Algorithm Algorithm for Computing the Prefix Function (Π): COMPUTE- PREFIX- FUNCTION (P) 1.m ←length [P] //'p' pattern to be matched 2. Π [1] ← 0 3. k ← 0 4. For q ← 2 to m 5. do while k > 0 and P [k + 1] ≠ P [q] 6. do k ← Π [k] 7. If P [k + 1] = P [q] 8. then k← k + 1 9. Π [q] ← k 10. Return Π String Matching: Knuth-Morris-Pratt Algorithm Running Time Analysis for Computing the Prefix Function: In the above pseudo code for calculating the prefix function, the for loop from step 4 to step 10 runs 'm' times. Step1 to Step3 take constant time. Hence the running time of computing prefix function is O (m). Example: Compute Π for the pattern 'p' below: String Matching: Knuth-Morris-Pratt Algorithm String Matching: Knuth-Morris-Pratt Algorithm String Matching: Knuth-Morris-Pratt Algorithm KMP Matcher: KMP Matcher with the pattern 'p,' the string 'S' and prefix function 'Π' as input, finds a match of p in S. KMP-MATCHER (T, P) 1. n ← length [T] 2. m ← length [P] 3. Π← COMPUTE-PREFIX-FUNCTION (P) 4. q ← 0 // numbers of characters matched 5. for i ← 1 to n // scan S from left to right 6. do while q > 0 and P [q + 1] ≠ T [i] 7. do q ← Π [q] // next character does not match 8. If P [q + 1] = T [i] 9. then q ← q + 1 // next character matches 10. If q = m // is all of p matched? 11. then print "Pattern occurs with shift" i - m 12. q ← Π [q] // look for the next match String Matching: Knuth-Morris-Pratt Algorithm Running Time Analysis: The for loop beginning in step 5 runs 'n' times, i.e., as long as the length of the string 'S.' Since step 1 to step 4 take constant times, the running time is dominated by this for the loop. Thus running time of the matching function is O (n). String Matching: Knuth-Morris-Pratt Algorithm String Matching: Knuth-Morris-Pratt Algorithm String Matching: Knuth-Morris-Pratt Algorithm String Matching: Knuth-Morris-Pratt Algorithm String Matching: Knuth-Morris-Pratt Algorithm THANK YOU