Title: Prefix
1Prefix Suffix
Example W ab is a prefix of X abefac where
Y efac.
The empty string ? is a prefix of any string.
Example W cdaa is a suffix of X acbecdaa
where Y acbe
? is a suffix of any string.
2Overlapping Suffix
Lemma Suppose X Z and Y Z.
a) if X ? Y, then X Y b) if X ?
Y, then Y X c) if X Y, then X
Y.
X
Z
Y
a) b)
c)
3The Knuth-Morris-Pratt Algorithm
Key idea on improvement
Use one auxiliary function (prefix function).
Achieve running time O(nm)!
4Minimum Shifting
s?1
s1 sq
Text T
Pattern P
Question What is the least shift s? gt s ?
5The Prefix Function
How much to shift depends on the pattern not the
text.
shift by 2
?5 3
Prefix function measures length of the longest
prefix of P1..m that
is also a proper suffix of P1..q.
?q max k k lt q and P1..k is a suffix of
P1..q
6Example
? measures how well the pattern matches
against a shift of itself.
i 1 2 3 4 5 6 7 8 9
10 P1..i a b a b a b a b c a ?i
0 0 1 2 3 4 5 6 0 1
Ex.
a b a b a b a b c a a
b a b a b a b c a
a b a b a b a b c a a b a b a b a b c a
7Computing the Prefix Function
Compute-Prefix-Function(P) m ? lengthP
?1 ? 0 k ? 0 for q ? 2 to m //
invariant k ?q ?1 do while k gt 0 and Pk1
? Pq do k ? ?k if Pk1 Pq
then k ? k1 ?q ? k return ?
q 9 and k 6 pk1 a ? c pq
Ex.
a b a b a b a b c a 4
a b a b a b a b c a 2
?9 ???6 0
a b a b a b a b c a 0
8Running-time Analysis
1 Compute-Prefix-Function(P) 2 m ?
lengthP 3 ?1 ? 0 4 k ? 0 5 for
q 2 to m 6 do while k gt 0 and Pk1 ?
Pq 7 do k ? ?k // decrease k by at least
1 8 if Pk1 Pq 9 then k ? k1 //
? m ? 1 increments, each by 1 10 ?q ?
k 11 return ?
decrements ? increments, thus line 7 is
executed at most m ?1 times in total.
Total time ?(m).
9KMP Algorithm
KMP-Matcher(T, P) // n T and m P ?
? Compute-Prefix-Function(P) // ?(m) time.
q ? 0 for i ? 1 to n do while q
gt 0 and Pq1 ? Ti do q ? ?q if
Pq1 Ti then q ? q1 // ? n
total increments if q m
then print Pattern occurs with
shift i ? m q ? ?q
// ?(n) time
Total time ?(mn).
10A KMP Example
i 1 2 3 4 5 6 7 8 9 10 11
?i 0 0 1 2 0 1 2 3 4 3 1
P1..i a b a b b a b a b a a
abababbababbaababbababaa ababbababaa
abababbababbaababbababaa
ababbababaa
shift by q ? ?q 4 ? 2
shift by 1 ? 0 1
abababbababbaababbababaa ababbababaa
abababbababbaababbababaa
ababbababaa
shift by 6 ? 1 5
shift by 9 ? 4 5
abababbababbaababbababaa
ababbababaa