Title: Two Way Algorithm
1Two Way Algorithm
Two-way string-matching Journal of the ACM
38(3)651-675, 1991 Crochemore M., Perrin D.
- Advisor Prof. R. C. T. Lee
- Speaker C. C. Yen
2- In 2003 ,Rytter proposed a constant space and
linear time string matching algorithm - To achieving the good constant space , this
algorithm avoids the preprocessing function table
of the KMP algorithm - Before introducing this algorithm , we shall
define some characteristic of the strings
3The Property of Maximal Suffix
- Consider a string P. Let P uv where v
MaxSuf(P). The property of the maximal suffix of
a string is If u is non-empty, no suffix of u
will be equal to a prefix of v. - Example
- Consider a pattern ababadada.
- Let P uv ababa.dada
- No suffix of u is equal to a prefix of v.
4Short Maximal Suffix
- If a maximal suffix of a string x satisfies
- , we say that
this maximal suffix of x is a short maximal
suffix of x. - Example
- Consider a string x abcdda ,dda is a maximal
suffix of x and
. - Hence we say that dda is a short maximal suffix
of x
5Short Prefixes Lemma
- Let the decomposition of P uv, where v is the
maximal suffix of P and v is also a short maximal
suffix. Suppose that we start to match v with T
at position i, a part of v is matched and a
mismatch occurs at the j 1-th position on v.
Then we can shift P safely by j 1 positions
without missing any occurrence of P in T.
i
ij1
T
mismatch
j
j
P
v
u
j
P
v
u
6- Why do we have to use short maximal suffix?
- Suppose V is very long, then we move pattern
which is incorrect.
j
i
v
T
j
i
v
P
v
u
j
j1
T
j
i
P
v
u
7- In the following , we will introduce the basic
rule of the Two Way Matching algorithm with short
maximal pattern strings - The basic rules are given in the next slides.
8Basic rule of the Two-Way algorithm with short
maximal
- 1. Let the decomposition of P uv, where v is the
maximal suffix of P and v is also a short maximal
suffix. - We then find where v appears in T from left to
right. Assume the comparison starts at position
i. When a mismatch occurs at vj 1, we shift v
with j 1 characters and start next comparison
at P1 with Ti j 1. - When the part of v has be found in T, we scan the
part of u from right to left. If a mismatch
occurs when scanning u, we shift P with Period(P) - 4. If we find both the parts of v and u in T, we
report an occurrence of P in T. We then shift v
with Period(P) -
9- Full Example
- Tadadadaddadababadada
- Pu.v ababa .dada
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
T a d a d a d a d d a d a b a d a d a
P a b a b a d a d a
1 2 3 4 5 6 7 8 9
10 Tadadadaddadababadada Pu.v ababa .dada
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
T a d a d a d a d d a d a b a d a d a
1 2 3 4
P a b a b a d a d a
1 2 3 4 5 6 7 8 9
Shift 4 steps
P a b a b a d a d a
1 2 3 4 5 6 7 8 9
11 Tadadadaddadababadada Pu.v ababa .dada
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
T a d a d a d a d d a d a b a d a d a
1 2 3 4
P a b a b a d a d a
1 2 3 4 5 6 7 8 9
Shift 1 steps
P a b a b a d a d a
1 2 3 4 5 6 7 8 9
12 Tadadadaddadababadada Pu.v ababa .dada
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
T a d a d a d a d d a d a d a b a b a d a d a
1 2 3 4
P a b a b a d a d a
1 2 3 4 5 6 7 8 9
Shift Preiod(P) 8 steps
P a b a b a d a d a
1 2 3 4 5 6 7 8 9
Rule 1 again!
13 Tadadadaddadababadada Pu.v ababa .dada
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
T a d a d a d a d d a d a d a b a b a d a d a
1 2 3 4
P a b a b a d a d a
1 2 3 4 5 6 7 8 9
Match!!
P a b a b a d a d a
1 2 3 4 5 6 7 8 9
Shift Preiod(P) 8 steps
14- References
- BRESLAUER, D., 1996, Saving comparisons in the
Crochemore-Perrin string matching algorithm,
Theoretical Computer Science 158(1-2)177-192. - CROCHEMORE, M., 1997. Off-line serial exact
string searching, in Pattern Matching Algorithms,
ed. A. Apostolico and Z. Galil, Chapter 1, pp
1-53, Oxford University Press. - CROCHEMORE M., PERRIN D., 1991, Two-way
string-matching, Journal of the ACM
38(3)651-675. - CROCHEMORE, M., RYTTER, W., 1994, Text
Algorithms, Oxford University Press.
15-
- Thanks for your attention