Exact String Search - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Exact String Search

Description:

For pattern P, position i, let spi(P) be the length of the ... Knuth-Morris-Pratt psuedo-code. Using failure functions. Extentions / strong vs weak shift rule ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 13
Provided by: nathanjoh
Category:
Tags: exact | knuth | search | string

less

Transcript and Presenter's Notes

Title: Exact String Search


1
Exact String Search
  • Lecture 6 September 20, 2005
  • Algorithms in Biosequence Analysis
  • Nathan Edwards - Fall, 2005

2
K-M-P preprocessing
  • Definition
  • For pattern P, position i, let spi(P) be the
    length of the longest proper suffix of P1i
    that matches a prefix of P.
  • Definition
  • For pattern P, position i, let spi(P) be the
    length of the longest proper suffix of P1i
    that matches a prefix of P and P(i1) ? P(spi1).

3
K-M-P shift rule
  • For any alignment of P with T
  • If the first mismatch (from l to r) is at
    position (i1) of P and position k of T, then
    shift P to the right so that P1spi aligns
    with T(k-spi)(k-1)
  • i.e. shift P (i1) - (spi1) i - spi to
    the right so that spi1 aligns with k
  • If an occurrence of P is found, shift P by n -
    spn

4
Speeding up the naïve algorithm
  • 0 1
  • 1234567890123
  • Txabxyabxyabxz
  • P abxyabxz

5
Speeding up the naïve algorithm
  • 0 1
  • 1234567890123
  • Txabxyabxyabxz
  • P abxyabxz

6
Speeding up the naïve algorithm
  • 0 1
  • 1234567890123
  • Txabxyabxyabxz
  • P abxyabxz
  • abxyabxz

7
Speeding up the naïve algorithm
  • 0 1
  • 1234567890123
  • Txabxyabxyabxz
  • P abxyabxy

8
Speeding up the naïve algorithm
  • 0 1
  • 1234567890123
  • Txabxyabxyabxz
  • P abxyabxy
  • abxyabxz

9
Speeding up the naïve algorithm
  • 0 1
  • 1234567890123
  • Txabxyabxyabxz
  • P abxyabxz

10
Speeding up the naïve algorithm
  • 0 1
  • 1234567890123
  • Txabxyabxyabxz
  • P abxyabxz
  • abxyabxz

11
K-M-P outline
  • Show that no occurrence of P is missed
  • Show linear search time, given the spi values
  • Show how to compute the spi values from the Zi
    values
  • Knuth-Morris-Pratt psuedo-code
  • Using failure functions
  • Extentions / strong vs weak shift rule

12
K-M-P outline
  • Knuth-Morris-Pratt psuedo-code
  • Using failure functions
  • Extentions / strong vs weak shift rule
  • K-M-P as a DFA
  • Original K-M-P pattern preprocessing
Write a Comment
User Comments (0)
About PowerShow.com