String Matching of Regular Expression - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

String Matching of Regular Expression

Description:

RE to NFA require m state. Deterministic Finite Automata (DFA) Only one next ... Preprocessing time. Searching time. 15. Reference. G. Navarro and M. Raffinot. ... – PowerPoint PPT presentation

Number of Views:175
Avg rating:3.0/5.0
Slides: 16
Provided by: four98
Category:

less

Transcript and Presenter's Notes

Title: String Matching of Regular Expression


1
String Matching of Regular Expression
2
Introduction
  • Regular Expression (RE)
  • A generalized string description with
  • Basic string
  • Kleene star ()
  • Concatenation
  • Union ()
  • Nondeterministic Finite Automata (NFA)
  • More then one next transition
  • RE to NFA require m state
  • Deterministic Finite Automata (DFA)
  • Only one next transition
  • RE to DFA may 2m state
  • Using (m1)(2m1S) bits

3
RE to NFA Construction
  • Thompsons construction
  • Produce up to 2m states
  • Not null-free NFA
  • Using (m)(2m1S) bits
  • Glushkovs construction
  • Produce exactly m1 states
  • null-free NFA
  • Using (m1)(2m1S) bits

4
Thompsons Construction
5
Thompsons Construction
Example
6
Glushkov Construction
  • RE ((ATGA((AGAAA)))
  • Marked RE (A1T2G3A4((A5G6A7A8A9)))
  • Used in Glushkov construction
  • First(RE)
  • The set of positions at which the reading can
    start.
  • Ex First (A1T2G3A4((A5G6A7A8A9))) 1 ,3 .
  • Last(RE)
  • The set of positions at which a string read can
    be recognized.
  • Ex Last (A1T2G3A4((A5G6A7A8A9)))2 ,4 ,6 ,9
    .
  • Follow(RE,x)
  • All the positions in RE accessible from x
  • Ex Follow ((A1T2G3A4((A5G6A7A8A9))),6)
    7,5.
  • EmptyRE is e if e belongs to L(RE) and Ø
    otherwise.

7
Glushkov Construction
  • Initial set of m1 states
  • Marked final states, use Last (RE)
  • Create transition link by Follow (RE,x)

RE (A1T2G3A4((A5G6A7A8A9)))
8
Bit Parallel Automata
  • Ex Shift-And
  • Automata
  • Update Function

State Mask
Occurrence Table
9
Thompson BPA
Notation D State mask E null-closure of D B
Precomute Table S string length Tj current char
null-closure, reachable state from D with null
input
B Table bit mask of the state reachable by each
letter
Alphabet
S
m1
Pattern
10
Glushkov BPA
Notation D State mask TD Follow of D B
Build by Glushkov Tj current char
B Table bit mask of the state reachable by each
letter
T Table Which states can be reached from an
active state
Alphabet
Active states
S
D2m1
m1
m1
Pattern
States

D
11
Glushkov Search Algorithm
Build B Table
12
Glushkov Search Algorithm
Build T Table
Initial to zero
Active states
D2m1
m1
States
13
Glushkov Search Algorithm
Compute First, Last, Follow and Empty
14
Performance Comparison
Forward Algorithm DFA
Glushkov with BuildT
Thompson s Construction
Glushkov with BuildTree
Test Pattern
Preprocessing time
Searching time
15
Reference
  • G. Navarro and M. Raffinot. Compact DFA
    representation for fast regular expression search
    . In Proceedings of the 5th Workshop on Algorithm
    Engineering , number 2141 in Lecture Notes in
    Computer Science, pages 1-12, 2001.
Write a Comment
User Comments (0)
About PowerShow.com