Two%20issues%20in%20lexical%20analysis - PowerPoint PPT Presentation

About This Presentation
Title:

Two%20issues%20in%20lexical%20analysis

Description:

Identifying tokens specified by regular expression. ... A recognizer for a language is a program that takes a string x as input and ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 12
Provided by: xyu
Learn more at: http://www.cs.fsu.edu
Category:

less

Transcript and Presenter's Notes

Title: Two%20issues%20in%20lexical%20analysis


1
  • Two issues in lexical analysis
  • Specifying tokens (regular expression)
  • Identifying tokens specified by regular
    expression.

2
  • How to recognize tokens specified by regular
    expressions?
  • A recognizer for a language is a program that
    takes a string x as input and answers yes if x
    is a sentence of the language and no otherwise.
  • In the context of lexical analysis, given a
    string and a regular expression, a recognizer of
    the language specified by the regular expression
    answer yes if the string is in the language.
  • A regular expression can be compiled into a
    recognizer (automatically) by constructing a
    finite automata which can be deterministic or
    non-deterministic.

3
  • Non-deterministic finite automata (NFA)
  • A non-deterministic finite automata (NFA) is a
    mathematical model that consists of (a 5-tuple
  • a set of states Q
  • a set of input symbols
  • a transition function that maps state-symbol
    pairs to sets of states.
  • A state q0 that is distinguished as the start
    (initial) state
  • A set of states F distinguished as accepting
    (final) states.
  • An NFA accepts an input string x if and only if
    there is some path in the transition graph from
    the start state to some accepting state.
  • Show an NFA example (page 116, Figure 3.21).

4
  • An NFA is non-deterministic in that (1) same
    character can label two or more transitions out
    of one state (2) empty string can label
    transitions.
  • For example, here is an NFA that recognizes the
    language ???.
  • An NFA can easily implemented using a transition
    table.
  • State
    a b

  • 0 0, 1 0

  • 1 - 2

  • 2 - 3

a
2
3
1
0
a
b
b
b
5
  • The algorithm that recognizes the language
    accepted by NFA.
  • Input an NFA (transition table) and a string x
    (terminated by eof).
  • output yes if accepted, no otherwise.
  • S e-closure(s0)
  • a nextchar
  • while a ! eof do begin
  • S e-closure(move(S, a))
  • a next char
  • end
  • if (intersect (S, F) ! empty) then return yes
  • else return no
  • Note e-closure(S) are the state that can be
    reached from states in S through transitions
    labeled by the empty string.

6
  • Example recognizing ababb from previous NFA
  • Example2 Use the example in Fig. 3.27 for
    recognizing ababb
  • Space complexity O(S), time complexity
    O(S2x)??

7
  • Construct an NFA from a regular expression
  • Input A regular expression r over an alphabet
  • Output An NFA N accepting L( r )
  • Algorithm (3.3, pages 122)
  • For , construct the NFA
  • For a in , construct the NFA
  • Let N(s) and N(t) be NFAs for regular s and t
  • for st, construct the NFA N(st)
  • For st, construct the NFA N(st)
  • For s, construct the NFA N(s)

a
N(s)
N(t)
N(s)
N(t)
N(s)
8
  • Example r (ab)abb.
  • Example using algorithm 3.3 to construct N( r )
    for r (ab a)b b.

9
  • Using NFA, we can recognize a token in
    O(S2X) time, we can improve the time
    complexity by using deterministic finite
    automaton instead of NFA.
  • An NFA is deterministic (a DFA) if
  • no transitions on empty-string
  • for each state S and an input symbol a, there is
    at most one edge labeled a leaving S.
  • What is the time complexity to recognize a token
    when a DFA is used?

10
  • Algorithm to convert an NFA to a DFA that accepts
    the same language (algorithm 3.2, page 118)
  • initially e-closure(s0) is the only state in
    Dstates and it is unmarked
  • while there is an unmarked state T in Dstates do
    begin
  • mark T
  • for each input symbol a do begin
  • U e-closure(move(T, a))
  • if (U is not in Dstates) then
  • add U as an unmarked state to
    Dstates
  • DtranT, a U
  • end
  • end
  • Initial state e-closure(s0), Final state ?

11
  • Example page 120, fig 3.27.
  • Question
  • for a NFA with S states, at most how many
    states can its corresponding DFA have?
Write a Comment
User Comments (0)
About PowerShow.com