Title: Finite Automata and Regular Languages
1Finite Automata and Regular Languages
- Jianhua Yang
- Department of Math and Computer Science
- Bennett College
2Goals
- To understand the concepts of automata
- To know how to establish an automata
- To know how to recognize regular languages by
using finite automata
31.1 Lexical Analysis
To detect whether or not a given string within a
source program represents an acceptable variable
name.
41. Transition Diagrams
is a finite collection of circles ,which
represent states, connected by arcs.
accept state
initial state
2
5Instruction sequence
- State1
- Read the next symbol from input
- While not end-of-string do
- case State of
- 1 if the symbol is a letter, then State3,
- else if it is a digit, then
State2 , - else exit to error routine
- 2 exit to error routine
- 3 if the current symbol is a letter then
State3, - else if it is a digit, then State3,
- else exit to error routine
- Read the next symbol from the input
- End while
- If State is not 3, then exit to error routine
62. Transition table
7A lexical analyzer
- State1
- Repeat
- Read the next symbol from the input stream
- case symbol of
- letter inputletter
- digit inputdigit
- end of string marker inputEOS
- none of the above exit to error routine
- StateTablestate, input
- if Stateerror then exit to error routine
- Until Stateaccept
8One more example
- Design a state diagram to recognize a real number
- 1.27E04, 0.345E11
- 1.29, 1239.
- 8E01
- 1.274E-10
- 2.871E10
9The transition diagram
2
5
3
6
10The transition table
- Please write its transition table
- Please write its lexical analyzer
Home work 1
11Home work 2
- Draw the state diagram
- Write the transition table
- Write the lexical analyzer
When considering abnormal processing
121.2 Deterministic finite automata
- We want to ask one question
- Do transition diagrams provide a tool powerful
enough to develop programs for recognizing
syntactic structures of arbitrary complexity?
131. Basic Definitions
A nonempty, finite set of symbols from which the
strings to be analyzed are constructed is called
an alphabet.
Each string to be analyzed is received as a
sequence of symbols, one symbol at a time, we
refer to the source of this sequence as the input
stream.
14Deterministic finite automaton
- Device
- Input stream
- Control mechanism
It can be in any one of a finite number of
states, of which one is an initial state, and at
least one is an accept state.
On a input stream, symbols from a given alphabet
arrive sequentially.
To compute the next state of the device based on
the current state and each received input symbol
15Make more sense
not ambiguous
The machine has only a finite number of states
16DFA model
Input tape
Head moves in this direction
tape head
1
8
2
7
3
4
6
5
Control mechanism
17DFA model
- DFA(S, ? ,d, i , F)
- S is a finite set of states
- ? is the machines alphabet
- d is a function from S X ? into S
- i is the initial state
- F is the set of accept states
18A DFA accepting empty string
A string containing no symbols, represented by ?
If and only if its initial state is also an
accept state
192. Deterministic Transition Diagrams
1. Each state in such a diagram must have only
one arc leaving it for each symbol in the
alphabet.
2. The diagram must be fully defined.
20Example of DFA
- Consider a typical vending machine that dispenses
a persons choice of candy after it has received
a total of 30 cents in nickels, dimes, and
quarters
21Example of DFA
0
5
Q
10
15
20
25
221.3 The limits of DFA
- Question
- Does the use of transition table provide enough
flexibility for general string processing?
231. DFA as Language Accepters
- The length of string w
- Full language
- A language
- Regular language
is denoted by w
?, ? is the alphabet
- is defined as the subset of ?
- We use M to represent a DFA(S, ? ,d, i , F), the
strings that can be accepted by M is called
regular language, denoted by L(M)
24Questions?
- What is the alphabet of language English?
- Does that alphabet constitute only one language
English? - How to prove a given language is regular?
25Examples
- To prove that the following language is a regular
language - A language is composed of the strings of xs and
ys that contain an even number of xs and any
number of ys. ?x,y
26Test 1
- To prove that the following language is a regular
language - A language is composed of the strings of xs and
ys that contain an even number of xs and even
number of ys. ?x,y
27Test 2
- To prove that the following language is a regular
language - A language is composed of the strings of xs and
ys that contain an even number of xs and odd
number of ys. ?x,y
Answer this language is not regular, we do not
have a DFA to accept it
28Example
- A language ?, empty language F
Containing no strings
29Theory about regular language
- Theorem 1.1
- For any alphabet ?, there is a language that is
not equal to L(M) for any deterministic finite
automaton M
- L(M) is countable because of countable states
- The number of subsets of ? is uncountable
302. A nonregular language
- Think about an expression
- (ab)/c, or ((ab)/c-d)2
31Theorem 1.2
- If a regular language contains of the form xnyn
for arbitrarily large integers n, then it must
contain strings of the form xmyn where m and n
are not equal
32Proof
- We assume L(M) contains xnyn
- There are u states for M
- There must exist positive integer k gtu
- and that xkyk is in L(M)
- In reading xs, there must exist j, so that xkjyk
is accepted by M - mkj, such that xmyn is in L(M)
33Consequence of Theorem 1.2
- 1. xnyn n?N is not regular language
- 2. DFA is not powerful enough to analyze
nonregular language
341.4 Nondeterministic Finite Automata (NFA)
- How to increase the power of DFA?
35An Example
1
2
3
36NFA
- NFA(S, ? ,?, i , F)
- a. S is a finite set of states
- b. ? is the machines alphabet
- c. ? is a subset of S ? S
- d. i (en element of ) is the initial state
- e. F (a subset of S) is the collection of
accept states.
37Accepted by a NFA
- We say that a string is accepted by a NFA if it
is possible for its analysis to leave the machine
in an accept state
38L(M)
- The set of all the strings accepted by a NFA M is
a language that we denote by L(M) and refer to as
the language accepted by M.
39And More
- A NFA(S, ? ,?, i , F) accepts the non empty
string x1x2xn?? if and only if there is a
sequence of states s0, s1,, sn such that s0 i,
sn ?F, and for each integer j from 1 to n, (sj-1,
xj, sj) is in ?.
40The problem of NFA
- Cost too much time to accept a string
41Converting a NFA to a DFA
y
S1
x
y
42The DFA
s0,s1,s2
s0
empty
x
s0,s1
s1
43Theorem 1.3
- For each NFA, there is a DFA that accepts exactly
the same language.
44Proof
- Suppose that M is a NFA(S, ? ,?, i , F), our
purpose is to demonstrate there exists a DFA that
accepts exactly the same string as M.
45Proof (Cont.)
- We define (S, ? ,d, i , F )as the
following - SP(S)
- ii
- F is the collection of subsets of S that contain
at least one state in F. - d S ?? S
46Proof (Cont.)
is a DFA
47Proof (Cont.)
- We need to prove
- For each path in M from state i to state sn
traversing arcs labeled w1, w2,, wn, there is a
path in from state i to state
traversing arcs labeled w1, w2,, wn, so that
and visa versa.
48Proof (Cont.) Induction
- n0 it is correct
- We assume that the statement holds for some n
- Then we need to prove that for n1, the statement
still holds.
49Theorem 1.4
- For any alphabet ?, L(M) M is a DFAL(M) M
is a NFA.
501.5 Regular Grammars
- A grammar describing a small subset of the
English language
ltsentencegt? ltsubjectgtltpredicategtltperiodgt
ltsubjectgt?ltnoungt ltnoungt?John ltnoungt?Mary
ltpredicategt?ltintransitive verbgt
ltpredicategt?lttransitive verbgtltobjectgt
ltintransitive verbgt?skates lttransitive verbgt?
hit lttransitive verbgt ?likes ltobjectgt? ltnoungt
ltperiodgt ? .
51Some definitions
- Terminals
- Nonterminals
- Start symbol
- Rewrite rule
- Grammar
- ?-rule
Terms that dont appear in brackets are called
terminals.
The terms enclosed in brackets are called
nonterminals.
The first nonterminal
each line consisting of a left and right side
connected by an arrow is called a rewrite rule
Such a finite collection of nonterminals and
terminals together with a start symbol and a
finite set of rewrite rules is called a grammar,
or phrase-structure grammar.
if a nonterminal is defined by an empty string,
this rule is a ?-rule .
52For convenience
- We denote nonterminals by upper case letters and
terminals by lower case letters.
53Derivation
- To generate a particular string of terminals if,
by starting with the start symbol, one can
produce the string by successively replacing
patterns found on the left of the grammars
rewrite rules with the corresponding expressions
on the right, until only terminals remain.
54A derivation example
- ltsentencegt ?ltsubjectgtltpredicategtltperiodgt
- ?ltnoungtltpredicategtltperiodgt
- ?Maryltpredicategtltperiodgt
- ?Marylttransitive verbgtltobjectgtltperiodgt
- ?Mary likes ltobjectgtltperiodgt
- ?Mary likes ltnoungtltperiodgt
- ?Mary likes John ltperiodgt
- ?Mary likes John.
55Grammar example
- S?XSZ
- S?Y
- Y?yY
- Y??
- X?x
- Z?z
Please determine if string xyz can be generated
by this grammar.
56L(G)
- If the terminals of a grammar G are symbols in an
alphabet ?, G is a grammar over the alphabet?.
- The strings generated by G is a subset of ?.
- We define L(G) as the strings generated by G over
?.
57Regular grammar
- It is a grammar whose rewrite rules conform to
the following restrictions. - The left side of any rewrite rule must consist of
a single nonterminal - The right side must be a terminal followed by a
nonterminal, or a single terminal, or the empty
string.
58Examples
A Regular grammar
yW?X X?xZy YX?WvZ
Not a Regular grammar
59From regular grammar G to DFA
- S?xX
- S?yY
- X?yY
- X??
- Y?xX
- Y? ?
S
60Theorem 1.5
- A language generated by a regular grammar G is a
regular language. This language can be accepted
by a DFA.
61Proof
- Converting if G is a regular grammar, we can
convert it to a regular grammar G ?, which
generates the same language but does not contain
the rewrite rules whose right sides consist of a
single terminal.
62Proof (cont.)
- Construct NFA M(S, ? ,?, i , F) as the following
- S is the collection of nonterminals in G ?.
- i is the start symbol in G ?.
- F is the nonterminals of ?-rules.
- And ? consists of the triples (P, x, Q) which
corresponds to a rewrite rule P?xQ in G ?.
63Proof (cont.)
- Generate a regular grammar from a NFA M(S, ? ,?,
i , F) as the following - The start symbol is i.
- Nonterminals are the states in S .
- If Q is in F, the we have a rule Q??.
- If (P, x, Q) is in ?, then we have a rule P?xQ.
64Proof (cont.)
- In either case, we have L(M)L(G ?) .
651.6 Regular Expressions
- Another characterization of regular languages
- Providing more insight into the composition of
regular languages.
66Definition of regular expressions
- A regular expression (over an alphabet ?) is
defined as follows - F is a regular expression
- Each member of ? is a regular expression
- If p and q are regular expressions, then so are
(p U q), (p ?q), and (p), or (q).
671. Operation Union U
- The union of any two regular languages is regular.
L1x, xy, and L2yx, yy LL1 U L2x, xy,
yx, yy
68Construct a NFA for Union
- Draw a new initial state, and declare it is an
accept state if and only if one of the original
initial states was also an accept state. - To each state that is the destination of arc from
one of the original initial states, draw an arc
with the same label from the new initial state. - Cancel the initial status of the original initial
states.
69Example about Union operation
L2
L1
702. Concatenation
- The concatenation of any two regular languages is
regular.
L1x, xy, and L2yx, yy LL1 ? L2 xyx, xyy,
xyyx, xyyy
71Construct a NFA for concatenation
- From each accept state of T1, draw an arc to each
state of T2 that is the destination of an arc
from the initial state of T2 - Label each arc with the label of the
corresponding arc in T2 - Allow the accept states in T1 to remain accept
states if and only if the initial states in T2 is
also an accept state - Remove the initial state from T2.
72Example
73Example (Cont.)
x
x
1
2
y
y
x
743. Kleene star operation
- The Kleene star of any regular language is
regular.
75The rules to construct a NFA
- To each state that was the destination of an arc
from the initial state, we draw an arc from the
new initial state with the same label. - To add an arc from each accept state to each
state that is the destination of an arc from the
initial state. - We designate the new initial state as an accept
state.
76Example
774. Theorem 1.6
- Given an alphabet ? , the regular languages over
? are exactly the languages that are represented
by regular expressions over ?.
78Proof
791. 7 Summary
- DFA, NFA
- Regular languages
- Regular grammar
- Regular expressions
80Example 1
- (ab), construct its NFA, and convert it to a DFA.
81NFA
e
a
b
e
e
e
82DFA
83Example 2
- Expression (b(ab)a(ab))(ab)
84(No Transcript)
85(No Transcript)
86NFA
87DFA
88(No Transcript)