Title: Regular Languages
1Regular Languages
Giorgi Japaridze Theory of
Computability
Chapter 1
2How a finite automaton works
1.1.a
Giorgi Japaridze Theory of Computability
1
q2
q0
0
0
1
1
q1
0
0 1 1 0 0
3The language of a machine
1.1.b
Giorgi Japaridze Theory of Computability
1
q2
q0
0
0
1
1
q1
0
L(M), the language of M, or the language
recognized by M
--- the set of all strings that the machine M
accepts
What is the language recognized by our automaton
A?
4Formal definition of a finite automaton
1.1.c
Giorgi Japaridze Theory of Computability
- A (deterministic) finite automaton (DFA) is a
5-tuple - (Q, ?, ?, s, F), where
- Q is a finite set whose elements are called the
states - ? is a finite set called the alphabet
- ? is a function of the type Q?? ? Q called the
transition function - s is an element of Q called the start state
- F is a subset of Q called the set of accept states
5Our automaton formalized
1.1.d
Giorgi Japaridze Theory of Computability
1
q2
q0
0
Q ? ? s F
0
1
1
q1
0 1 q0 q1 q2
0
A (Q, ?, ?, s, F)
6Formal definition of computation
1.1.e
Giorgi Japaridze Theory of Computability
M (Q, ?, ?, s, F)
1
q2
q0
0
0
1
1
q1
- M accepts the string
- u1 u2
un - iff there is a sequence
- r1, r2,
, rn, rn1 - of states such that
- r1s
- ri1 ?(ri,ui), for each i with 1? i ? n
- rn1 ? F
0
u1 u2 un
0 1 1 0 0
q0 q2 q0 q0 q2 q1
r1, r2, , rn, rn1
7Designing finite automata
1.1.f
Giorgi Japaridze Theory of Computability
Task Design an automaton that accepts a bit
string iff it contains an even number of 1s.
8NFAs (Nondeterministic Finite Automata)
1.2.a
Giorgi Japaridze Theory of Computability
q1
q2
q3
1
0,1
0,1
0 1 0 1 0
q1
0
q1
1
q1
q2
0
q1
q3
1
q1 q2
0
q1 q3
9NFAs (Nondeterministic Finite Automata)
1.2.a
Giorgi Japaridze Theory of Computability
q1
q2
q3
1
0,1
0,1
What language does this NFA recognize?
10Formal definition of a nondeterministic finite
automaton
1.2.b
Giorgi Japaridze Theory of Computability
- An NFA is a 5-tuple (Q, ?, ?, s, F),
where - Q is a finite set whose elements are called the
states - ? is a finite set called the alphabet
- ? is a function of the type Q?? ? P(Q) called
the transition function - s is an element of Q called the start state
- F is a subset of Q called the set of accept states
11Example
1.2.c
Giorgi Japaridze Theory of Computability
1
Q ? ? s F
b
a
b
a
a b 1 2 3
3
2
a,b
A (Q, ?, ?, s, F)
12Formal definition of accepting
1.2.d
Giorgi Japaridze Theory of Computability
M (Q, ?, ?, s, F)
When M is a DFA
When M is an NFA
- M accepts the string
- u1 u2 un
- iff there is a sequence
- r1, r2, ,
rn, rn1 - of states such that
- r1s
- ri1 ?(ri,ui), for each i with 1? i ? n
- rn1 ? F
- M accepts the string
- u1 u2 un
- iff there is a sequence
- r1, r2, ,
rn, rn1 - of states such that
- r1s
- ri1 ? ?(ri,ui), for each i with 1? i ? n
- rn1 ? F
13What language does this NFA recognize?
1.2.e
Giorgi Japaridze Theory of Computability
0
0
0
0
0
0
0
14What language does this DFA recognize?
1.2.f
Giorgi Japaridze Theory of Computability
2
1
0
0
0
0
3
0
0
5
4
0
15Equivalence of NFAs and DFAs
1.2.g
Giorgi Japaridze Theory of Computability
Two machines are said to be equivalent if they
recognize the same language.
Theorem 1.39 Every NFA has an equivalent DFA.
Proof. Consider an NFA
N (Q, ?, ?, s, F) We need
construct an equivalent DFA
D (Q, ?, ?, s,
F) using a procedure called the subset
construction described on the next slide.
16The subset construction
1.2.h
Giorgi Japaridze Theory of Computability
Constructing DFA D (Q, ?, ?, s, F) from
NFA N (Q, ?, ?, s, F)
- Q P (Q)
- ?(R,a) q q?(p,a) for some p?R
-
- s s
- F R R is a subset of Q containing an accept
state of N - D obviously works correctly
- at every step in the
computation, it clearly enters a state that - corresponds to the subset
of states that N could be in at that point.
17Example of applying the subset construction
1.2.i
Giorgi Japaridze Theory of Computability
Q ? ? s F
N (Q, ?, ?, s, F)
1
a b ?
1 2 3 1,2 1,3 2,3 1,2,3
b
a
b
a
3
2
a,b
- Q P (Q)
- ?(R,a) q q?(p,a) for some p?R
- s s
- F R R is a subset of Q containing an
- accept state of N
18The resulting DFA
1.2.j
Giorgi Japaridze Theory of Computability
D
a,b
3
?
b
a
a
b
1,3
a
1
b
b
b
2,3
2
a
a
a,b
b
1,2
1,2,3
a
19Removing unreachable states
1.2.k
Giorgi Japaridze Theory of Computability
D
a,b
3
?
b
a
a
1
b
b
2,3
a
b
1,2,3
a
20Testing in work
1.2.l
Giorgi Japaridze Theory of Computability
D
a,b
N
3
?
b
1
a
a
b
a
b
1
b
a
b
3
2
a,b
2,3
a
b
b a a
1,2,3
a
21Regular operations
1.3.a
Giorgi Japaridze Theory of Computability
- Union L1 ? L2 x x?L1 or x?L2
- Good,Bad ? Boy,Girl
- 0,00,000, ?1,11,111,
- L ??
- Concatenation L1 ? L2 xy x?L1 and y?L2
- Good,Bad?Boy,Girl
- 0,00,000,?1,11,111,
- L ? ?
- Star L x1xk k?0 and each xi is in L
- Boy,Girl
22Regular expressions
1.3.b
Giorgi Japaridze Theory of Computability
We say that R is a regular expression (RE) iff R
is one of the following 1. a, where a is a
symbol of the alphabet 2. ? 3. ? 4. (R1)?(R2),
where R1 and R2 are RE 5. (R1) ? (R2), where R1
and R2 are RE 6. (R1), where R1 is a RE
What language is represented by the
expression a ? ? The union of the
languages represented
by R1 and R2 The concatenation of the
languages represented by R1
and R2 The star of the language represented
by R1
- The symbol ? is often omitted in RE
- Some parentheses can be omitted.
- The precedence order for the operators
is - (highest), ? (medium), ?
(lowest)
Conventions
23Regular languages
1.3.c
Giorgi Japaridze Theory of Computability
A language is said to be regular iff it can be
represented by a regular expression.
Language
Expression
11
Boy, Girl, Good, Bad
?,0,00,000,0000,
0,00,000,0000,
?,01,0101,010101,01010101,
x x 0k where k is a multiple of 2 or 3
x x is divisible by 8
x x MOD 4 3
24Exercising reading regular expressions
1.3.d
Giorgi Japaridze Theory of Computability
Expression
Language
(Good ? Bad)(Boy ? Girl)
(Tom ? Bob)_is_(good ? bad)
Name_is_adjective Name is an uppercase letter
followed by zero or more lowercase letters, and
adjective is a lowercase letter followed by zero
or more lowercase letters
010
(0 ?1)101(0 ?1)
((0 ?1)(0 ?1))
25Regular languages and DFA-recognizable languages
are the same
1.3.e
Giorgi Japaridze Theory of Computability
Theorem 1.54 A language is regular if and only
if some NFA (DFA) recognizes it.
Proof omitted (but given in the textbook).
The textbook describes an algorithm for
converting any given regular expression to an
equivalent NFA, and an algorithm for converting
any given NFA to an equivalent regular
expression.
26The limitations of the power of DFAs
1.4.a
Giorgi Japaridze Theory of Computability
The computing power of finite automata is
severely limited by the fact that their memory
( set of states) is small ( of a fixed size)
while inputs can be arbitrarily large.
While the memories of real computers are also
finite, they are not fixed, in the sense that we
assume one can always supply additional memory if
needed.
To summarize, DFAs are not as powerful as
computers can generally be.
The next slide gives several examples of
non-regular languages, i.e. languages that no
DFA can handle (recognize). The non-regularity
of those languages can be strictly proven using
the tool called pumping lemma. We omit the
pumping lemma in this course (but it is in the
textbook). Instead, we will simply rely on
intuitive arguments.
Warning Generally one cannot safely rely on
intuition when making important conclusions,
because intuition can sometimes be deceptive.
Only strict mathematical proofs can be trusted.
27Non-regular languages
1.4.b
Giorgi Japaridze Theory of Computability
Do the following languages look regular to
you?
A ww w ?0,1
Is not regular.
Intuitively, this is so because a DFA processing
a long input will have forgotten much of the
previously seen part of the input when it gets to
the middle of the string. But without fully
remembering the first half of the string, it is
impossible to tell whether the second half
coincides with it or not.
Is not regular.
B 0n1n n?0
Intuitively, this is so because a DFA processing
a long input 0n1n will be unable to remember
exactly how many 0s it has seen by the time when
the 1s start. But without that information it is
impossible to tell whether the remaining 1 part
of input has the same length as the already seen
0 part.
C w w contains the same number of 0s as
1s
Is not regular.
An intuitive reason here is similar to the one
for language B.
D w w contains the same number of 01s as
10s
Is regular.
Intuitively, it may appear to you that if C is
irregular, even more so should be D.
But youve been warned about the deceptiveness of
intuition. The following slide shows a DFA that
recognizes D, so that D is regular!
28A DFA recognizing D
1.4.c
Giorgi Japaridze Theory of Computability
D w w contains the same number of 01s as
10s
1
0
1
0
0
1
1
1
0
0