Title: Operations on Languages
1Operations on Languages
- Let L, L1, L2 be subsets of S
- Concatenation L1L2 xy x is in L1 and y is
in L2 - Concatenating a language with itself L0 e
- Li LLi-1, for all i gt 1
- Kleene Closure L Li L0 U L1 U L2 U
- Positive Closure L Li L1 U L2 U
- Question Does L contain e?
2Kleene closure
Say, L1 a, abc, ba, on S a,b,c Then, L2
aa, aabc, aba, abca, abcabc, abcba, baa,
baabc, baba L3 a, abc, ba. L2 L e, L1,
L2, L3, . . .
3Regular Expressions
- Highlights
- A regular expression is used to specify a
language, and it does so precisely. - Regular expressions are very intuitive.
- Regular expressions are very useful in a variety
of contexts. - Given a regular expression, an NFA-e can be
constructed from it automatically. - Thus, so can an NFA, a DFA, and a corresponding
program, all automatically!
4Definition of a Regular Expression
- Let S be an alphabet. The regular expressions
over S are - Ø Represents the empty set
- e Represents the set e
- a Represents the set a, for any symbol a in S
- Let r and s be regular expressions that
represent the sets R and S, respectively. - rs Represents the set R U S (precedence 3)
- rs Represents the set RS (precedence 2)
- r Represents the set R (highest precedence)
- (r) Represents the set R (not an op, provides
precedence) - If r is a regular expression, then L(r) is used
to denote the corresponding language.
5- Examples Let S 0, 1
- (0 1) All strings of 0s and 1s
- 0(0 1) All strings of 0s and 1s, beginning
with a 0 - (0 1)1 All strings of 0s and 1s, ending
with a 1 - (0 1)0(0 1) All strings of 0s and 1s
containing at least one 0 -
- (0 1)0(0 1)0(0 1) All strings of 0s
and 1s containing at least two 0s - (0 1)0101 All strings of 0s and 1s
containing at least two 0s - (1 010) All strings of 0s and 1s
containing an even number of 0s -
- 1(0101) All strings of 0s and 1s
containing an even number of 0s -
- (1010)1 All strings of 0s and 1s
containing an even number of 0s
6- Identities
- Øu uØ Ø Multiply by 0
- eu ue u Multiply by 1
- Ø e L Li L0 U L1 U L2 U
- e e e
- uv vu
- u Ø u
- u u u
- u (u)
- u(vw) uvuw
- (uv)w uwvw
- (uv)u u(vu)
- (uv) (uv)
- u(uv)
- (uvu)
- (uv)
- u(vu)
- (uv)u
7Equivalence of Regular Expressionsand NFA-es
- Note
- Throughout the following, keep in mind that a
string is accepted by an NFA-e if there exists a
path from the start state to a final state. - Lemma 1 Let r be a regular expression. Then
there exists an NFA-e M such that L(M) L(r).
Furthermore, M has exactly one final state with
no transitions out of it. - Proof (by induction on the number of operators,
denoted by OP(r), in r).
8- Basis OP(r) 0
- Then r is either Ø, e, or a, for some symbol a
in S - For Ø
- For e
- For a
9- Inductive Hypothesis Suppose there exists a k ?
0 such that for any regular expression r where 0
? OP(r) ? k, there exists an NFA-e such that L(M)
L(r). Furthermore, suppose that M has exactly
one final state. - Inductive Step Let r be a regular expression
with k 1 operators (OP(r) k 1), where k 1
gt 1. - Case 1) r r1 r2
- Since OP(r) k 1, it follows that 0lt OP(r1),
OP(r2) lt k. By the inductive hypothesis there
exist NFA-e machines M1 and M2 such that L(M1)
L(r1) and L(M2) L(r2). Furthermore, both M1 and
M2 have exactly one final state. - Construct M as
M1
q1
f1
e
e
q0
e
e
M2
q2
f2
10- Case 2) r r1r2
- Since OP(r) k1, it follows that 0lt OP(r1),
OP(r2) lt k. By the inductive hypothesis there
exist NFA-e machines M1 and M2 such that L(M1)
L(r1) and L(M2) L(r2). Furthermore, both M1 and
M2 have exactly one final state. - Construct M as
- Case 3) r r1
- Since OP(r) k1, it follows that 0lt OP(r1) lt
k. By the inductive hypothesis there exists an
NFA-e machine M1 such that L(M1) L(r1).
Furthermore, M1 has exactly one final state. - Construct M as
e
M1
e
e
f1
q1
q0
e
11- Example
- r 0(01)
- r r1r2
- r1 0
- r2 (01)
- r2 r3
- r3 01
- r3 r4 r5
- r4 0
- r5 1
1
q0
12- Example
- r 0(01)
- r r1r2
- r1 0
- r2 (01)
- r2 r3
- r3 01
- r3 r4 r5
- r4 0
- r5 1
1
q0
0
q2
13- Example
- r 0(01)
- r r1r2
- r1 0
- r2 (01)
- r2 r3
- r3 01
- r3 r4 r5
- r4 0
- r5 1
e
e
q4
e
e
14- Example
- r 0(01)
- r r1r2
- r1 0
- r2 (01)
- r2 r3
- r3 01
- r3 r4 r5
- r4 0
- r5 1
15- Example
- r 0(01)
- r r1r2
- r1 0
- r2 (01)
- r2 r3
- r3 01
- r3 r4 r5
- r4 0
- r5 1
0
q8
q9
16- Example
- r 0(01)
- r r1r2
- r1 0
- r2 (01)
- r2 r3
- r3 01
- r3 r4 r5
- r4 0
- r5 1
17Definitions Required to Convert a DFAto a
Regular Expression
- Let M (Q, S, d, q1, F) be a DFA with state set
Q q1, q2, , qn, and define - Ri,j x x is in S and d(qi,x) qj
- Ri,j is the set of all strings that define a
path in M from qi to qj. - Note that states have been numbered starting at
1!
18- Example
- R2,3 0, 001, 00101, 011,
- R1,4 01, 00101,
- R3,3 11, 100,
0
19- Another definition
- Rki,j x x is in S and d(qi,x) qj, and
for no u where 1 ? u lt x and - x uv is it the case that d(qi,u) qp where
pgtk - In words Rki,j is the set of all strings that
define a path in M from qi to qj but that pass
through no state numbered greater than k. Note
that it may be that igtk or jgtk.
20- Example
- R42,3 0, 1000, 011, R12,3 0
- 111 is not in R42,3 111 is not in R12,3
- 101 is not in R12,3
- R52,3 R2,3
21- Obeservations
- 1) Rni,j Ri,j
- 2) Rk-1i,j is a subset of Rki,j
- 3) L(M) Rn1,q R1,q
- 4) R0i,j Easily computed from the DFA!
- 5) Rki,j Rk-1i,k (Rk-1k,k) Rk-1k,j U Rk-1i,j
22- Notes on 5
- 5) Rki,j Rk-1i,k (Rk-1k,k) Rk-1k,j U Rk-1i,j
- Consider paths represented by the strings in
Rki,j -
- IF x is a string in Rki,j then no state numbered
gt k is passed through when processing x and
either - qk is not passed through, i.e., x is in Rk-1i,j
- qk is passed through one or more times, i.e., x
is in Rk-1i,k (Rk-1k,k) Rk-1k,j
23- Lemma 2 Let M (Q, S, d, q1, F) be a DFA. Then
there exists a regular expression r such that
L(M) L(r). - Proof
- First we will show (by induction on k) that for
all i,j, and k, where 1 ? i,j ? n - and 0 ? k ? n, that there exists a regular
expression r such that L(r) Rki,j . - Basis k0
- R0i,j contains single symbols, one for each
transition from qi to qj, and possibly e if ij. - case 1) No transitions from qi to qj and i ! j
- r0i,j Ø
- case 2) At least one (m ? 1) transition from qi
to qj and i ! j - r0i,j a1 a2 a3 am where d(qi, ap)
qj, - for all 1 ? p ? m
24- case 3) No transitions from qi to qj and i j
- r0i,j e
- case 4) At least one (m ? 1) transition from qi
to qj and i j - r0i,j a1 a2 a3 am e where d(qi,
ap) qj - for all 1 ? p ? m
- Inductive Hypothesis
- Suppose that Rk-1i,j can be represented by the
regular expression rk-1i,j for all - 1 ? i,j ? n, and some k?1.
- Inductive Step
- Consider Rki,j Rk-1i,k (Rk-1k,k) Rk-1k,j U
Rk-1i,j . By the inductive hypothesis there exist
regular expressions rk-1i,k , rk-1k,k , rk-1k,j ,
and rk-1i,j generating Rk-1i,k , Rk-1k,k ,
Rk-1k,j , and Rk-1i,j , respectively. Thus, if
we let - rki,j rk-1i,k (rk-1k,k) rk-1k,j rk-1i,j
- then rki,j is a regular expression generating
Rki,j ,i.e., L(rki,j) Rki,j .
25- Finally, if F qj1, qj2, , qjr, then
- rn1,j1 rn1,j2 rn1,jr
- is a regular expression generating L(M).
- Note not only does this prove that the regular
expressions generate the regular languages, but
it also provides an algorithm for computing it!
26- Example
- First table column is
- computed from the
- DFA.
- k 0 k 1 k 2
- rk1,1 e
- rk1,2 0
- rk1,3 1
- rk2,1 0
- rk2,2 e
- rk2,3 1
- rk3,1 Ø
- rk3,2 0 1
27- All remaining columns are computed from the
previous column using the formula. - r12,3 r02,1 (r01,1 ) r01,3 r02,3
- 0 (e) 1 1
- 01 1
- k 0 k 1 k 2
- rk1,1 e e
- rk1,2 0 0
- rk1,3 1 1
- rk2,1 0 0
- rk2,2 e e 00
- rk2,3 1 1 01
- rk3,1 Ø Ø
- rk3,2 0 1 0 1
- rk3,3 e e
28- r21,3 r11,2 (r12,2 ) r12,3 r11,3
- 0 (e 00) (1 01) 1
- 01
- k 0 k 1 k 2
- rk1,1 e e (00)
- rk1,2 0 0 0(00)
- rk1,3 1 1 01
- rk2,1 0 0 0(00)
- rk2,2 e e 00 (00)
- rk2,3 1 1 01 01
- rk3,1 Ø Ø (0 1)(00)0
- rk3,2 0 1 0 1 (0 1)(00)
- rk3,3 e e e (0 1)01
29- To complete the regular expression, we compute
- r31,2 r31,3
- k 0 k 1 k 2
- rk1,1 e e (00)
- rk1,2 0 0 0(00)
- rk1,3 1 1 01
- rk2,1 0 0 0(00)
- rk2,2 e e 00 (00)
- rk2,3 1 1 01 01
- rk3,1 Ø Ø (0 1)(00)0
- rk3,2 0 1 0 1 (0 1)(00)
- rk3,3 e e e (0 1)01
30- Theorem Let L be a language. Then there exists
an a regular expression r such that L L(r) if
and only if there exits a DFA M such that L
L(M). - Proof
- (if) Suppose there exists a DFA M such that L
L(M). Then by Lemma 2 there exists a regular
expression r such that L L(r). - (only if) Suppose there exists a regular
expression r such that L L(r). Then by Lemma 1
there exists a DFA M such that L L(M). - Corollary The regular expressions define the
regular languages. - Note The conversion from a regular expression to
a DFA and a program accepting L(r) is now
complete, and fully automated!