Title: 91'304 Foundations of Theoretical Computer Science
191.304 Foundations of (Theoretical) Computer
Science
- David Martin
- dm_at_cs.uml.edu
This work is licensed under the Creative Commons
Attribution-ShareAlike License. To view a copy of
this license, visit http//creativecommons.org/lic
enses/by-sa/2.0/ or send a letter to Creative
Commons, 559 Nathan Abbott Way, Stanford,
California 94305, USA.
2Note on these slides
- The numbering only partially corresponds to
textbook chapters - 1-c.ppt is just the third part of slides having
to do with chapter 1
3Closure properties
- The presence or absence of closure properties
says something about how robust a set is with
respect to an operation - Definition. Let S µ U be a set in some universe
U and be an operation on elements of U. We say
that S is closed under if applying to
element(s) of S produces another element of S. - For example, if is a binary operation UU!U,
then we're saying that (8 x2S and y2S) x y 2 S
4Closure properties illustrated
U
Applying the operation to elements of S never
takes you ouside of S. S is closed with respect
to This example shows unary operations
S
5Closure properties
- Having a closure property usually means there is
some type of "natural fit" between the operation
and the set - Examples
- N is closed under and and but not - and
- Z is closed under and - and and unary -
(negation) but not or - Q-0 is closed under and but not or -
6More examples
- L1x2 0,1 x is a multiple of 3
- is closed under string reversal and concatenation
- L3x20,1 the binary number x is a multiple
of 3 - is not closed under string reversal or
concatenation (oops, will take that under
advisement) - L4x2a,b x contains an odd of bs and an
even of as - is closed under string reversal
- is not closed under string concatenation
7Closure higher abstraction
- We will usually be concerned with closure of
language classes under language operations - Previous examples were closure of sets containing
non-set elements under various familiar
operations - We consider DFAs and NFAs to be programs and we
want assurance that their outputs can be combined
in desired ways just by manipulating their
programs (like using one as a subroutine for the
other) - Representative question is REG closed under
(language) concatenation?
8The regular operations
- The regular operations on languages are
- (union)
- (concatenation)
- (Kleene star)
- The name "regular operations" is not that
important - Too bad we use the word "regular" for so much
- REG is closed under these regular operations
- That's why they're called "regular" operations
- This does not mean that each regular language is
closed under each of these operations!
9The regular operations
- REG is closed under union Theorem 1.12 (using
DFAs), Theorem 1.22 (using NFAs) - REG is closed under concatenation Theorem 1.23
(NFAs) - REG is closed under Theorem 1.24 (NFAs)
- REG is also closed under complement and reversal
(not in book)
10Regular expressions
- You are probably familiar with these
- Example "int .\(.\)" is a (flex format)
regular expression that appears to match C
function prototypes that return ints - In our treatment, a regular expression is a
program that generates a language of matching
strings when you "run it" - We will use a very compact definition that
simplifies things later
11Regular expressions
- Definition. Let ? be an alphabet not containing
any of the special characters in this list ?
) ( We define the syntax of the
(programming) language REX(?), abbreviated as
REX, inductively - Base cases
- For all a2?, a2REX. In other words, each single
character from ? is a regular expression all by
itself. - ?2REX. In other words, the literal symbol ? is a
regular expression. In this context it is not
the empty string but rather the single-character
name for the empty string. - 2REX. Similarly, the literal symbol is a
regular expression.
12Regular expressions
- Definition continued
- Induction cases
- For all r1, r22 REX,( r1 r2 ) 2 REX
also - For all r1, r22 REX,( r1 r2 ) 2 REX also
literal symbols
variables
13Regular expressions
- Definition continued
- Induction cases continued
- For all r 2 REX,( r ) 2 REX also
- Examples over ?0,1
- ? and 0 and 1 and
- (((10)(?)))
- ?? is not a regular expression
- Remember, in the context of regular expressions,
? and are ordinary characters
14Semantics of regular expressions
- Definition. We define the meaning of the
language REX(?) inductively using the L()
operator so that L(r) denotes the language
generated by r as follows - Base cases
- For all a2?, L(a) a . A single-character
regular expression generates the corresponding
single-character string. - L(?) ? . The symbol for the empty string
actually generates the empty string. - L() . The symbol for the empty language
actually generates the empty language.
15Regular expressions
- Definition continued
- Induction cases
- For all r1, r22 REX,L( (r1 r2) ) L(r1)
L(r2) - For all r1, r22 REX,L( (r1 r2) ) L(r1)
L(r2) - For all r 2 REX,L( ( r ) ) (L(r))
- No other string is in REX(?)
- Example
- L( (((10)(?))) ) includes
- ?,10,1010,101010,101010,...
16Orientation
- We used highly flexible mathematical notation and
state-transition diagrams to specify DFAs and
NFAs - Now we have a precise programming language REX
that generates languages - REX is designed to close the simplest languages
under , ,
17Abbreviations
- Instead of parentheses, we use precedence to
indicate grouping when possible. - (highest)
-
- (lowest)
- Instead of , we just write elements next to each
other - Example (((10)(?))) can be written as
(10(?)) but there is no further abbreviation - (Not in text) If r2 REX(?), instead of writing
rr, we write r
18Abbreviations
- Instead of writing a union of all characters from
? together to mean "any character", we just write
? - In a flex/grep regular expression this would be
called "." - Instead of writing L(r) when r is a regular
expression, we consider r alone to simultaneously
mean both the expression r and the language it
generates, relying on context to disambiguate
19Abbreviations
- Caution regular expressions are strings
(programs). They are equal only when they
contain exactly the same sequence of characters. - (((10)(?))) can be abbreviated (10(?))
- however (((10)(?))) ? (10(?)) as strings
- but (((10)(?))) (10(?)) when they are
considered to be the generated languages - more accurately then, L( (((10)(?))) )
L( (10(?)) ) - L( (10) )
20Facts
- REX(?) is itself a language over an alphabet ?
that is - ? ? ) , ( , , , ? ,
- For every ?, REX(?) 1
- ,(),(()),...
- even without knowing ? there are infinitely many
elements in REX(?) - Question Can we find a DFA or NFA M with L(M)
REX(?)?
21Examples
- Find a regular expression for w20,1 w ?
10 - Find a regular expression for x20,1 the
6th digit counting from the rightmost
character of x is 1 - Find a regular expression forL3x20,1 the
binary number x is a multiple of 3
22The DFA for L3
1
0
1
0
1
0
2
0
1
(0 1 0)
Regular expression(0 1 _____________ 1 )
23Regular expression for L3
- (0 1 (0 1 0) 1 )
- L3 is closed under concatenation, because of the
overall form ( ) - Now suppose x2L3. Is xR 2 L3?
- Yes, so L3 is also closed under reversal
- Another way to see this is by reversing the
regular expression and observing that the same
regular expression results
24Regular expressions generate regular languages
- Lemma 1.29 For every regular expression r, L(r)
is a regular language. - Proof by induction on regular expressions.
- We used induction to create all of the regular
expressions and then to define their languages,
so we can use induction to visit each one and
prove a property about it
25L(REX) µ REG
- Base cases
- For every a2 ?, L(a) a is obviously
regular - L(?) ? 2 REG also
- L() 2 REG
a
26L(REX) µ REG
- Induction cases
- Suppose the induction hypothesis holds for r1 and
r2. Namely, L(r1) 2 REG and L(r2) 2 REG. We
want to show that L( (r1 r2) ) 2 REG also. But
look by definition, L( (r1 r2) ) L(r1)
L(r2) - Since both of these languages are regular, we
can apply Theorem 1.22 (closure of REG under )
to conclude that their union is regular.
27L(REX) µ REG
- Induction cases
- Now suppose L(r1)2 REG and L(r2)2 REG. By
definition, L( (r1 r2) ) L(r1) L(r2) - By Theorem 1.23, this concatenation is regular
too. - Finally, suppose L(r)2 REG. Then by
definition, L( (r) ) (L(r)) - By Theorem 1.24, this language is also regular.
QED
28On to REG µ L(REX)
- Now we'll show that each regular language (one
accepted by an automaton) also can be described
by a regular expression - Hence REG L(REX)
- In other words, regular expressions are
equivalent in power to finite automata - This equivalence is called Kleene's Theorem (1.28
in book)
29Converting DFAs to REX
- Lemma 1.32 in textbook
- This approach uses yet another form of finite
automaton called a GNFA (generalized NFA) - The technique is easier to understand by working
an example than by studying the proof