91'304 Foundations of Theoretical Computer Science - PowerPoint PPT Presentation

1 / 29

About This Presentation

Title:

91'304 Foundations of Theoretical Computer Science

Description:

Instead of writing a union of all characters from together to mean 'any ... Instead of writing L(r) when r is a regular expression, we consider r alone to ... – PowerPoint PPT presentation

Number of Views:29

Avg rating:3.0/5.0

Slides: 30

Provided by: csU72

Category:

more less

Transcript and Presenter's Notes

Title: 91'304 Foundations of Theoretical Computer Science

1
91.304 Foundations of (Theoretical) Computer
Science

David Martin
dm_at_cs.uml.edu

This work is licensed under the Creative Commons
Attribution-ShareAlike License. To view a copy of
this license, visit http//creativecommons.org/lic
enses/by-sa/2.0/ or send a letter to Creative
Commons, 559 Nathan Abbott Way, Stanford,
California 94305, USA.
2
Note on these slides

The numbering only partially corresponds to
textbook chapters
1-c.ppt is just the third part of slides having
to do with chapter 1

3
Closure properties

The presence or absence of closure properties
says something about how robust a set is with
respect to an operation
Definition. Let S µ U be a set in some universe
U and be an operation on elements of U. We say
that S is closed under if applying to
element(s) of S produces another element of S.
For example, if is a binary operation UU!U,
then we're saying that (8 x2S and y2S) x y 2 S

4
Closure properties illustrated
U
Applying the operation to elements of S never
takes you ouside of S. S is closed with respect
to This example shows unary operations

S
5
Closure properties

Having a closure property usually means there is
some type of "natural fit" between the operation
and the set
Examples
N is closed under and and but not - and
Z is closed under and - and and unary -
(negation) but not or
Q-0 is closed under and but not or -

6
More examples

L1x2 0,1 x is a multiple of 3
is closed under string reversal and concatenation
L3x20,1 the binary number x is a multiple
of 3
is not closed under string reversal or
concatenation (oops, will take that under
advisement)
L4x2a,b x contains an odd of bs and an
even of as
is closed under string reversal
is not closed under string concatenation

7
Closure higher abstraction

We will usually be concerned with closure of
language classes under language operations
Previous examples were closure of sets containing
non-set elements under various familiar
operations
We consider DFAs and NFAs to be programs and we
want assurance that their outputs can be combined
in desired ways just by manipulating their
programs (like using one as a subroutine for the
other)
Representative question is REG closed under
(language) concatenation?

8
The regular operations

The regular operations on languages are
(union)
(concatenation)
(Kleene star)
The name "regular operations" is not that
important
Too bad we use the word "regular" for so much
REG is closed under these regular operations
That's why they're called "regular" operations
This does not mean that each regular language is
closed under each of these operations!

9
The regular operations

REG is closed under union Theorem 1.12 (using
DFAs), Theorem 1.22 (using NFAs)
REG is closed under concatenation Theorem 1.23
(NFAs)
REG is closed under Theorem 1.24 (NFAs)
REG is also closed under complement and reversal
(not in book)

10
Regular expressions

You are probably familiar with these
Example "int .\(.\)" is a (flex format)
regular expression that appears to match C
function prototypes that return ints
In our treatment, a regular expression is a
program that generates a language of matching
strings when you "run it"
We will use a very compact definition that
simplifies things later

11
Regular expressions

Definition. Let ? be an alphabet not containing
any of the special characters in this list ?
) ( We define the syntax of the
(programming) language REX(?), abbreviated as
REX, inductively
Base cases
For all a2?, a2REX. In other words, each single
character from ? is a regular expression all by
itself.
?2REX. In other words, the literal symbol ? is a
regular expression. In this context it is not
the empty string but rather the single-character
name for the empty string.
2REX. Similarly, the literal symbol is a
regular expression.

12
Regular expressions

Definition continued
Induction cases
For all r1, r22 REX,( r1 r2 ) 2 REX
also
For all r1, r22 REX,( r1 r2 ) 2 REX also

literal symbols
variables
13
Regular expressions

Definition continued
Induction cases continued
For all r 2 REX,( r ) 2 REX also
Examples over ?0,1
? and 0 and 1 and
(((10)(?)))
?? is not a regular expression
Remember, in the context of regular expressions,
? and are ordinary characters

14
Semantics of regular expressions

Definition. We define the meaning of the
language REX(?) inductively using the L()
operator so that L(r) denotes the language
generated by r as follows
Base cases
For all a2?, L(a) a . A single-character
regular expression generates the corresponding
single-character string.
L(?) ? . The symbol for the empty string
actually generates the empty string.
L() . The symbol for the empty language
actually generates the empty language.

15
Regular expressions

Definition continued
Induction cases
For all r1, r22 REX,L( (r1 r2) ) L(r1)
L(r2)
For all r1, r22 REX,L( (r1 r2) ) L(r1)
L(r2)
For all r 2 REX,L( ( r ) ) (L(r))
No other string is in REX(?)
Example
L( (((10)(?))) ) includes
?,10,1010,101010,101010,...

16
Orientation

We used highly flexible mathematical notation and
state-transition diagrams to specify DFAs and
NFAs
Now we have a precise programming language REX
that generates languages
REX is designed to close the simplest languages
under , ,

17
Abbreviations

Instead of parentheses, we use precedence to
indicate grouping when possible.
(highest)
(lowest)
Instead of , we just write elements next to each
other
Example (((10)(?))) can be written as
(10(?)) but there is no further abbreviation
(Not in text) If r2 REX(?), instead of writing
rr, we write r

18
Abbreviations

Instead of writing a union of all characters from
? together to mean "any character", we just write
?
In a flex/grep regular expression this would be
called "."
Instead of writing L(r) when r is a regular
expression, we consider r alone to simultaneously
mean both the expression r and the language it
generates, relying on context to disambiguate

19
Abbreviations

Caution regular expressions are strings
(programs). They are equal only when they
contain exactly the same sequence of characters.
(((10)(?))) can be abbreviated (10(?))
however (((10)(?))) ? (10(?)) as strings
but (((10)(?))) (10(?)) when they are
considered to be the generated languages
more accurately then, L( (((10)(?))) )
L( (10(?)) )
L( (10) )

20
Facts

REX(?) is itself a language over an alphabet ?
that is
? ? ) , ( , , , ? ,
For every ?, REX(?) 1
,(),(()),...
even without knowing ? there are infinitely many
elements in REX(?)
Question Can we find a DFA or NFA M with L(M)
REX(?)?

21
Examples

Find a regular expression for w20,1 w ?
10
Find a regular expression for x20,1 the
6th digit counting from the rightmost
character of x is 1
Find a regular expression forL3x20,1 the
binary number x is a multiple of 3

22
The DFA for L3
1
0
1
0
1
0
2
0
1
(0 1 0)
Regular expression(0 1 _____________ 1 )
23
Regular expression for L3

(0 1 (0 1 0) 1 )
L3 is closed under concatenation, because of the
overall form ( )
Now suppose x2L3. Is xR 2 L3?
Yes, so L3 is also closed under reversal
Another way to see this is by reversing the
regular expression and observing that the same
regular expression results

24
Regular expressions generate regular languages

Lemma 1.29 For every regular expression r, L(r)
is a regular language.
Proof by induction on regular expressions.
We used induction to create all of the regular
expressions and then to define their languages,
so we can use induction to visit each one and
prove a property about it

25
L(REX) µ REG

Base cases
For every a2 ?, L(a) a is obviously
regular
L(?) ? 2 REG also
L() 2 REG

a
26
L(REX) µ REG

Induction cases
Suppose the induction hypothesis holds for r1 and
r2. Namely, L(r1) 2 REG and L(r2) 2 REG. We
want to show that L( (r1 r2) ) 2 REG also. But
look by definition, L( (r1 r2) ) L(r1)
L(r2)
Since both of these languages are regular, we
can apply Theorem 1.22 (closure of REG under )
to conclude that their union is regular.

27
L(REX) µ REG

Induction cases
Now suppose L(r1)2 REG and L(r2)2 REG. By
definition, L( (r1 r2) ) L(r1) L(r2)
By Theorem 1.23, this concatenation is regular
too.
Finally, suppose L(r)2 REG. Then by
definition, L( (r) ) (L(r))
By Theorem 1.24, this language is also regular.
QED

28
On to REG µ L(REX)

Now we'll show that each regular language (one
accepted by an automaton) also can be described
by a regular expression
Hence REG L(REX)
In other words, regular expressions are
equivalent in power to finite automata
This equivalence is called Kleene's Theorem (1.28
in book)

29
Converting DFAs to REX

Lemma 1.32 in textbook
This approach uses yet another form of finite
automaton called a GNFA (generalized NFA)
The technique is easier to understand by working
an example than by studying the proof

Write a Comment

User Comments (0)