Title: Regular Languages, Regular Operations
1Regular Languages, Regular Operations
2Agenda
- Today
- Regular languages
- Finite languages are regular
- Regular operations on languages
- Union (?)
- Concatenation (?)
- Kleene star ()
- For next time
- Read 1.3 and handout on minimization
- Thursday, 9/20 (revised ) HW1 collected
3Definition of Regular Language
- Recall the definition of a regular language
- DEF The language accepted by an FA M is the set
of all strings which are accepted by M and is
denoted by L (M). - Would like to understand what types of languages
are regular. Languages of this type are amenable
to super-fast recognition of their elements - Would be nice to know for example, which of the
following are regular
4Language Examples
- Unary prime numbers
- 11, 111, 11111, 1111111, 11111111111,
- 12, 13, 15, 17, 111, 113,
- 1p p is a prime number
- Unary squares
- ?, 1, 14, 19, 116, 125, 136,
- 1n n is a perfect square
- Palindromic bit strings
- ?, 0, 1, 00, 11, 000, 010, 101, 111,
- x ? 0,1 x xR o
- Will explore whether or not these are regular in
future.
5Finite Languages
- All the previous examples had the following
property in common infinite cardinality - NOTE The strings which made up the language
were finite (as they always will be in this
course) however, the collection of such strings
was infinite. - Before looking at infinite languages, should
definitely look at finite languages.
6Languages of Cardinality 1
- Q Is the singleton language containing one
string regular? For example, is - banana
- regular?
7Languages of Cardinality 1
- A Yes.
- Q Whats, wrong with this example?
8Languages of Cardinality 1
- A Nothing, really. This an example of a
nondeterministic FA. This turns out to be the
most concise way to encapsulate the language
banana - But we will deal with nondeterminism in coming
lectures. So - Q Is there a way of fixing this and making it
deterministic?
9Languages of Cardinality 1
- A Yes, just add a fail state q7 I.e., put a
state that sucks in all strings different from
banana for all eternity unless they happen to
be the banana prefixes ?, b, ba, ban, bana,
banan.
10ABCEZ Goes Bananas
- Show how ABCEZ works on this example.
11Two Strings
- Q How about two strings? For example
- banana, nab ?
12Two Strings
13Arbitrary Finite Number of Strings
- Q1 How about more? For example
- banana, nab, ban, babba ?
- Q2 Or less (the empty set)
- Ø ?
14Arbitrary Finite Number of Strings
15Arbitrary Finite Number of Strings Empty Language
- A2 Build a 1-state automaton whose accept
states set F is empty!
16Arbitrary Finite Number of Strings
- THM All finite languages are regular.
- Proof Can always construct a tree whose leaves
are word-ending. In our example the tree is - Now make word endings into accept states, add a
fail sink-state and add links to the fail state
to finish the construction.
17Infinite Cardinality
- Q Are all regular languages finite?
18Infinite Cardinality
- A No! Many infinite languages are regular.
- Common Mistake 1 The strings of regular
languages are finite, therefore the regular
languages must be finite. - Common Mistake 2 Regular languages are by
definition accepted by finite automata,
therefore regular languages are finite. - Q Give an example of a infinite but regular
language.
19Infinite Cardinality
- bit strings with an even number of bs
- Simplest example is S
-
- many, many more
- Home exercise think of a criterion for
non-finiteness
20Regular Operations
- You may have come across the regular operations
when doing advanced searches utilizing programs
such as emacs, egrep, perl, python, etc. There
are three basic operations we will work with - Union
- Concatenation
- Kleene-star
- And a fourth definable in terms of the previous
- Kleene-plus
21Regular Operations Summarizing Table
22Regular operations - Union
- UNIX to search for all lines containing vowels
in a text one could use the command - egrep -i aeiou
- Here the pattern vowel is matched by any line
containing one of a, e, i, o or u. - Q What is a string pattern?
23String Patterns
- A A good way to define a pattern is as a set of
strings, i.e. a language. The language for a
given pattern is the set of all strings
satisfying the predicate of the pattern. - EG vowel-pattern
- the set of strings which contain at least
one of a e i o u
24UNIX patterns vs. Computability patterns
- In UNIX, a pattern is implicitly assumed to occur
as a substring of the matched strings. - In our course, however, a pattern needs to
specify the whole string, and not just a
substring.
25Regular operations - Union
- Computability union is exactly what we expect.
If you have patterns - A aardvark, B bobcat,
- C chimpanzee
- union the patterns together to get
- A?B ?C aardvark, bobcat, chimpanzee
26Regular operations - Concatenation
- UNIX to search for all consecutive double
occurrences of vowels, use - egrep -i (aeiou)(aeiou)
- Here the pattern vowel has been repeated.
Parentheses have been introduced to specify where
exactly in the pattern the concatenation is
occurring.
27Regular operations - Concatenation
- Computability. Consider the previous result
- L aardvark, bobcat, chimpanzee
- Q What language results when we concatenate L
with itself obtaining - L?L ?
28Regular operations - Concatenation
- A L?L
- aardvark, bobcat, chimpanzee?aardvark, bobcat,
chimpanzee -
- aardvarkaardvark, aardvarkbobcat,
aardvarkchimpanzee, - bobcataardvark, bobcatbobcat, bobcatchimpanzee,
- chimpanzeeaardvark, chimpanzeebobcat,
chimpanzeechimpanzee - Q1 What is L?e ?
- Q2 What is L?Ø ?
29Algebra of Languages
- A1 L?e L. In general, e is the identity
in the algebra of languages. I.e., if we think
of concatenation as being like multiplication,
e acts like the number 1. - A2 L?Ø Ø. Opposite to e, Ø acts like the
number zero obliterating everything it is
concatenated with. - Note We can carry on the analogy between
numbers and languages. Addition becomes union,
multiplication becomes concatenation. This forms
a so-called algebra.
30Regular operations Kleene-
- UNIX search for lines consisting purely of
vowels (including the empty line) - egrep -i (aeiou)
- NOTE and are special symbols in UNIX
regular expressions which respectively anchor the
pattern at the beginning and end of a line. The
trick above can be used to convert any
Computability regular expression into an
equivalent UNIX form.
31Regular operations Kleene-
- Computability Suppose we have a language
- B ba, na
- Q What is the language B ?
32Regular operations Kleene-
- A
- B ba, na
- e,
- ba, na,
- baba, bana, naba, nana,
- bababa, babana, banaba, banana,
- nababa, nabana, nanaba, nanana,
- babababa, bababana,
33Regular operations Kleene-
- Kleene- is just like Kleene- except that the
pattern is forced to occur at least once. - UNIX search for lines consisting purely of
vowels (not including the empty line) - egrep -i (aeiou)
- Computability B ba, na
- ba, na,
- baba, bana, naba, nana,
- bababa, babana, banaba, banana,
- nababa, nabana, nanaba, nanana,
- babababa, bababana,
34Generating the Regular Languages
- The real reason that regular languages are called
regular is the following - THM The regular languages are all those
languages which can be generated starting from
the finite languages by applying the regular
operations. - This will be proved in the coming lectures.
- Q Can we start with even more basic languages
than arbitrary finite languages?
35Generating the Regular Languages
- A Yes. We can start with languages consisting
of single strings which are themselves just a
single character. These are the atomic regular
languages. - EG To generate the finite language
- L banana, nab
- we can start with the atomic languages
- A a, B b, N n.
- Then we can express L as
- L (B ?A ?N ?A ?N ?A) ? (N ?A ?B )
36Blackboard Exercises
- Express the DFA patterns from the previous
board-exercises using regular operations in both
UNIX-style and Computability-style.