Languages and Machines - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Languages and Machines

Description:

the set consisting of the empty string ( ) is a regular language ... The language consisting of all possible binary strings. The language of HTML tags such as HEAD ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 29
Provided by: richardb84
Category:

less

Transcript and Presenter's Notes

Title: Languages and Machines


1
Languages and Machines
  • Unit two Regular languages and Finite State
    Automata

2
Review of week one
  • A language is a set of strings (the set of
    different things you can say). May be infinite.
  • A string is a sequence of symbols. Minimum length
    zero, maximum length some finite number.
  • A symbol is just some mark on the page or screen.
    A language has a finite alphabet of symbols.

3
Review of week one
  • In a context-dependent language, the meaning of a
    phrase depends on the context
  • In a context-sensitive language, the structure of
    a phrase depends on the context
  • Most natural languages are context-dependent but
    not context-sensitive
  • A context-free language is one where the
    structure of a phrase is always the same,
    independent of context
  • A regular language is a context-free language
    which has simple rules for forming valid strings
    (e.g. "94", "getWidth())

4
Classes of formal language
phrase structure
context-sensitive
context-free
regular
5
Regular languages
  • Here are examples of strings from a regular
    language with alphabet a,b
  • a
  • b
  • ab
  • aaaaa
  • ababab

6
Regular languages
  1. the empty set is a regular language
  2. the set consisting of the empty string (?) is a
    regular language
  3. the set consisting of a one-symbol string is a
    regular language
  4. a new regular language can be made by taking a
    string from a regular language and concatenating
    it with a string from a regular language
  5. a new regular language can be made by taking the
    disjoint union of two regular languages

7
Recognizing regular languages
  • regular languages can be recognized and
    interpreted by a finite-state machine
  • for example, here is a machine to recognize a
    two-bit string

0
0
acceptor states
1
1
8
Regular expressions
  • Wouldnt it be nice if we had a compact way of
    specifying a regular language?
  • we have!
  • its a special notation called a
  • regular expression

9
Examples of regular languages
  • the set of all two-symbol strings containing the
    letters a and b
  • (ab)2
  • the set of all two-bit strings
  • (01)2
  • the set of all possible words
  • (a..z)
  • the set of all decimal integers
  • (0(1..9)(0..9))
  • the set of Java identifiers
  • JavaLetter JavaLetterOrDigit

10
More examples of regular languages
  • all the possible three-bit strings
  • (01)3
  • all the single-digit decimal numbers
  • (0123456789) (0..9)
  • all the possible repetitions of the traffic-light
    sequence (red, amber, green, amber)
  • (red amber green amber)

11
Activity
  • Write down the regular expression denoting the
    following regular languages
  • The language with two strings the cat and the
    mat
  • Arithmetic expressions with two operands,
  • e.g. 1 2, 3 4
  • The allowed operator are , -, ,
  • The allowed operands are single digit decimal
    numbers
  • The language consisting of all possible binary
    strings
  • The language of HTML tags such as ltHEADgt

12
Suggested Answers
  • The language with two strings the cat and the
    mat
  • the (cat mat) or (the (cm)at)
  • Arithmetic expressions with two operands,
  • e.g. 1 2, 3 4.
  • (0..9) (-) (0..9)
  • The language consisting of all possible binary
    strings
  • (01)
  • The language of HTML tags such as ltHEADgt
  • lt (A..Z) gt

13
A cautionary note
  • You have been using a metalanguage!
  • The regular expression strings form a language
    having terminal symbols ( ) plus literal
    symbols e.g. a stands for the letter a
  • this can cause problems when the metalanguage and
    the language get confused e.g. the language
    consisting of strings of one to three vertical
    bars

14
A cautionary note
  • we can fix this by some ghastly escape
    convention, e.g. convert the above to
  • "" "" ""
  • now we have problems with the quote symbol!
  • the best idea is to choose metalanguage symbols
    which are rarely encountered in the language
    being described, and use bold-face or color to
    distinguish

15
Regular languages and regular expressions
  • Regular Expression
  • a
  • a b
  • a b
  • Regular Language
  • the empty set
  • the set consisting of the empty string (?)
  • the set consisting of a one-symbol string (e.g.
    "a")
  • a new regular language can be made by taking a
    string from a regular language and concatenating
    it with a string from a regular language
  • a new regular language can be made by taking the
    union of two regular languages

16
Regular languages and regular expressions
  • The other ways of forming regular expressions
    are just shorthand
  • a0
  • a1 a
  • a2 aa
  • a a aa aaa ...
  • a a aa aaa ...

17
Regular languages and regular expressions
  • Brackets are used to show precedence of the
    operations
  • (a b ) ? a b
  • default precedence is
  • or or n
  • concatenation

18
Activity
  • Give examples of the following languages
  • (x y z)3
  • x y z
  • a b2
  • (a b)2

19
Suggested Answers
  • Give examples of the following languages
  • (x y z)3 xzy
  • x y z
  • a b2 abb
  • (a b)2 abab

20
From Regular Expressions to Finite State Automata
  • It is an amazing fact that any regular expression
    has an equivalent finite state automaton which
    recognizes it
  • and every finite state automaton recognizes some
    regular expression
  • we will prove these propositions later

21
Finite State Machines
transition
D
0
00
start state
B
1
0
end state
E
A
01
0
output symbol
1
C
1
input symbol
F
10
  • an FSM to add two binary numbers

22
Finite state automata
  • These are simple machines with no output symbols
  • they can only recognize strings of input symbols
  • acceptance is shown by a special state

23
NFAs
  • The kind of finite state automata we shall be
    using are called nondeterministic finite automata
  • "nondeterministic" means we can do naughty things
    like
  • have a transition without a symbol
  • label two exit transitions with the same symbol
  • not show the paths which lead to failure

24
Example of an NFA
b
a
b
c
a
a
a
  • what regular language does this NFA represent?
  • a b a b c a

25
Examples of conversion from REs to NFAs
  • (a b)2
  • a b2
  • (a b)2
  • (a b)

a
b
a
b
a
b
b
a
a
b
b
a
b
26
Activity
  • Convert the following regular expressions to
    NFAs
  • JavaLetter JavaLetterOrDigit
  • (red amber green amber)
  • Convert the following NFAs to REs

a
b
c
a
b
d
27
Suggested answer
  1. (ab)
  2. (acbd)

javaLetter
javaLetterOrDigit
amber
amber
red
green
28
Summary
  • regular expressions give us a neat notation for
    describing regular languages
  • nondeterministic finite automata (NFAs) provide a
    diagrammatic version of regular expressions
  • these notations are equivalent
  • finite automata theory is crucial in generating
    lexical analyzers from regular expressions
Write a Comment
User Comments (0)
About PowerShow.com