Title: Contextfree languages and Pushdown Automata
1Context-free languages and Pushdown Automata
A more powerful model for computing
2Models of computing
- Each level of languages in the Chomsky hierarchy
has a computing model associated with it. As the
languages grow more complex, the computing models
become more powerful. - DFA and NFA - Regular languages
- Pushdown automata- Context-free
- Bounded Turing Ms - Context sensitive
- Turing machines - Unrestricted
3Context-free grammar
- A context-free grammar is a grammar whose
productions are of the form - S ? ?
- Â
- where S is a non-terminal and ? is any string
over the alphabet of terminals and non-terminals.
- Context-free languages contain within them the
family of regular languages.
4Chomsky normal form
- When working with CFLs, its often useful to have
them in a simplified form, and one of the
simplest is called the Chomsky normal form - All productions are of the form
- A ? BC
- A ? a
- Where a is any terminal, and A, B, C are
non-terminals (B, C not start symbol) - If ? is needed, it is produced via S ? ?
- Every CF grammar is expressible in Chomsky normal
form!
5Context-free examples
- Classic example L an bn n ? 0
- However, M an bn cn n ? 0 is not
context-free! Context free languages also have a
pumping lemma. - Palindromes read the same forward and backward
Madam, Im Adam. - Even w wR wR is reverse of w
- Odd w x wR
- S ? aSa bSb a b ?
6English as a context free language
- American linguist Noam Chomsky first proposed
context free languages in 1956. - He hoped that it would allow him to define the
grammars of ordinary written and spoken
languages, but this was not realized. - However, a few languages, such as Sanskrit and a
form of Tamil poetry, may be expressible as
context free
7English fragment
- ?Sentence? ? ?Noun Phrase? ?Verb Phrase?
- ?Noun Phrase? ? ?Cmplx-Noun? ?Cmplx-Noun?
?Prep Phrase? - ?Verb Phrase? ? ?Cmplx-Verb?
?Cmplx-Verb? ?Prep Phrase? - ?Prep Phrase? ? ?Prep? ?Cmplx-Noun?
- ?Cmplx-Noun? ? ?Article? ?Noun?
- ?Cmplx-Verb? ? ?Verb? ?Verb? ?Noun Phrase?
- ?Article? ? a the
- ?Noun? ? boy girl flower
- ?Verb? ? likes sees touches
- ?Prep? ? with
8Example a boy sees
- ?Sentence? ? ?Noun Phrase? ?Verb Phrase?
- ? ?Cmplx-Noun? ?Verb Phrase?
- ? ?Article? ?Noun? ?Verb Phrase?
- ? a ?Noun? ?Verb Phrase?
- ? a boy ?Verb Phrase?
- ? a boy ?Cmplx-Verb?
- ? a boy ?Verb?
- ? a boy sees
- Does this grammar produce an infinite language?
9Computer languages
- While not as useful as hoped for human language,
context-free languages are extensively used for
computer languages. - E.g., Fortran, Pascal, HTML, etc all have
context-free form. - In the early 1960s, Chomsky and Evey showed that
they were equivalent to a computing model called
a pushdown automaton.
10Pushdown Automata
- We need a more powerful model than the finite
automata to recognize the context-free languages.
- Context-free languages can be recognised by the
so-called Pushdown Automata (PDAs) - PDAs are like the finite automata, except they
also have a stack memory where they can store an
arbitrary amount of information.
11PDAs versus finite automata
Finite automaton
Input
Control
Pushdown automaton
Input
Control
Read/write stack memory Last in, first out LIFO
12Stack operations
- We can only interact with the top of the stack
- Basic stack commands
- Push - Put another character onto the
stack, pushing the rest down by one - Pop - Pull off the top element from the
stack - Empty? - Check to see if any symbols are left
in the stack - Read - Look at the top of the stack, but
dont change it.
13PDA basics
- Begin with an empty stack, with some marker (e.g.
) to show there are no entries in stack. - Begin in start state of control automaton.
- At each step, the state, input element and top
element in the stack determine what to do next. - This could include changing states, pushing an
element onto the stack or popping an element off
the stack.
14Describing a PDA
- In order to build a PDA, you have to answer a
number of questions - What are the states?
- Which of these is the start state?
- Which are the final states?
- What are the input and stack alphabets ? They
might well be different! - What is the empty stack symbol?
- Most importantly, what are the allowed
transitions?
15Transition functions
- Three inputs state i, input character a and
stack character C - Decisions
- What is the next state, j ?
- What do we do to the stack?
- pop, push(A), or nop (no operation)
-
- Shorthand ? i, a, C, push(A), j?
- or perhaps T(i, a, C) (push(A), j)
a, C / push(A)
i
j
16Instantaneous description
- As the PDA reads in the input, it changes states
and modifies the stack. - To describe the process at any instant, we need
to keep track of three things - 1) the present state
- 2) what input characters are left
- 3) what is on the stack
- This is the instantaneous description
- ?current state, unconsumed input, stack contents?
17Sample transition
- Suppose your PDA has the instantaneous
description - ?0, abba, YZ?
- And say that you transition to
- 1, bba, XYZ?
- This implies the PDA must include a transition
function like the following - ? 0, a, Y, push(X), 1?
18Example emptying a stack
- Ignoring any input, take a stack with symbols on
it and clear it. - Plan
- If there is an X or a Y on the stack, remove it.
- Repeat if necessary
- Stop if the stack shows , the empty stack
symbol.
19Two state PDA
?, nop
Two states 0 (start), 1 (final) Input a, b
Stack X, Y,
1
0
?, X pop
?, Y pop
Three transitions ? 0, ?, X, pop, 0? ? 0, ?,
Y, pop, 0? ? 0, ??, , nop, 1?
20Consider the stack XXYY
- Step by step instantaneous description
- ? 0, ?, XXYY? - Start
- ? 0, ?, XYY? - T1
- ? 0, ?, YY? - T1
- ? 0, ?, Y? - T2
- ? 0, ?, ? - T2
- ? 1, ?, ? - T3
- Final state, nothing left on the input, so halt.
21Example an bn
- Build a PDA which recognises the context free
language L an bn n ? 0 - Plan
- 1) Begin reading in the string, and for each a
read, push a Y onto the stack - 2) On the first b change states, and begin
removing one Y from the stack for each b - 3) If you reach the end of the input and have
just cleared the stack, accept the string. - 4) Otherwise reject (e.g. if the stack runs
out before the input more bs than as. )
22Example an bn
- Sample pushdown automaton
a, Y push(Y)
b, Y pop
a, push(Y)
?, nop
b, Y pop
2
0
1
?, nop
Three states 0 (start), 1, 2 (final) Input
alphabet a,b Stack alphabet Y,
23Transition function
- There are 6 allowed transitions for this PDA
- ? 0, a, , push(Y), 0?
- ? 0, a, Y, push(Y), 0?
- ? 0, ?, , nop, 2?
- ? 0, b, Y, pop, 1?
- ? 1, b, Y, pop, 1?
- 0, ??, , nop, 2?
- Here, reading shows the stack is empty.
24Consider the string aabb
- Step by step instantaneous description
- ? 0, aabb, ? - Start
- ? 0, abb, Y? - T1
- ? 0, bb, YY? - T2
- ? 1, b, Y? - T4
- ? 1, ?, ? - T5
- ? 2, ?, ? - T6
- Final state, nothing left on the input, so accept
the string.
25Rejecting strings
- Here we have used a similar convention as we did
previously for NFAs, in that not all possible
paths are shown. - A string is rejected if
- It goes through the entire input without
reaching a final state - aab - It reaches an instantaneous description for
which there are no transitions - aba - It attempts to pop the empty stack - abb
26Determinism vs non-determinism
- As was the case for finite automata, PDAs can be
either deterministic or non-deterministic - A deterministic PDA has only one possible result
for every combination of state, input character
and stack character - The example we looked at was non-deterministic
because you have two options from the starting
configuration - 0, a, , push(Y), 0?
- ? 0, ?, , nop, 2?
27Example
- Consider another example, to find a PDA to
recognise any string over a, b which has
exactly the same number of as and bs. - Plan
- Keep track of the difference between the
number of as and bs youve seen by changing
the symbols in the stack. - Use one symbol (X) if youve seen more as and
another (Y) if youve seen more bs
28Two state PDA
?, nop
1
0
Two states 0 (start), 1 (final) Input a, b
Stack X, Y,
a, push(X)
a, X push(X)
b, X pop
b, push(Y)
b, Y push(Y)
a, Y pop
29Transition function
- There are 7 allowed transitions for this PDA
- ? 0, a, , push(X), 0?
- ? 0, a, X, push(X), 0?
- ? 0, a, Y, pop, 0?
- 0, b, , push(Y), 0?
- ? 0, b, Y, push(Y), 0?
- ? 0, b, X, pop, 0?
- 0, ??, , nop, 1?
- This is non-deterministic in the same way as the
previous example. Whenever youve seen as many
as as bs, you can accept it or look for more
symbols
30Consider the string abbbaa
- ? 0, abbbaa, ? - Start
- ? 0, bbbaa, X? - T1
- ? 0, bbaa, ? - T6
- ? 0, baa, Y? - T4
- 0, aa, YY? - T5
- 0, a, Y? - T5
- ? 0, ?, ? - T5
- ? 1, ?, ? - T6
31PDAs and context free languages
- The context free languages are exactly the
languages that are accepted by non-deterministic
pushdown automata! - This can be shown by construction
- A context-free grammar can be found which
generates the language accepted by any PDA. - A PDA can be generated which accepts any given
context-free language
32Does determinism matter?
- Recall that for finite automata, DFAs and NFAs
accepted the same (regular) languages - Does the same hold true for the deterministic and
non-deterministic PDAs? - NO! In fact, deterministic PDAs cannot
recognise the whole family of context-free
languages.
33Example Palindromes
- Design a PDA which recognises even length
palindromes L w wR w ?a, b - Plan
- Read in a string and save it to the stack.
- At each step, consider the possibility you might
have reached the middle. - Once reaching the midpoint, start working back,
removing things from stack if they match what was
saved.
34Palindrome PDA
?, X nop
?, Y nop
?, nop
0
1
2
a, ? push(X)
a, X pop
Each step, test if you are in the middle of the
string
b, ? push(Y)
b, Y pop
? X,Y,
Load stack Reverse out
35Consider the string aabbaa
- ? 0, aabbaa, ? - Start
- ? 0, abbaa, X? - Load stack
- ? 0, bbaa, XX? - Load stack
- ? 0, baa, YXX? - Load stack
- ? 1, baa, YXX? - Is this middle?
- ?1, aa, XX? - Pop stack
- ? 1, a, X? - Pop stack
- ? 1, ?, ? - Pop stack
- ? 2, ?, ? - Done
36DPDAs versus NPDAs
- This was an example of a non-deterministic PDA,
because from state 0, it branches, either loading
another letter or trying to take letters off. - This could only be done non-deterministically. A
deterministic PDA would need to know when to
start removing letters from the stack, and this
could take an arbitrary long time. - Thus an NPDA can recognise this language, but a
DPDA cannot.
37DPDA language
- DPDAs recognise regular languages and also some
which are not regular, but not all of the context
free languages
Context-free
Regular
DPDAs
NPDAs
38Still to come
- Computing applications
- Compilers, Parsing, Parse trees
- YACC
- Look ahead grammars
- Ambiguity
39Applications of pushdown automata
- One of the most important applications of PDAs
in in the construction of compilers - Compilers must take an ASCII program and
translate it into computer commands - To do this it first must determine
- 1) Does the syntax of the program make sense?
- 2) What is the meaning of the program? What
operations does it want done and in what order? - These are performed by a PDA called a parser.
40Parsing
- To parse is to explain the meaning of a sentence
(or string) through its grammatical derivation - We usually attach a meaning or value to strings
in our lives - e.g., 3 4 means 7
- Sometimes the meaning is ambiguous
- 3 4 2
- Does this mean 3? (3 4) 2
- Or does it mean 1? 3 (4 2)
41Syntax and meaning
- We would know which if we knew how the string was
derived, so a strings meaning is determined by
its derivation - In some languages, each string has only one
possible derivation, so the meaning is
unambiguous - A grammar is ambiguous if some string has two
different derivations (parse trees)
42Example arithmetic expressions
- Consider a grammar fragment for simple arithmetic
expressions - E ? E E a b
- This grammar is ambiguous. For example, the
string a b a has two distinct derivations - E ? E E ? a E ? a E E ? a b E
- a b a
- E ? E E ? E E E ? a E E
- ? a b E ? a b a
43Example numerical expressions
These correspond to different parse trees and
different meanings
(a b) a
a (b a)
44Implementation of Parsers
- Parsers construct a derivation tree from a
program in a similar way - They are implemented by deterministic PDAs which
accept a subset of the CFLs - The grammars they use are often those called
look-ahead grammars, LR(k), which can parse
from the bottom up by looking at a certain number
of lead characters in a string - In Unix, there are tools which convert grammars
into parsing programs - YACC - Yet Another Compiler Compiler
- BISON