Title: CS 3240: Languages and Computation
1CS 3240 Languages and Computation
- Chomsky Normal Form andPushdown Automata
2Review
- A context-free grammar (V, S, R, S) is a grammar
where all rules are of the form A ? x, with
A?V and x?(V?S) - A string is accepted if there is a derivation
from S to the string - Representation of derivation by ? or parse
trees - Left-most and right-most derivations
3Review Ambiguity
- A string w?L(G) is derived ambiguously if it has
more than one derivation tree (or equivalently
if it has more than one leftmost derivation (or
rightmost)). - A grammar is ambiguous if some strings are
derived ambiguously. - Some languages are inherently ambiguous
4Resolution of Ambiguities
- Some ambiguities are inessential but some others
must be resolved - The following grammar is ambiguous
- exp ? exp op exp ( exp ) numberop ? -
- Sample ambiguous strings 123 and 1-2-3
- Resolution of ambiguity
- Precedence has higher precedence than and -
- Left-association perform ops from left to right
- Full parenthesization
- Can we revise the grammar rules to incorporate
these techniques?
5Resolution of Ambiguities
- Precedence Group operators into different groups
make operations with lower precedence closer to
the root - exp ? exp addop exp termaddop ? -term ?
term mulop term factormulop ? factor ? ( exp
) number - Associativity allow recursion only on left
- exp ? exp addop term termterm ? term mulop
factor factor
6Dangling Else Statement
- Ambiguous grammar
- statement ? if-stmt other
- if-stmt ? if ( exp) statement if ( exp)
statement else statement - exp ? ...
- Resolution
- Bracketing with endif (e.g., shell script)
- Revise the grammar
- statement ? matched-stmt unmatched-stmt
- matched-stmt ? if ( exp) matched-stmt else
matched-stmt other - unmatched-stmt ? if ( exp) statement if ( exp)
matched-stmt else unmatched-stmt - In practice, compilers typically use
disambiguating rules instead of changing grammar
rules
7Chomsky normal form
- Method of simplifying a CFG
- Definition A context-free grammar is in Chomsky
normal form if every rule is of one of the
following forms - A ? BC
- A ? a
- where a is any terminal and A is any variable,
and B, and C are any variables or terminals other
than the start variable - if S is the start variable then
- the rule S ? e is the only permitted ? rule
8CFGs and Chomsky normal form
- Theorem Any context-free language is generated
by a context-free grammar in Chomsky normal form. - Proof idea Convert any CFG to one in Chomsky
normal form by removing or replacing rules in
wrong form - Add a new start symbol
- Eliminate e rules of the form A ? e
- Eliminate unit rules of the form A ? B
- Convert remaining rules into proper form
- Example (work out on whiteboard)
- S ?ASA aB
- A ?B S
- B ?b ?
9Convert a CFG to Chomsky normal form
- Add a new start symbol
- Create the following new rule
- S0 ? S
- where S is the start symbol and S0 is not used
in the CFG
10Convert a CFG to Chomsky normal form
- Eliminate all e rules A ? e, where A is not the
start variable - For each rule with an occurrence of A on the
right-hand side, add a new rule with the A
deleted - R ? uAv becomes R ? uAv uv
- R ? uAvAw becomes R ? uAvAw uvAw uAvw uvw
- If we have R ? A, replace it with R ? e unless
we had already removed R ? e
11Convert a CFG to Chomsky normal form
- Eliminate all unit rules of the form A ? B
- For each rule B ? u, add a new rule A ? u, where
u is a string of terminals and variables, unless
this rule had already been removed - Repeat until all unit rules have been replaced
12Convert a CFG to Chomsky normal form
- Convert remaining rules into proper form
- Whats left?
- Replace each rule A ? u1u2uk, where k ? 3 and ui
is a variable or a terminal with k-1 rules - A ? u1A1 A1 ? u2A2 Ak-2 ? uk-1uk
13Chomsky Hierarchy
- Chomsky Hierarchy of grammars
- Type 0 unrestricted grammars
- Type 1 context-senstive grammars
- Type 2 context-free grammars
- Type 3 regular
- Named after Noam Chomsky who made seminal
contributions to the field of theoretical
linguisticsin the 60s. (cf. Chomsky hierarchyof
languages).
1928- Linguist / Political theorist
14Pushdown automata
- Similar to finite automata, but for CFGs
- Finite automata are not adequate for CFGs
because we cannot keep track of what weve done - At any point, we only know the current state, not
previous states - Need memory
- PDAs are finite automata with a stack
15Finite automata and PDA schematics
State control
FA
State control
PDA
Stack Infinite LIFO (last in first out) device
16Example
read e pop off stack
read e push e on stack
read e push on stack
read 0 push 0 on stack
read 1 pop 0 off stack
Language accepted 0n1n n ? 0
17Differences between PDAs and NFAs
- Transitions read a symbol of the string and push
a symbol onto or pop a symbol off of the stack - Stack alphabet is not necessarily the same as the
alphabet for the language - e.g., marks bottom of stack in previous (0n1n)
example
18Equivalence of PDAs and CFGs
- Theorem A language is context free if and only
if some pushdown automaton recognizes it - Proved in two lemmas one for the if direction
and one for the only if direction -
- We will only do the only if step i.e., show
that every context-free language has an
associated PDA
19CFGs are recognized by PDAs
- Lemma If a language is context free, then some
pushdown automaton recognizes it - Proof idea Construct a PDA following CFG rules
20Constructing the PDA
- You can read any symbol in ? when that symbol is
at the top of the stack - Transitions of the form a,a?e
- The rules will be pushed onto the stack when a
variable A is on top of the stack and there is a
rule A?w, you pop A and push w - You can go to the accept state only if the stack
is empty
21Idea of PDA construction for A?xBz
22Actual construction for A?xBz
In an abuse of notation, we say ?(q,e,A)(q,xBz)
23Constructing the PDA
- Q qstart, qloop, qaccept?E, where E is the
set of states used for replacement rules onto the
stack - ? (the PDA alphabet) is the set of terminals in
the CFG - ? (the stack alphabet) is the union of the
terminals and the variables and (or some
suitable placeholder)
24Constructing the PDA
- ? is comprised of several rules
- ?(qstart,e,e)(qloop,S)
- Start with placeholder on the stack and with the
start variable - ?(qloop,a,a)(qloop,e) for every a??
- Terminals may be read off the top of the stack
- ?(qloop,e,A)(qloop,w) for every rule A?w
- Implement replacement rules
- ?(qloop,e,)(qaccept,e)
- Accept when the stack is empty
25CFGs are recognized by PDAs
26Example
27Example
e,S?SS e,S?(S) e,S?()
S
e, e ?S
e, ?e
qstart
qaccept
qloop
(,(?e ),)?e
28Example
( S )
e,S?SS e,S?(S) e,S?()
e, e ?S
e, ?e
qstart
qaccept
qloop
(,(?e ),)?e
29Example
S )
e,S?SS e,S?(S) e,S?()
e, e ?S
e, ?e
qstart
qaccept
qloop
(,(?e ),)?e
(
30Example
SS )
e,S?SS e,S?(S) e,S?()
e, e ?S
e, ?e
qstart
qaccept
qloop
(,(?e ),)?e
(
31Example
( ) S )
e,S?SS e,S?(S) e,S?()
e, e ?S
e, ?e
qstart
qaccept
qloop
(,(?e ),)?e
(
32Example
) S )
e,S?SS e,S?(S) e,S?()
e, e ?S
e, ?e
qstart
qaccept
qloop
(,(?e ),)?e
((
33Example
S )
e,S?SS e,S?(S) e,S?()
e, e ?S
e, ?e
qstart
qaccept
qloop
(,(?e ),)?e
(()
34Example
( ) )
e,S?SS e,S?(S) e,S?()
e, e ?S
e, ?e
qstart
qaccept
qloop
(,(?e ),)?e
(()
35Example
) )
e,S?SS e,S?(S) e,S?()
e, e ?S
e, ?e
qstart
qaccept
qloop
(,(?e ),)?e
(()(
36Example
)
e,S?SS e,S?(S) e,S?()
e, e ?S
e, ?e
qstart
qaccept
qloop
(,(?e ),)?e
(()()
37Example
e,S?SS e,S?(S) e,S?()
e, e ?S
e, ?e
qstart
qaccept
qloop
(,(?e ),)?e
(()())
38Example
e,S?SS e,S?(S) e,S?()
e, e ?S
e, ?e
qstart
qloop
qaccept
(,(?e ),)?e
(()())
39Test
- Test 1 will be 3-355pm on Tuesday, June 6th
(second half of the class will be lecture) - Test will cover regular languages and
context-free languages, including everything we
discussed in class except for the proof of
DFA-gtRE - Test will be close-book and close-note. But you
can bring one page of cheat sheet. - Prepare the cheat sheet yourself! Reorganizing
the material by yourself is a very effective way
to learning!