Title: Pushdown Automata
1Chapter 7
2Context Free Languages
- A context-free grammar is a simple recursive way
of specifying grammar rules by which strings of a
language can be generated. - All regular languages, and some non-regular
languages, can be generated by context-free
grammars.
3Context Free Languages
- Regular Languages are represented by regular
expressions - Context Free Languages are represented by a
context-free grammar
4Context Free Languages
- Regular Languages are accepted by deterministic
finite automata (DFAs). - Context Free Languages are accepted by pushdown
automata, which are non-deterministic finite
state automata with a stack as auxiliary memory. - Note that pushdown automata which are
deterministic can represent some but not all of
the context-free languages.
5Definition
- A context-free grammar (CFG) is a 4-tuple
- G (V, T, S, P) where V and T are disjoint
sets, S ? V, and P is a finite set of rules of
the form A ? x, where A ? V and x ? (V ? T). - V non-terminals or variables
- T terminals
- S Start symbol
- P Productions or grammar rules
6Example
- Let G be the CFG having productions
- S ? aSa bSb c
- Then G will generate the language
- L xcxR x ? a, b
- This is the language of odd palindromes -
palindromes with a single isolated character in
the middle.
7Memory
- What kind of memory do we need to be able to
recognize strings in L, using a single
left-to-right pass? - Example aaabbcbbaaa
- We need to remember what we saw before the c
- We could push the first part of the string onto a
stack and, when the c is encountered, start
popping characters off of the stack and matching
them with each character from the center of the
string on to the end. - If everything matches, this string is an odd
palindrome.
8Counting
- We can use a stack for counting out equal numbers
of as and bs on different sides of a center
marker. - Example L ancbn aaaacbbbb
- Push the as onto the stack until you see a c,
then pop an a off and match it with a b whenever
you see a b. - If we finish processing the string successfully
(and there are no more as on our stack), then
the string belongs to L.
9Definition 7.1 Pushdown Automaton
- A nondeterministic pushdown automaton (NPDA) is a
7-tuple - M (Q, S, G, q0, d, z, F), where
- Q is a finite set of states
- S is the input alphabet (a finite set)
- G is the stack alphabet (a finite set)
- d Q ? (S ? l) ? G ? (finite subsets of Q ?
G) - is the transition function
- q0 ? Q is the start state
- z ? G is the initial stack symbol
- F ? Q is the set of accepting states
10Production rules
- So we can fully specify any NPDA like this
- Q q0, q1, q2, q3
- S a, b
- G 0, 1
- q0 is the start state
- z (the empty stack marker)
- F q3
- d is the transition function
11Production rules
- d(q0, a, ) ? (q1, 1), (q3, ?)
- d(q0, ?, ) ? (q3, ?)
- d(q1, a, 1) ? (q1, 11)
- d(q1, b, 1) ? (q2, ?)
- d(q2, b, 1) ? (q2, ?)
- d(q2, ?, ) ? (q3, ?)
- This PDA is nondeterministic. Why?
12Production rules
- Note that in an FSA, each rule told us that when
we were in a given state and saw a specific
character, we moved to a specified state. - In a PDA, we also need to know what is on the
stack before we can decide what new state to move
to. After moving to the new state, we also need
to decide what to do with the stack.
13Working with a stack
- You can only access the top element of the stack.
- To access the top element of the stack, you have
to POP it off the stack. - Once the the top element of the stack has been
POPped, if you want to save it, you need to PUSH
it back onto the stack immediately. - Characters from the input string must be read one
character at a time. You cannot back up. - The current configuration of the machine
includes the current state, the remaining
characters left in the input string, and the
entire contents of the stack
14Lanbnn?0 ? a
- In the previous example we had two key
transitions - d(q1, a, 1) ? (q1, 11), which adds a 1 to the
stack when an a is read - d(q1, b, 1) ? (q2, ?), which removes a 1 when a
b is encountered - We also have the rule d(q0, a, ) ? (q1, 1),
(q3, ?), which allows us to transition directly
to the acceptance state, q3, if we initially see
an a
15Instantaneous description
- Given the transition function
- d Q ? (S ? l) ? G ? (finite subsets of Q ?
G) - a configuration, or instantaneous description, of
M is a snapshot of the current status of the PDA.
It consists of a triple - (q, w, u)
- where
- q ? Q (q is the current state of the control
unit) - w ? S (w is the remaining unread part of the
input string), and - u ? G (u is the current stack contents, with
the leftmost symbol indicating the top of the
stack)
16Instantaneous description
- To indicate that the application of a transition
rule has caused our PDA to move from one state to
another, we use the following notation - (q1, aw, bx) - (q2, w, yx)
- To indicate that we have moved from one state to
another via the application of several rules, we
use - (q1, aw, bx) - (q2, w, yx)
- or
- (q1, aw, bx) -M (q2, w, yx) to indicate a
specific PDA
17Definition 7.2 Acceptance
- If M (Q, S, G, d, q0, z, F) is a push-down
automaton and w ? S, the string w is accepted by
M if - (q0, w, ) -M (qf, l, u)
- for some u ? G and some qf ? F.
- This means that we start at the start state, with
the stack empty, and after processing the string
w, we end up in an accepting state, with no more
characters left to process in the original
string. We dont care what is left on the stack. - This is called acceptance by final state.
182 types of acceptance
- An alternate type of acceptance is acceptance by
empty stack. - This means that we start at the start state, with
the stack empty, and after processing the string
w, we end up with no more characters left to
process in the original string, and no more
characters (except the empty-stack character)
left on the stack.
192 types of acceptance
- The two types of acceptance are equivalent if we
can build a PDA to accept language L via
acceptance by final state, we can also build a
PDA to accept L via acceptance by empty stack.
20Definition 7.2 Acceptance
- A language L ? S is said to be accepted by M if
L is precisely the set of strings accepted by M.
In this case, we say that L L(M).
21Determinism/non-determinism
- A deterministic PDA must have only one transition
for any given pair of input symbol/ stack symbol. - A non-deterministic PDA (NPDA) may have no
transition or several transitions defined for a
particular input symbol/stack symbol pair. - In an NPDA, there may be several paths to follow
to process a given input string. Some of the
paths may result in accepting the string. Other
paths may end in a non-accepting state. An NPDA
can guess which path to follow through the
machine in order to accept a string.
22Example anbcn
b, a / a
l, /
q2
q0
q1
a, / a a, a / aa
c, a / l
L anbcn ngt0
23Production rules for anbcn
Rule State Input Top of Stack Move(s)
1 q0 a (q0, a)
2 q0 a a (q0, aa)
3 q0 b a (q1, a)
4 q1 c a (q1, ?)
5 q1 ? (q2, )
24Example aabcc
b, a / a
l, /
q2
q0
q1
a, / a a, a / aa
c, a / l
a a b c c
25Example aabcc
b, a / a
l, /
q2
q0
q1
a, / a a, a / aa
c, a / l
a
a b c c
26Example aabcc
b, a / a
l, /
q2
q0
q1
a, / a a, a / aa
c, a / l
a
a
b c c
27Example aabcc
b, a / a
l, /
q2
q0
q1
a, / a a, a / aa
c, a / l
a
a
c c
28Example aabcc
b, a / a
l, /
q2
q0
q1
a, / a a, a / aa
c, a / l
a
c
29Example aabcc
b, a / a
l, /
q2
q0
q1
a, / a a, a / aa
c, a / l
?
30Example aabcc
b, a / a
l, /
q2
q0
q1
a, / a a, a / aa
c, a / l
?
31Example Odd palindrome
c, / c, a / a c, b / b
l, /
q2
q0
q1
a, a / l b, b / l
a, / a b, / b a, a / aa b, a / ba a, b /
ab b, b / bb
L xcxR x ? a, b
32Production rules for Odd palindromes
Rule State Input Top of Stack Move(s)
1 q0 a (q0, a)
2 q0 b (q0, b)
3 q0 a a (q0, aa)
4 q0 b a (q0, ba)
5 q0 a b (q0, ab)
6 q0 b b (q0, bb)
7 q0 c (q1, )
8 q0 c a (q1, a)
9 q0 c b (q1, b)
10 q1 a a (q1, ?)
11 q1 b b (q1, ?)
12 q1 ? (q2, )
33Processing abcba
Rule Resulting state Unread input Stack
(initially) q0 abcba
1 q0 bcba a
4 q0 cba ba
9 q1 ba ba
11 q1 a a
10 q1 -
12 q2 -
accept
34Processing ab
Rule Resulting state Unread input Stack
(initially) q0 ab
1 q0 b a
4 q0 - ba
crash
35Processing acaa
Rule Resulting state Unread input Stack
(initially) q0 acaa
1 q0 caa a
8 q1 aa a
10 q1 a
12 q2 a
crash
36Crashing
What is happening in the last example? We
process the first 3 letters of acaa and are in
state q1. We have an a left to process in our
input string. We have the empty-stack marker as
the top character in our stack. Rule 12 says
that if we are in state q1 and have on the
stack, then we can make a free move (a l-move) to
q2, pushing back onto the stack. So this is
legal. So far, the automaton is saying that it
would accept aca. But note that we are in state
q2 and we still have the last a in our input
string left to process. There are no rules like
this. On the next move, when we try to process
the a, the automaton will crash, rejecting acaa.
37Example Even palindromes
Consider the following context-free language L
wwR w ? a, b This is the language of
all even-length palindromes over a, b.
38Production rules for Even palindromes
Rule State Input Top of Stack Move(s)
1 q0 a (q0, a)
2 q0 b (q0, b)
3 q0 a a (q0, aa)
4 q0 b a (q0, ba)
5 q0 a b (q0, ab)
6 q0 b b (q0, bb)
7 q0 ? (q1, )
8 q0 ? a (q1, a)
9 q0 ? b (q1, b)
10 q1 a a (q1, ?)
11 q1 b b (q1, ?)
12 q1 ? (q2, )
39Example Even palindromes
This PDA is non-deterministic. Note moves 7, 8,
and 9. Here the PDA is guessing where the
middle of the string occurs. If it guesses
correctly (and if the PDA doesnt accept any
strings that arent actually in the language),
this is OK.
40Example Even palindromes
(q0, baab, ) - (q0, aab, b) - (q0,
ab, ab) - (q1, ab, ab) - (q1, b,
b) - (q1, ?, ) - (q2, ?, )
(accept)
41Example All palindromes
Consider the following context-free language L
pal x ? a, b x xR This is the
language of all palindromes, both odd and even,
over a, b.
42Production rules for All palindromes
Rule State Input Top of Stack Move(s)
1 q0 a (q0, a), (q1, )
2 q0 b (q0, b), (q1, )
3 q0 a a (q0, aa), (q1, a)
4 q0 b a (q0, ba), (q1, a)
5 q0 a b (q0, ab), (q1, b)
6 q0 b b (q0, bb), (q1, b)
7 q0 ? (q1, )
8 q0 ? a (q1, a)
9 q0 ? b (q1, b)
10 q1 a a (q1, ?)
11 q1 b b (q1, ?)
12 q1 ? (q2, )
43Production rules for All palindromes
At each point before we start processing the
second half of the string, there are three
possibilities 1. The next input character is
still in the first half of the string and needs
to be pushed onto the stack to save it. 2.The
next input character is the middle symbol of an
odd-length string and should be read and thrown
away (because we dont need to save it to match
it up with anything). 3. The next input
character is the first character of the second
half of an even-length string.
44Production rules for All palindromes
Why is this PDA non-deterministic? Note the
first 6 rules of this NPDA. This PDA is
obviously non-deterministic, because in each of
these rules, there are two moves that may be
chosen.
45Production rules for All palindromes
Each move in a PDA has three pre-conditions the
current state you are in, the next character to
be processed from the input string, and the top
character on the stack. In rule 1, our current
state is q0, the next character in the input
string is a, and the top character on the stack
is the empty-stack marker. But there are two
possible moves for this one set of
preconditions 1) move back to state q0 and
push a onto the stack or 2) move to state q1
and push onto the stack Whenever we have
multiple moves possible from a given set of
preconditions, we have nondeterminism.
46Definition 7.3
- Let M (Q, S, G, q0, z, A, d), be a pushdown
automaton. M is deterministic if there is no
configuration for which M has a choice of more
than one move. In other words, M is
deterministic if it satisfies both of the
following - For any q ? Q, a ? S ? l, and X ? G, the set
d(q, a, X) has at most one element. - For any q ? Q and X ? G, if d(q, l, X) ? ?, then
d(q, a, X) ? for every a ? S. - A language L is a deterministic context-free
language if there is a deterministic PDA (DPDA)
accepting L.
47Definition 7.3
- If M is deterministic, then multiple moves for a
single input/stack configuration are not allowed.
That is - Given stack Y and input X, there cannot
exist another move with the same stack value and
the same input from the same state. - There may be l-productions, BUT for input l and
stack X, there cannot exist another move with
stack X, from the same state.
48Non-determinism
- Some PDAs which are initially described in a
non-deterministic way can also be described as
deterministic PDAs. - However, some CFLs are inherently
non-deterministic, e.g. - L pal x ? a, b x xR cannot be
accepted by any DPDA.
49Example
- L w ? a, b na(w) gt nb(w)
- This is the set of all strings over the alphabet
a, b in which the number of as is greater than
the number of bs. This can be represented by
either an NPDA or a DPDA.
50Example (NPDA)
- L w ? a, b na(w) gt nb(w)
Rule State Input Top of Stack Move(s)
1 q0 a (q0, a)
2 q0 b (q0, b)
3 q0 a a (q0, aa)
4 q0 b b (q0, bb)
5 q0 a b (q0, ?)
6 q0 b a (q0, ?)
7 q0 ? a (q1, a)
51Example (NPDA)
- What is happening in this PDA?
- We start, as usual, in state q0. If the stack is
empty, we read the first character and push it
onto the stack. Thereafter, if the stack
character matches the input character, we push
both characters onto the stack. If the input
character differs from the stack character, we
throw both away. When we run out of characters
in the input string, then if the stack still has
an a on top, we make a free move to q1 and halt
q1 is the accepting state.
52Example (NPDA)
- Why is it non-deterministic?
- Rules 6 and 7 both have preconditions of the
starting state is q0 and the stack character is
a. But we have two possible moves from here, one
of them if the input is a b, and one of them any
time we want (a l-move), including if the input
is a b. So we have two different moves allowed
under the same preconditions, which means this
PDA is non-deterministic.
53Example (DPDA)
- L w ? a, b na(w) gt nb(w)
Rule State Input Top of Stack Move(s)
1 q0 a (q1, )
2 q0 b (q0, b)
3 q0 a b (q0, ?)
4 q0 b b (q0, bb)
5 q0 a (q1, a)
6 q0 b (q0, )
7 q0 a a (q1, aa)
8 q0 b a (q1, ?)
54Example (DPDA)
- What is happening in this PDA?
- Here being in state q1 means we have seen more
as than bs. Being in state q0 means we have
not seen more as than bs. We start in state q0.
- If we are in state q0 and read a b, we push it
onto the stack. If we are in state q1 and read an
a, we push it onto the stack. Otherwise we dont
push as or bs onto the stack. Any time we read
an a from the input string and pop a b from the
stack, or vice versa, we throw the pair away and
stay in the same state. - When we run out of characters in the input
string, then we halt q1 is the accepting state.
557.2 PDAs and CFLs
- Theorem 7.1 For any context-free language L,
there exists an NPDA M such that L L(M).
567.2 PDAs and CFLs
- Proof
- If L is a context-free language (without ?),
there exists a context-free grammar G that
generates it. - We can always convert a context-free grammar into
Greibach Normal Form. - We can always construct an NPDA which simulates
leftmost derivations in the GNF grammar. - QED
57Greibach Normal Form
- Greibach Normal Form (GNF) for Context-Free
Grammars requires the Context-Free Grammar to
have only productions of the following form - A ? ax
- where a ? T and x ? V. That is,
- Nonterminal ? one Terminal concatenated with a
string of 0 or more Nonterminals -
- Convert the following Context-Free Grammar to
GNF - S ? abSb aa
58Greibach Normal Form
- S ? abSb aa
- Lets fix S ? aa. Get rid of the terminal at the
end by changing this to S ? aA and creating a new
rule, A ? a. - Now lets fix S ? abSb. Get rid of bSb by
replacing the original rule with S ? aX and
creating a new rule, X ? bSb. - Unfortunately, this rule itself needs fixing.
Replace the rule with X ? bSB by creating a new
rule, B ? b.
59Greibach Normal Form
- So, starting with this set of production rules
- S ? abSb aa
- we now have
- S ? aA aX
- X ? bSB
- A ? a
- B ? b
- (other solutions are possible)
607.2 CFG to PDA
To convert a context-free grammar to an
equivalent pushdown automaton 1. Convert the
grammar to Greibach Normal Form (GNF). 2. Write
a transition rule for the PDA that pushes S (the
Start symbol in the grammar) onto the stack. 3.
For each production rule in the grammar, write an
equivalent transition rule. 4. Write a
transition rule that takes the automaton to the
accepting state when you run out of characters in
the input string and the stack is empty. 5. If
the empty string is a legitimate string in the
language described by the grammar, write a
transition rule that takes the automaton to the
accepting state directly from the start state.
617.2 CFG to PDA
How do you write the transition rules? Its
really simple 1. Every rule in the GNF grammar
has the following form One variable ? one
terminal 0 or more variables Example A ? bB
627.2 CFG to PDA
2. The left side of each transition rule is the
precondition, a triple that specifies what
conditions must be true before you can move to
the next state. The precondition consists of the
current state, the character just read in from
the input string, and the symbol just popped off
the top of the stack. So write a transition rule
that has as its precondition the current state,
the terminal from the grammar rule, and the
left-hand variable from the grammar rule. Our
grammar rule A ? bB The left side of the
transition rule d(q1, b, A) (What about the B?
See the next slide.)
637.2 CFG to PDA
3. The right side of the transition rule is the
post-condition. The post-condition consists of
the state to move to, and the symbol(s) to push
onto the stack. So the post-condition for this
transition rule will be the state to move to, and
the variable (or variables) on the right-hand
side of the grammar rule. Example d(q1, b, A)
(q1, B) If there are no variables on the
right-hand side of the grammar rule, dont push
anything onto the stack. In the transition rule,
put a l where you would show what symbol to push
onto the stack. Example A ? ano variable
here would be represented in transition rule
form as d(q1, a, A) (q1, l)
647.2 CFG to PDA
How do you know which state to move to? Its
really simple 1. We always start off with this
special transition rule d(q0, l, ) (q1,
S) This rule says a. begin in state q0 b.
pop the top of the stack. If it is (the empty
stack symbol), then c. take a free move to q1
without reading anything from the input string,
push back onto the stack, and then push S (the
Start symbol in the grammar) onto the stack.
657.2 CFG to PDA
2. We always end up with this special transition
rule d(q1, l, ) (q2, ) This rule says
a. begin in state q1 b. pop the top of the
stack. If it is (the empty stack symbol),
then c. take a free move to q2 without reading
anything from the input string, and push back
onto the stack. In order to be in state q1 we
previously must have pushed something onto the
stack. If we now pop the stack and find the
empty stack symbol, it tells us that we have
finished processing the string, so we can move on
to state q2.
667.2 CFG to PDA
3. Every other transition rule leaves us in
state 1.
677.2 CFG to PDA
Here is a grammar in GNF G (V, T, S, P),
where V S, A, B, C, T a, b, c,, S
S, and P S ? aA A ? aABC bB a B ?
b C ? c Lets convert this grammar to a PDA.
687.2 CFG to PDA
Grammar rule PDA transition rule (none) d(q0
, l, ) (q1, S) S ? aA d(q1, a, S)
(q1, A) A ? aABC d(q1, a, A) (q1,
ABC)) A ? bB d(q1, b, A) (q1, B) A ?
a d(q1, a, A) (q1, l) B ? b d(q1, b, B)
(q1, l) C ? c d(q1, c, C) (q1,
l) (none) d(q1, l, ) (q2, ) So the
equivalent PDA can be defined as M (q0, q1,
q2, T, V ? , d, q0, , q2), where d is the
set of transition rules given above.
697.2 CFG to PDA
Is this grammar deterministic? Lets group the
transition rules together so that all rules with
the same precondition are described in a single
rule. 1. d(q0, l, ) (q1, S) 2. d(q1, a, S)
(q1, A) 3. d(q1, a, A) (q1, ABC), (q1,
l) 4. d(q1, b, A) (q1, B) 5. d(q1, b, B)
(q1, l) 6. d(q1, c, C) (q1, l) 7. d(q1, l,
) (q2, ) Here we see that rule three has
the same precondition but two different possible
post-conditions. Thus this PDA is
nondeterministic.
707.2 CFG to PDA
Lets follow the steps that the PDA would go
through to process the string aaabc, starting
with the initial precondition (q0, aaabc, ) -
(q1, aaabc, S) rule 1 - (q1, aabc, A)
rule 2 - (q1, abc, ABC) rule 3, first
alternative - (q1, bc, BC) rule 3,
second alternative - (q1, c, C) rule
5 - (q1, l, ) rule 6 - (q2, l,
) rule 7 Notice that this corresponds to the
following leftmost derivation in the grammar S
? aA ? aaABC ? aaaBC ? aaabC ? aaabc
717.2 CFG to PDA
In fact, this is exactly what our set of PDA
transformational rules does. It carries out a
leftmost derivation of any string in the language
described by the CFG. After each step, the
remaining unprocessed sentential form (the as-yet
unprocessed variables) is on the stack, as can be
seen by looking down the post-condition column
above. This corresponds precisely to the
left-to-right sequence of unprocessed variables
in each step of the leftmost derivation given
above.
727.2 Alternative Approach to Constructing a PDA
from a CFG
Let G (V, T, S, P) be a context-free grammar.
Then there is a push-down automaton M so that
L(M) G. Can we generate an NPDA from a CFG
without converting to GNF first? Yes.
737.2 Alternative Approach to Constructing a PDA
from a CFG
In this approach, the plan is to let the
production rules directly reflect the
manipulation of the stack implied by the grammar
rules. With this method, you do not need to
convert to GNF first, but the technique is harder
to understand. The beginning and ending
production rules are the same in the GNF method.
747.2 Alternative Approach to Constructing a PDA
from a CFG
So we will always need to have the following 2
sets of production rules in our PDA d (q0, l,
) (q1, S) and d (q1, l, ) (q2, )
757.2 Alternative Approach to Constructing a PDA
from a CFG
- The other production rules are derived from the
grammar rules - If you pop the top of the stack and it is a
variable, dont read anything from the input
string. Push the right side of the grammar rule
involving this variable onto the stack. - If you pop the top of the stack and it is a
terminal, read the next character in the input
string. Dont push anything onto the stack.
767.2 Constructing a PDA from a CFG
Given G (V, T, S, P), construct M (Q, S, G,
d, q0, , F, ), with Q q0, q1, q2 G V ? S
? ? V ? S F q2 d (q0, l, ) (q1, S
) For A ? V, d (q1, l, A) (q1, a), where A
? a For a ? S, d (q1, a, a) (q1, l) d (q1,
l, ) (q2, )
777.2 Constructing a PDA from a CFG
Language L x ? a, b na(x) gt
nb(x) Context-free grammar S ? a aS bSS
SSb SbS
787.2 Constructing a PDA from a CFG
S ? a aS bSS SSb SbS Let M (Q, S, G,
q0, z, A, d), be a pushdown automaton as
previously described. The production rules will
be
Rule State Input Top of Stack Move(s)
1 q0 ? (q1, S)
2 q1 ? S (q1, a), (q1, aS), (q1, bSS), (q1, SSb), (q1, SbS)
3 q1 a a (q1, ?)
4 q1 b b (q1, ?)
5 q1 ? (q2, )
797.2 CFG to PDA
Lets follow the steps that the PDA would go
through to process the string baaba, starting
with the initial precondition (q0, aaabc, )
- (q1, baaba, S) rule 1 - (q1, baaba,
bSS) rule 2, 3rd alternative - (q1, aaba,
SS) rule 4 - (q1, aaba, aS) rule 2, 1st
alternative - (q1, aba, S) rule 3 -
(q1, aba, SbS) rule 2, 5th alternative -
(q1, aba, abS) rule 2, 1st alternative -
(q1, ba, bS) rule 3 - (q1, a, S) rule
4 - (q1, a, a) rule 2, 1st
alternative - (q1, ?, ) rule 3 - (q2,
?, ) rule 5
807.2 PDA to CFG
- Theorem 7.2 If L L(M) for some NPDA, then L
is a context-free language. - Proof
- Convert the NPDA into a particular form (if
needed). - From the NPDA, generate a corresponding
context-free grammar, G, where the language
generated by G L(M). - Since any language generated by a CFG is a
context-free language, L must be a CFL.
817.2 PDA to CFG
It is possible to convert any PDA into a CFG.
In order to do this, we need to convert our PDA
into a form in which 1. there is just one final
state, which is entered iff the stack is empty,
and 2. each transition rule must either
increase or decrease the stack content by one.
This means that all transition rules must be of
the form a. d (qi, a, A) (qj, l) or b. d
(qi, a, A) (qj, BC)
827.2 PDA to CFG
- For transition rules that delete a variable from
the stack, we will have production rules in the
grammar that correspond to - (qi, A, qj) ? a
- For transition rules that add a variable to the
stack, we will have production rules in the
grammar that correspond to - (qi, A, qj) ? a(qi, B, ql)(ql, C, qk)
- The start variable in the grammar will correspond
to - (q0, , qf)
837.2 PDA to CFG
We will not go into the details of this process,
as it is tedious, and the grammar rules derived
are often somewhat complicated, and dont look
much like the rules we are used to seeing just
remember that it can be done.
847.4 Parsing
- Starting with a CFG G and a string x in L(G), we
would like to be able to parse x, or find a
derivation for x. - There are two basic approaches to parsing,
top-down parsing and bottom up parsing.
857.4 Parsing
- Remember that Chomsky Normal Form (CNF) requires
that every production be one of these two types - A ? BC
- A ? a
- If G is in Chomsky Normal Form, we can bound the
length of a derivation string. Every rule in a
CNF grammar replaces a variable with either two
variables or a single terminal. We always start
off with a single variable, S. Therefore, every
CNF derivation must have 2n - 1 rule
applications, where n is the number of characters
in the input string.
86Parsing
- Example
- S ? SA
- A ? AA a
- Starting with the S symbol, to derive the string
aaa we would need 5 rule applications - S ? SA ? AAA ? aAA ? aaA ? aaa
- If we want to automate this process, using a
nondeterministic PDA may require following many
alternatives a deterministic PDA (if available
for this grammar) is more efficient.
87LL(k) grammars
A grammar is an LL(k) grammar if, while trying
to generate a specific string, we can determine
the (unique) correct production rule to apply,
given the current character from the input
string, plus a look-ahead of the next k-1
characters. A simple example is the
following S ? aSb ab
88LL(k) grammars
S ? aSb ab Assume that we want to generate
the string ab. We look at the first character,
which is an a, plus a look-ahead of one more (a
b), for a total of 2 characters. We MUST use the
second rule to produce this string.
89LL(k) grammars
S ? aSb ab Now assume that we want to
generate the string aabb. We look at our current
character, the first symbol (an a), plus one more
(another a), and we immediately know that we must
use the first rule. But we still have more
letters to produce, so we make the second
character our current character, and look ahead
one more character (the first b), and now we have
ab, so we know we must use the second rule.
90LL(k) grammars
S ? aSb ab This is an LL(2) grammar. All
LL(k) grammars are deterministic context-free
grammars, but not all deterministic context-free
grammars are LL(k) grammars. LL(k) grammars are
often used to define programming languages. If
you take Compilers, you will study this in more
depth.
91Top Down Parsing
S ? T T ? TT l This is the language of
balanced strings of square brackets. The is a
special end-marker added to the end of each
string. This CFG is non-deterministic since there
are two rules for T. Its grammar is not in
CNF. We can convert this to a DCFG by using
look-ahead.
92Top Down Parsing
Here is the derivation of (q0, , ) -
(q1, , S) S - (q1, ,
T) ? T - (q, , TT) ? TT -
(q1, , TT) - (q, , T) ?
T - (q1, , T) - (q, l, ) ?
- (q1, l, ) - (q2, l, )
93Top Down Parsing
Top-down parsing involves finding the left hand
(precondition) part of the production rule on the
stack and replacing it with the right hand
(postproduction) sides. In a way, the PDA is
saving information so that it can backtrack if,
during parsing, it finds that it has made the
wrong choice of how to process a string.
94Left recursion
Example T ? TT Here the first symbol on the
right side is the same as the variable on the
left side. With left recursion, the PDA will
never crash, and can never backtrack. There is an
easy method for eliminating left-recursion.
95Recursive Descent
- LL(x) grammars perform a left to right scan
generating a leftmost derivation with a
look-ahead of x characters. - Recursive descent means that the PDA contains a
collection of mutually recursive procedures
corresponding to the variables in the grammar.
LL(1) grammars perform recursive descent parsing. - Recursive descent is deterministic
96Bottom Up Parsers
- Input symbols are read in and pushed (shifted)
onto the stack until the stack matches the
right-hand side of a production rule then the
string is popped off the stack (reduced) and
replaced by the variable on the left side of that
production rule. - Bottom-up parsers perform a rightmost derivation.
- Bottom-up parsers can be deterministic under some
conditions