Title: Why NL is not CF
1Why NL is not CF
- As proven in Stuart Shiebers 1985
paper,Evidence Against Context Freeness,and
explained informally for people whose eyes glaze
over formal proofs
2Motivation
- Important to theoretical linguists and
philosophers of language. Central to Chomskys
innateness hypothesis as well as to critics of
transformational grammars and their derivatives - Should be of at least theoretical interest to
Computational Linguists because computational
processing difficulty of languages is directly
linked to their formal complexity - Personal motivation If I label myself a Linguist
I should have more than just a vague idea of what
this question is about
3Tiny subset of the question
- A. Why are the Swiss German constructions a proof
that at least some NLs are not CFLs? - B. How are the Swiss German examples crucially
different from similar Dutch examples that are
not a proof of the non-CFL-ness of NLs? - C. Are Polish (and other free word order
languages) that contain sentences essentially
identical to the Swiss German ones additional
examples non-CFLs?
4The Chomsky Hierarchyadapted from
www.wikipedia.com a, ß, ? strings of terminals
and nonterminals, A,B nonterminals, x-string of
nonterminals
5Preliminaries Regular Languages
- Productions (A-gtxB, A-gtx, where A and B are
nonterminals and x is any string in the language) - Here is an example of a regular grammar
- Vocabulary of terminalsa, cat, dog, mouse,
chased, scared, squeaked - Vocabulary of non-terminalsS,VP,NP
- S is the only initial symbol.
- S-gta mouse VP
- VP-gtsqueaked
- VP-gtchased NP
- VP-gtscared NP
- NP-gta N
- N-gtcat
- N-gtdog
- N-gtmouse
6Preliminaries Regular Languages
- S-gta mouse VP
- VP-gtsqueaked
- VP-gtchased NP
- VP-gtscared NP
- NP-gta N
- N-gtcat
- N-gtdog
- N-gtmouse
- What sentences can this grammar
recognize/generate? - A mouse squeaked.
- A mouse chased a cat.
- A mouse chased a dog.
- A mouse chased a mouse.
- A mouse scared a cat.
- A mouse scared a dog.
- A mouse scared a mouse.
7Preliminaries Regular Languages
- If we want this grammar to generate sentences
with a subject other than a mouse we have to
add the following productions - S-gta cat VP
- S-gta dog VP
- This is inefficient
- Whats worse, it doesnt capture the NP
generalization (i.e. doesnt give us the right
structure) - We need S-gtNP VP but this is not a legitimate
production
8Preliminaries Regular Languages
- Consider a CFG that accepts the same sentences
(slightly different nonterminal vocabulary) - S-gtNP VP
- NP-gtDT N
- DT-gta
- N-gtcat
- N-gtdog
- N-gtmouse
- VP-gtVI
- VP-gtVT NP
- VI-gtsqueaked
- VT-gtchased
- VT-gtscared
9Preliminaries Regular Languages
- Crucial point the regular grammar shown here
recognizes the strings we want it to recognize
but it doesnt assign to them the structure we
want. That is, this grammar weakly generates the
language in question but doesnt strongly
generate it. Why is this important?
10Preliminaries Regular Languages
- An example of why it is important
- I saw the man with a telescope.
- Recognizing the string is not sufficient for
expressing its syntactic ambiguity - We need some way of expressing the ambiguity to
get at the two meanings. - The reason I am drawing attention to the
weak/strong distinction is that it seems that
there is some conceptual confusion. When people
say language x is CF that dont always say
whether they mean that it is strongly CF or just
weakly CF. In the context of formal languages and
automata, people talk about weak generative
capacity, but to us linguists strong generative
capacity is of greater interest.
11Preliminaries Regular Languages
- In any case, for some sentences of English, one
cannot write a regular grammar at all. That is,
some sentences cannot be even recognized by a
regular grammar, let alone assigned the correct
structure. That is, they are not even weakly
regular. - A mouse a cat chased squeaked.
- A mouse a cat a dog scared chased squeaked.
- Example of Center Embedding
- NP1 NP2 NP3 V3 V2 V1
- There is no way to write a regular grammar for an
arbitrary number of such embeddings. - How do we know this for sure?
12Preliminaries Regular Languages
- Pumping Theorem for finite state languages
- If a language is an infinite set over some
alphabet E, then there are strings x,y,z made out
of the characters of E, such that y is not the
empty string, and xynz is in the language for
all ngt0. - What does this mean?
13Preliminaries Regular Languages
- Example Labnngt0
- Some strings in this language are
- a
- ab
- abb
- abbb
- There are strings xynz such that y is not empty
and xynz is in the language for all ngt0. For
example, the string abb is such a string xa,
yb, and zb. The following are all in the
language - n0, xa, yb0, zb, ab
- n1, xa, yb1, zb, abb
- n2, xa, yb2, zb, abbb
- Why does this have to be true?
14Preliminaries Regular Languages
- If a language is regular then by definition there
is some regular grammar that accepts it. - By definition a grammar has a finite set of
productions. In our example, one grammar for this
language could be S-gtaB, B-gtbB, B-gtØ - But if the language is to consist of an infinite
number of strings then there are strings in this
language that have more symbols in them than
there are productions, so some production must be
applied more than once to generate the string.
15Preliminaries Regular Languages
- Lets call the substring read by the grammar up
to the point in which the production which ends
up being used more than once is used for the
first time, x (so in our example, lets say that
we have a production such as S-gtaB so xa) . - Now lets call the substring that is read when
the production eventually used more than once is
used for the first time, y (so in our example,
lets say that we have a production such as
B-gtbB so yb). - Lets call the substring that is read from the
point where we used that B-gtbB production for the
first time to the end of the string, z (in this
case z can result from B-gtb or even be empty and
correspond to no production). - But since the middle substring, y, is the result
of applying a recursive production (in this
example, B-gtbB), we know that we can apply this
production arbitrarily many times and thus make n
in yn arbitrarily large.
16Preliminaries Regular Languages
- So how would we use the Pumping theorem to prove
that the language that accepts the sentences
exhibiting center embedding that are shown above
cannot be regular. - Similar language Lanbnngt0
- ab
- aabb
- aaabbb
- aaaabbbb
- .
- This language does not contain any strings in
which the number of bs does not equal the number
of preceding as or which includes any as after
bs - a
- aab
- abb
- abab
17Preliminaries Regular Languages
- Imagine that Lanbnngt0 were a regular
language. - Then there would be some string xyz, such that y
is not the empty string, and xynz is in the
language for all ngt0. - The substring y is the substring created by the
recursive rule and it therefore cannot contain
both as and bs because wed end up with bs
following as when we pump the string - So the substring y must consist entirely of as
or entirely of bs.
18Preliminaries Regular Languages
- If y consists entirely of as then z consists
entirely of bs. - But every time we apply the recursive rule that
created y we get one more a and since z is fixed
we cannot increase it by the same number of bs,
so it will always be possible to get a greater
number of as than bs. - If y consists entirely of bs then x consists
entirely of as. But every time we apply the
recursive rule that created y we get one more b
and since x is fixed we cannot increase it by the
same number of as, so it will always be possible
to get a greater number of bs than as. - So there is no string xynz that satisfies the
conditions for a regular language. So
Lanbnngt0 is not a regular language.
19Preliminaries Regular Languages
- Now how does this relate to the center embedding
examples? - The set of sentences that exhibit center
embedding as described above may be viewed as a
special case of the Lanbnngt0 language, with
nouns being as and verbs being bs, or - L(catdogmouse)n(chasedscaredsqueaked)nngt0
. - There is no way of writing a regular grammar that
accepts strings with an arbitrarily long number
of nouns followed by the same exact number of
verbs.
20Preliminaries Regular Languages
- So we have shown that the subset of English which
consists of these types of sentences is not
regular. - But this doesnt in itself prove that English
itself is not regular. - In order to show that English is not regular we
need to use a few more steps in our proof. - Regular languages are closed under intersection.
This means that intersecting a regular language
with a regular language produces a regular
language.
21Preliminaries Regular Languages
- If English were a regular language than
intersecting it with some other regular language
would result in a regular language. We will try
to find some regular language and show that
intersecting it with English results in
L(catdogmouse)n(chasedscaredsqueaked)nngt0
which we have already shown is not a regular
language. - What language when intersected with English would
produce L(catdogmouse)n(chasedscaredsqueake
d)nngt0? - L(catdogmouse)(chasedscaredsqueaked) is
clearly regular and intersecting it with English
results in L(catdogmouse)n(chasedscaredsquea
ked)nngt0. - So English is not regular.
22Brief History
- Chomsky (1963) NLs are not regular or CF.
Proposed the transformational grammar model as an
alternative - The notion that all NL phenomena are regular was
put to rest. But the inadequacy of context free
grammars for handling NL proved more
controversial. Chomskys proofs for the
non-CFG-ness of NL are not accepted. - Peters Ritchie (1973) showed that Chomskys
transformational grammar framework was powerful
enough to describe any recursively enumerable set
- perhaps too powerful. - Until 1985, all the arguments for the claim that
NLs are not CF were shown to be flawed (see
alleged counterexamples debunked in Gazdar
Pullum (1982)) - Shieber (1985) provides the first syntactic
counterexample to the claim that CFGs are
powerful enough to generate NL. Shiebers
argument survived until today.
23Dutch
- Dutch has been initially introduced as a
counterexample but later dismissed. The example
and the explanation of why it is not a
counterexample is presented in Bresnan, Kaplan,
Peters Zaenen (1982). - Dutch has the following structures
- dat Jan Marie Piet de kinderen zag
helpen laten zwemmen - that Jan Marie Piet the children see-past
help-inf make-inf swim-inf - ..that Jan saw Marie help Piet make the children
swim - The structure is
- that NP1 NP2 NP3 NP4 V1 V2 V3 V4
24Dutch
- Arbitrarily many of these NP V pairs may be
inserted to form longer sentences. - The number of verbs and NPs must be the same.
- The first verb has to be tensed and it must agree
with the first NP. - All the other verbs have to be infinitives.
- The subcategorization constraints between the
final NP and final verb must be satisfied. - This is an example of cross-serial dependencies.
- A language that has strings with arbitrarily long
cross-serial dependencies is not a CFL.
25 Center Embedding
1
3
3
2
2
1
Cross Serial Dependencies
1
2
3
2
1
3
26Context Free Languages
- Lambncmdnm,ngt0
- We can use the pumping theorem for CFLs to show
this. - If L is an infinite CFL, then there is some
constant K such that any string w in L longer
than K can be factored into substrings wuvxyz
such that v and y are not both empty and uvnxynz
is in L for all ngt0. - What does this mean?
27Context Free Languages
- Lets look first at Lanbnngt1
- One CFG for this L would be
- S-gtaSb
- S-gtab
- n0, wempty, va0, xab, yb0, zempty, ab
- n1, wempty, va1, xab, yb1, zempty, aabb
- n2, wempty, va2, xab, yb2, zempty, aaabbb
28Context Free Languages
- We can show that ambncmdn is not CF by using the
pumping theorem. - If ambncmdn were a CFL then there would be some
constant K such that any string in L longer than
K, say ak bk ck dk, for example, could be written
as wuvxyz such that v and y are not both empty
and v and y are pumpable.
29Context Free Languages
- v cant consist of both as and bs because when
pumped it would produce strings with as after
bs. Similarly, it cannot consist of both bs and
cs or both cs and ds. The same goes for the
other pumpable term, y. - So v must consist entirely of as or entirely of
bs or entirely of cs or entirely of ds. Then
no matter what y we choose, any pumping of v and
y simultaneously will result in strings not in L
because we can pump only 2 symbols at a time but
not 4.
30Dutch
- The Dutch example seems to exhibit the same cross
serial dependencies we just showed could not be
handled by a CFG. - However, it is possible to write a CFG that would
accept these Dutch strings.
31Dutch
- We can divide the verbs as follows
- 1. V-index Form infinitive, Subcats for a
subject (swim) - 2. V-tensed Form tensed, Subcats for a subject
it agrees with and an S or S complement without
complementizer (saw) - 3. V-infinitive Form infinitive, Subcats for a
subject and an S or S complement without
complementizer (help, make) - 1. S-gtNP-agr S-agr-index V-index
- 2. S-agr-index-gtNP S-agr-index V-infinitive
- 3. S-agr-index-gtNP S-agr-index V-infinitive
- 4. S-agr-index-gtNP-index V-tensed
- 5. NP-index -gt JanPietMariethe children
- 6. V-tensed-gtsaw
- 7. V-index-gtswim
- 8. V-infinitive-gt helpmake
- These productions would accept the example
sentence as well as the following sentences - that Jan Marie Piet the children see-past
make-inf help-inf swim-inf - that Jan Marie the children Piet see-past
help-inf make-inf swim-inf - that Jan Marie the children Piet see-past
make-inf help-inf swim-inf - These are perfectly grammatical.
32Dutch
- How come this works? Note that the only items we
care about are the ones in bold. - that Jan Marie Piet the children see-past
help-inf make-inf swim-inf - thar Jan saw Marie help Piet make the children
swim - that Jan Marie Piet the children see-past
make-inf help-inf swim-inf - thar Jan saw Marie make Piet help the children
swim - that Jan Marie the children Piet see-past
help-inf make-inf swim-inf - thar Jan saw Marie help the children make Piet
swim - that Jan Marie the children Piet see-past
make-inf help-inf swim-inf - thar Jan saw Marie make the children help Piet
swim
33Dutch
- We can recognize and generate all the grammatical
strings with this grammar because the number of
items we need to cross-reference is finite and
all the other items are interchangeable
syntactically. - Note that the sentences above are all grammatical
but each has a different interpretation in Dutch.
- The final tree structure of each sentence will
only reflect the order of the words in the
sentence and not all the cross-serial
dependencies. - So we can write a grammar to recognize and
generate all these strings but not to assign a
structure to them that will preserve the
cross-serial dependencies. - So the grammar above weakly generates the
cross-serial examples but doesnt strongly
generate them. - This is sufficient if we are interested in
classifying sentences as grammatical or
ungrammatical but is it sufficient for semantic
interpretation? Probably not.
34Swiss German
- How is the Swiss German example set presented by
Shieber (1985) crucially different from the Dutch
example set? - mer em Hans es haus halfed
aastriiche - we Hans-DAT the house-ACC helped paint
- we helped Hans paint the house.
- mer dchind em Hans es haus
lond halfe aastriiche - we the-children-ACC Hans-DAT the house-ACC let
help paint - we let the children help Hans paint the house.
35Swiss German
- we the-children-ACC Hans-DAT the house-ACC let
help paint - In Swiss German verbs subcategorize for NPs with
specific cases. - Some verbs subcategorize for accusative NPs and
some verbs subcategorize for dative NPs - The number of verbs subcategorizing for
accusative case NPs must be the same as the
number of accusative NPs in the sentence and the
number of verbs subcategorizing for dative case
NPs must be the same as the number of dative NPs
in the sentence.
36Swiss German
- Why cant we produce arbitrarily long sentences
of this type with a CFG? - Simplest case all the accusatives precede all
the datives - NPam NPdn Vam Vdn
- This is the same as ambncmdn which is non-CF, as
shown earlier.
37Swiss German
- The Swiss German strings with all the accusatives
preceding all the datives may be presented as - NP_ACCmNP_DATnV_ACCmV_DATn
- which is the same as
- ambncmdn which has been shown not to be CF.
- CFLs are closed under intersection with regular
languages. - If Swiss German were CF then intersecting it with
the regular language abcd would yield a CF. - But the intersection, wambnxcmdny, is not CF, so
Swiss German cannot be CF.
38Swiss German
- Shiebers argument rests on the impossibility of
writing a CFG that would insure that the total
number of accusatives and datives matched and not
on the order they appear in. - This argument is SUFFICIENT for Swiss Germans
status as a non-CFL. - The argument does NOT DEPEND on the ORDER of the
inner NPs and inner Verbs. - Surprising.
- Orders other than NP1 NP2 NP3..NPn V1 V2 V3 ..Vn
are acceptable. - So, Polish and other similar examples provide the
same exact counterexamples that Swiss German
does.
39Conclusions
- If I am right then one important point to take
away from this the often cited Swiss German
example is by no means unique. It just happens to
be the first published counterexample of NL
syntax not being CF. Once the inadequacy of CFGs
was shown for one language, there is no need to
show it for other languages.
40Conclusions
- Another important point is that the proof shows
that Swiss German, and thus NL, is not even
weakly CF. - Since in CL we are not interested in merely
recognizing strings, weak generative capacity
without strong generative capacity is of limited
use to us and plenty of examples have been
presented to support the claim that NLs are not
strongly CF before Shiebers paper, so its
importance is perhaps more theoretical than
practical from a CL point of view.
41References
- Bresnan, Kaplan, Peters Zaenen (1982) - Joan
Bresnan, Ron Kaplan, Stanley Peters, and Annie
Zaenen. 1982. Cross-serial dependencies in Dutch.
Linguistic Inquiry, 13(4)613--35. - Chomsky (1963) - Chomsky, Noam. 1963. Formal
properties of grammar. In Luce, R.D., R.R. Bush
and E. Galanter (eds), Handbook of Mathematical
Psychology, vol.II. New York Wiley, pp. 323-418. - Partee (1993) - Mathematical Methods in
Linguistics, Corrected second printing of the
first edition (Studies in Linguistics and
Philosophy) by Barbara H. Partee, Alice Ter
Muelen and Robert Wall - Peters Ritchie (1973) - Peters, Stanley R.
Ritchie (1973). "On the generative power of
transformational grammars". Information Sciences
6 49-83. - Pullum Gazdar (1982) - Pullum, Geoffrey K., and
Gerald Gazdar (1982) "Natural languages and
context-free languages," ltugtLinguistics and
Philosophylt/ugt 4, 471--504. - Savitch (1987) - The formal complexity of natural
language" by Walter J. Savitch, Emmon Bach,
William Marsh, and Gila Safran-Naveh. D. Reidel
1987 - Shieber (1985) - Stuart M. Shieber. Evidence
against the context-freeness of natural language.
Linguistics and Philosophy, 8333-343, 1985.
http//www.eecs.harvard.edu/shieber/Biblio/Papers
/shieber85.pdf -