How do linguists study grammar? - PowerPoint PPT Presentation

About This Presentation
Title:

How do linguists study grammar?

Description:

Title: Preliminary Concepts Author: lsl Last modified by: lsl Created Date: 8/27/2003 1:17:00 AM Document presentation format: On-screen Show Company – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 47
Provided by: lsl
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: How do linguists study grammar?


1
How do linguists study grammar?
  • Lori Levin
  • 11-721 Grammars and Lexicons
  • August 29, 2007

2
Outline
  • Views of language
  • Prescriptive
  • Artistic
  • Descriptive
  • Claims about knowledge of a language
  • Unconscious
  • Complex
  • Systematic
  • Can be studied scientifically
  • A research tool grammaticality judgments
  • What is grammaticality?
  • Problems with grammaticality
  • Rationalism vs empiricism
  • Why should language technologists care about
    grammaticality?

3
Prescriptive and Descriptive Linguistics
  • Natural phenomena cannot be legislated, just
    described.
  • You cant declare the value of p to be 3.
  • Sag, Wasow, and Bender, page 1
  • Social phenomena can be legislated.
  • Language use can be legislated as a social
    phenomenon, but it can also be studied as a
    natural phenomenon.

4
Prescriptive view of language
  • Rules about how language should be used
  • Dont say Me and him went to the movies.
  • It doesnt make sense because you cant say Me
    went to the movies.
  • Focus on isolated phenomena that are thought to
    be corruptions of the language.
  • Everybody should do their homework.
  • Some people speak correctly and others dont.
  • Rules are something that you are aware of.

5
Artistic View of Language
  • Language can be used creatively to make
    literature and poetry.
  • Some people are better at it than others.
  • Language is not systematic and rule governed.

6
Descriptive view of language
  • Study language as a natural phenomenon
  • People say Me and him went to the movies.
  • Thats interesting because they dont say Me went
    to the movies.
  • Focus on all aspects of language, even very
    normal sentences.
  • Every native speaker of a language speaks equally
    well.
  • Unless there is an injury or an illness that
    affects certain parts of the brain or speech
    producing organs.
  • Language consists of systematic knowledge that
    the speakers are not aware of.

7
Outline
  • Views of language
  • Prescriptive
  • Artistic
  • Descriptive
  • Claims about knowledge of a language
  • Unconscious
  • Complex
  • Systematic
  • Can be studied scientifically
  • A research tool grammaticality judgments
  • What is grammaticality?
  • Problems with grammaticality
  • Rationalism vs empiricism
  • Why should language technologists care about
    grammaticality?

8
Knowledge of Language
  • Every normal speaker of any natural language has
    acquired an immensely rich and systematic body of
    unconscious knowledge, which can be investigated
    by consulting speakers intuitive judgments.
  • Languages are objects of considerable
    complexity, which can be studied scientifically.
    That is, we can formulate hypotheses about
    linguistic structure and test them against the
    facts of particular languages.
  • Sag et al., page 2

Claim 1
Claim 2
Claim 3
Claim 4
9
Chomsky, 1957 on testable hypotheses
  • The search for rigorous formulation in
    linguistics has a much more serious motivation
    than mere concern for logical niceties or the
    desire to purify well-established methods of
    linguistic analysis. Precisely constructed
    models for linguistic structure can play an
    important role, both negative and positive, in
    the process of discovery itself. By pushing a
    precise but inadequate formulation to an
    unacceptable conclusion, we can often expose the
    exact source of the inadequacy and, consequently,
    gain a deeper understanding of the linguistic
    data. More positively a formalized theory may
    automatically provide solutions for many problems
    other than those for which it was explicitly
    designed. Obscure and intuition-bound notions
    can neither lead to absurd conclusions nor
    provide new and correct ones, and hence they fail
    to be useful in two important respects.
  • (Noam Chomsky has been the most influential
    linguist in many parts of the world since 1957.
    You may have also heard his name associated with
    politics. )

10
Immensely rich and systematic body of
unconscious knowledge
  • They saw Pat and Chris.
  • They saw Pat with Chris.
  • Who did they see Pat with?
  • Who did they see Pat and?
  • Has anyone ever had to tell you not to say this?

11
Testable hypotheses about linguistic knowledge
  • We like us.
  • We like ourselves.
  • She likes her. (She ? her)
  • She likes herself.
  • Nobody likes us.
  • Leslie likes ourselves.
  • Ourselves like us.
  • Ourselves like ourselves.

12
Testable hypotheses
  • Use a reflexive pronoun only when
  • Use a regular pronoun only when

13
Counter-examples
  • We think that Leslie likes us.
  • We think that Leslie likes ourselves.
  • We think that ourselves like Leslie.

14
New Hypothesis
  • Use a reflexive pronoun only when
  • Use a regular pronoun only when
  • (This is an English rule. Many languages do not
    follow it.)

15
Support for the new hypothesis
  • We think that she voted for her. (she ? her)
  • We think that she voted for herself.
  • We think that herself voted for her.
  • We think that herself voted for herself.

16
Counter-examples
  • Our friends like us.
  • Our friends like ourselves.
  • Those pictures of us offended us.
  • Those pictures of us offended ourselves.

17
New Hypothesis
  • Use a reflexive pronoun only when
  • Use a regular pronoun only when

18
Counter-examples
  • Vote for us.
  • Vote for ourselves.
  • Vote for you.
  • Vote for yourselves.

19
Counter-examples
  • We appealed to them to vote for themselves.
  • We appealed to them to vote for them.
  • Them ? them
  • We appealed to them to vote for us.
  • We appealed to them to vote for ourselves.
  • We appeared to them to vote for themselves.
  • We appeared to them to vote for them.
  • Them them
  • We appeared to them to vote for us.
  • We appeared to them to vote for ourselves.

The theoretical machinery required for a viable
grammatical analysis could be quite abstract.
Sag et al., page 6
20
Knowledge of Language
  • Every normal speaker of any natural language has
    acquired an immensely rich and systematic body of
    unconscious knowledge, which can be investigated
    by consulting speakers intuitive judgments.
  • Languages are objects of considerable
    complexity, which can be studied scientifically.
    That is, we can formulate hypotheses about
    linguistic structure and test them against the
    facts of particular languages.
  • Sag et al., page 2

Claim 1
Claim 2
Claim 3
Claim 4
21
Grammaticality Judgments as a scientific tool for
collecting data
  • What is grammaticality?
  • What are some problems in using it as a tool for
    collecting data?
  • Grammaticality vs corpus analysis

22
One more claim
  • It is also possible to make testable hypotheses
    about how languages differ and what they have in
    common.

23
Outline
  • Views of language
  • Prescriptive
  • Artistic
  • Descriptive
  • Claims about knowledge of a language
  • Unconscious
  • Complex
  • Systematic
  • Can be studied scientifically
  • A research tool grammaticality judgments
  • What is grammaticality?
  • Problems with grammaticality
  • Rationalism vs empiricism
  • Why should language technologists care about
    grammaticality?

24
Investigate hypotheses by consulting native
speakers intuitions
  • Many linguists (probably a majority) assume that
    people can distinguish strings of words that are
    sentences of their language from strings of words
    that are not sentences of their language.
  • So imagine that you are a machine or a classifier
    that takes a sentence as input, and returns
    accept or reject as output.

25
Native speakers as automata that accept and
reject strings of words.
  • The student read a book.
  • Student the a read book.

26
Grammaticality
  • A string of words that you recognize as a
    sentence in your native language is grammatical.
  • A string of words that you do not recognize as a
    sentence in your native language is
    ungrammatical.
  • When you decide whether a sentence is grammatical
    or ungrammatical, this is called giving a
    grammaticality judgment.
  • Ungrammatical sentences are preceded by an
    asterisk or star (). Sometimes they are called
    starred sentences.
  • If native speakers cant decide whether the
    sentence is grammatical or ungrammatical, it is
    preceded by a combination of stars and question
    marks.

27
Grammaticality Descriptive
  • When you give a grammaticality judgment, you are
    not supposed to judge whether the sentence is the
    most elegant or appropriate --- just whether it
    is a sentence of your language or not.
  • You may have a stylistic preference for one of
    these, but they are all grammatical.
  • These are things you never want to hear.
  • These are things you want never to hear.
  • These are things you want to never hear.

28
Grammatical ? meaningful
  • It is unlikely that Pat will succeed.
  • It is improbable that Pat will succeed.
  • Pat is unlikely to succeed.
  • Pat is improbable to succeed.
  • This could be meaningful, but most people
    consider it to be ungrammatical.
  • They saw Pat with Chris.
  • They saw Pat and Chris.
  • Who did they see Pat with?
  • Who did they see Pat and?
  • Again, this could be meaningful, but it is
    ungrammatical.

29
Syntactically well-formed vs semantically
well-formed
  • Colorless green ideas sleep furiously.
  • Syntactically well-formed
  • Chomsky, 1957
  • Colorless sleep green furiously ideas.
  • Not syntactically well-formed

30
Grammaticality Where to draw the line?
  • Sentences that are understandable, but sound like
    mistakes are probably not grammatical.
  • These are things that I dont know anyone who
    says.

31
Where to draw the line?
  • Sentences of bad poetry are not grammatical.
  • Strange word order in order to make lines rhyme.
  • Fame to our alma mater
  • Thousands of voices ring
  • Telling of love we bear her
  • To her we laurels bring.
  • From my high school song. Dont ask how I could
    remember something like that.
  • indirect-object subject direct-object
    verb
  • We bring laurels to her.
  • subject verb direct-object
    indirect-object

32
Grammaticality
  • More bad poetry not grammatical
  • Shout on high the ringing praises, loyal strong
    and true
  • Bring we to our alma mater trust and
    honor due.
  • verb subject indirect-object
    direct-object
  • We bring trust and honor (that are) due
    to our alma mater.
  • subject verb direct-object indirect-object

33
Where to draw the line?
  • However, many types of sentences that are found
    in writing, or are restricted to special contexts
    are considered to be grammatical and even have
    names
  • Locative Inversion In this village live many
    people.
  • Topicalization Sam, I like.
  • Heavy NP Shift I presented to the students many
    examples of strange and unusual constructions.
    (indirect object comes before direct object
    because the direct object is too long)
  • These are grammatical.

34
Grammaticality
  • Grammatical
  • In this village live many people.
  • I presented to the students many examples of
    strange and unusual constructions.
  • Sam, I like.
  • Not grammatical
  • To her we laurels bring.
  • Bring we to our alma mater trust and honor due.
  • These are things that I dont know anyone who
    says.
  • Who did they see Pat and?
  • We told them to vote for ourselves.

35
Problems with Grammaticality
  • Dialect differences
  • The car needs washed.
  • (The car needs to be washed.)
  • We go to the movies a lot anymore.
  • (We go to the movies a lot these days.)
  • I gave it her.
  • (I gave it to her.)
  • It were me what told her.
  • (It was me that told her.)
  • Mine is bigger than what yours is.
  • (Mine is bigger than yours is.)
  • Aint no chicken cant get into no coop.
  • (No chicken can get into a coop.)
  • (There isnt a chicken that can get into a coop.)

36
Problems with grammaticality
  • Changes over time
  • (From Kroeger, Chapter 1)
  • With two things hath God mens soul
    endowed.
  • Normal word order in English before 1100 AD
  • I know not what course others may take,
  • Patrick Henry, 1775

37
Grammaticality Discrete or Continuous?
  • Manning (2003) Probabilistic Syntax
  • We regard Kim to be an acceptable candidate.
  • Consulting native speakers judgments.
  • Conservatives argue that the Bible regards
    homosexuality to be a sin.
  • Attested example.
  • Kim turned out doing all the work.
  • Consulting native speakers judgments.
  • But it turned out having a greater impact than
    any of us dreamed.
  • Attested example.
  • Better to ask, How likely? than to ask,
    Possible or not?

38
Philosophy LessonRationalism and Empiricism
  • Rationalism the source of knowledge is reason
  • Empiricism the source of knowledge is data

39
Rationalist view of linguistic data
  • Language is something in peoples minds a set
    of rules and principles that allows them to make
    grammaticality judgments and produce and
    understand sentences that they have never heard
    before
  • i-language or internal language
  • We study i-language asking people to give
    grammaticality judgments.
  • A corpus (a collection of texts or speech) is
    e-language, or external language. It is not the
    object of study.

40
Empiricist view of linguistic data
  • Corpora are the objects of study.
  • We study language by examining patterns in
    corpora (collections of texts or speech).

41
Why do we need the philosophy lesson?
  • In the second half of the 20th century,
    linguistics was heavily dominated by rationalism.
  • Computational linguistics was also initially
    dominated by rationalism.
  • Rationalism/empiricism was heavily debated in
    computational linguistics in the 1990s.
  • Rationalism people writing grammar rules for a
    parser
  • Empiricism statistical, corpus-based models
  • In current Language Technologies Research,
    rationalism and empiricism are often combined.
  • Combination A person choosing linguistic
    features as input to a machine learning
    algorithm, which then learns from the
    distribution of the features in a corpus.
  • Combination Syntax-based statistical machine
    translation.
  • Empiricism is gaining ground in linguistics
    (Manning 2003)
  • Linguistics textbooks are still mainly
    rationalist.
  • Empiricism is mentioned only in one footnote in
    Chapter 1 of the Sag et al book.
  • But a few years earlier, it would not have been
    mentioned at all!

42
Strong points of rationalism
  • Infinite, creative capacity People can produce
    and understand sentences that have never been
    uttered before. They are not repeating memorized
    patterns, but applying productive rules.
  • Leads people to wonder about things that dont
    exist in a corpus Who did you see Pat and?
  • Probability is not grammaticality grammatical
    sentences may have very low probability.
  • Probability reflects facts about the world, but
    grammaticality is independent of context.
  • Clyde is an African elephant.
  • Clyde is a pink elephant

43
Strong points of empiricism
  • Frequency of occurrence in a corpus is easier to
    measure reliably than a grammaticality judgment.
  • Many ungrammatical sentences turn out to be
    acceptable in the right context.
  • Identifying the right context turns out to be an
    interesting question that does not arise in the
    rationalist approach.
  • Bresnan et al., 2005, 2007
  • I gave her the book.
  • I gave the book to her.

44
Grammaticality in language technologies
  • Real input (especially spoken input) is not
    always well-formed, so you should not build a
    program that accepts only grammatical sentences.
  • Can we do away with grammar in language
    technologies?

45
Grammaticality in Language Technologies
  • You cannot extract the meaning of a sentence
    without processing the grammar
  • Sue interviewed Sam.
  • Sam interviewed Sue.
  • LT output has to be comprehensible, and
    therefore, mostly grammatical
  • Synthesized speech
  • An automatically produced translation
  • An automatically produced summary
  • Error detection programs for computer-assisted
    language instruction or for word processing must
    distinguish grammatical from ungrammatical
    sentences.

46
In favor of grammaticality
  • Probability is not grammaticality grammatical
    sentences may have very low probability.
  • Probability reflects facts about the world, but
    grammaticality is independent of context.
  • Clyde is an African elephant.
  • Clyde is a pink elephant
Write a Comment
User Comments (0)
About PowerShow.com