Title: CL vs NLP
1CL vs NLP
- Why Computational Linguistics (CL) rather than
Natural Language Processing (NLP)? -
- Computational Linguistics
- Computers dealing with language
- Modeling what people do
- Natural Language Processing
- Applications on the computer side
2Relation of CL to Other Disciplines
Electrical Engineering (EE) (Optical Character
Recognition)
Artificial Intelligence (AI) (notions of rep,
search, etc.)
Linguistics (Syntax, Semantics, etc.)
Machine Learning (particularly, probabilistic or
statistic ML techniques)
Psychology
CL
Philosophy of Language, Formal Logic
Human Computer Interaction (HCI)
Information Retrieval
Theory of Computation
3A Sampling of Other Disciplines
- Linguistics formal grammars, abstract
characterization of what is to be learned. - Computer Science algorithms for efficient
learning or online deployment of these systems in
automata. - Engineering stochastic techniques for
characterizing regular patterns for learning and
ambiguity resolution. - Psychology Insights into what linguistic
constructions are easy or difficult for people to
learn or to use
4History 1940-1950s
- Development of formal language theory (Chomsky,
Kleene, Backus). - Formal characterization of classes of grammar
(context-free, regular) - Association with relevant automata
- Probability theory language understanding as
decoding through noisy channel (Shannon) - Use of information theoretic concepts like
entropy to measure success of language models.
51957-1983 Symbolic vs. Stochastic
- Symbolic
- Use of formal grammars as basis for natural
language processing and learning systems.
(Chomsky, Harris) - Use of logic and logic based programming for
characterizing syntactic or semantic inference
(Kaplan, Kay, Pereira) - First toy natural language understanding and
generation systems (Woods, Minsky, Schank,
Winograd, Colmerauer) - Discourse Processing Role of Intention, Focus
(Grosz, Sidner, Hobbs) - Stochastic Modeling
- Probabilistic methods for early speech
recognition, OCR (Bledsoe and Browning, Jelinek,
Black, Mercer)
61983-1993 Return of Empiricism
- Use of stochastic techniques for part of speech
tagging, parsing, word sense disambiguation, etc. - Comparison of stochastic, symbolic, more or less
powerful models for language understanding and
learning tasks.
71993-Present
- Advances in software and hardware create NLP
needs for information retrieval (web), machine
translation, spelling and grammar checking,
speech recognition and synthesis. - Stochastic and symbolic methods combine for real
world applications.
8Language and Intelligence Turing Test
- Turing test
- machine, human, and human judge
- Judge asks questions of computer and human.
- Machines job is to act like a human, humans job
is to convince judge that hes not the machine. - Machine judged intelligent if it can fool
judge. - Judgement of intelligence linked to appropriate
answers to questions from the system.
9ELIZA
- Remarkably simple Rogerian Psychologist
- Uses Pattern Matching to carry on limited form of
conversation. - Seems to Pass the Turing Test! (McCorduck,
1979, pp. 225-226) - Eliza Demo
http//www.lpa.co.uk/pws_dem4.htm
10Whats involved in an intelligent Answer?
Analysis Decomposition of the signal (spoken
or written) eventually into meaningful units.
This involves
11Speech/Character Recognition
- Decomposition into words, segmentation of words
into appropriate phones or letters - Requires knowledge of phonological patterns
- Im enormously proud.
- I mean to make you proud.
12Morphological Analysis
- Inflectional
- duck s N duck plural s
- duck s V duck 3rd person s
- Derivational
- kind, kindness
- Spelling changes
- drop, dropping
- hide, hiding
13Syntactic Analysis
- Associate constituent structure with string
- Prepare for semantic interpretation
14Semantics
- A way of representing meaning
- Abstracts away from syntactic structure
- Example
- First-Order Logic watch(I,terrapin)
- Can be I watched the terrapin or The terrapin
was watched by me - Real language is complex
- Who did I watch?
15Lexical Semantics
The Terrapin, is who I watched. Watch the
Terrapin is what I do best. Terrapin is what I
watched the I experiencer Watch the Terrapin
predicate The Terrapin patient
16Compositional Semantics
- Association of parts of a proposition with
semantic roles - Scoping
17Word-Governed Semantics
- Any verb can add able to form an adjective.
- I taught the class . The class is teachable
- I rejected the idea. The idea is rejectable.
- Association of particular words with specific
semantic forms. - John (masculine)
- The boys ( masculine, plural, human)
18Pragmatics
- Real world knowledge, speaker intention, goal of
utterance. - Related to sociology.
- Example 1
- Could you turn in your assignments now (command)
- Could you finish the homework? (question,
command) - Example 2
- I couldnt decide how to catch the crook. Then I
decided to spy on the crook with binoculars. - To my surprise, I found out he had them too.
Then I knew to just follow the crook with
binoculars. - the crook with binoculars
- the crook with binoculars
19Discourse Analysis
- Discourse How propositions fit together in a
conversationmulti-sentence processing. - Pronoun reference The professor told the
student to finish the assignment. He was pretty
aggravated at how long it was taking to pass it
in. - Multiple reference to same entityGeorge W.
Bush, president of the U.S. - Relation between sentencesJohn hit the man. He
had stolen his bicycle
20NLP Pipeline
speech
text
Phonetic Analysis
OCR/Tokenization
Morphological analysis
Syntactic analysis
Semantic Interpretation
Discourse Processing
21Relation to Machine Translation
input
analysis
generation
output
Morphological analysis
Morphological synthesis
Syntactic analysis
Syntactic realization
Semantic Interpretation
Lexical selection
Interlingua
22Ambiguity
I made her duck I made duckling for her I made
the duckling belonging to her I created the duck
she owns I forced her to lower her head By magic,
I changed her into a duck
23Syntactic Disambiguation
S
S NP
VP NP
VP I V NP VP
I V NP
made her V
made det N
duck
her duck
24Part of Speech Tagging and Word Sense
Disambiguation
- verb Duck !
- noun Duck is delicious for dinner
- I went to the bank to deposit my check.
- I went to the bank to look out at the river.
- I went to the bank of windows and chose the
one dealing with last names beginning with d.
25Resources forNLP Systems
- Dictionary
- Morphology and Spelling Rules
- Grammar Rules
- Semantic Interpretation Rules
- Discourse Interpretation
- Natural Language processing involves (1) learning
or fashioning the rules for each component, (2)
embedding the rules in the relevant automaton,
(3) and using the automaton to efficiently
process the input .
26Some NLP Applications
- Machine TranslationBabelfish (Alta Vista)
- Question AnsweringAsk Jeeves (Ask Jeeves)
- Language SummarizationMEAD (U. Michigan)
- Spoken Language Recognition EduSpeak (SRI)
- Automatic Essay evaluationE-Rater (ETS)
- Information Retrieval and ExtractionNetOwl
(SRA)
http//babelfish.altavista.com/translate.dyn
http//www.ask.com/
http//www.summarization.com/mead
http//www.eduspeak.com/
http//www.ets.org/research/erater.html
http//www.netowl.com/extractor_summary.html
27What is MT?
- Definition Translation from one natural language
to another by means of a computerized system - Early failures
- Later varying degrees of success
28An Old Example
- The spirit is willing but the flesh is weak
- The vodka is good but the meat is rotten
29Machine Translation History
- 1950s Intensive research activity in MT
- 1960s Direct word-for-word replacement
- 1966 (ALPAC) NRC Report on MT
- Conclusion MT no longer worthy of serious
scientific investigation. - 1966-1975 Recovery period
- 1975-1985 Resurgence (Europe, Japan)
- 1985-present Resurgence (US)
http//ourworld.compuserve.com/homepages/WJHutchin
s/MTS-93.htm.
30What happened between ALPAC and Now?
- Need for MT and other NLP applications confirmed
- Change in expectations
- Computers have become faster, more powerful
- WWW
- Political state of the world
- Maturation of Linguistics
- Development of hybrid statistical/symbolic
approaches
31Three MT Approaches Direct, Transfer,
Interlingual
Interlingua
Semantic Composition
Semantic Decomposition
Semantic Structure
Semantic Structure
Semantic Analysis
Semantic Generation
Semantic Transfer
Syntactic Structure
Syntactic Structure
Syntactic Transfer
Syntactic Analysis
Syntactic Generation
Word Structure
Word Structure
Direct
Morphological Generation
Morphological Analysis
Target Text
Source Text
32Examples of Three Approaches
- Direct
- I checked his answers against those of the
teacher ? - Yo comparé sus respuestas a las de la
profesora - Rule check X against Y ? comparar X a Y
- Transfer
- Ich habe ihn gesehen ? I have seen him
- Rule clause agt aux obj pred ? clause agt aux
pred obj - Interlingual
- I like Mary? Mary me gusta a mí
- Rep BeIdent (I ATIdent (I, Mary) Likeingly)
33MT Systems 1964-1990
- Direct GAT Georgetown, 1964, TAUM-METEO
Colmerauer et al. 1971 - Transfer GETA/ARIANE Boitet, 1978LMT McCord,
1989, METAL Thurmair, 1990, MiMo Arnold
Sadler, 1990, - Interlingual MOPTRANS Schank, 1974, KBMT
Nirenburg et al, 1992, UNITRAN Dorr, 1990
34Statistical MT and Hybrid Symbolic/Stats MT
1990-Present
- Candide Brown, 1990, 1992 Halo/Nitrogen
Langkilde and Knight, 1998, Yamada and Knight,
2002 GHMT Dorr and Habash, 2002 DUSTer Dorr
et al. 2002
35Direct MT Pros and Cons
- Pros
- Fast
- Simple
- Inexpensive
- Cons
- Unreliable
- Not powerful
- Rule proliferation
- Requires too much context
- Major restructuring after lexical substitution
36Transfer MT Pros and Cons
- Pros
- Dont need to find language-neutral rep
- No translation rules hidden in lexicon
- Relatively fast
- Cons
- N2 sets of transfer rules Difficult to extend
- Proliferation of language-specific rules in
lexicon and syntax - Cross-language generalizations lost
37Interlingual MT Pros and Cons
- Pros
- Portable (avoids N2 problem)
- Lexical rules and structural transformations
stated more simply on normalized representation - Explanatory Adequacy
- Cons
- Difficult to deal with terms on primitive level
universals? - Must decompose and reassemble concepts
- Useful information lost (paraphrase)
38Approximate IL Approach
- Tap into richness of TL resources
- Use some, but not all, components of IL
representation - Generate multiple sentences that are
statistically pared down
39Approximating IL Handling Divergences
- Primitives
- Semantic Relations
- Lexical Information
40Interlingual vs. Approximate IL
- Interlingual MT
- primitives relations
- bi-directional lexicons
- analysis compose IL
- generation decompose IL
- Approximate IL
- hybrid symbolic/statistical design
- overgeneration with statistical ranking
- uses dependency rep input and structural
expansion for deeper overgeneration
41Mapping from Input Dependency to English
Dependency Tree
Mary le dio patadas a John ? Mary kicked John
Knowledge Resources in English only (LVD Dorr,
2001).
42Statistical Extraction
Mary kicked John . 0.670270 Mary gave a kick
at John . -2.175831 Mary gave the kick at
John . -3.969686 Mary gave an kick at John .
-4.489933 Mary gave a kick by John .
-4.803054 Mary gave a kick to John .
-5.045810 Mary gave a kick into John .
-5.810673 Mary gave a kick through John .
-5.836419 Mary gave a foot wound by John .
-6.041891 Mary gave John a foot wound .
-6.212851
43Benefits of Approximate IL Approach
- Explaining behaviors that appear to be
statistical in nature - Re-sourceability Re-use of already existing
components for MT from new languages. - Application to monolingual alternations
44What Resources are Required?
- Deep TL resources
- Requires SL parser and tralex
- TL resources are richer LVD representations,
CatVar database - Constrained overgeneration
45MT Challenges Ambiguity
- Syntactic AmbiguityI saw the man on the hill
with the telescope - Lexical Ambiguity
- E book
- S libro, reservar
- Semantic Ambiguity
- Homographyball(E) pelota, baile(S)
- Polysemykill(E), matar, acabar (S)
- Semantic granularityesperar(S) wait, expect,
hope (E)be(E) ser, estar(S)fish(E) pez,
pescado(S)
46How do we evaluate MT?
- Human-based Metrics
- Semantic Invariance
- Pragmatic Invariance
- Lexical Invariance
- Structural Invariance
- Spatial Invariance
- Fluency
- Accuracy
- Do you get it?
- Automatic Metrics Bleu
47BiLingual Evaluation Understudy (BLEU Papineni,
2001)
http//www.research.ibm.com/people/k/kishore/RC221
76.pdf
- Automatic Technique, but .
- Requires the pre-existence of Human (Reference)
Translations - Approach
- Produce corpus of high-quality human translations
- Judge closeness numerically (word-error rate)
- Compare n-gram matches between candidate
translation and 1 or more reference translations
48Bleu Comparison
Chinese-English Translation Example Candidate 1
It is a guide to action which ensures that the
military always obeys the commands of the
party. Candidate 2 It is to insure the troops
forever hearing the activity guidebook that party
direct.
Reference 1 It is a guide to action that ensures
that the military will forever heed Party
commands. Reference 2 It is the guiding
principle which guarantees the military forces
always being under the command of the
Party. Reference 3 It is the practical guide for
the army always to heed the directions of the
party.
49How Do We Compute Bleu Scores?
- Key Idea A reference word should be considered
exhausted after a matching candidate word is
identified.
- For each word compute
- (1) candidate word count
- (2) maximum ref count
- Add counts for each candidate word using the
lower of the two numbers . - Divide by number of candidate words..
50Modified Unigram Precision Candidate 1
It(1) is(1) a(1) guide(1) to(1) action(1)
which(1) ensures(1) that(2) the(4) military(1)
always(1) obeys(0) the commands(1) of(1) the
party(1)
Reference 1 It is a guide to action that ensures
that the military will forever heed Party
commands. Reference 2 It is the guiding
principle which guarantees the military forces
always being under the command of the
Party. Reference 3 It is the practical guide for
the army always to heed the directions of the
party.
Whats the answer??????
17/18
51Modified Unigram Precision Candidate 2
It(1) is(1) to(1) insure(0) the(4) troops(0)
forever(1) hearing(0) the activity(0)
guidebook(0) that(2) party(1) direct(0)
Reference 1 It is a guide to action that ensures
that the military will forever heed Party
commands. Reference 2 It is the guiding
principle which guarantees the military forces
always being under the command of the
Party. Reference 3 It is the practical guide for
the army always to heed the directions of the
party.
Whats the answer??????
8/14
52Modified Bigram Precision Candidate 1
It is(1) is a(1) a guide(1) guide to(1) to
action(1) action which(0) which ensures(0)
ensures that(1) that the(1) the military(1)
military always(0) always obeys(0) obeys the(0)
the commands(0) commands of(0) of the(1) the
party(1)
Reference 1 It is a guide to action that ensures
that the military will forever heed Party
commands. Reference 2 It is the guiding
principle which guarantees the military forces
always being under the command of the
Party. Reference 3 It is the practical guide for
the army always to heed the directions of the
party.
Whats the answer??????
10/17
53Modified Bigram Precision Candidate 2
It is(1) is to(0) to insure(0) insure the(0) the
troops(0) troops forever(0) forever hearing(0)
hearing the(0) the activity(0) activity
guidebook(0) guidebook that(0) that party(0)
party direct(0)
Reference 1 It is a guide to action that ensures
that themilitary will forever heed Party
commands. Reference 2 It is the guiding
principle which guarantees the military forces
always being under the command of the
Party. Reference 3 It is the practical guide for
the army always to heed the directions of the
party.
Whats the answer??????
1/13
54Catching Cheaters
the(2) the the the(0) the(0) the(0) the(0)
Reference 1 The cat is on the mat Reference 2
There is a cat on the mat
Whats the unigram answer?
2/7
Whats the bigram answer?
0/7