Title: Fall 2004
1EECS 595 / LING 541 / SI 661
Natural Language Processing
- Fall 2004
- Lecture Notes 1
2Introduction
3Course logistics
- Instructor Prof. Dragomir Radev
(radev_at_umich.edu) Ph.D., Computer Science,
Columbia University Formerly at IBM TJ Watson
Research Center - Times Tuesdays 110-355 PM, in 412, West Hall
- Office hours TBA, 3080 West Hall Connector
Course home page
http//www.si.umich.edu/radev/NLP-fall2004
4Example (from a famous movie)
Dave Bowman Open the pod bay doors, HAL. HAL
Im sorry Dave. Im afraid I cant do that.
5Example
I saw her fall
- How many different interpretations does the above
sentence have?
6What is Natural Language Processing
- Natural Language Processing (NLP) is the study of
the computational treatment of natural language. - NLP draws on research in Linguistics, Theoretical
Computer Science, Mathematics and Statistics,
Artificial Intelligence, Psychology, etc.
7Linguistics
- Knowledge about language
- Phonetics and phonology - the study of sounds
- Morphology - the study of word components
- Syntax - the study of sentence and phrase
structure - Lexical semantics - the study of the meanings of
words - Compositional semantics - how to combine words
- Pragmatics - how to accomplish goals
- Discourse conventions - how to deal with units
larger than utterances
8Theoretical Computer Science
- Automata
- Deterministic and non-deterministic finite-state
automata - Push-down automata
- Grammars
- Regular grammars
- Context-free grammars
- Context-sensitive grammars
- Complexity
- Algorithms
- Dynamic programming
9Mathematics and Statistics
- Probabilities
- Statistical models
- Hypothesis testing
- Linear algebra
- Optimization
- Numerical methods
10Artificial Intelligence
- Logic
- First-order logic
- Predicate calculus
- Agents
- Speech acts
- Planning
- Constraint satisfaction
- Machine learning
11Ambiguity
I saw her fall.
- The categories of knowledge of language can be
thought of as ambiguity-resolving components - How many different interpretations does the above
sentence have? - How can each ambiguous piece be resolved?
- Does speech input make the sentence even more
ambiguous?
Time flies like an arrow.
12http//edition.cnn.com/2004/WEATHER/09/03/hurrican
e.frances/index.html Frances churns toward
Florida Hurricane center Storm 'relentlessly
lashing Bahamas' Friday, September 3, 2004
Posted 2024 GMT (0424 HKT) MIAMI, Florida
(CNN) -- Hurricane Frances moved slowly toward
Florida on Friday, and the National Hurricane
Center said it could gain intensity before making
landfall, possibly late Saturday. At 2 p.m. ET,
the Category 3 storm was centered near the
southern tip of Great Abaco in the Bahamas, 200
miles (321 kilometers) east-southeast of
Florida's lower east coast, according to the
National Hurricane Center. The storm was moving
toward the west-northwest at about 9 mph (15
kph). Its maximum sustained winds had dropped to
115 mph (185 kph), but forecasters said it still
is "a dangerous hurricane." Hurricanes are
classified as categories 1 to 5 on the
Saffir-Simpson hurricane scale. A Category 3
storm has sustained winds between 111 and 130 mph
(178 and 209 kph). The advisory said Frances was
likely to make landfall in Florida in about 36
hours. Hurricane-force winds extend 85 miles (140
kilometers) from the center of the storm, and
winds of tropical storm strength (39-73 mph)
extend outward up to 185 miles (295
kilometers). Because Frances is the size of Texas
-- more than twice as large as Hurricane Charley
three weeks ago -- its major winds and heavy rain
are expected to batter a large part of Florida
well before landfall. By Friday afternoon, parts
of Florida were experiencing wind gusts as high
as 39 mph -- the lower end of tropical-storm
intensity. Hurricane warnings are in effect for
much of Florida's eastern coastline. A hurricane
warning means hurricane conditions are expected
in the warning area within 24 hours. Storm surge
flooding of six to 14 feet above normal has been
reported in the storm's path, and the hurricane
center warned "rainfall amounts of seven to 12
inches -- locally as high as 20 inches -- are
possible in association with Frances." The
hurricane center bulletin said Frances was
"relentlessly lashing the central and western
Bahamas." A hurricane center official told CNN
the storm could spend two days moving across the
Florida Peninsula. Frances has weakened slightly
in the past few days, but the hurricane center
advisory warned that as it moves across the warm
waters of the Gulf Stream, "this could easily
lead to re-intensification." However, current
forecasts predict "a 100-knot hurricane at
landfall" -- meaning wind speeds of about 115
mph. Because steering currents are expected to
weaken further, Frances "will likely slow down on
its way to Florida. This could delay the landfall
a few more hours," the advisory said. "Numerical
guidance continues to bring the hurricane over
Florida during the next two to three
days." Florida Gov. Jeb Bush said Friday that the
state was taking all necessary steps to prepare
for the storm.
13Florida Gov. Jeb Bush said Friday that the state
was taking all necessary steps to prepare for the
storm. "We are staging across -- some outside the
state and some inside the state -- a massive
response for this storm, and we're going to need
it," Bush said in a news conference. "There's
going to be a lot of work necessary to make sure
that the response is massive and immediate to
help people once this storm comes." He said he
has asked the governors of 17 states to waive
size and weight restrictions on trucks carrying
relief supplies. His brother, President Bush,
also offered support at a campaign rally Friday
morning in Pennsylvania. "Before I begin, I do
know you'll join me in offering our prayers and
best wishes to those in the path of Hurricane
Frances," the president said. A hurricane the
size of Texas Florida ordered mandatory
evacuations in parts of 16 counties and voluntary
evacuations in five other counties. "If you are
on a barrier island or a low-lying area, and you
haven't left, now is the time to do so," Governor
Bush said. Florida officials said the evacuation
order covers 2.5 million people. Most of them
"are staying in their own community, which is
exactly what they should be doing," said Bush,
noting that low-lying areas were most at risk.
"They've made plans to be with a loved one or a
friend and they're not on the roads." People
looking to flee the region clogged highways
Thursday, but officials said Friday that traffic
had died down. "Overall we're very, very pleased
with evacuation procedures yesterday and
continuing through today," said Col. Chris
Knight, director of the Florida Highway Patrol.
"We have no problems this morning." The Red Cross
opened 82 shelters in Florida on Thursday and
about 21,000 people were in them by nightfall,
spokeswoman Carol Miller told CNN. The group also
set up eight reception centers along the highway
to help people who needed information,
directions, water and maps, she said. Miller said
the Red Cross was launching its largest-ever
response effort to a domestic natural
disaster. Airlines have canceled flights in and
out of some of the major airports in Florida and
the Caribbean, and are expected to adjust
schedules as weather patterns change throughout
the weekend. Military preparations Military
officials preparing to evacuate three commands as
Frances approaches. At MacDill Air Force Base in
Tampa, on Florida's Gulf Coast, a military team
is preparing to set up alternative headquarters
facilities for the U.S. Central Command and
Special Operations Command at the stadium used by
the Tampa Bay Buccaneers football team. Central
Command is responsible for running the wars in
Afghanistan and Iraq, while Special Operations
Command oversees 50,000 special operations
forces. Patrick Air Force Base, on the eastern
coast of Florida near Melbourne, was evacuated
Thursday, and the commander of a fighter wing
near Miami ordered aircraft moved out of the
hurricane's path. The naval air station at
Jacksonville also moved aircraft out of the
area. In Miami, the headquarters of the Southern
Command has closed. Command-and-control
operations are being performed, but they could be
moved to Davis-Monthan Air Force Base in Arizona.
14The alphabet soup(NLP vs. CL vs. SP vs. HLT vs.
NLE)
- NLP (Natural Language Processing)
- CL (Computational Linguistics)
- SP (Speech Processing)
- HLT (Human Language Technology)
- NLE (Natural Language Engineering)
- Other areas of research Speech and Text
Generation, Speech and Text Understanding,
Information Extraction, Information Retrieval,
Dialogue Processing, Inference - Related areas Spelling Correction, Grammar
Correction, Text Summarization
15Sample applications
- Speech Understanding
- Question Answering
- Machine Translation
- Text-to-speech Generation
- Text Summarization
- Dialogue Systems
16Some demos
- ATT Labs Text-To-Speech(http//www.research.att.
com/projects/tts/demo.html) - Babelfish (babelfish.altavista.com)
- OneAcross (www.oneacross.com)
- AskJeeves (www.ask.com)
- IONaut (http//www.ionaut.com8400)
- NSIR (http//tangra.si.umich.edu/clair/NSIR/html/n
sir.cgi) - AnswerBus (www.answerbus.com)
- NewsInEssence (www.newsinessence.com)
17(No Transcript)
18(No Transcript)
19(No Transcript)
20(No Transcript)
21The Turing Test
- Alan Turing the Turing test (language as test
for intelligence) - Three participants a computer and two humans
(one is an interrogator) - Interrogators goal to tell the machine and
human apart - Machines goal to fool the interrogator into
believing that a person is responding - Other humans goal to help the interrogator
reach his goal
Q Please write me a sonnet on the topic of the
Forth Bridge. A Count me out on this one. I
never could write poetry. Q Add 34957 to
70764. A 105621 (after a pause)
22Some brief history
- Foundational insights (40s and 50s) automaton
(Turing), probabilities, information theory
(Shannon), formal languages (Backus and Naur),
noisy channel and decoding (Shannon), first
systems (Davis et al., Bell Labs) - Two camps (57-70) symbolic and
stochastic.Transformation grammar (Harris,
Chomsky), artificial intelligence (Minsky,
McCarthy, Shannon, Rochester), automated theorem
proving and problem solving (Newell and
Simon)Bayesian reasoning (Mosteller and
Wallace)Corpus work (Kucera and Francis)
23Some brief history
- Four paradigms (70-83) stochastic (IBM),
logic-based (Colmerauer, Pereira and Warren, Kay,
Bresnan), nlu (Winograd, Schank, Fillmore),
discourse modelling (Grosz and Sidner) - Empiricism and finite-state models redux (83-93)
Kaplan and Kay (phonology and morphology), Church
(syntax) - Late years (94-03) strong integration of
different techniques, different areas (including
speech and IR), probabilistic models, machine
learning
24The state of the art and the near-term future
- World-Wide Web (WWW)
- Sample scenarios
- generate weather reports in two languages
- teaching deaf people to speak
- translate Web pages into different languages
- speak to your appliances
- find restaurants
- answer questions
- grade essays (?)
- closed-captioning in many languages
- automatic description of a soccer game
25Structure of the course
- Three major parts
- Linguistic, mathematical, and computational
background - Computational models of morphology, syntax,
semantics, discourse, pragmatics - Applications text generation, machine
translation, information extraction, etc. - Three major goals
- Learn the basic principles and theoretical issues
underlying natural language processing - Learn techniques and tools used to develop
practical, robust systems that can communicate
with users in one or more languages - Gain insight into many open research problems in
natural language
26Readings
- Speech and Language Processing(Daniel Jurafsky
and James Martin)Prentice-Hall, 2000ISBN
0-13-095069-6 - Handouts given in class
- 1-2 chapters per week
Optional readings Natural Language
Understanding by Allen Foundations of
Statistical Natural Language Processing by
Manning and Schütze.
27Grading
- Four homework assignments (40)
- Midterm (15)
- Final project (20)
- Final exam (25)
- Additional requirements for SI761
28Assignments
- (subject to change)
- Finite-state modeling, part of speech tagging,
and information extraction - Fsmtools/lextools/JMX (Bell Labs, Penn)
- Tagging and parsing
- Brill tagger/Charniak parser (JHU, Brown)
- Machine translation
- GIZA/Rewrite decoder (Aachen, JHU, ISI)
- Text generation
- FUF/Surge (Columbia)
29Syllabus
Wk Date Topic HW HW due
1 9/7 Introduction (JM1)Linguistic Fundamentals
2 9/14 Regular Expressions and Automata (JM2) 1
3 9/21 Morphology and Finite-State Transducers (JM3)Word Classes and Part of Speech Tagging (JM8)
4 9/28 Context-Free Grammars for English (JM9)Parsing with Context-Free Grammars (JM10) 2 1
5 10/5 Features and Unification (JM11)Lexicalized and Probabilistic Parsing (JM12)
6 10/12 Natural Language Generation (JM20)Machine Translation (JM 21 handout) 3 2
10/19 NO CLASS
30Syllabus
Wk Date Topic HW HW due
7 10/26 Midterm
8 11/2 Natural Language Generation (JM20) (Contd)The Functional Unification Formalism (Handout) 4 3
9 11/9 Language and Complexity (JM13)
10 11/16 Representing Meaning (JM14) 4
11 11/23 Semantic Analysis (JM15)Discourse (JM18)
12 11/30 Rhetorical Analysis (Handout)Dialogue and Conversational Agents (JM19) Project due
13 12/714 Project Presentations
31Other meetings
- CLAIR meeting
- (TBA)
- Artificial Intelligence Seminar
- (Tuesdays 4-530)
- STIET
- (Thursdays 4-530)
32Projects
Each student will be responsible for designing
and completing a research project that
demonstrates the ability to use concepts from the
class in addressing a practical problem. A
significant part of the final grade will depend
on the project assignment. Students can elect to
do a project on an assigned topic, or to select a
topic of their own. The final version of the
project will be put on the World Wide Web, and
will be defended in front of the class at the end
of the semester (procedure TBA). In some cases
(and only with instructors approval), students
may be allowed to work in pairs when the
projects scope is significant.
33Sample projects
- Noun phrase parser
- Paraphrase identification
- Question answering
- NL access to databases
- Named entity tagging
- Rhetorical parsing
- Anaphora resolution, entity crossreference
- Document and sentence alignment
- Using bioinformatics methods
- Encyclopedia
- Information extraction
- Speech processing
- Sentence normalization
- Text summarization
- Sentence compression
- Definition extraction
- Crossword puzzle generation
- Prepositional phrase attachment
- Machine translation
- Generation
- Semi-structured document parsing
- Semantic analysis of short queries
- User-friendly summarization
- Number classification
- Domain-specific PP attachment
- Time-dependent fact extraction
34Main research forums and other pointers
- Conferences ACL/NAACL, SIGIR, AAAI/IJCAI, ANLP,
Coling, HLT, EACL/NAACL, AMTA/MT Summit,
ICSLP/Eurospeech - Journals Computational Linguistics, Natural
Language Engineering, Information Retrieval,
Information Processing and Management, ACM
Transactions on Information Systems, ACM TALIP,
ACM TSLP - University centers Columbia, CMU, JHU, Brown,
UMass, MIT, UPenn, USC/ISI, NMSU, Michigan,
Maryland, Edinburgh, Cambridge, Saarland,
Sheffield, and many others - Industrial research sites IBM, SRI, BBN, MITRE,
MSR, (ATT, Bell Labs, PARC) - Startups Language Weaver, Ask.com, LCC
- The Anthology http//www.aclweb.org/anthology
35(No Transcript)
36What this course is NOT
- EECS 597 / LING 792 / SI 661 Language and
Information, last taught in Fall of 2002,
essentially an introduction to corpus-based and
statistical NLP. - Topics covered introduction to computational
linguistics, information theory, data compression
and coding, N-gram models, clustering,
lexicography, collocations, text summarization,
information extraction, question answering, word
sense disambiguation, analysis of style, and
other topics . - SI 760 Information Retrieval, last taught
Winter 2003. - Topics covered information need, IR models,
documents, queries, query languages, relevance,
retrieval evaluation, reference collections,
query expansion and relevance feedback, indexing
and searching, XML retrieval, language modeling
approaches, crawling the Web, hyperlink analysis,
measuring the Web, similarity and clustering,
social network analysis for IR, hubs and
authorities, PageRank and HITS, focused crawling,
relevance transfer, question answering - An undergraduate Linguistics course such as Ling
212 Intro to the Symbolic Analysis of Language
or Ling 320 Programming for Linguistics and
Language Studies
37Linguistic Fundamentals
38Syntactic categories
black Persian tabbysmall
Nathalie likes
cats.
- Open (lexical) and closed (functional) categories
No-fly-zone yadda yadda yadda
the in
39Morphology
The dog chased the yellow bird.
- Parts of speech eight (or so) general types
- Inflection (number, person, tense)
- Derivation (adjective-adverb, noun-verb)
- Compounding (separate words or single word)
- Part-of-speech tagging
- Morphological analysis (prefix, root, suffix,
ending)
40Part of speech tags
From Church (1991) - 79 tags
NN / singular noun / IN / preposition
/ AT / article / NP / proper noun / JJ
/ adjective / , / comma / NNS /
plural noun / CC / conjunction / RB /
adverb / VB / un-inflected verb / VBN /
verb en (taken, looked (passive,perfect)) / VBD
/ verb ed (took, looked (past tense)) / CS
/ subordinating conjunction /
41Jabberwocky (Lewis Carroll)
- Twas brillig, and the slithy tovesDid gyre
and gimble in the wabeAll mimsy were the
borogoves,And the mome raths outgrabe."Beware
the Jabberwock, my son!The jaws that bite, the
claws that catch!Beware the Jubjub bird, and
shunThe frumious Bandersnatch!"
42Nouns
- Nouns dog, tree, computer, idea
- Nouns vary in number (singular, plural), gender
(masculine, feminine, neuter), case (nominative,
genitive, accusative, dative) - Latin filius (m), filia (f), filium
(object)German Mädchen - Clitics (s)
43Pronouns
- Pronouns she, ourselves, mine
- Pronouns vary in person, gender, number, case (in
English nominative, accusative, possessive, 2nd
possessive, reflexive)
Mary saw her in the mirror. Mary saw herself in
the mirror.
- Anaphors herself, each other
44Determiners and adjectives
- Articles the, a
- Demonstratives this, that
- Adjectives describe properties
- Attributive and predicative adjectives
- Agreement in gender, number
- Comparative and superlative (derivative and
periphrastic) - Positive form
45Verbs
- Actions, activities, and states (throw, walk,
have) - English four verb forms
- tenses present, past, future
- other inflection number, person
- gerunds and infinitive
- aspect progressive, perfective
- voice active, passive
- participles, auxiliaries
- irregular verbs
- French and Finnish many more inflections than
English
46Other parts of speech
- Adverbs, prepositions, particles
- phrasal verbs (the plane took off, take it off)
- particles vs. prepositions (she ran up a
bill/hill) - Coordinating conjunctions and, or, but
- Subordinating conjunctions if, because, that,
although - Interjections Ouch!
47Phrase structure
- Constraints on word order
- Constituents NP, PP, VP, AP
- Phrase structure grammars
S
NP
VP
PN
V
N
Det
N
Spot
chased
a
bird
48Phrase structure
- Paradigmatic relationships (e.g., constituency)
- Syntagmatic relationships (e.g., collocations)
S
NP
VP
VBD
That
man
PP
NP
the
butterfly
IN
NP
caught
a
net
with
49Phrase-structure grammars
Peter gave Mary a book. Mary gave Peter a book.
- Constituent order (SVO, SOV)
- imperative forms
- sentences with auxiliary verbs
- interrogative sentences
- declarative sentences
- start symbol and rewrite rules
- context-free view of language
50Sample phrase-structure grammar
S ? NP VPNP ? AT NNSNP ? AT NNNP ? NP
PPVP ? VP PP VP ? VBD VP ? VBD NP P ? IN
NP
AT ? theNNS ? children NNS ? students NNS ?
mountains VBD ? slept VBD ? ate VBD ? saw IN
? in IN ? of NN ? cake
51Phrase structure grammars
- Local dependencies
- Non-local dependencies
- Subject-verb agreement
The women who found the wallet were given a
reward.
Should Peter buy a book? Which book should Peter
buy?
52Dependency arguments and adjuncts
Sue watched the man at the next table.
- Event dependents (verb arguments are usually
NPs) - agent, patient, instrument, goal - semantic roles
- subject, direct object, indirect object
- transitive, intransitive, and ditransitive verbs
- active and passive voice
53Subcategorization
- Arguments subject complements
- adjuncts vs. complements
- adjuncts are optional and describe time, place,
manner - subordinate clauses
- subcategorization frames
54Subcategorization
- Subject The children eat candy.Object The
children eat candy.Prepositional phrase She put
the book on the table.Predicative adjective We
made the man angry.Bare infinitive She helped
me walk.To-infinitive She likes to
walk.Participial phrase She stopped singing
that tune at the end.That-clause She thinks
that it will rain tomorrow.Question-form
clauses She asked me what book I was reading.
55Subcategorization frames
- Intransitive verbs The woman walked
- Transitive verbs John loves Mary
- Ditransitive verbs Mary gave Peter flowers
- Intransitive with PP I rent in Paddington
- Transitive with PP She put the book on the table
- Sentential complement I know that she likes you
- Transitive with sentential complement She told
me that Gary is coming on Tuesday
56Selectional restrictions and preferences
- Subcategorization frames capture syntactic
regularities about complements - Selectional restrictions and preferences capture
semantic regularities bark, eat
57Phrase structure ambiguity
- Grammars are used for generating and parsing
sentences - Parses
- Syntactic ambiguity
- Attachment ambiguity Our company is training
workers. - The children ate the cake with a spoon.
- High vs. low attachment
- Garden path sentences The horse raced past the
barn fell. Is the book on the table red?
58Ungrammaticality vs. semantic abnormality
Slept children the. Colorless green ideas
sleep furiously. The cat barked.
59Semantics and pragmatics
- Lexical semantics and compositional semantics
- Hypernyms, hyponyms, antonyms, meronyms and
holonyms (part-whole relationship, tire is a
meronym of car), synonyms, homonyms - Senses of words, polysemous words
- Homophony (bass).
- Collocations white hair, white wine
- Idioms to kick the bucket
60Discourse analysis
1. Mary helped Peter get out of the car. He
thanked her.2. Mary helped the other passenger
out of the car. The man had asked her for
help because of his foot injury.
- Information extraction problems (entity
crossreferencing)
Hurricane Hugo destroyed 20,000 Florida homes.At
an estimated cost of one billion dollars, the
disasterhas been the most costly in the states
history.
61Pragmatics
- The study of how knowledge about the world and
language conventions interact with literal
meaning. - Speech acts
- Research issues resolution of anaphoric
relations, modeling of speech acts in dialogues
62Other areas of NLP
- Linguistics is traditionally divided into
phonetics, phonology, morphology, syntax,
semantics, and pragmatics. - Sociolinguistics interactions of social
organization and language. - Historical linguistics change over time.
- Linguistic typology
- Language acquisition
- Psycholinguistics real-time production and
perception of language
63Other sites
- Johns Hopkins University (Jason
Eisner)http//www.cs.jhu.edu/jason/465/ - Cornell University (Lillian Lee)http//courses.cs
.cornell.edu/cs674/2002SP/ - Simon Fraser University (Anoop Sarkar)
- http//www.sfu.ca/anoop/courses/CMPT-825-Fall-20
03/index.html - Stanford University (Chris Manning)http//www.sta
nford.edu/class/cs224n/ - JHU Summer workshophttp//www.clsp.jhu.edu/ws2003
/calendar/preliminary.shtml
64Word classes andpart-of-speech tagging
65Part of speech tagging
- Problems transport, object, discount, address
- More problems content
- French est, président, fils
- Book that flight what is the part of speech
associated with book? - POS tagging assigning parts of speech to words
in a text. - Three main techniques rule-based tagging,
stochastic tagging, transformation-based tagging
66Rule-based POS tagging
- Use dictionary or FST to find all possible parts
of speech - Use disambiguation rules (e.g., ARTV)
- Typically hundreds of constraints can be designed
manually
67Example in French
ltSgt
beginning of sentence La rf b nms
u article teneur nfs nms
noun feminine singular Moyenne
jfs nfs v1s v2s v3s adjective feminine
singular en p a b
preposition uranium nms
noun masculine singular des
p r preposition
rivieres nfp noun
feminine plural , x
punctuation bien_que
cs subordinating conjunction
délicate jfs
adjective feminine singular À p
preposition calculer
v verb
68Sample rules
- BS3 BI1 A BS3 (3rd person subject personal
pronoun) cannot be followed by a BI1 (1st person
indirect personal pronoun). In the example il
nous faut'' (\it we need) - il'' has the tag
BS3MS and nous'' has the tags BD1P BI1P BJ1P
BR1P BS1P. The negative constraint BS3 BI1''
rules out BI1P'', and thus leaves only 4
alternatives for the word nous''. - N K The tag N (noun) cannot be followed by a tag
K (interrogative pronoun) an example in the test
corpus would be ... fleuve qui ...''
(...river, that...). Since qui'' can be tagged
both as an E'' (relative pronoun) and a K''
(interrogative pronoun), the E'' will be chosen
by the tagger since an interrogative pronoun
cannot follow a noun (N''). - R VA word tagged with R (article) cannot be
followed by a word tagged with V (verb) for
example l' appelle'' (calls him/her). The word
appelle'' can only be a verb, but l''' can be
either an article or a personal pronoun. Thus,
the rule will eliminate the article tag, giving
preference to the pronoun.
69Stochastic POS tagging
- HMM tagger
- Pick the most likely tag for this word
- P(wordtag) P(tagprevious n tags) find tag
sequence that maximizes the probability formula - A bigram-based HMM tagger chooses the tag ti for
word wi that is most probable given the previous
tag ti-1 and the current word wi - ti argmaxj P(tjti-1,wi)
- ti argmaxj P(tjti-1)P(witj) HMM equation
for a single tag
70Example
- Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/ADV - People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ
space/NN - P(VBTO)P(raceVB)
- P(NNTO)P(raceNN)
- TO toVB (to sleep), toNN (to school)
71Example (contd)
- P(NNTO) .021
- P(VBTO) .34
- P(raceNN) .00041
- P(raceVB) .00003
- P(VBTO)P(raceVB) .00001
- P(NNTO)P(raceNN) .000007
72HMM Tagging
- T argmax P(TW), where Tt1,t2,,tn
- By Bayes rule P(TW) P(T)P(WT)/P(W)
- Thus we are attempting to choose the sequence of
tags that maximizes the rhs of the equation - P(W) can be ignored
- P(T)P(WT) ?P(wiw1t1wi-1ti-1ti)P(tiw1t1wi-1t
i-1)
73Transformation-based learning
- P(NNrace) .98
- P(VBrace) .02
- Change NN to VB when the previous tag is TO
- Types of rules
- The preceding (following) word is tagged z
- The word two before (after) is tagged z
- One of the two preceding (following) words is
tagged z - One of the three preceding (following) words is
tagged z - The preceding word is tagged z and the following
word is tagged w
74Confusion matrix
IN JJ NN NNP RB VBD VBN
IN - .2 .7
JJ .2 - 3.3 2.1 1.7 .2 2.7
NN 8.7 - .2
NNP .2 3.3 4.1 - .2
RB 2.2 2.0 .5 -
VBD .3 .5 - 4.4
VBN 2.8 2.6 -
Most confusing NN vs. NNP vs. JJ, VBD vs. VBN
vs. JJ
75Readings
- JM Chapters 1, 2, 3, 8
- What is Computational Linguistics by Hans
Uszkoreithttp//www.coli.uni-sb.de/hansu/what_is
_cl.html - Lecture notes 1