Linguistically Rich Statistical Models of Language - PowerPoint PPT Presentation

About This Presentation
Title:

Linguistically Rich Statistical Models of Language

Description:

Talk to your computer like another human. HAL, Star Trek, etc. ... British Left Waffles on Falkland Islands. Red Tape Holds Up New Bridges ... – PowerPoint PPT presentation

Number of Views:118
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Linguistically Rich Statistical Models of Language


1
Linguistically Rich Statistical Models of
Language
  • Joseph Smarr
  • M.S. Candidate
  • Symbolic Systems Program
  • Advisor Christopher D. Manning
  • December 5th, 2002

2
Grand Vision
  • Talk to your computer like another human
  • HAL, Star Trek, etc.
  • Ask your computer a question, it finds the answer
  • Whos speaking at this weeks SymSys Forum?
  • Computer can read and summarize text for you
  • Whats the cutting edge in NLP these days?

3
Were Not There (Yet)
  • Turns out behaving intelligently is difficult
  • What does it take to achieve the grand vision?
  • General Artificial Intelligence problems
  • Knowledge representation, common sense reasoning,
    etc.
  • Language-specific problems
  • Complexity, ambiguity, and flexibility of
    language
  • Always underestimated because language is so
    easy for us!

4
Are There Useful Sub-Goals?
  • Grand vision is still too hard, but we can solve
    simpler problems that are still valuable
  • Filter news for stories about new tech gadgets
  • Take the SSP talk email and add it to my calendar
  • Dial my cell phone by speaking my friends name
  • Automatically reply to customer service e-mails
  • Find out which episode of The Simpsons is tonight
  • Two approaches to understanding language
  • Theory-driven Theoretical Linguistics
  • Task-driven Natural Language Processing

5
Theoretical Linguistics vs. NLP
  • Theoretical Linguistics
  • Goal
  • Understand peoples Knowledge of language
  • Method
  • Rich logical representations of languages hidden
    structure and meaning
  • Guiding principles
  • Separation of (hidden) knowledge of language and
    (observable) performance
  • Grammaticality is categorical (all or none)
  • Describe what are possible and impossible
    utterances
  • Natural Language Processing
  • Goal
  • Develop practical tools for analyzing speech /
    text
  • Method
  • Simple, robust models of everyday language use
    that are sufficient to perform tasks
  • Guiding principles
  • Exploit (empirical) regularities and patterns in
    examples of language in text collections
  • Sentence goodness is gradient (better or worse)
  • Deal with the utterances youre given, good or bad

6
Theoretical Linguistics vs. NLP
Linguistics
NLP
7
Linguistic Puzzle
  • When dropping an argument, why do some verbs keep
    the subject and some keep the object?
  • John sang the song ? John sang
  • John broke the vase ? The vase broke
  • Not just quirkiness of language
  • Similar patterns show up in other languages
  • Seems to involve deep aspects of verb meaning
  • Rules to account for this phenomenon
  • Two classes of verbs (unergative unaccusative)
  • Remaining argument must be realized as subject

8
Exception Imperatives
  • Open the pod bay doors, Hal
  • Different goals lead to study of different
    problems. In NLP...
  • Need to recognize this as a command
  • Need to figure out what specific action to take
  • Irrelevant how youd say it in French
  • Describing language vs. working with language
  • But both tasks clearly share many sub-problems

9
Theoretical Linguistics vs. NLP
  • Potential for much synergy between linguistics
    and NLP
  • However, historically they have remained quite
    distinct
  • Chomsky (founder of generative grammar)
  • It must be recognized that the notion
    probability of a sentence is an entirely
    useless one, under any known interpretation of
    this term.
  • Karttunen (founder of finite state technologies
    at Xerox)
  • Linguists reaction to NLP Not interested. You
    do not understand Theory. Go away you geek.
  • Jelinek (former head of IBM speech project)
  • Every time I fire a linguist, the performance of
    our speech recognition system goes up.

10
Potential Synergies
  • Lexical acquisition (unknown words)
  • Statistically infer new lexical entries from
    context
  • Modeling naturalness and conventionality
  • Use corpus data to weight constructions
  • Dealing with ungrammatical utterances
  • Find most similar / most likely correction
  • Richer patterns for finding information in text
  • Use argument structure / semantic dependencies
  • More powerful models for speech recognition
  • Progressively build parse tree while listening

11
Finding Information in Text
  • US Government has sponsored lots of research in
    information extraction from news articles
  • Find mentions of terrorists and which locations
    theyre targeting
  • Find which companies are being acquired by which
    others and for how much
  • Progress driven by simplifying the models used
  • Early work used rich linguistic parsers
  • Unable to robustly handle natural text
  • Modern work is mainly finite state patterns
  • Regular expressions are very practical and
    successful

12
Web Information Extraction
  • How much does that text book cost on Amazon?
  • Learn patterns for finding relevant fields

Concept Book
Title Foundations of Statistical Natural Language Processing
Author(s) Christopher D. Manning Hinrich Schütze
Price 58.45
13
Improving IE Performance on Natural Text
Documents
  • How can we scale IE back up for natural text?
  • Need to look elsewhere for regularities to
    exploit
  • Idea Consider grammatical structure
  • Run shallow parser on each sentence
  • Flatten output into sequence of typed chunks

14
Power of Linguistic Features
21 increase
65 increase
45 increase
15
Linguistically Rich(er) IE
  • Exploit more grammatical structure for patterns
  • e.g. Tim Grows work on IE with PCFGs

Spur, acq, amt
VPacq, amt
VPacq, amt
NPpur
MD
NNP
PPamt
NNP
NNP
VB
will
pur
pur
pur
NPamt
NPacq
acquire
IN
First
Union
Corp
NNP
CD
CD
NNP
NNP
NNP
for
acq
acq
acq
amt
amt
amt
three
million
Sheland
Bank
Inc
dollars
16
Classifying Unknown Words
  • Which of the following is the name of a city?
  • Most linguistic grammars assume a fixed lexicon
  • How do humans learn to deal with new words?
  • Context (I spent a summer living in
    Wethersfield)
  • Makeup of the word itself (phonesthetics)
  • Idea Learn distinguishing letter sequences

17
Whats in a Name?
18
Generative Model of PNPs
Length n-gram model and word model P(pnpc)
Pn-gram(word-lengths(pnp))
Pword i?pnp P(wiword-length(wi))
Word model mixture of character n-gram model and
common word model P(wilen) llenPn-gram(wilen)
k/len (1-llen) Pword(wilen)
N-Gram Models deleted interpolation P0-gram(symbo
lhistory) uniform-distribution Pn-gram(sh)
lC(h)Pempirical(sh) (1- lC(h))P(n-1)-gram(sh)
19
Experimental Results
20
Knowledge of Frequencies
  • Linguistics traditionally assumes Knowledge of
    Language doesnt involve counting
  • Letter frequencies are clearly an important
    source of knowledge for unknown words
  • Similarly, we saw before that there are regular
    patterns to exploit in grammatical information
  • Take home point
  • Combining Statistical NLP methods with richer
    linguistic representations is a big win!

21
Language is Ambiguous!
  • Ban on Nude Dancing on Governors Desk from a
    Georgia newspaper column discussing current
    legislation
  • Lebanese chief limits access to private parts
    talking about an Army Generals initiative
  • Death may ease tension an article about the
    death of Colonel Jean-Claude Paul in Haiti
  • Iraqi Head Seeks Arms
  • Juvenile Court to Try Shooting Defendant
  • Teacher Strikes Idle Kids
  • Stolen Painting Found By Tree

22
Language is Ambiguous!
  • Local HS Dropouts Cut in Half
  • Obesity Study Looks for Larger Test Group
  • British Left Waffles on Falkland Islands
  • Red Tape Holds Up New Bridges
  • Man Struck by Lightning Faces Battery Charge
  • Clinton Wins on Budget, but More Lies Ahead
  • Hospitals Are Sued by 7 Foot Doctors
  • Kids Make Nutritious Snacks

23
Coping With Ambiguity
  • Categorical grammars like HPSG provide many
    possible analyses for sentences
  • 455 parses for List the sales of the products
    produced in 1973 with the products produced in
    1972. (Martin et al, 1987)
  • In most cases, only one interpretation is
    intended
  • Initial solution was hand-coded preferences among
    rules
  • Hard to manage as number of rules increase
  • Need to capture interactions among rules

24
Statistical HPSG Parse Selection
  • HPSG provides deep analyses of sentence structure
    and meaning
  • Useful for NLP tasks like question answering
  • Need to solve disambiguation problem to make
    using these richer representations practical
  • Idea Learn statistical preferences among
    constructions from hand-disambiguated collection
    of sentences
  • Result Correct analysis chosen gt80 of the time
  • StatNLP methods Linguistic representation Win

25
Towards Semantic Extraction
  • HPSG provides representation of meaning
  • Who did what to whom?
  • Computers need meaning to do inference
  • Can we extend information extraction methods to
    extract meaning representations from pages?
  • Current project IE for the semantic web
  • Large project to build rich ontologies to
    describe the content of web pages for intelligent
    agents
  • Use IE to extract new instances of concepts from
    web pages (as opposed to manual labeling)
  • student(Joseph), univ(Stanford), at(Joseph,
    Stanford)

26
Towards the Grand Vision?
  • Collaboration between Theoretical Linguistics and
    NLP is important step forward
  • Practical tools with sophisticated language power
  • How can we ever teach computers enough about
    language and the world?
  • Hawking Moores Law is sufficient
  • Moravec mobile robots must learn like children
  • Kurzweil reverse-engineer the human brain
  • The experts agree Symbolic Systems is the
    future!

27
Upcoming Convergence Courses
  • Ling 139M Machine Translation Win
  • Ling 239E Grammar Engineering Win
  • CS 276B Text Information Retrieval Win
  • Ling 239A Parsing and Generation Spr
  • CS 224N Natural Language Processing Spr

Get Involved!!
Write a Comment
User Comments (0)
About PowerShow.com