Title: CPSC 503 Computational Linguistics
1CPSC 503Computational Linguistics
- Lecture 7
- Giuseppe Carenini
2Knowledge-Formalisms Map(next three lectures)
State Machines (and prob. versions) (Finite State
Automata,Finite State Transducers, Markov Models)
Morphology
Syntax
Rule systems (and prob. versions) (e.g., (Prob.)
Context-Free Grammars)
Semantics
- Logical formalisms
- (First-Order Logics)
Pragmatics Discourse and Dialogue
AI planners
3Today 2/10
- English Syntax
- Context-Free Grammar for English
- Rules
- Trees
- Recursion
- Problems
- Start Parsing
4Syntactic Notions so far...
- N-grams prob. distr. for next word can be
effectively approximated knowing previous n words
- POS categories are based on
- distributional properties (what other words can
occur nearby) - morphological properties (affixes they take)
5Syntax
- Def. The study of how sentences are formed by
grouping and ordering words
Example Ming and Sue prefer morning flights
Ming Sue flights morning and prefer
Groups behave as single unit wrt Substitution,
Movement, Coordination
6Syntax Useful tasks
- Why should you care?
- Grammar checkers
- Basis for semantic interpretation
- Question answering
- Information extraction
- Summarization
- Machine translation
-
7Key Constituents - heads (English)
(Specifier) X (Complement)
- Noun phrases
- Verb phrases
- Prepositional phrases
- Adjective phrases
- Sentences
- (Det) N (PP)
- (Qual) V (NP)
- (Deg) P (NP)
- (Deg) A (PP)
- (NP) (I) (VP)
Some simple specifiers Category Typical
function Examples Determiner specifier of N
the, a, this, no.. Qualifier specifier
of V never, often.. Degree word
specifier of A or P very, almost..
Complements?
8Key Constituents Examples
- (Det) N (PP)
- the cat on the table
- (Qual) V (NP)
- never eat a cat
- (Deg) P (NP)
- almost in the net
- (Deg) A (PP)
- very happy about it
- (NP) (I) (VP)
- a mouse -- ate it
- Noun phrases
- Verb phrases
- Prepositional phrases
- Adjective phrases
- Sentences
9Context Free Grammar (Example)
- S -gt NP VP
- NP -gt Det NOMINAL
- NOMINAL -gt Noun
- VP -gt Verb
- Det -gt a
- Noun -gt flight
- Verb -gt left
10CFG more complex Example
- Grammar with example phrases
11CFGs
- Define a Formal Language (un/grammatical
sentences) - Generative Formalism
- Generate strings in the language
- Reject strings not in the language
- Impose structures (trees) on strings in the
language
12CFG Formal Definitions
- 4-tuple (non-term., term., productions, start)
- (N, ?, P, S)
- P is a set of rules A?? A?N, ??(??N)
- A derivation is the process of rewriting ?1 into
? m (both strings in (??N)) by applying a
sequence of rules ?1 ? ? m
13Derivations as Trees
Context Free?
14CFG Parsing
- It is completely analogous to running a
finite-state transducer with a tape - Its just more powerful
- Chpt. 13
15Other Options
- Regular languages (FSA) A? xB or A? x
- Too weak (e.g., cannot deal with recursion in a
general way no center-embedding) - CFGs A? ? (also produce more understandable and
useful structure) - Context-sensitive ?A?? ??? ???
- Can be computationally intractable
- Turing equiv. ??? ???
- Too powerful / Computationally intractable
16Common Sentence-Types
- Declaratives A plane left
- S -gt NP VP
- Imperatives Leave!
- S -gt VP
- Yes-No Questions Did the plane leave?
- S -gt Aux NP VP
- WH Questions
- Which flights serve breakfast?
- S -gt WH NP VP
- When did the plane leave?
- S -gt WH Aux NP VP
17NP more details
- NP -gt Specifiers N Complements
-
- NP -gt (Predet)(Det)(Card)(Ord)(Quant) (AP) Nom
- e.g., all the other cheap
cars -
- Nom -gt Nom PP (PP) (PP)
- e.g., reservation on BA456 from NY to YVR
-
Nom -gt Nom GerundVP e.g., flight arriving
on Monday Nom -gt Nom RelClause Nom RelClause
-gt(who that) VP e.g., flight that arrives
in the evening
18Conjunctive Constructions
- S -gt S and S
- John went to NY and Mary followed him
-
- NP -gt NP and NP
- John went to NY and Boston
- VP -gt VP and VP
- John went to NY and visited MOMA
-
-
- In fact the right rule for English is
- X -gt X and X
-
19Problems with CFGs
- Agreement
- Subcategorization
20Agreement
- In English,
- Determiners and nouns have to agree in number
- Subjects and verbs have to agree in person and
number
- Many languages have agreement systems that are
far more complex than this (e.g., gender).
21Agreement
- This dog
- Those dogs
- This dog eats
- You have it
- Those dogs eat
- This dogs
- Those dog
- This dog eat
- You has it
- Those dogs eats
22Possible CFG Solution
OLD Grammar
NEW Grammar
- S -gt NP VP
- NP -gt Det Nominal
- VP -gt V NP
- SgS -gt SgNP SgVP
- PlS -gt PlNp PlVP
- SgNP -gt SgDet SgNom
- PlNP -gt PlDet PlNom
- PlVP -gt PlV NP
- SgVP -gtSgV NP
Sg singular Pl plural
23CFG Solution for Agreement
- It works and stays within the power of CFGs
- But it doesnt scale all that well (explosion in
the number of rules)
24Subcategorization
- Def. It expresses constraints that a predicate
(verb here) places on the number and type of its
arguments (see first table)
- John sneezed the book
- I prefer United has a flight
- Give with a flight
25Subcategorization
- Sneeze John sneezed
- Find Please find a flight to NYNP
- Give Give meNPa cheaper fareNP
- Help Can you help meNPwith a flightPP
- Prefer I prefer to leave earlierTO-VP
- Told I was told United has a flightS
-
26So?
- So the various rules for VPs overgenerate.
- They allow strings containing verbs and arguments
that dont go together - For example
- VP -gt V NP therefore Sneezed the book
- VP -gt V S therefore go she will go there
27Possible CFG Solution
OLD Grammar
NEW Grammar
- VP -gt IntransV
- VP -gt TransV NP
- VP -gt TransPPto NP PPto
-
- TransPPto -gt hand,give,..
- VP -gt V
- VP -gt V NP
- VP -gt V NP PP
This solution has the same problem as the one for
agreement
28CFG for NLP summary
- CFGs cover most syntactic structure in English.
- But there are problems (overgeneration)
- That can be dealt with adequately, although not
elegantly, by staying within the CFG framework. - There are simpler, more elegant, solutions that
take us out of the CFG framework - Chpt 16 Features and Unification
29Dependency Grammars
- Syntactic structure binary relations between
words - Links grammatical function or very general
semantic relation
- Abstract away from word-order variations (simpler
grammars) - Useful features in many NLP applications (for
classification, summarization and NLG)
30Today 2/10
- English Syntax
- Context-Free Grammar for English
- Rules
- Trees
- Recursion
- Problems
- Start Parsing
31Parsing with CFGs
I prefer a morning flight
Parser
CFG
- Assign valid trees covers all and only the
elements of the input and has an S at the top
32Parsing as Search
- Search space of possible parse trees
- S -gt NP VP
- S -gt Aux NP VP
- NP -gt Det Noun
- VP -gt Verb
- Det -gt a
- Noun -gt flight
- Verb -gt left, arrive
- Aux -gt do, does
- Parsing find all trees that cover all and only
the words in the input
33Constraints on Search
I prefer a morning flight
Parser
CFG (search space)
- Search Strategies
- Top-down or goal-directed
- Bottom-up or data-directed
34Top-Down Parsing
- Since were trying to find trees rooted with an S
(Sentences) start with the rules that give us an
S. - Then work your way down from there to the words.
35Next step Top Down Space
- When POS categories are reached, reject trees
whose leaves fail to match all words in the input
36Bottom-Up Parsing
- Of course, we also want trees that cover the
input words. So start with trees that link up
with the words in the right way. - Then work your way up from there.
37Two more steps Bottom-Up Space
38Top-Down vs. Bottom-Up
- Top-down
- Only searches for trees that can be answers
- But suggests trees that are not consistent with
the words - Bottom-up
- Only forms trees consistent with the words
- Suggest trees that make no sense globally
39So Combine Them
- Top-down control strategy to generate trees
- Bottom-up to filter out inappropriate parses
- Top-down Control strategy
- Depth vs. Breadth first
- Which node to try to expand next
- Which grammar rule to use to expand a node
40Top-Down, Depth-First, Left-to-Right Search
Sample sentence Does this flight include a
meal?
41Example
Does this flight include a meal?
42Example
Does this flight include a meal?
flight
flight
43Example
Does this flight include a meal?
flight
flight
44Adding Bottom-up Filtering
- The following sequence was a waste of time
because an NP cannot generate a parse tree
starting with an AUX
Aux
Aux
Aux
Aux
45Bottom-Up Filtering
Category Left Corners
S Det, Proper-Noun, Aux, Verb
NP Det, Proper-Noun
Nominal Noun
VP Verb
46Problems with TD-BU-filtering
- Left recursion
- Ambiguity
- Repeated Parsing
- SOLUTION Earley Algorithm
- (once again dynamic programming!)
47For Next Time
- Read Chapter 13 (Parsing)
- Optional Read Chapter 16 (Features and
Unification) skip algorithms and implementation