CPSC 503 Computational Linguistics - PowerPoint PPT Presentation

About This Presentation
Title:

CPSC 503 Computational Linguistics

Description:

e.g., all the other cheap cars. Nom - Nom PP (PP) (PP) ... Noun - flight. Verb - left, arrive. Aux - do, does. Search space of possible parse trees ... – PowerPoint PPT presentation

Number of Views:110
Avg rating:3.0/5.0
Slides: 48
Provided by: giuseppec
Category:

less

Transcript and Presenter's Notes

Title: CPSC 503 Computational Linguistics


1
CPSC 503Computational Linguistics
  • Lecture 7
  • Giuseppe Carenini

2
Knowledge-Formalisms Map(next three lectures)
State Machines (and prob. versions) (Finite State
Automata,Finite State Transducers, Markov Models)
Morphology
Syntax
Rule systems (and prob. versions) (e.g., (Prob.)
Context-Free Grammars)
Semantics
  • Logical formalisms
  • (First-Order Logics)

Pragmatics Discourse and Dialogue
AI planners
3
Today 2/10
  • English Syntax
  • Context-Free Grammar for English
  • Rules
  • Trees
  • Recursion
  • Problems
  • Start Parsing

4
Syntactic Notions so far...
  • N-grams prob. distr. for next word can be
    effectively approximated knowing previous n words
  • POS categories are based on
  • distributional properties (what other words can
    occur nearby)
  • morphological properties (affixes they take)

5
Syntax
  • Def. The study of how sentences are formed by
    grouping and ordering words

Example Ming and Sue prefer morning flights
Ming Sue flights morning and prefer
Groups behave as single unit wrt Substitution,
Movement, Coordination
6
Syntax Useful tasks
  • Why should you care?
  • Grammar checkers
  • Basis for semantic interpretation
  • Question answering
  • Information extraction
  • Summarization
  • Machine translation

7
Key Constituents - heads (English)
(Specifier) X (Complement)
  • Noun phrases
  • Verb phrases
  • Prepositional phrases
  • Adjective phrases
  • Sentences
  • (Det) N (PP)
  • (Qual) V (NP)
  • (Deg) P (NP)
  • (Deg) A (PP)
  • (NP) (I) (VP)

Some simple specifiers Category Typical
function Examples Determiner specifier of N
the, a, this, no.. Qualifier specifier
of V never, often.. Degree word
specifier of A or P very, almost..
Complements?
8
Key Constituents Examples
  • (Det) N (PP)
  • the cat on the table
  • (Qual) V (NP)
  • never eat a cat
  • (Deg) P (NP)
  • almost in the net
  • (Deg) A (PP)
  • very happy about it
  • (NP) (I) (VP)
  • a mouse -- ate it
  • Noun phrases
  • Verb phrases
  • Prepositional phrases
  • Adjective phrases
  • Sentences

9
Context Free Grammar (Example)
  • S -gt NP VP
  • NP -gt Det NOMINAL
  • NOMINAL -gt Noun
  • VP -gt Verb
  • Det -gt a
  • Noun -gt flight
  • Verb -gt left
  • Non-terminal
  • Terminal

10
CFG more complex Example
  • Lexicon
  • Grammar with example phrases

11
CFGs
  • Define a Formal Language (un/grammatical
    sentences)
  • Generative Formalism
  • Generate strings in the language
  • Reject strings not in the language
  • Impose structures (trees) on strings in the
    language

12
CFG Formal Definitions
  • 4-tuple (non-term., term., productions, start)
  • (N, ?, P, S)
  • P is a set of rules A?? A?N, ??(??N)
  • A derivation is the process of rewriting ?1 into
    ? m (both strings in (??N)) by applying a
    sequence of rules ?1 ? ? m
  • L G Ww?? and S ? w

13
Derivations as Trees
Context Free?
14
CFG Parsing
  • It is completely analogous to running a
    finite-state transducer with a tape
  • Its just more powerful
  • Chpt. 13

15
Other Options
  • Regular languages (FSA) A? xB or A? x
  • Too weak (e.g., cannot deal with recursion in a
    general way no center-embedding)
  • CFGs A? ? (also produce more understandable and
    useful structure)
  • Context-sensitive ?A?? ??? ???
  • Can be computationally intractable
  • Turing equiv. ??? ???
  • Too powerful / Computationally intractable

16
Common Sentence-Types
  • Declaratives A plane left
  • S -gt NP VP
  • Imperatives Leave!
  • S -gt VP
  • Yes-No Questions Did the plane leave?
  • S -gt Aux NP VP
  • WH Questions
  • Which flights serve breakfast?
  • S -gt WH NP VP
  • When did the plane leave?
  • S -gt WH Aux NP VP

17
NP more details
  • NP -gt Specifiers N Complements
  • NP -gt (Predet)(Det)(Card)(Ord)(Quant) (AP) Nom
  • e.g., all the other cheap
    cars
  • Nom -gt Nom PP (PP) (PP)
  • e.g., reservation on BA456 from NY to YVR

Nom -gt Nom GerundVP e.g., flight arriving
on Monday Nom -gt Nom RelClause Nom RelClause
-gt(who that) VP e.g., flight that arrives
in the evening
18
Conjunctive Constructions
  • S -gt S and S
  • John went to NY and Mary followed him
  • NP -gt NP and NP
  • John went to NY and Boston
  • VP -gt VP and VP
  • John went to NY and visited MOMA
  • In fact the right rule for English is
  • X -gt X and X

19
Problems with CFGs
  • Agreement
  • Subcategorization

20
Agreement
  • In English,
  • Determiners and nouns have to agree in number
  • Subjects and verbs have to agree in person and
    number
  • Many languages have agreement systems that are
    far more complex than this (e.g., gender).

21
Agreement
  • This dog
  • Those dogs
  • This dog eats
  • You have it
  • Those dogs eat
  • This dogs
  • Those dog
  • This dog eat
  • You has it
  • Those dogs eats

22
Possible CFG Solution
OLD Grammar
NEW Grammar
  • S -gt NP VP
  • NP -gt Det Nominal
  • VP -gt V NP
  • SgS -gt SgNP SgVP
  • PlS -gt PlNp PlVP
  • SgNP -gt SgDet SgNom
  • PlNP -gt PlDet PlNom
  • PlVP -gt PlV NP
  • SgVP -gtSgV NP

Sg singular Pl plural
23
CFG Solution for Agreement
  • It works and stays within the power of CFGs
  • But it doesnt scale all that well (explosion in
    the number of rules)

24
Subcategorization
  • Def. It expresses constraints that a predicate
    (verb here) places on the number and type of its
    arguments (see first table)
  • John sneezed the book
  • I prefer United has a flight
  • Give with a flight

25
Subcategorization
  • Sneeze John sneezed
  • Find Please find a flight to NYNP
  • Give Give meNPa cheaper fareNP
  • Help Can you help meNPwith a flightPP
  • Prefer I prefer to leave earlierTO-VP
  • Told I was told United has a flightS

26
So?
  • So the various rules for VPs overgenerate.
  • They allow strings containing verbs and arguments
    that dont go together
  • For example
  • VP -gt V NP therefore Sneezed the book
  • VP -gt V S therefore go she will go there

27
Possible CFG Solution
OLD Grammar
NEW Grammar
  • VP -gt IntransV
  • VP -gt TransV NP
  • VP -gt TransPPto NP PPto
  • TransPPto -gt hand,give,..
  • VP -gt V
  • VP -gt V NP
  • VP -gt V NP PP

This solution has the same problem as the one for
agreement
28
CFG for NLP summary
  • CFGs cover most syntactic structure in English.
  • But there are problems (overgeneration)
  • That can be dealt with adequately, although not
    elegantly, by staying within the CFG framework.
  • There are simpler, more elegant, solutions that
    take us out of the CFG framework
  • Chpt 16 Features and Unification

29
Dependency Grammars
  • Syntactic structure binary relations between
    words
  • Links grammatical function or very general
    semantic relation
  • Abstract away from word-order variations (simpler
    grammars)
  • Useful features in many NLP applications (for
    classification, summarization and NLG)

30
Today 2/10
  • English Syntax
  • Context-Free Grammar for English
  • Rules
  • Trees
  • Recursion
  • Problems
  • Start Parsing

31
Parsing with CFGs
  • Valid parse trees
  • Sequence of words

I prefer a morning flight
Parser
CFG
  • Assign valid trees covers all and only the
    elements of the input and has an S at the top

32
Parsing as Search
  • CFG
  • Search space of possible parse trees
  • S -gt NP VP
  • S -gt Aux NP VP
  • NP -gt Det Noun
  • VP -gt Verb
  • Det -gt a
  • Noun -gt flight
  • Verb -gt left, arrive
  • Aux -gt do, does
  • defines
  • Parsing find all trees that cover all and only
    the words in the input

33
Constraints on Search
  • Sequence of words
  • Valid parse trees

I prefer a morning flight
Parser
CFG (search space)
  • Search Strategies
  • Top-down or goal-directed
  • Bottom-up or data-directed

34
Top-Down Parsing
  • Since were trying to find trees rooted with an S
    (Sentences) start with the rules that give us an
    S.
  • Then work your way down from there to the words.

35
Next step Top Down Space
  • When POS categories are reached, reject trees
    whose leaves fail to match all words in the input

36
Bottom-Up Parsing
  • Of course, we also want trees that cover the
    input words. So start with trees that link up
    with the words in the right way.
  • Then work your way up from there.

37
Two more steps Bottom-Up Space
38
Top-Down vs. Bottom-Up
  • Top-down
  • Only searches for trees that can be answers
  • But suggests trees that are not consistent with
    the words
  • Bottom-up
  • Only forms trees consistent with the words
  • Suggest trees that make no sense globally

39
So Combine Them
  • Top-down control strategy to generate trees
  • Bottom-up to filter out inappropriate parses
  • Top-down Control strategy
  • Depth vs. Breadth first
  • Which node to try to expand next
  • Which grammar rule to use to expand a node
  • (left-most)
  • (textual order)

40
Top-Down, Depth-First, Left-to-Right Search
Sample sentence Does this flight include a
meal?
41
Example
Does this flight include a meal?
42
Example
Does this flight include a meal?
flight
flight
43
Example
Does this flight include a meal?
flight
flight
44
Adding Bottom-up Filtering
  • The following sequence was a waste of time
    because an NP cannot generate a parse tree
    starting with an AUX

Aux
Aux
Aux
Aux
45
Bottom-Up Filtering
Category Left Corners
S Det, Proper-Noun, Aux, Verb
NP Det, Proper-Noun
Nominal Noun
VP Verb
46
Problems with TD-BU-filtering
  • Left recursion
  • Ambiguity
  • Repeated Parsing
  • SOLUTION Earley Algorithm
  • (once again dynamic programming!)

47
For Next Time
  • Read Chapter 13 (Parsing)
  • Optional Read Chapter 16 (Features and
    Unification) skip algorithms and implementation
Write a Comment
User Comments (0)
About PowerShow.com