Title: Guarded Constraints in Natural Language Processing
1Guarded Constraints in Natural Language Processing
- Kathryn L. Baker
- May 3, 2002
- Committee
- Bob Carpenter, Chair
- Teruko Mitamura, Co-Chair
- Chris Manning
- Carl Pollard
- Rich Thomason
2Overview
- Introduction and Thesis Statement
- Linguistic Theories
- Guarded Descriptions and Guarded Rules
- Implementation and Evaluation
- Conclusions
3Feature Structures
- Feature-value pairs encode information
- Framework for linguistic theories (FUG, LFG,
HPSG) - Widely used in NLP (frame grammars,
interlinguas) - Can use unification to combine them.
4Example The dog runs.
5German determiners have features
nom acc dat gen
masc der den dem des
fem die die der der
neut das das dem des
plur die die den der
- den is masculine accusative or dative plural.
- den Mann is masculine accusative
- den Männern is dative plural
6Constraints on German Determiners
- Constraints narrow down set of answers, and hence
the search space, for a problem. e.g. X lt 5, X gt
2, then X 3,4 - Computational linguists can use constraints to
narrow down an analysis for some piece of
language. - Noun constrains a determiner in German.
- Specifically, we can describe the information we
get, such as caseaccusative
7Underspecification
- When more than one answer is still possible, a
feature is considered underspecified. - The verb constrains sheep in these cases.
- The sheep bleats.
- The sheep graze.
- The number feature is considered underspecified.
- Types can also be underspecified, in a typed
system.
8Typed Feature Structures
- This feature structure is underspecified for
number. - This feature structure has number specified.
9Implementing Feature-based Linguistic Theories
- Linguists can use underspecification to great
advantage. The benefit is generality. - Sometimes, an underspecified portion of a feature
structure is shared. - Our goal in implementation is to maintain the
elegance of the original theories, and parse with
them.
10Specific Problems for NLP
- Increased local ambiguity if lexicon contains a
priori an entry for every possibility. - e.g. extracted complements for a sentence that is
not gapped every combination of binder bound
arguments. - Process may not terminate if arguments are
underspecified. - e.g. applying a lexical rule over an
uninstantiated list of complements - Constraints may be prematurely satisfied.
- Vacuously true in absence of contradiction
11Solution Guarded Constraints
- Wait until we have more information, then process
the underspecified part of the feature structure. - For example, describe what we want to know from
the verb, in the case of the sheep bleats, before
processing sheep - Waiting is also called delaying. Constraints
that are delayed are called guarded constraints.
12Thesis Statement
- Delays for processing a linguistic theory can
be generalized with a specification for guarded
constraints as descriptions of typed feature
structures. My work enables the successful
parsing of modern grammars as written, even with
highly lexicalized constraints.
13Contributions of the Thesis
- Guarded descriptions added to the theory of typed
feature structures. - Extension to theory of Carpenter (1992)
- Demonstrate theory in a logic programming system.
- Extended ALE system of Carpenter Penn (1994)
- Identification of various linguistic phenomena
which can be implemented using guarding.
14Linguistic Theories
- Introduction and Thesis Statement
- Linguistic Theories
- Guarded Descriptions and Guarded Rules
- Implementation and Evaluation
- Conclusions
15Cross-language Phenomenaas a Basis for Guarding
- Argument Raising in German, French and Italian
(Johnson 1986, Hinrichs Nakazawa 1989,
Monachesi 1993, Abeille Godard 1994, Baker
1999) - Romance Clitics (Miller 1992, Monachesi 1993, Sag
Godard 1993, Abeille et al. 1995) - Morphology
- German Passive (Kathol 1994, Pollard 1994)
- Causative
- Chichewa (Alsina 1992)
- Japanese (Manning et al. 1999)
16Cross-linguistic Phenomena
- Complement Extraction (Pollard Sag 1994)
- Quantifier Raising (Pollard Yoo 1997)
- Ordering Constraints (Erbach et al. 1995, Penn
1999)
17A Common Thread Argument Sharing
- Two heads have a common argument or arguments.
- Causative causee linked to both causative and
embedded predicate - Verb and auxiliary share syntactic arguments in
argument raising - Quantifier store of verb phrase derived from
quantifiers of its arguments - During processing, one head might need to gain
information from the other.
18Example Argument Raising by Auxiliary
- Wird Sandy Kim sehen?
- Will Sandy Kim see
- Will Sandy see Kim?
- Arguments of the verb sehen are the subject Sandy
and the object Kim. - Arguments of the auxiliary verb wird are Sandy,
Kim, and the verb sehen.
19An auxiliarys arguments are underspecified
20Lexicalization of Constraints
- Concurrent trend towards lexicalization of
constraints, such as binding theory, quantifier
scoping, and complement extraction. - e.g. for a verb, Apply a binding constraint to
my argument structure, however that is filled
in.
21Argument Raising Complement Extraction
- Kim wird Sandy sehen.
- Kim will Sandy see
- Sandy will see Kim.
- The object Kim has been fronted (extracted).
- A lexical rule for complement extraction says,
If I have a lexical entry with an object on the
complements list of a verb, then I also have an
entry with that object in the fronted position.
22Lexically specified Complement Extraction
(topicalization, or fronting) (Pollard Sag,
1994)
23Raising by Auxiliary Complement Extraction
Processing Problems
- We know nothing about the verb. Thus, it is
difficult to extract arguments from the
auxiliarys complements list. - If we know the verb is sehen, then we know there
is a direct object. - Solution Wait until we know what the verbs
arguments are to process this lexically specified
constraint.
24Guarded Descriptions and Guarded Rules
- Introduction and Thesis Statement
- Linguistic Theories
- Guarded Descriptions and Guarded Rules
- Implementation and Evaluation
- Conclusions
25Feature Structure Descriptions
- third person singular
- The description is
- I used feature structure descriptions to guard a
portion of a feature structure.
26Typed Feature Structure Descriptions
- Descriptions are a shorthand for picking out a
particular feature structure. (Rounds Kasper
1986). Descriptions are relative to a particular
type.
27Guarded Descriptions Extending the Description
Language
- If the feature structure satisfies f, then add
the information in the description ?. Else, if
it doesnt, add the description ?. - Example verb are
- NUMBERplural ? true PERSONsecond
28Satisfaction Conditions for Guarded Descriptions
- If a feature structure satisfies a description f,
then it also satisfies ? else it satisfies ?.
29Negation
- Negation is defined as failure if every
extension of a feature structure satisfies a
description, else true. Definition follows
Moshier (1988).
30Inequations
- An instance of negation.
- Compile the description that contains an
inequation as in instance of guarding
(transparent to the user).
31Example Binding Theory
- For non-reflexives, if the nominal objects are
co-indexed, then fail, else true. - Johni likes himselfi
- Johni likes Johni.
- Johni likes Johnj.
- (Subj,loccont((nproppro),index(SubjInd,\Ind)
)) - bind(Subj,(Arg,loccontindexInd))
- (\Ind) compiled as SubjIndInd?failtrue
32Guarded Rules (Constraint Logic Programming --
CLP)
- A constraint can be associated with a rule. This
affects the order of goal resolution (constraint
must be satisfied first). - I will use descriptions of feature structures as
the guards on rules which have feature structure
arguments.
33Guarded Rules
- Equivalent to using guarded descriptions.
- With negation defined, we can guard rules using
the regular description language.
34Argument Raising with Guards
- Kim wird Sandy sehen.
- Kim will Sandy see.
- Kim will see Sandy.
- Put a guard on argument raising. Each member of
the verbs complements list must satisfy at
least the type subst (substantive) before the
arguments can be raised by auxiliary. - A substantive is noun, verb, etc.
35Argument Raising (cont.)
- Guard on the COMPS list of the
subcategorized verb. - Compsloccatheadsubst aux_raising(Aux,Comps,S
ubj)
36Add Complement Extraction
- Sandy sehen wird Kim.
- Sandy see will Kim
- Kim will see Sandy.
- Know the semantic relation of the embedded verb
before arguments can be extracted from an
auxiliary. - Propnucleusrelation
- lexical_rule(aux(compsPVPComps) ?
- aux(slash(PVP,contProp),compsPVPComps)
37Suspending Goals
38Implementation and Evaluation
- Introduction and Thesis Statement
- Linguistic Theories
- Guarded Descriptions and Guarded Rules
- Implementation and Evaluation
- Conclusions
39Logic Grammar Systems
- Early systems
- FUG (Kay 1983)
- PATR-II (Shieber et al. 1985)
- LOGIN (Ait-Kaci and Nasr 1986)
- ALE (Carpenter Penn 1994)
- ALEP (Erbach et al. 1995)
- CUF (Dorre Dorna 1993)
- ConTroll (Gotz et al. 1997)
- FUF (Elhadad Robin 1992)
40Implementation
- Extension to ALE (Carpenter Penn 1994)
- Allow guards on the type of a typed feature
structure. - Grammars
- Japanese causative and binding theory (Manning et
al. 1999). Inequations are compiled as waits on
disproving isomorphism. - German partial verb phrase fronting (Baker 1999)
- Lexical quantifier scoping (Pollard Yoo 1997)
41Other Possible Strategies
- Brute Force or naïve approach (multiply out
possibilities, esp. in the lexicon) - Accommodation (setting limits not in the original
theory edit program or grammar) - e.g. limit the number of a verbs complements.
- Hand-threading the grammar
- e.g. put lexical constraints in the
phrase-structure rules.
42Results of Grammar Comparison
- Guarding may reduce lexicon size substantially,
especially with respect to the brute force
approach. - Accommodation strategy may prune solutions.
- Hand threaded grammar performs fastest, but is
knowledge-intensive. - Other Slowness of guarding due to blocking and
unblocking of guards (up to 10x slower than the
ideal case). Implementation could be improved.
43Raising by AuxiliaryPVP Fronting (parse times,
in msec)
Sentence Naïve Unguarded Guarded Thread
Sandy sieht Kim. Sandy wird Kim sehen können. Sandy sehen wird Kim. Sandy wird Kim sehen. Sandy sehen können wird Kim. Wird Kim gehen können? Wird Kim sehen können? Wird Kim gehen? wird Kim können (VP) Wird Kim Sandy sehen? Wird Kim Sandy sehen können? 70 5590 1070 1240 3450 2620 4260 540 1860 1220 5850 60 4990 470 810 2010 2120 3990 1460 940 1230 8240 70 3620 350 710 1110 2210 3270 430 580 1570 8900 70 840 200 280 390 330 610 120 160 320 880
44Advantages of the Approach
- Theoretical relevance
- Faithful rendering of both the linguistic theory
and the theory I develop for guarded feature
structures. Enables complex theories to be
implemented as written. - Algorithm is parser-independent
- In a study of quantifier raising, the guards are
unblocked in exactly the same order, with three
different parsers.
45Conclusions
- Introduction and Thesis Statement
- Linguistic Theories
- Guarded Descriptions and Guarded Rules
- Implementation and Evaluation
- Conclusions
46Conclusions
- First generalization of delays over feature
structures. - Flexible guards can be applied to lexical rules,
grammar rules, type constraints, etc. - Guarding is relevant for a wide range of
linguistic phenomena and also across languages.
47Future Work
- Speedier implementation so that guards are not
unblocked every time the type variable changes. - Transfer the approach to non-typed formalisms
(e.g. LFG).
48Extra slides begin here
49Example (CLP)
- A program to append two lists together will go
into an unbounded search if the first or third
argument is a variable. Wait until either of
these is nonvariable. - (nonvar(Xs)nonvar(Zs)) append(XXs,Ys,XZ
s) ? append(Xs,Ys,Zs)
50Guarding on Typed Feature Structures
- In implementing delays, we put a guard on the
type of the information we are looking for. - We never delay any goals that could have been
solved with a less specific type. - In asking whether a feature structure F satisfies
a feature-value pair FeatDesc, a wait is placed
on the type of F if it does not already satisfy a
type for which Feat is appropriate. Recursively,
a wait is placed on the feature structure rooted
at Feat with respect to Desc.