Title: Whose presentation is this? SUBJ(present, Violeta Seretan)
1Whose presentation is this?SUBJ(present, Violeta
Seretan)
(Decoding the predicate-argument structure of
nominalizations)
OBL(collaborate, Lorenzo Thione)PP-OBJ(with,
Lorenzo Thione)
SUBJ(supervise, Martin van den Berg)
2Overview
- nominalization problem
- NOMLEX resource
- Denominalizer service based on NOMLEX
- additional resources (CSLI)
- APIs for NOMLEX, CSLI
- related and future work
- demo
3Text normalization for QA
- Mark Twain published Adventures of Huckleberry
Finn in 1885 in America. - Who published H.F.?
- Where was H.F. published?
- When was H.F. published?
- QA/NLU needs to deal with a large spectrum of
variation in text - morphological published, publishes
- syntactic H.F. was published
- lexical novel, book, masterpiece, work
publish, write, author, appear - nominalization the publication
- Normalization (via parsing)
- base word form publishes -gt publish published
-gt publish - canonical word order SUBJ(publish, Mark Twain)
OBJ(publish, H.F.) - Lexical semantic resources
- synonyms, hyponyms, hypernyms,
4Nominalization
- Since the publication of Huckleberry Finn in
1885, there have been many - reactions to the novel, some of them quite
extreme. - When was H.F. published?
Nominalization NP having a systematic
correspondence with a clause structure (Quirk et
al. 1985) Goal decoding the clause structure
5Mapping nominal arguments into verbal roles
- Mark Twains publication of his book
- possessive determiner PP adjunct
(nominal arguments) - the book publication by Mark Twain
- modifier PP adjunct (nominal
arguments) - Mark Twain - publish book
- SUBJECT OBJECT (verbal
roles)
6Role ambiguity
- Romes destruction SUBJ or OBJ?
- OBJ(destroy, Rome)
- SUBJ(destroy, Rome)
- Romes destruction by barbarians OBJ
- Romes destruction of Carthage SUBJ
- Romes destruction OBJ (by default)
- Johns admiration SUBJ (by default)
7NOMLEX NOMinalization LEXicon
- Macleod et al., New York University
- 1025 deverbal nouns
- detailed mapping from nominal arguments to verb
roles -
-
- ORTH "destruction"
- VERB "destroy"
- VERB-SUBC ((NOM-NP SUBJECT ((N-N-MOD)
- (DET-POSS)
- (PP PVAL
("by"))) - OBJECT ((DET-POSS)
- (N-N-MOD)
- (PP PVAL
("of"))) - REQUIRED ((OBJECT
DET-POSS-ONLY T -
N-N-MOD-ONLY T))))
role to assign
default role
8NOMLEXML
(NOM ORTH "accusation" PLURAL
"accusations" PLURAL-FREQ "not rare"
VERB "accuse" NOUN-SUBC ((NOUN-PP PVAL
("about"))) NOM-TYPE ((VERB-NOM))
VERB-SUBJ ((DET-POSS)
(N-N-MOD) (PP PVAL ("by")))
SUBJ-ATTRIBUTE ((COMMUNICATOR))
OBJ-ATTRIBUTE ((COMMUNICATOR)) VERB-SUBC
((NOM-NP-PP SUBJECT ((DET-POSS)
(N-N-MOD)
(PP PVAL ("by")))
OBJECT ((PP PVAL ("against")))
PVAL ("of"))
(NOM-NP SUBJECT ((DET-POSS)
Perl
9NOMLEX API in Java
- com.fxpal.sake.test (NomLexInterface)
- com.fxpal.ltng.services.normalization.noun.nomlex
- (NomLex, NomLexEntry, NomLexClassConstants,
Subcat)
10How useful?
- Oracle acquired PeopleSoft at the end of last
year. - Oracles acquisition of PeopleSoft at the end of
last year -
- Google hits, 10/25/2005
"Oracle acquisition of PeopleSoft"
"Oracle acquired PeopleSoft"
"Oracle's PeopleSoft acquisition"
14500
587
693
More hits
"Oracle acquires PeopleSoft" 1020
"Oracle has acquired PeopleSoft" 248
"Oracle will acquire PeopleSoft" 424
11Argument-role mapping
- Oracle's acquisition of PeopleSoft
- possessive PP (of )
- ORTH "acquisition"
- VERB "acquire"
- VERB-SUBC ((NOM-NP SUBJECT ((DET-POSS)
- (N-N-MOD)
- (PP PVAL
("by"))) - OBJECT ((N-N-MOD)
- (PP PVAL
("of"))))
12Denominalizer
- Input sentence
- Output pairs nominal argument verb role
- for each nominalization
- (noun, (argument role))
- Exemples
- Oracle's acquisition of PeopleSoft finally
materialized after an 18 months struggle between
the two companies. - (acquisition, (Oracle - SUBJECT) (PeopleSoft -
OBJECT)) - Oracle acquisition finally materialized.
- (acquisition, (Oracle - SUBJECT) (Oracle -
OBJECT))
13Algorithm
- parse sentence
- for each deverbal noun
- get noun arguments
- for each NOMLEX entry for noun
- for each subcat of the entry
- 1. match arguments against subcat
- 2. filter assignment results
- select a subcat
- output assignments for selected subcat
-
- Note overlapping nominalizations ok
- an increase in product sales
com.fxpal.ltng.services.normalization.noun.
141. Matching
- Oracle's acquisition of PeopleSoft finally
materialized. - Arguments (acquisition)
- POSS(acquisition, Oracle)
- ADJUNCT(acquisition, of)
- PP-OBJ(of, PeopleSoft)
- NOM-NP
- SUBJECT ((DET-POSS)
- (N-N-MOD)
- (PP PVAL ("by")))
- OBJECT ((N-N-MOD)
- (PP PVAL ("of")))
152. Filtering
- Oracle's PeopleSoft acquisition finally
materialized. - Arguments (acquisition)
- POSS(acquisition, Oracle)
- MOD(acquisition, PeopleSoft)
- NOM-NP
- SUBJECT ((DET-POSS)
- (N-N-MOD)
- (PP PVAL ("by")))
- OBJECT ((N-N-MOD)
- (PP PVAL ("of")))
Alternatives Oracle SUBJECT PeopleSoft
SUBJECT, OBJECT
16NOMLEX constraints (1)
- Uniqueness Constraint
- A verbal role may be filled only once.
- Oracle's PeopleSoft acquisition
- Matching alternatives
- Oracle SUBJECT
- PeopleSoft SUBJECT, OBJECT
-
17NOMLEX constraints (2)
- Ordering Constraint
- If there are multiple pre-nominal arguments,
they must appear in the order - SUBJECT, INDIRECT OBJECT, DIRECT OBJECT,
OBLIQUE. - FXs printer sales grew by 50.
- Matching alternatives
- FX SUBJECT, OBJECT
- printer SUBJECT, OBJECT
-
- order FX, printer
- verbal roles SUBJECT, OBJECT
18NOMLEX constraints (3)
- Obligatoriness Constraint
- By default, the subject and object are optional.
- A NOMLEX entry can specify obligatory roles to
be filled. -
- circulation - REQUIRED (SUBJECT)
- blood circulation
- SUBJ(circulate, blood)
-
- destruction - REQUIRED ((OBJECT DET-POSS-ONLY T
-
N-N-MOD-ONLY T)))) - Romes destruction
- OBJ(destroy, Rome)
-
19Selectional Restrictions
com.fxpal.ltng.services.normalization.noun.csli (N
ouns, Verbs, NounsVerbs)
20Applying selectional restrictions
- room reservation
- Alternatives
- room - SUBJECT, OBJECT
- reserve - selectional restrictions SUBJECT
sentient OBJECT - room - location, physobj
- semantic types for about 5000 N
- selectional restrictions for about 5000 V
- 459/941 verbs from NOMLEX (48.77)
-
21Coverage extension
- What if a noun is not in NOMLEX?
- additional deverbal nouns in the CSLI data
- 4087 event nouns
- 3348 new, 739 already in NOMLEX
- 3348/1025 326 more data
- NOMLEX template
- NOM-NP
- SUBJECT ((DET-POSS)
- (N-N-MOD)
- (PP PVAL ("by")))
- OBJECT ((DET-POSS)
- (N-N-MOD)
- (PP PVAL ("of")))
22Future work
- extensive test and evaluation
- other nominalization data
- deverbal noun recognition
- mapping information (FrameNet)
- other lexical resources
- PropBank semantic roles
- VerbLex selectional restrictions
- role assignment in context
- word sense disambiguation, anaphora, discourse
- collocations
- the author will make no accusation
- SUBJ(make, author) -gt SUBJ (accuse, author)
23Related work
- PUNDIT system (Dahl et al., 1987)
- SNOWY QA system (Hull and Gomez 1996)
- NOMLEX for IE (Meyers et al., 1998)
- N-N interpretation (Lapata 2002, Girju et al.
2004)
24References
- Dahl, Deborah A., Palmer, Martha S. and
Passonneau, Rebecca J. 1987. "Nominalizations in
PUNDIT." Proceedings of the 25th Annual Meeting
of the Association for Computational Linguistics,
Stanford, CA. - Girju, Roxana, Ana-Maria Giuglea, Marian Olteanu,
Ovidiu Fortu, Orest Bolohan, and Dan Moldovan.
Support vector machines applied to the
classification of semantic relations in
nominalized noun phrases. In Proceedings of the
HLT-NAACL Workshop on Computational Lexical
Semantics, 2004. - Hull, Richard and Fernando Gomez (1996). Semantic
Interpretation of Nominalizations. PDF Format.
Proceedings of the Thirteenth National Conference
on Artificial Intelligence, Portland, Oregon,
August, 1996, pp. 1062-8. - Lapata, Maria. 2002. The Disambiguation of
Nominalisations. Computational Linguistics 283,
357-388. - Macleod, Catherine, Ralph Grishman, Adam Meyers,
Leslie Barrett, and Ruth Reeves. 1998. Nomlex A
lexicon of nominalizations. In Proceedings of the
8th International Congress of the European
Association for Lexicography, pages 187193,
Liège, Belgium. - Meyers A., et al. Using NOMLEX to produce
nominalization patterns for information
extraction. In Proceedings of the COLING-ACL
Workshop on Computational Treatment of Nominals,
1998. - Quirk, S. R., Greenbaum, G. Leech, and J.
Svartvik. 1985. A comprehensive grammar of
English language, Longman, Harlow. - Terada Akira, Tokunaga Takenobu. Corpus based
method of transforming nominalized phrases into
clauses for text mining application. IEICE
Transactions on Information and Systems.
Vol.E86-D. No.9. pp.1736 -- 1744. 2003.
25 26Selectional restrictions data
- CSLI resource
- nouns 4447
- semantic types (ontology)
- verbs 4858
- subcategorizations
- selectional restrictions
- noun-verb 5700 V (9415 N)
- noun-verb pairs
27Grammatical Transfer
NOMLEX XLE Example
DET-POSS POSS Rome's destruction
PP ADJUNCT, PP-OBJ (POSNOUN) destruction of Carthage
TO-INF XCOMP the desire to leave
AS-NP-PHRASE ADJUNCT, PP-OBJ (as, POSNOUN) his resignation as chairman
N-N-MOD MOD the room reservation
P-ING ADJUNCT, PP-OBJ (POSVERB) the accusation against launching
ING ADJUNCT, QA_PROG() my appreciation being there
FOR-TO-INF ADJUNCT, SUBJ the wish for him to go
ADVP ADJUNCT (POSADV) his departure abroad
AS-ING ADJUNCT, PP-OBJ (as, POSVERB), QA_PROG() characterization as being
AS-ADJP ADJUNCT, PP-OBJ (as, POSADJ) the characterization as useful
P-POSSING ADJUNCT, PP-OBJ(POSVERB), POSS the acceptance of his talking
28FrameNet
- aim word semantico-syntactic mapping
- semantic roles frame elements (frame-specific)
- BNC corpus (100M words) American English LDC,
ANC - more than 600 frames, about 9.000 words
Example accusation frame Judgment_communication
FE (for this word) and their realization
communicator evaluee reason
not expressed (27/48) possessive determiner (6/48) PP (from) (2/48) not expressed (40/48) PP (against) (5/48) PP (about) (3/48) PP (of) (9/48) S (that) (9/48) not expressed (8/48) PP (about) (3/48)
29NOMLEX constraints (4)
- restrictions on possible combinations
- specified in NOMLEX entry
- adaptation
- NOT ((AND SUBJECT ((DET-POSS) (N-N-MOD))
- OBJECT ((N-N-MOD))
- plants' weather adaptation
- plants adaptation to weather
- Note Not implemented (cannot decide which
assignment to remove).
30Denominalizer UI
com.fxpal.sake.test.DenominalizerTest
parse triples
output