Title: Semantic Parsing
1Semantic Parsing
- Pushpak Bhattacharyya,
- Computer Science and Engineering Department,
- IIT Bombay
- pb_at_cse.iitb.ac.in
with contributions from Rajat Mohanty, S.
Krishna, Sandeep Limaye
2Motivation
- Semantics Extraction has many applications
- MT
- IR
- IE
- Does not come free
- Resource intensive
- Properties of words
- Conditions of relation establishment between
words - Disambiguation at many levels
- Current Computational Parsing less than
satisfactory for deep semantic analysis
3Roadmap
- Current important parsers
- Experimental observations
- Handling of difficult language phenomena
- Brief Introduction to the adopted Semantic
Representation - Universal Networking Language (UNL)
- Two stage process to UNL generation approach-1
- Use of better parser approach-2
- Consolidating statement of resources
- Observations on treatment of verb
- Conclusions and future work
4Current parsers
5Categorization of parsers
Output Method Constituency Dependency
Rule Based Earley Chart (1970), CYK (1965-70), LFG (1970), HPSG (1985) Link (1991), Minipar (1993)
Probabilistic Charniack (2000), Collins (1999), Stanford (2006), RASP (2007) Stanford (2006), MST (2005), MALT (2007)
6Observations on some well-known Probabilistic
Constituency Parsers
7Parsers investigated
- Charniak Probabilistic Lexicalized Bottom-Up
Chart Parser - Collins Head-driven statistical Beam Search
Parser - Stanford Probabilistic A Parser
- RASP Probabilistic GLR Parser
8Investigations based on
- Robustness to Ungrammaticality
- Ranking in case of multiple parses
- Handling of embeddings
- Handling of multiple POS
- Words repeated with multiple POS
- Complexity
-
9Handling ungrammatical sentences
10Charniak
S
NP
VP
NNP
AUX
VP
Joe
has
VBG
NP
reading
DT
NN
the
book
Joe has reading the book
11Collins
12Stanford
- has is treated as VBZ and not AUX.
13RASP
- Confuses as a case of sentence embedding
14Ranking in case of multiple parses
15Charniak
- semantically correct one chosen from among
possible multiple parse trees
S
NP
VP
SBAR
NNP
VBD
S
John
said
NP
VP
VB
VBD
NP
PP
Marry
sang
DT
NN
IN
NP
NNP
with
song
the
MaX
John said Marry sang the song with Max
16Collins
17Stanford
18RASP
- Different POS Tags, but parse trees are comparable
19Time complexity
20Time taken
- 54 instances of the sentence This is just to
check the time is used to check the time - Time taken
- Collins 40s
- Stanford 14s
- Charniak 8s
- RASP 5s
- Reported complexity
- Charniack O(n5)
- Collins O(n5)
- Stanford O(n3)
- RASP not known
21Embedding Handling
22Charniak
A
S
VBD
PP
NP
NP
NP
SBAR
NP
NP
VP
SBAR
NP
IN
spilled
DT
NN
WHNP
S
VBD
S
IN
NN
DT
on
The
cat
WDT
VP
escaped
VP
that
floor
the
that
VBD
NP
AUX
ADJP
killed
NP
SBAR
was
slippery
NN
DT
WHNP
S
the
rat
WDT
VP
that
VBD
NP
stole
NP
SBAR
The cat that killed the rat that stole the milk
that spilled on the floor that was slippery
escaped.
DT
WHNP
NN
S
WDT
that
VP
A
23Collins
24Stanford
25RASP
26Handling words with multiple POS tags
27Charniack
S
NP
VP
NNP
VBZ
PP
Time
flies
IN
NP
like
DT
NN
an
arrow
Time flies like an arrow
28Collins
29Stanford
30RASP
31Repeated Word handling
32Charniak
S
NP
VP
NNP
VBZ
SBAR
Buffalo
buffaloes
S
NP
VP
NNP
VBZ
SBAR
Buffalo
buffaloes
S
Buffalo buffaloes Buffalo buffaloes buffalo
buffalo Buffalo buffaloes
NP
VP
NN
NNP
NNP
VBZ
buffalo
buffalo
Buffalo
buffaloes
33Collins
34Stanford
35RASP
36Sentence Length
37Sentence with 394 words
- One day, Sam left his small, yellow home to head
towards the meat-packing plant where he worked, a
task which was never completed, as on his way, he
tripped, fell, and went careening off of a cliff,
landing on and destroying Max, who, incidentally,
was also heading to his job at the meat-packing
plant, though not the same plant at which Sam
worked, which he would be heading to, if he had
been aware that that the plant he was currently
heading towards had been destroyed just this
morning by a mysterious figure clad in black, who
hailed from the small, remote country of France,
and who took every opportunity he could to
destroy small meat-packing plants, due to the
fact that as a child, he was tormented, and
frightened, and beaten savagely by a family of
meat-packing plants who lived next door, and
scarred his little mind to the point where he
became a twisted and sadistic creature, capable
of anything, but specifically capable of
destroying meat-packing plants, which he did, and
did quite often, much to the chagrin of the
people who worked there, such as Max, who was not
feeling quite so much chagrin as most others
would feel at this point, because he was dead as
a result of an individual named Sam, who worked
at a competing meat-packing plant, which was no
longer a competing plant, because the plant that
it would be competing against was, as has already
been mentioned, destroyed in, as has not quite
yet been mentioned, a massive, mushroom cloud of
an explosion, resulting from a heretofore
unmentioned horse manure bomb manufactured from
manure harvested from the farm of one farmer J.
P. Harvenkirk, and more specifically harvested
from a large, ungainly, incontinent horse named
Seabiscuit, who really wasn't named Seabiscuit,
but was actually named Harold, and it completely
baffled him why anyone, particularly the author
of a very long sentence, would call him
Seabiscuit actually, it didn't baffle him, as he
was just a stupid, manure-making horse, who was
incapable of cognitive thought for a variety of
reasons, one of which was that he was a horse,
and the other of which was that he was just
knocked unconscious by a flying chunk of a
meat-packing plant, which had been blown to
pieces just a few moments ago by a shifty
character from France.
38Partial RASP Parse
- (One_MC1 day_NNT1 ,_, Sam_NP1
leaveed_VVD his_APP small_JJ ,_,
yellow_JJ home_NN1 to_TO head_VV0
towards_II the_AT meat-packing_JJ
plant_NN1 where_RRQ he_PPHS1 worked_VVD
,_, a_AT1 task_NN1 which_DDQ beed_VBDZ
never_RR completeed_VVN ,_, as_CSA
on_II his_APP way_NN1 ,_, he_PPHS1
triped_VVD ,_, falled_VVD ,_, and_CC
goed_VVD careening_VVG off_RP of_IO
a_AT1 cliff_NN1 ,_, landing_VVG on_RP
and_CC destroying_VVG Max_NP1 ,_,
who_PNQS ,_, incidentally_RR ,_,
beed_VBDZ also_RR heading_VVG to_II
his_APP job_NN1 at_II the_AT
meat-packing_JB plant_NN1 ,_, though_CS
not_XX the_AT same_DA plant_NN1 at_II
which_DDQ Sam_NP1 worked_VVD ,_,
which_DDQ he_PPHS1 would_VM be_VB0
heading_VVG to_II ,_, if_CS he_PPHS1
haveed_VHD been_VBN aware_JJ that_CST
that_CST the_AT plant_NN1 he_PPHS1
beed_VBDZ currently_RR heading_VVG
towards_II haveed_VHD been_VBN
destroyed_VVN just_RR this_DD1
morning_NNT1 by_II a_AT1 mysterious_JJ
figure_NN1 clotheed_VVN in_II black_JJ
,_, who_PNQS hailed_VVD from_II the_AT
small_JJ ,_, remote_JJ country_NN1
of_IO France_NP1 ,_, and_CC who_PNQS
takeed_VVD every_AT1 opportunity_NN1
he_PPHS1 could_VM to_TO destroy_VV0
small_JJ meat-packing_NN1 plants_NN2 ,_,
due_JJ to_II the_AT fact_NN1 that_CST
as_CSA a_AT1 child_NN1 ,_, he_PPHS1
beed_VBDZ tormented_VVN ,_, and_CC
frightened_VVD ,_, and_CC beaten_VVN
savagely_RR by_II a_AT1 family_NN1
of_IO meat-packing_JJ plants_NN2
who_PNQS liveed_VVD next_MD door_NN1
,_, and_CC scared_VVD his_APP
little_DD1 mind_NN1 to_II the_AT
point_NNL1 where_RRQ he_PPHS1
becomeed_VVD a_AT1 twisted_VVN and_CC
sadistic_JJ creature_NN1 ,_, capable_JJ
of_IO anything_PN1 ,_, but_CCB
specifically_RR capable_JJ of_IO
destroying_VVG meat-packing_JJ plants_NN2
,_, which_DDQ he_PPHS1 doed_VDD ,_,
and_CC doed_VDD quite_RG often_RR ,_,
much_DA1 to_II the_AT chagrin_NN1 of_IO
the_AT people_NN who_PNQS worked_VVD
there_RL ,_, such_DA as_CSA Max_NP1
,_, who_PNQS beed_VBDZ not_XX
feeling_VVG quite_RG so_RG much_DA1
chagrin_NN1 as_CSA most_DAT others_NN2
would_VM feel_VV0 at_II this_DD1
point_NNL1 ,_, because_CS he_PPHS1
beed_VBDZ dead_JJ as_CSA a_AT1
result_NN1 of_IO an_AT1 individual_NN1
nameed_VVN Sam_NP1 ,_, who_PNQS
worked_VVD at_II a_AT1 competeing_VVG
meat-packing_JJ plant_NN1 ,_, which_DDQ
beed_VBDZ no_AT longer_RRR a_AT1
competeing_VVG plant_NN1 ,_, because_CS
the_AT plant_NN1 that_CST it_PPH1
would_VM be_VB0 competeing_VVG
against_II beed_VBDZ ,_, as_CSA
haves_VHZ already_RR been_VBN
mentioned_VVN ,_, destroyed_VVN in_RP
,_, as_CSA haves_VHZ not_XX quite_RG
yet_RR been_VBN mentioned_VVN ,_,
a_AT1 massive_JJ ,_, mushroom_NN1
cloud_NN1 of_IO an_AT1 explosion_NN1
,_, resulting_VVG from_II a_AT1
heretofore_RR unmentioned_JJ horse_NN1
manure_NN1 bomb_NN1 manufactureed_VVN
from_II manure_NN1 harvested_VVN from_II
the_AT farm_NN1 of_IO one_MC1
farmer_NN1 J._NP1 P._NP1 Harvenkirk_NP1 ,_,
and_CC more_DAR specifically_RR
harvested_VVN from_II a_AT1 large_JJ
,_, ungainly_JJ ,_, incontinent_NN1
horse_NN1 nameed_VVN Seabiscuit_NP1 ,_,
who_PNQS really_RR beed_VBDZ not_XX
nameed_VVN Seabiscuit_NP1 ,_, but_CCB
beed_VBDZ actually_RR nameed_VVN
Harold_NP1 ,_, and_CC it_PPH1
completely_RR baffleed_VVD he_PPHO1
why_RRQ anyone_PN1 ,_, particularly_RR
the_AT author_NN1 of_IO a_AT1 very_RG
long_JJ sentence_NN1 ,_, would_VM
call_VV0 he_PPHO1 Seabiscuit_NP1 _
actually_RR ,_, it_PPH1 doed_VDD
not_XX baffle_VV0 he_PPHO1 ,_, as_CSA
he_PPHS1 beed_VBDZ just_RR a_AT1
stupid_JJ ,_, manure-making_NN1 horse_NN1
,_, who_PNQS beed_VBDZ incapable_JJ
of_IO cognitive_JJ thought_NN1 for_IF
a_AT1 variety_NN1 of_IO reasons_NN2
,_, one_MC1 of_IO which_DDQ beed_VBDZ
that_CST he_PPHS1 beed_VBDZ a_AT1
horse_NN1 ,_, and_CC the_AT other_JB
of_IO which_DDQ beed_VBDZ that_CST
he_PPHS1 beed_VBDZ just_RR knocked_VVN
unconscious_JJ by_II a_AT1 flying_NN1
chunk_NN1 of_IO a_AT1 meat-packing_JJ
plant_NN1 ,_, which_DDQ haveed_VHD
been_VBN blowen_VVN to_II pieces_NN2
just_RR a_AT1 few_DA2 moments_NNT2
ago_RA by_II a_AT1 shifty_JJ
character_NN1 from_II France_NP1 ._.) -1
()
39What do we learn?
- All parsers have problems dealing with long
sentences - Complex language phenomena cause them to falter
- Good as starting points for structure detection
- But need output correction very often
40Needs of high accuracy parsing(difficult
language phenomena)
41Context of our work Universal Networking
Language (UNL)
42A vehicle for machine translation
- Much more demanding than transfer approach or
direct approach
Hindi
English
Interlingua (UNL)
Analysis
Chinese
French
generation
43A United Nations project
- Started in 1996
- 10 year program
- 15 research groups across continents
- First goal generators
- Next goal analysers (needs solving various
ambiguity problems) - Current active groups UNL-Spanish, UNL-Russian,
UNL-French, UNL-Hindi - IIT Bombay concentrating on UNL-Hindi and
UNL-English
Dave, Parikh and Bhattacharyya, Journal of
Machine Translation, 2002
44UNL represents knowledge John eats rice with a
spoon
Universal words
Semantic relations
attributes
45Sentence Embeddings
- Mary claimed that she had composed a poem
_at_entry._at_past
claim (iclgtdo)
obj
agt
compose (iclgtdo)
_at_entry._at_past _at_complete
01
Mary (iofgtperson)
obj
agt
poem (iclgtart)
she
46Relation repository
- Number 39
- Groups
- Agent-object-instrument agt, obj, ins, met
- Time tim, tmf, tmt
- Place plc, plf, plt
- Restriction mod, aoj
- Prepositions taking object go, frm
- Ontological icl, iof, equ
- Etc. etc.
47Semantically Relatable Sequences (SRS)
- Mohanty, Dutta and Bhattacharyya, Machine
Translation Summit, 2005
48Semantically Relatable Sequences (SRS)
- Definition A semantically relatable sequence
(SRS) of a sentence is a group of unordered words
in the sentence (not necessarily consecutive)
that appear in the semantic graph of the sentence
as linked nodes or nodes with speech act labels
49Example to illustrate SRS
- The man bought a
- new car in June
50SRSs from the man bought a new car in June
- man, bought
- bought, car
- bought, in, June
- new, car
- the, man
- a, car
51Basic questions
- What are the SRSs of a given sentence?
- What semantic relations can link the words in an
SRS?
52Postulate
- A sentence needs to be broken into sets of at
most three forms - CW, CW
- CW, FW, CW
- FW, CW
- where CW refers to content word or a clause and
FW to function word
53Language Phenomena and SRS
54Clausal constructs
- Sentences The boy said that he was reading a
novel - the boy
- boy, said
- said, that, SCOPE
- SCOPEhe, reading
- SCOPEreading, novel
- SCOPEa, novel
- SCOPEwas, reading
- scope umbrella for clauses or compounds
55Preposition Phrase (PP) Attachment
- John published the article in June
- John, published CW,CW
- published, article CW,CW
- published, in, June CW,FW,CW
- the, article FW,CW
- Contrast with
- The article in June was published by John
- The, article FW,CW
- article, in, June CW,FW,CW
- article, was, published CW,CW
- published, by, John CW,CW
56To-Infinitival
- PRO element co-indexed with the object him
- I forced Johni PROi to throw a party
- PRO element co-indexed with the subject I
- Ii promised John PROi to throw a party
- SRSs are
- I, forced CW,CW
- forced, John CW,CW
- forced, SCOPE CW,CW
- SCOPEJohn, to, throw CW,FW,CW
- SCOPEthrow, party CW,CW
- SCOPEa, party FW,CW
replaced with I in the 2nd sentence
go deeper than surface phenomena
57Complexities of that
- Embedded clausal constructs as opposed to
relative clauses need to be resolved - Mary claimed that she had composed a poem
- The poem that Mary composed was beautiful
- Dangling that
- I told the child that I know that he played well
58Two possibilities
told
told
that I know that that he Played well
I
the child
that he Played well
I
the child
that I know
59SRS Implementation
60Syntactic constituents to Semantic constituents
- Used a probabilistic parser (Charniak, 04)
- Output of Charniack parser tags give indications
of CW and FW - NP, VP, ADJP and ADVP
- ? CW
- PP (prepositional phrase), IN (preposition) and
DT (determiner) - ? FW
61Observation Headwords of sibling nodes form SRSs
- John has bought
- a car.
- SRS
- has, bought,
- a, car,
- bought, car
62Work needed on the parse tree
63Correction of wrong PP attachment
- John has published an article on linguistics
- Use PP attachment
- heuristics
- Get
- article, on, linguistics
(C)VP published
(C)NNarticle
(F)IN on
(C)NPlinguistics
article
(C)NNS linguistics
linguistics
on
64To-Infinitival
- Clause boundary is the VP
- node, labeled with SCOPE.
- Tag is modified to TO, a FW
- tag, indicating that it heads
- a to-infinitival clause.
- The duplication and insertion
- of the NP node with head him
- (depicted by shaded nodes) as
- a sibling of the VBD node with
- head forced is done to bring out
- the existence of a semantic
- relation between force and him.
(C)VP watch
65Linking of clauses John said that he was
reading a novel
- Head of S node marked as Scope SRS
- said, that, SCOPE
- Adverbial clauses have similar parse tree
structures except that the subordinating
conjunctions are different from that
66Implementation
- Block Diagram of the system
67Evaluation
- Used the Penn Treebank (LDC, 1995) as the test
bed - The un-annotated sentences, actually from the WSJ
corpus (Charniak et. al. 1987), were passed
through the SRS generator - Results were compared with the Treebanks
annotated sentences
68Results on SRS generation
69Results on sentence constructs
70SRS to UNL
71Features of the system
- High accuracy resolution of different kinds of
attachment - Precise and fine grained semantic relations
between sentence constituents - Empty-pronominal detection and resolution
- Exhaustive knowledge bases of sub-categorization
frames, verb knowledge bases and rule templates
for establishing semantic relations and speech
act like attributes using - Oxford Advanced Learners Dictionary (Hornby,
2001) - VerbNet (Schuler, 2005)
- WordNet 2.1 (Miller, 2005)
- Penn Tree Bank (LDC, 1995) and
- XTAG lexicon (XTAG, 2001)
72Side effect high accuracy parsing(comparison
with other parsers)
73Rules for generating Semantic Relations
e.g., finish within a week
e.g., turn water into steam
74Rules for generating attributes
75System architecture
76Evaluation scheme
77Evaluation example
- Input He worded the statement carefully.
- unlGenerated76
- agt(word._at_entry, he)
- obj(word._at_entry, statement._at_def)
- man(word._at_entry, carefully)
- \unl
- unlGold76
- agt(word._at_entry._at_past, he)
- obj(word._at_entry._at_past, statement._at_def)
- man(word._at_entry._at_past, carefully)
- \unl
F1-Score 0.945
Not heavily punished since attributes are not
crucial to the meaning!!
78Approach 2 switch to rule based parsing LFG
79Using Functional Structure from an LFG Parser
Sentence
Functional Structure (Transfer Facts)
UNL
?
?
SUBJ (eat, John) OBJ (eat, pastry) VTYPE (eat,
main)
agt (eat, Ram) obj (eat, mango)
John eats a pastry ?
?
80Lexical Functional Grammar
- Considers two aspects
- Lexical considers lexical structures and
relations - Functional considers grammatical functions of
different constituents, like SUBJECT, OBJECT - Two structures
- C-structure (Constituent-structure)?
- F-structure (Functional-structure)?
- Languages vary in C-structure (word order,
phrasal structure) but have the same functional
structure (SUBJECT, OBJECT, etc.)
81LFG Structures example
Sentence He gave her a kiss.
C-structure
F-structure
82XLE Parser
- Developed by Xerox Corporation
- Gives C-structures, F-structures and morphology
of the sentence constituents - Supports packed rewriting system converting
F-structure to transfer facts, used by our system - Works on Solaris, Linux and MacOSX
83Notion of Transfer Facts
- Serialized representation of the Functional
structure - Particularly useful for transfer-based MT systems
- We use it as the starting point for UNL generation
Example transfer facts
84Transfer Facts - Example
- Sentence
- The boy ate the apples hastily.
- Transfer facts (selected)
- ADJUNCT,eat2,hastily6
- ADV-TYPE,hastily6,vpadv
- DET,apple5,the4
- DET,boy1,the0
- DET-TYPE,the0,def
- DET-TYPE,the4,def
- NUM,apple5,pl
- NUM,boy1,sg
- OBJ,eat2,apple5
- PASSIVE,eat2,-
- PERF,eat2,-_
- PROG,eat2,-_
- SUBJ,eat2,boy1
- TENSE,eat2,past
- VTYPE,eat2,main
85Workflow in detail
86Phase 1 Sentence to transfer facts
- Input Sentence
- The boy ate the apples hastily.
- Output Transfer facts (selected are shown here)?
- ADJUNCT,eat2,hastily6
- ADV-TYPE,hastily6,vpadv
- DET,apple5,the4
- DET,boy1,the0
- DET-TYPE,the0,def
- DET-TYPE,the4,def
- NUM,apple5,pl
- NUM,boy1,sg
- OBJ,eat2,apple5
- PASSIVE,eat2,-
- PERF,eat2,-_
- PROG,eat2,-_
- SUBJ,eat2,boy1
- TENSE,eat2,past
- VTYPE,eat2,main
87Phase 2 Transfer facts to word entry collection
- Input transfer facts as in the previous example
- Output word entry collection
-
- Word entry eat2, lex item eat
- (PERF-_ PASSIVE- _SUBCAT-FRAMEV-SUBJ-OBJ
VTYPEmain SUBJboy1 OBJapple5
ADJUNCThastily6 CLAUSE-TYPEdecl TENSEpast
PROG-_ MOODindicative )? - Word entry boy1, lex item boy
- (CASEnom _LEX-SOURCEcountnoun-lex COMMONcount
DETthe0 NSYNcommon PERS3 NUMsg )? - Word entry apple5, lex item apple
- (CASEobl _LEX-SOURCEmorphology COMMONcount
DETthe4 NSYNcommon PERS3 NUMpl )? - Word entry hastily6, lex item hastily
- (DEGREEpositive _LEX-SOURCEmorphology
ADV-TYPEvpadv )? - Word entry the0, lex item the
- (DET-TYPEdef )?
- Word entry the4, lex item the
- (DET-TYPEdef )?
-
88Phase 3 (1) UW and Attribute generation
- Input word entry collection
- Output Universal Words with (some) attributes
generated - In our example
- UW (eat2._at_entry._at_past) UW (hastily6)?
- UW (boy1) UW (the0)?
- UW (apple5._at_pl) UW (the4)?
-
Example transfer facts and their mapping to UNL
attributes
89Digression Subcat Frames, Arguments and Adjuncts
- Subcat frames and arguments
- A predicate subcategorizes for its arguments, or
arguments are governed by the predicate. - Example predicate eat subcategorizes for a
SUBJECT argument and an OBJECT argument. - The corresponding subcat frame is
- V-SUBJ-OBJ.
- Arguments are mandatory for a predicate.
- Adjuncts
- Give additional information about the predicate
- Not mandatory
- Example hastily in The boy ate the apples
hastily.
90Phase 3(1) Handling of Subcat Frames
- Input
- Word entry collection
- Mapping of subcat frames to transfer facts
- Mapping of transfer facts to relations or
attributes - Output relations and / or attributes
- Example for our sentence, agt(eat,boy),
obj(eat,apple) relations are generated in this
phase.
91Rule bases for Subcat handling examples (1)
Mapping Subcat frames to transfer facts
92Rule bases for Subcat handling examples (2)
Mapping Subcat frames, transfer facts to
relations / attributes Some simplified rules
93Phase 3(2) Handling of adjuncts
- Input
- Word entry collection
- List of transfer facts to be considered for
adjunct handling - Rules for relation generation based on transfer
facts and word properties - Output relations and / or attributes
- Example for our sentence, man(eat, hastily)
relation and _at_def attributes for boy, apple are
generated in this phase.
94Rule bases for adjunct handling examples (1)
Mapping adjunct transfer facts to relations /
attributes Some simplified rules
95Rule bases for adjunct handling examples (2)
Mapping adjuncts to relations / attributes based
on prepositions - some example rules
96Final UNL Expression
- Sentence
- The boy ate the apples hastily.
- UNL Expression
- unl1
- agt(eat2._at_entry._at_past,boy1._at_def)
man(eat2._at_entry._at_past,hastily6)
obj(eat2._at_entry._at_past,apple5._at_pl._at_def) \unl
97Design of Relation Generation Rules an example
Subject
ANIMATE
INANIMATE
Verb Type
Verb Type
do
occur
do
occur
be
be
agt
aoj
aoj
aoj
aoj
obj
98Summary of Resources
- Mohanty and Bhatacharyya, LREC 2008
99Lexical Resources
Functional Elements with Grammatical attributes
Lexical Knowledgebase with Semantic attributes
Verb Knowledgebase
Syntactic Argument database
PPs as syntactic arguments
Clause as syntactic arguments
Syntactic and Semantic Argument mapping
Verb Senses
N
V
Adv
A
Semantic Argument Frame
A
N
V
Adv
Auxiliary verbs
Determiners
Tense-Aspect morphemes
UNL Expression Generation
SRS Generation
100Use of a number of lexical data
- We have created these resources over a long
period of time from - Oxford Advanced Learners Dictionary (OALD)
(Hornby, 2001) - VerbNet (Schuler, 2005)
- Princeton WordNet 2.1 (Miller, 2005)
- LCS database (Dorr, 1993)
- Penn Tree Bank (LDC, 1995), and
- XTAG lexicon (XTAG Research Group, 2001)
101Verb Knowledge Base (VKB) Structure
102VKB statistics
- 4115 unique verbs
- 22000 rows (different senses)
- 189 verb groups
103Verb categorization in UNL and its relationship
to traditional verb categorization
Traditional (sytactic) UNL (semantic) Transitive (has direct object) Intransitive
Do (action) Ram pulls the rope Ram goes home (ergative languages)
Be (state) Ram knows mathematics Ram sleeps
Occur (event) Ram forgot mathematics Earth cracks
Unergative (syntactic subject semantic agent)
Unaccusative (syntactic subject ? semantic
agent)
104Accuracy on various phenomena and corpora
105Applications
106MT and IR
- Smriti Singh, Mrugank Dalal, Vishal Vachani,
Pushpak Bhattacharyya and Om Damani, Hindi
Generation from Interlingua, Machine Translation
Summit (MTS 07), Copenhagen, September, 2007. - Sanjeet Khaitan, Kamaljeet Verma and Pushpak
Bhattacharyya, Exploiting Semantic Proximity for
Information Retrieval, IJCAI 2007, Workshop on
Cross Lingual Information Access, Hyderabad,
India, Jan, 2007. - Kamaljeet Verma and Pushpak Bhattacharyya,
Context-Sensitive Semantic Smoothing using
Semantically Relatable Sequences, submitted
107Conclusions and future work
- Presented two approaches to UNL generation
- Demonstrated the need for Resources
- Working on handling difficult language phenomena
- WSD for correct UW word
108URLs
- For resources
- www.cfilt.iitb.ac.in
- For publications
- www.cse.iitb.ac.in/pb