Title: Extracting Information from Participial Structures
1Extracting Information from Participial Structures
- Kata Gábor, Eniko Héja, Ágnes Mészáros
- Research Institute for Linguistics, HAS
- 8th INTEX WORKSHOP, 2005
2STRUCTURE
- IE system and its shortage the problem of
participles - NPs and participles in Hungarian
- a possible enhancement of the IE system
- implementation in INTEX
3IE system
- input text (1-2 sentences of short business news)
- shallow syntactic analysis
- pre-defined semantic patterns (event frames)
- output event frames slots filled by the
elements of the input text - the event, its participants and circumstances are
identified
4Event frames
- Az ABN Amro Bank egyesül a Kereskedelmi és
Hitelbankkal. - ABN Amro Bank fuses with Commercial and Credit
Bank. - ltevent schema"owner_changed.fusion.6"
roles_matched"3/3"gt - ltrv role"member_company_1" pos"N" case"NOM"
sem"companyinstitute"gt - ltNP id"88" sem"company countable human
institute"gt - ltw id"0" class"DET" at"1-1" lex"az"
case"NOM"gtAzlt/wgt - ltw id"2" class"UNKNOWN" at"2-2"
lex"ABN"gtABNlt/wgt - ltw id"4" class"UNKNOWN" at"3-3"
lex"Amro"gtAmrolt/wgt - ltw id"6" class"N" at"4-4" lex"bank"
case"NOM"gtBanklt/wgt - lt/NPgt
- lt/rvgt
- ltrv role"_1" pos"V" lemma"egyesül"gt
- ltw id"8" class"V" at"5-5" lex"egyesül"gtegyesü
llt/wgt - lt/rvgt
- ltrv role"member_company_2" pos"N" case"INS"
sem"companyinstitute"gt - ltNP id"118" sem"company countable institute"gt
- ltw id"13" class"DET" at"6-6" lex"a"
case"NOM"gtalt/wgt
5Mapping syntax to event frames
- SYNTAX EVENT FRAMES
- verb main event
- arguments participants
- free modifiers circumstances
- (time,
location,manner...)
6Mapping syntax to event frames
- Problem secondary information (cause or
antecedent of the main event) is hidden in
participial structures - A befektetok által tegnap eladott részvények
megnövelték a tozsde forgalmát. - The shares sold yesterday by the investors
increased the traffic at the stock exchange.
7Mapping syntax to event frames
- A befektetok által tegnap eladott részvények
megnövelték a tozsde forgalmát. - The shares sold yesterday by the investors
increased the traffic at the stock exchange. -
részvények / shares /
a befektetok / the investors /
eladott / sold /
tegnap / yesterday /
8Mapping syntax to event frames
- A befektetok által tegnap eladott részvények
megnövelték a tozsde forgalmát. - The shares sold yesterday by the investors
increased the traffic at the stock exchange. -
- A befektetok tegnap eladtak részvényeket.
- The investors sold shares yesterday.
9A solution
- a preprocessing module within the IE system which
transforms participial structures into sentences
with a finite predicate - semantic frame matching may operate on
transformed sentences - 1st step past participles within NPs
- the participle preserves the meaning of its base
verb - its arguments can be derived from the internal
structure of the NP
10NPs in Hungarian 1.
11NPs in Hungarian 2.
ADV
NPcase
Participles (past, present)
modifiers
APcase
NPostp
V.INF
...
12Participles in Hungarian
- ADJ Participle homonimy is a problem
- mérsékelt PC-chip kereslet
- modest /moderated/ demand for PC-chips
- Valaki mérsékelte a PC-chip keresletet.
- Somebody moderated the demand for PC-chips
- ragozott szóalakok
- inflected word forms
- Valaki ragozott szóalakokat.
- Somebody inflected word forms.
- only participles can be transformed
13Participle or Adjective?
- syntactic tests
- comparative
- ADV formation
- predicative use
- impossibility of preverb detachment
- we need to decide in the context whether the
given word form is an ADJ or a PART - 1. If at least one of the base verbs
complements is present, than it is a participle. -
14Participle or Adjective?
- syntactic tests
- comparative
- ADV formation
- predicative use
- preverb detachment
- we need to decide in the context whether the
given word form is an ADJ or a PART - 2. If at least one of the base verbs
complements / adjuntcs / a preverb is present,
than it is a participle. -
15Participle or Adjective?
- TESTS
- comparative mérsékeltebb kereslet
- more moderate demand
- predicative Ez a szóalak ragozott.
- This word form is
inflected. - ADV formation mérsékelt ? mérsékelten
- moderate ? moderately
- preverb detachment
- a fel nem újított házak ?
- the re- not stored houses (not
restored) - Ezek a házak fel nem újítottak.
- These houses are re- not stored.
16THE GRAMMAR
- - the correctness and informativity of the
resulting sentence depends on the correct
identification of verbal arguments and modifiers
within the NP - - then these elements are transformed according
to their grammatical function - past participles may be formed from both
transitive or intransitive verbs - if the base verb is intransitive, the head noun
of the NP represents the subject of the base
verb - az összedolt épület /the collapsed building/
- if the base verb is transitive, the head noun
represents the direct object of the base verb - a bejelentett változások /the changes
announced/ -
- ? transitivity needs to be coded
17THE GRAMMAR
- transformation rules are (enhanced) FSTs
- they store relevant elements of the input NP in
variables - the output is made up of the content of these
variables but in an altered order function
words needed in the sentence - our delaf dictionary codes
- transitivity properties of verbs (on the basis of
a lexicon-grammar of verbal argument structures) - - preverb feature shows whether the base verb
has a preverb
18Transformation Graphs 1.Transitive Verbs
- transitive verbs without expressed subject
- (somebody insertion)
-
- Det (V_compl) VMIB
N -
- ? Valaki V_vmib Det N t (V_compl) .
- transitive verbs with a subject with the PostP
által - Det Nsubj által (V_compl)
VMIB N -
- ? Nsubj V_vmib Det N t
(V_compl) . -
19Transformation Graphs 2.Intransitive Verbs
- head N becomes subject (patient)
-
- Det (V_compl) VMIB N
-
-
- ? Det N V_vmib (V_compl) .
20Structure of the graphs
- 1 graph
- 3 subgraphs according to complement-types
possessor / verbal complementadjunct / nothing/ - each subgraph divided into two paths
- transitive / intransitive verbs
21(No Transcript)
22(No Transcript)
23Evaluation
- central aspect to what extent does it augment
the efficiency of the IE system? - lack of information (recall value) is considered
less important than incorrect information
(precision) - evaluated on the 231.000 words corpus of short
business news - 1259 hits ? 898 qualified as informative
- precision 64
- further task recall
- (requires a corpus with manually annotated
participial structures)
24THANK YOU FOR YOUR ATTENTION!
gkata, eheja, magnes_at_corpus.nytud.hu