Machine Learning Approach to Automatic Functor Assignment in Prague Dependency Treebank - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Machine Learning Approach to Automatic Functor Assignment in Prague Dependency Treebank

Description:

CONJ (conjunction): Jim and Jack. CPR (comparison): taller than Jack ... Black: over 90% Red: less than 60% Blue: otherwise. Using the learned AFA trees in TrEd ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 46
Provided by: saso8
Category:

less

Transcript and Presenter's Notes

Title: Machine Learning Approach to Automatic Functor Assignment in Prague Dependency Treebank


1
Machine Learning Approach to Automatic Functor
Assignment in Prague Dependency Treebank
  • Sao Deroski
  • Institut Joef Stefan, Ljubljana
  • Department of Knowledge Technologies
  • Joint work with Zdenek abokrtský, Petr Sgall
  • Charles University, Prague
  • Institute of Formal and Applied Linguistics

2
Outline
  • Materials
  • The Prague Dependency Treebank
  • Analytical and Tectogramatical Tree Structures
  • Training and Testing Sets / Representation
  • Methods
  • Data flow
  • Machine Learning-based
  • Rule-based
  • Dictionary-based
  • Results
  • Conclusions and further work

3
Prague Dependency Treebank (PDT)
  • Long-term project aimed at a complex annotation
    of a part of the Czech National Corpus
    with rich annotation scheme
  • Institute of Formal and Applied Linguistics
  • Established in 1990 at the Faculty of Mathematics
    and Physics, Charles University, Prague
  • Jan Hajic, Eva Hajicová, Jarmila Panevová, Petr
    Sgall
  • http//ufal.mff.cuni.cz

4
Prague Dependency Treebank
  • Inspiration
  • The Penn Treebank (the most widely used
    syntactically annotated corpus of English)
  • Motivation
  • The treebank can be used for further linguistic
    research
  • More accurate results can be obtained (on a
    number of tasks) when using annotated corpora
    than when using raw texts
  • PDT reaches representations suitable as input for
    semantic interpretation, unlike most other
    annotations

5
Layered structure of PDT
  • Morphological level
  • Full morphological tagging (word forms, lemmas,
    mor. tags)
  • Analytical level
  • Surface syntax
  • Syntactic annotation using dependency syntax
    (captures analytical functions such as subject,
    object,...)
  • Tectogrammatical level
  • Level of linguistic meaning (tectogrammatical
    functions such as actor, patient,...)

Raw text
Morphologically tagged text
Analytic tree structures (ATS)
Tectogrammatical tree structures (TGTS)
6
The Analytical Level
  • The dependency structure chosen to represent the
    syntactic relations within the sentence
  • Output of the analytical level analytical tree
    structure
  • Oriented, acyclic graph with one entry node
  • Every word form and punctuation mark is a node
  • The nodes are annotated by attribute-value pairs
  • New attribute analytical function
  • Determines the relation between the dependent
    node and its governing nodes
  • Values Sb, Obj, Adv, Atr,....

7
The Tectogrammatical Level
  • Based on the framework of the Functional
    Generative Description as developed by Petr Sgall
  • In comparison to the ATSs, the tectogrammatical
    tree structures (TGTSs) have the following
    characteristics
  • Only autosemantic words have an own node,
    function words (conjunctions, prepositions) are
    attached as indices to the autosemantic words to
    which they belong
  • Nodes are added in case of clearly specified
    deletions on the surface level
  • Analytical functions are substituted by
    tectogrammatical functions (functors), such as
    Actor, Patient, Addressee,...

8
Functors
  • Tectogrammatical counterparts of analytical
    functions
  • About 60 functors
  • Arguments (or theta roles) and adjuncts
  • Actants (Actor, Patient, Adressee, Origin,
    Effect)
  • Free modifiers (LOC, RSTR, TWHEN, THL,...)
  • Provide more detailed information about the
    relation to the governing node than the
    analytical function

9
AN EXAMPLE ATS Michalkova upozornila, e zatim
je zbytecne podavat na spravu adosti ci
adat ji o podrobneji informace. Literal
translation Michalkova pointed-out that
meanwhile is superfluous to-submit to
administration requests or to-ask it for
more-detailed information.
10
AN EXAMPLE TGTS FOR THE SENTENCE M. pointed out
that for the time being it was superfluous to
submit requests to the administration, or to ask
it for a more detailed information.
Literal translation Michalkova pointed-out
that meanwhile is superfluous to-submit to
administration requests or to-ask it for
more-detailed information.
11
AN EXAMPLE TGTS FOR THE SENTENCEThe valuable
and fascinating cultural event documents that
the long-term high-quality strategy of the
Painted House exhibitions, established by L. K.,
attracts further activities in the domains of
art and culture.
12
Some TG Functors
  • ACMP (accompaniement) mothers with children
  • ACT (actor) Peter read a letter.
  • ADDR (addressee) Peter gave Mary a book.
  • ADVS (adversative) He came there, but didn't
    stay long.
  • AIM (aim) He came there to look for Jane.
  • APP (appuerenance, i.e., possesion in a broader
    sense) John's desk
  • APPS (apposition) Charles the Fourth, (i.e.) the
    Emperor
  • ATT (attitude) They were here willingly.
  • BEN (benefactive) She made this for her
    children.
  • CAUS (cause) She did so since they wanted it.
  • COMPL (complement) They painted the wall blue.
  • COND (condition)If they come here, we'll be
    glad.
  • CONJ (conjunction) Jim and Jack
  • CPR (comparison) taller than Jack
  • CRIT (criterion) According to Jim, it was rainng
    there.

13
Some more TG Functors
  • ID (entity) the river Thames
  • LOC (locative) in Italy
  • MANN (manner) They did it quickly.
  • MAT (material) a bottle of milk
  • MEANS (means) He wrote it by hand.
  • MOD (mod) He certainly has done it.
  • PAR (parentheses) He has, as we know, done it
    yesterday.
  • PAT (patient) I saw him.
  • PHR (phraseme) in no way, grammar school
  • PREC (preceding, particle referring to context)
    therefore, however
  • PRED (predicate) I saw him.
  • REG (regard) with regard to George
  • RHEM (rhematizer, focus sensitive particle)
    only, even, also
  • RSTR (restrictive adjunct) a rich family
  • THL (temporal-how-long ) We were there for three
    weeks.
  • THO (temporal-how-often) We were there very
    often.
  • TWHEN (temporal-when) We were there at noon.

14
Automatic Functor Assignment
  • Motivation Currently annotation done by humans,
    consumes huge amounts of time of linguistic
    experts
  • Overall goal Given an ATS, generate a TGTS
  • Specific task Given a node in an ATS,
    assign a tectogrammatical functor
  • Approach Use sentences with existing manually
    derived ATSs and TGTSs to learn how to assign
    tectogrammatical functors
  • More specifically, use machine learning to learn
    rules for assigning tectogrammatical functors

15
What context of a node to take into account for
AFA purposes?
a) only node U
b) whole tree
c) node U and its parent
d) node U and its siblings
16
The attributes
  • Lexical attributes lemmas of both G and D
    nodes, and the lemma of a preposition /
  • subordinating conjunction that binds both
    nodes,
  • Morphological attributes POS, subPOS,
    morphological voice, morphologic case,
  • Analytical attributes the analytical functors of
    G/D
  • Topological attributes number of children
    (directly depending nodes) of both nodes in the
    TGTS
  • Ontological attributes semantic position of the
    node lemma within the EuroWordNet Top Ontology

17
Take 1 (2000) The attributes and the class
Given
  • Governing node
  • Word form
  • Lemma
  • Full morphological tag
  • Part of speech (POS) (extracted from above)
  • Analytical function from ATS
  • Dependent node
  • Word form
  • Lemma
  • Full morphological tag
  • POS and case (extracted from above)
  • Analytical function
  • Conj. or preposition between G and D node

Predict Functor of the dependent node
18
Training examples
  • zastavme zastavit1 vmp1avpredokamz_i
    k okamz_ik nis4a n4naadvtfhl
  • zastavme zastavit1 vmp1avpredustanov
    eni_ustanoveni_nns2a n2u adv loc
  • normy norma nfs2a natr
    nove_ novy_ afs21a a0
    atr rstr
  • normy norma nfs2a natr
    pra_vni_ pra_vni_ afs21aa0 atr
    rstr
  • ustanoveni_ ustanoveni_nns2a nadvnormy
    norma nfs2a n2 atr pat

19
Take 1 (2000) The methods used
  • Machine learning Induction of decision trees
  • Hand-crafted rules
  • Dictionaries of unambiguous assigments

20
Machine Learning - Decision Trees
  • Decision trees learned using C4.5
  • Only leaves with accuracy over 80 kept
  • Semiautomatic transformation into Perl
  • if (dep_afun"atr")
  • if (conj_prep eq "o") functor"pat"
  • if (conj_prep eq "v") functor"loc"
  • if (conj_prep eq "z") functor"dir1"
  • if (conj_prep"null")
  • if (dep_case"0")
  • if (dep_morph eq "a")
    functor"rstr"

dep_afun atr conj_prep aby aim
(4.0/2.2) conj_prep bez acmp (2.0/1.0)
conj_prep do dir3 (11.0/3.6)
conj_prep o pat (25.0/4.9)
conj_prep v loc (35.0/6.0)
conj_prep z dir1 (35.0/3.8)
conj_prep null dep_case 0

21
Hand-crafted rules
  • Verbs_active if the governing node is verb
  • If the analytical function is subject, then ACT
  • Object in dativ, then ADDR
  • Object in acusativ, then PAT
  • Similar rules for verbs_passive, adjectives,
    pronounsposs, numerals, pnom, pred

22
Dictionaries generated from data
  • Adverbs Couples adverb-functor extracted from
    the training set, couples of unambigous adverbs
    saved in dictionary
  • Prepnoun All pairs preposition-noun extracted,
    unambiguous pairs that occur at least twice saved
    in dictionary

23
AFA Evaluation (Take 1)
  • Divide existing sentences into a training (6049
    nodes) and testing set (1089 nodes) to be able to
    evaluate performance
  • 1) Only ML
  • a) without pruning
  • Cover 100 Precision Recall
    76
  • b) ML80 (after pruning of the rules
  • with expected precision
    worse than 80 )
  • Cover 37.3 Recall 35.3
    Precision94.5
  • 2) Only handcrafted rules
  • Cover51.2 Recall48.1
    Precision93.9

24
AFA Evaluation (Take 1)
  • 3) ML80 hand-crafted rules dictionaries
    (adverbs prepnoun)
  • Cover62.8 Recall58.7
    Precision93.5
  • When trying to assign everything, with the
    available training set it is probably not
    possible to reach AFA accuracy of 90 (rather 75
    to 80)
  • ... but using a subset of the available methods,
    it is possible to reach sufficient precision on
    the 60 cover

25
One automatically anotated TGTS (after Take 1)
  • Proto je dobré seznámit se s jejich praktikami a
    tak vlastne preventivne predcházet moným metodám
    konkurencních firem.

26
Take 2 (2002)
  • Lesson from Take 1 Annotators want high recall,
    even at the cost of lower precision
  • Consequence Use machine learning only
  • More training data/annotated sentences (1536
    sentences 27463 nodes in total)
  • Use a larger set of attributes
  • Topological (number of children of G/D nodes)
  • Ontological (WordNet)
  • Newer version of ML SW (C5.0)

27
Ontological attributes
  • Semantic concepts (63) of Top Ontology in EWN
    (e.g., Place, Time, Human, Group, Living, )
  • For each English synset, a subset of these is
    linked
  • Inter Lingual Index Czech lemma -gt English
    synset -gt subset of semantic concepts
  • 63 binary attributes positive/negative relation
    of Czech lemma to the respective concept TOEWN

28
Methodology
29
Methodology
  • Evaluation of accuracy by 10-fold
    cross-validation
  • Rules to illustrate the learned concepts
  • Trees translated to Perl code included in TrEd
    a tool that annotators use

30
Different sets of attributes
  • E-0 (empty)
  • E1 Only POS E2 Only Analytical function
  • E3 All morphological atts E-2
  • E4 E3 Attributes of governing node
  • E5 E4 funct. Words (preps./conjs.)
  • E6 E5 lemmas E7 E5 EWN
  • E8 E6 E7

31
AFA performance
32
Example rules (1)
33
Example rules (2)
34
Example rules (3)
35
Example rules (4)
36
Example rules (5)
37
Example rules (6)
38
Example rules ()
39
Example rules (E8)
40
Learning curve (for E-8)
41
Using the learned AFA trees
  • PDT Annotators use TrEd editor
  • Learned trees transformed into Perl
  • A keyboard shortcut defined in TrEd which
    executes the decision tree for each node of the
    TGT and assigns functors
  • Color coding of factors based on confidence
  • Black over 90
  • Red less than 60
  • Blue otherwise

42
Using the learned AFA trees in TrEd
43
Annotators response
  • Six annotators
  • All agree The use of AFA significantly increases
    the speed of annotation (twice as long without
    it)
  • All annotators prefer to have as many assigned
    functors as possible
  • They do not use the colors (even though red nodes
    are corrected in 75 on unseen data)
  • Found some systematic errors bade by AFA
    suggested the use of topological attributes

44
Conclusions
  • ML very helpful for annotating PDT, even though
  • PDTs very close to the semantics of natural
    language
  • Faster
  • Very accurate
  • Automatically assigned functors corrected in 20
    of the cases
  • Human annotators disagree in more than 10 of the
    cases
  • Very close to what is possible to achieve through
    learning

45
Further work
  • Slovene Dependency Treebank
  • Morphological analysis (done)
  • Part-Of-Speech tagging (done)
  • Parsing/grammar (only a rough draft)
  • Annotation of sentences
  • from Orwells 1984 (in progress)
Write a Comment
User Comments (0)
About PowerShow.com