Results - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Results

Description:

Little difference in ranking (mostly just /1) when using UAS or label accuracy ... Bilge Say and Kemal Oflazer for granting the Metu-Sabanci license for CoNLL-X ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 13
Provided by: Sab663
Category:
Tags: bilge | results

less

Transcript and Presenter's Notes

Title: Results


1
Results see proceedings for full table
Ar Ch Cz Da Du Ge Ja Po Sl Sp Sw Tu Tot SD Bu
McD 66.9 85.9 80.2 84.8 79.2 87.3 90.7 86.8 73.4 82.3 82.6 63.2 80.3 8.4 87.6
Niv 66.7 86.9 78.4 84.8 78.6 85.8 91.7 87.6 70.3 81.3 84.6 65.7 80.2 8.5 87.4
ON 66.7 86.7 76.6 82.8 77.5 85.4 90.6 84.7 71.1 79.8 81.8 57.5 78.4 9.4 85.2
Rie 66.7 90.0 67.4 83.6 78.6 86.2 90.5 84.4 71.2 77.4 80.7 58.6 77.9 10.1 0.0
Sag 62.7 84.7 75.2 81.6 76.6 84.9 90.4 86.0 69.1 77.7 82.0 63.2 77.8 9.0 0.0
Che 65.2 84.3 76.2 81.7 71.8 84.1 89.9 85.1 71.4 80.5 81.1 61.2 77.7 8.7 86.3
Cor 63.5 79.9 74.5 81.7 71.4 83.5 90.0 84.6 72.4 80.4 79.7 61.7 76.9 8.5 83.4

Av 59.9 78.3 67.2 78.3 70.7 78.6 85.9 80.6 65.2 73.5 76.4 56.0 80.0
SD 6.5 8.8 8.9 5.5 6.7 7.5 7.1 5.8 6.8 8.4 6.5 7.7 6.3
2
Results continued
  • Good parsers good on all languages
  • Best overall scores achieved by two very
    different approaches
  • Little difference in ranking (mostly just /-1)
    when using UAS or label accuracy metric instead
    of LAS
  • except Two groups with special emphasis on
    DEPREL values score 2/3 ranks for label
    accuracy one group with bug in HEAD assignment
    scores 4 ranks for UAS
  • Very little difference in scores as well as
    rankings when scoring all tokens (i.e. including
    punctuation)
  • But some outliers in ranking for individual
    languages
  • Turkish Johansson and Nugues 7, Yuret 7,
    Riedel et al. -5
  • Dutch Canisius et al. 6, Schiehlen and Spranger
    8

3
Analysis easy data sets
Ar Ch Cz Da Du Ge Ja Po Sl Sp Sw Tu Bu
Top score 66.9 90.0 80.2 84.8 79.2 87.3 91.7 87.6 73.4 82.3 84.6 65.7 87.6
Av. score 59.9 78.3 67.2 78.3 70.7 78.6 85.9 80.6 65.2 73.5 76.4 56.0 80.0
Tokens (k) 54 337 1249 94 195 700 151 207 29 89 191 58 190
Tok./tree 37.2 5.9 17.2 18.2 14.6 17.8 8.9 22.8 18.7 27.0 17.3 11.5 14.8
DEP./CPOS 1.9 6.3 6.5 5.2 2 .88 .35 3.7 2.3 1.4 1.5 1.8 1.6
DEP./POS 1.4 .28 1.2 2.2 .09 .88 .09 2.6 .89 .55 1.5 .83 .34
H. prec. 82.9 24.8 50.9 75.0 46.5 50.9 8.9 60.3 47.2 60.8 52.8 6.2 62.9
H. foll. 11.6 58.2 42.4 18.6 44.6 42.7 72.5 34.6 46.9 35.1 40.7 80.4 29.2
np trees 11.2 0.0 23.2 15.6 36.4 27.8 5.3 18.9 22.2 1.7 9.8 11.6 5.4
new FOR. 17.3 9.3 5.2 18.1 20.7 6.5 0.96 11.6 22.0 14.7 18.0 41.4 14.5
new LEM. 4.3 n/a 1.8 n/a 15.9 n/a n/a 7.8 9.9 9.7 n/a 13.2 n/a
4
Analysis difficult data sets
Ar Ch Cz Da Du Ge Ja Po Sl Sp Sw Tu Bu
Top score 66.9 90.0 80.2 84.8 79.2 87.3 91.7 87.6 73.4 82.3 84.6 65.7 87.6
Av. score 59.9 78.3 67.2 78.3 70.7 78.6 85.9 80.6 65.2 73.5 76.4 56.0 80.0
Tokens(k) 54 337 1249 94 195 700 151 207 29 89 191 58 190
Tok./tree 37.2 5.9 17.2 18.2 14.6 17.8 8.9 22.8 18.7 27.0 17.3 11.5 14.8
DEP./CPOS 1.9 6.3 6.5 5.2 2 .88 .35 3.7 2.3 1.4 1.5 1.8 1.6
DEP./POS 1.4 .28 1.2 2.2 .09 .88 .09 2.6 .89 .55 1.5 .83 .34
H. prec. 82.9 24.8 50.9 75.0 46.5 50.9 8.9 60.3 47.2 60.8 52.8 6.2 62.9
H. foll. 11.6 58.2 42.4 18.6 44.6 42.7 72.5 34.6 46.9 35.1 40.7 80.4 29.2
np trees 11.2 0.0 23.2 15.6 36.4 27.8 5.3 18.9 22.2 1.7 9.8 11.6 5.4
new FOR. 17.3 9.3 5.2 18.1 20.7 6.5 0.96 11.6 22.0 14.7 18.0 41.4 14.5
new LEM. 4.3 n/a 1.8 n/a 15.9 n/a n/a 7.8 9.9 9.7 n/a 13.2 n/a
5
The future Using the resources Parsed test
sets
  • Parser combination
  • Check test cases where parser majority disagrees
    with treebank
  • Possible reasons
  • Challenge for current parser technology
  • Treebank annotation wrong
  • Conversion wrong
  • (Non-sentence)
  • for German test data checked cases where 17 out
    of 18 parsers said DEPRELX but gold standard had
    DEPRELY (11 cases)
  • 4 challenge distinguishing PP complements from
    adjuncts
  • 1 treebank PoS tag wrong
  • 1 non-sentence
  • 5 either treebank or conversion wrong (to be
    investigated)

6
The future Using the resources Parsers
  • Collaborate with treebank providers or other
    treebank experts
  • To semi-automatically enlarge or improve the
    treebank
  • To use automatically parsed texts as input for
    other NLP projects
  • Please let us know if you are willing to help
  • Arabic Otakar Smrž, Jan Hajic also general
    feedback
  • Bulgarian Petya Osenova, Kiril Simov
  • Czech Jan Hajic
  • German Martin Forst, Michael Schiehlen, Kristina
    Spranger
  • Portuguese Eckhard Bick
  • Slovene Tomaž Erjavec
  • Spanish Toni Martí, Roser Morante
  • Swedish Joakim Nivre
  • Turkish Gülsen Eryigit
  • Gertjan van Noord (Dutch Alpino treebank)
    interested in general feedback

7
The future Improving the resources
  • http//nextens.uvt.nl/conll/ is a static web
    page
  • But hopefully many other people will continue
    this line of research
  • They need a platform to exchange (information
    about)
  • Experience with/bug reports about/patches for
    treebanks
  • Treebank conversion and validation scripts (esp.
    head tables!)
  • Other new/improved tools (e.g. visualization,
    analysis)
  • Details of experiments on new treebanks (e.g.
    training/test split)
  • Predictions on test sets by new/improved parsers
  • ...
  • SIGNLL agreed to provide such a platform (hosted
    at Tilburg University)
  • http//depparse.uvt.nl/ a Wiki site where
    everybody is welcome to contribute!

8
Acknowledgements Many thanks to
  • Jan Hajic for granting the PDT/PADT temporary
    licenses for CoNLL-X and talking to LDC about it
  • Christopher Cieri for arranging distribution
    through LDC and Tony Castelletto for handling the
    distribution
  • Otakar Smrž for valuable help during the
    conversion of PADT
  • the SDT people for granting the special license
    for CoNLL-X and Tomaž Erjavec for converting the
    SDT for us
  • Matthias Trautner Kromann and assistants for
    creating the DDT and releasing it under the GNU
    General Public License
  • Joakim Nivre, Johan Hall and Jens Nilsson for the
    conversion of DDT to Malt-XML, for the conversion
    of the original Talbanken to Talbanken05 and for
    making it freely available for research purposes
  • Joakim Nivre again for prompt and proper response
    to all our questions
  • Bilge Say and Kemal Oflazer for granting the
    Metu-Sabanci license for CoNLL-X and answering
    questions
  • Gülsen Eryigit for making many corrections to
    Metu-Sabanci and discussing some aspects of the
    conversion

9
Acknowledgements continued
  • the TIGER team (esp. Martin Forst) for allowing
    us to use the treebank for the shared task
  • Yasuhiro Kawata, Julia Bartels and colleagues
    from Tübingen University for the construction of
    the Japanese Verbmobil treebank
  • Sandra Kübler for providing the Japanese
    Verbmobil data and granting the special license
    for CoNLL-X
  • Diana Santos, Eckhard Bick and other Floresta
    sintá(c)tica project members for creating the
    treebank and making it publicly available, for
    answering many questions about the treebank
    (Diana and Eckhard), for correcting problems and
    making new releases (Diana), and for sharing
    scripts and explaining the head rules implemented
    in them (Eckhard)
  • Jason Baldridge for useful discussions and to Ben
    Wing for independently reporting problems which
    Diana then fixed
  • Gertjan van Noord and the other people at
    Groningen University for creating the Alpino
    Treebank and releasing it for free
  • Gertjan van Noord for answering all our questions
    and for providing extra test material
  • Antal van den Bosch for help with the
    memory-based tagger

10
Acknowledgements continued
  • Academia Sinica for granting the Sinica treebank
    temporary license for CoNLL-X
  • Keh-Jiann Chen for answering our questions about
    the Sinica treebank
  • Montserrat Civit and Toni Martí for allowing us
    to use Cast3LB for CoNLL-X and supplying the head
    table and function mapping
  • Kiril Simov and Petya Osenova for allowing us to
    use the BulTreeBank for CoNLL-X
  • Svetoslav Marinov, Atanas Chanev, Kiril Simov and
    Petya Osenova for converting the BulTreeBank
  • Dan Bikel for making the Randomized Parsing
    Evaluation Comparator
  • SIGNLL for having a shared task
  • Erik Tjong Kim Sang for posting the Call for
    Papers
  • Lluís Màrquez and Xavier Carreras for sharing
    their experience from organizing the two previous
    shared tasks
  • Lluís Màrquez for being a very helpful CoNLL
    organizer
  • All participants, including those who could not
    submit results or cannot be here today
  • It has been a pleasure working with you!

11
Future research
  • More result analysis
  • Baseline
  • Correlation between parsing approach and types of
    errors
  • Importance of individual features and algorithm
    details
  • Repeat experiments on improved data
  • POSTAG and FEATS for Talbanken05, LEMMA for
    Talbanken05 (DDT)
  • LEMMA and FEATS for new version of TIGER, POSTAG
    for TIGER
  • better DEPREL for Cast3LB
  • larger treebanks
  • having several good parsers facilitates
    annotation of more text!
  • better quality
  • check cases where parsers and treebank disagree!

12
Future research continued
  • Repeat experiments with other parsers
  • http//nextens.uvt.nl/conll/post_task_data.html
  • Repeat experiments with additional external
    information
  • large-scale distributional data harvested from
    the internet
  • Similar experiments but including secondary
    relations
  • Similar experiments on other data
  • Hebrew treebank
  • SUSANNE treebank (English)
  • Kyoto University Corpus, ATR corpus (Japanese)
  • Szeged Corpus (Hungarian)
  • see list in Wikipedia article on treebank
    (please contribute!)
  • Integrate automatic tokenization, morphological
    analysis and POS tagging
Write a Comment
User Comments (0)
About PowerShow.com