Geen diatitel - PowerPoint PPT Presentation

About This Presentation
Title:

Geen diatitel

Description:

Marc Kemps-Snijders, Alex Klassmann, Claus Zinn, Peter Berck, Albert Russel, Peter ... Dutch Institute for Lexicology, the Max Planck Society and the Max Planck ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 16
Provided by: kamp89
Learn more at: http://www.lrec-conf.org
Category:

less

Transcript and Presenter's Notes

Title: Geen diatitel


1
? ? ?
Exploring and Enriching a LR Archive via the Web
Marc Kemps-Snijders, Alex Klassmann,
Claus Zinn, Peter Berck, Albert Russel,
Peter Wittenburg MPI for Psycholinguistics DOBES
Endangered Languages Project
2
? ? ?
What is a digital archive?
  • Two essential dimensions
  • Long-term Preservation of all resources and
    relations
  • Accessibility and Interpretability
  • Why preserve?
  • face the loss of our cultural memory on
    electronic media
  • UNESCO 80 of the recordings about languages
    and cultures
  • are highly endangered
  • There are no guarantees for preservation but we
    can increase chances of survival
  • store everything in a well-organized repository
    (browsable/searchable)
  • take care of redundancy, migration and curation
    on various dimensions
  • establish organizations that take responsibility
  • Digital Archives are living Entities!
  • Live Archives Concept allow enrichments
    (standoff), relations etc

3
? ? ?
What is in MPIs archive?
  • Endangered Language Documentation resources
  • Representative record of a language in its
    cultural context
  • Crucial is the active involvement of the
    community
  • May help in maintaining and revitalizing
    languages
  • Therefore trend towards complementing linguistic
    information with ontological one in collaborative
    spaces
  • Child language, bilingualism, gesture, sign
    language, corpus spoken Dutch, sound corpora,
    second learner corpora, etc.

Mostly annotated audio/video recordings 30
Terabyte, 53.000 AV resources, 24.000 annotation
files, 60 Mio annotations, lexicons, sketch
grammars, etc.
All from a large number of depositors
4
? ? ?
DOBES Languages
40 language teams from the DOBES program
documenting about 60 languages and working
independently
5
? ? ?
Language Archiving Technology
LAT to support operations during resource
life-time
support standards where possible
6
? ? ?
LAT Dimensions Management Upload
  • take care of consistency
  • check uploaded formats
  • convert where possible
  • create presentation formats
  • create indexes
  • allow access rights definition
  • add unique persistent IDs
  • take care of distribution
  • basis is a robust repository
  • system with reliable mechanisms

resources metadata
repository system
metadata editing
7
? ? ?
LAT Dimensions Complex Access
  • access to annotated
  • media or multimedia
  • lexica
  • callable via any other
  • web application

8
? ? ?
LAT Dimensions Customized views
  • fostering the creation of special web-sites by
    REST interfaces and templates
  • fostering GIS presentations by special
    converters

9
? ? ?
Who are our users?
Stakeholder Interest
archivist easy management, easy discovery, consistency, statistics, versioning, ..
researchers easy visualization, easy discovery, virtual collections, extensions, permissions, ..
communities semantic exploration, extensions, permissions, ..
journalists appetizers, easy inspection, ..
students curiosity, navigation, inspection, ..
Still in a download first paradigm not
cyberinfrastructure usage (result of an ESF/NSF
workshop)
10
? ? ?
Download first problems and disadvantages
  • Tool and format updates are propagated to users
    at a slow rate
  • legacy formats offered to archives pose an
    increasing burden on archives or tool builders
    (conversion/migration)
  • New techniques slowly spread through the community
  • Orchestration between tools becomes much more
    difficult if not impossible
  • Users need to install tools locally

Can we provide more incentives on the tools side?

11
? ? ?
How to extend LAT?
  • Paper dictionaries limited usefulness in
    language maintenance
  • language revival (Manning et al., 2000)
  • Linear lexicons not at all interesting except
    for linguists
  • Speech community may prefer explicit semantic
    acces and links, possibly
  • of a wide variety of types (i.e. beyond formal
    systems)
  • Semantic view not limited to lexicons, but
    should include all fragments

Therefore, introduction of conceptual spaces,
where concepts are related to others anchored
in language illustrated with multimedia
12
? ? ?
ADDIT Commentary Relations
  • allow authorized people to make arbitrary
    comments on and relations between
  • object fragments
  • visualize them in tools and via VICOS

13
? ? ?
VICOS Lexical relations navigation
  • Allow users to create relations within and across
    lexicons
  • across cognate sets etc
  • Visualize and allow easy navigation in conceptual
    spaces
  • Empower community members to actively describe
    their LC and to learn from such resources
  • Decide which words offer key access to cultural
    concepts
  • Technology needed to link words (and the
    associations they evoke) to other words and to
    all sorts of relevant fragments
  • Conceptual Spaces informal ontology of
    fuzzily-defined concepts and relationships
  • But where concepts are anchored in
    corresponding formal lexicon entries

14
(No Transcript)
15
? ? ?
Team and Acknowledgements
  • LAT Team
  • System Managers
  • Archive Managers Digitization
  • Software Developers

Acknowledgements The work was funded by the
VolkswagenFoundation, the European Commission,
the Dutch Science Organization, the Dutch
Institute for Lexicology, the Max Planck Society
and the Max Planck Institute for Psycholinguistics
Write a Comment
User Comments (0)
About PowerShow.com