Title: Developments in Metadata Interoperability: Museums and Localisation Industry paradigms
1Developments in Metadata Interoperability
Museums and Localisation Industry paradigms
2Overview
- Metadata interoperability issue
- Museums metadata paradigm
- RDA, working on the global metadata standard
- Localisation paradigm (as a business metadata
example) - The XLIFF standard and its potential application
as localisation metadata standard to all other
formats. - Summary
3Metadata interoperability issue (I) The problem
space
- Various metadata schemes and element sets
- Other are well known documented
- Other are less known and used in special cases
- Similar or the same content is described by
different metadata standards - No canonical metadata record for an object
- Varied syntaxes for encoding metadata
- This situation leads to
- A very rich and diverse metadata ecology!
- Problems in a networked environment
4Metadata interoperability issue (II)The problem
space
- In a networked environment
- Interaction between systems during harvesting and
searching - Integrating different types of metadata even for
local information management (i.e. inside a
library or a museums LAN) - Interoperability
- "Interoperability is the ability of multiple
systems with different hardware and software
platforms, data structures, and interfaces to
exchange data with minimal loss of content and
functionality" NISO, 2004. - "Interoperability is the ability of two or more
systems or components to exchange information and
use the exchanged information without special
effort on either system" CCDA, 2000.
5Metadata interoperability issue (III)Solutions
- Schema level Efforts are focused on the
elements of the schemata, being independent of
any application. - Derivation (e.g. MARC? MARCXML, MARCLite, MODS) -
Application Profiles - Crosswalks (absolute and
relative) - Switching-across - Metadata Registry - Record level Efforts are intended to integrate
populated metadata records through the mapping of
the elements according to the semantic meanings
of these elements. - Conversion of Metadata Records (e.g. MARC ??MODS)
- Data Reuse and Integration (e.g. Resource
Description Framework -RDF) - Repository level With harvested or integrated
records from varying sources, efforts at this
level focus on mapping value strings associated
with particular elements (e.g., terms associated
with subject or format elements). The results
enable cross-collection searching. - Metadata Repository Based on the Open Archives
Initiative (OAI) Protocol - Metadata Repository
Supporting Multiple Formats Without Record
Conversion - Aggregation
6Museums metadata paradigm(I)
- Descriptive or Content metadata in Museum
- Museum Collections Management/Documentation
Standards - CHIN Data Dictionaries
- SPECTRUM (Standard ProcEdures for CollecTions
Recording Used in Museums) - CIDOC Guidelines for Museum Object Information
The CIDOC Information Categories - Collections Description Standards
- Collection-level description
- RSLP Standard for Collection-level description
(based on DC) - Description of Art Collections and/or Visual
Resources - Categories for the Description of Works of Art
(CDWA) - VRA Core Categories
- Méthode d'inventaire informatique des objets
beaux-arts et arts décoratifs - RLG REACH Element Set
- Le catalogage des estampes
- Description of Architecture, Archaeological
Sites/Monuments - A Guide to the Description of Architectural
Drawings - CIDOC International Core Data Standard for
Archaeological Sites and Monuments - CIDOC International Core Data Standard for
Archaeological Objects - MIDAS (Monument Inventory Data Standard)
7Museums metadata paradigm(II)
- General Metadata Standards for Resource Discovery
- Dublin Core, The Dublin Core Metadata Element Set
- Museum use a discipline-specific standard (CHIN
Data Dictionaries or SPECTRUM) in order to
document and manage their collections, and
extract a subset of their collections records
which map to the Dublin Core Elements. - Darwin Core
- Darwin Core (DwC) is a "profile describing the
minimum set of standards for search and retrieval
of natural history collections and observation
databases". - Multimedia Metadata Standards (NISO NISO
Z39.87-2002 Technical Metadata for Digital Still
Images, DIG35 Specification, MPEG-7, Video
Development Initiative (ViDe) User's Guide
Dublin Core Application Profile for Digital
Video) - Metadata Standards for Digital Preservation RLG
Preservation Metadata Elements, Metadata for
Long-Term Preservation, Metadata Encoding and
Transmission Standard (METS) - Intellectual Property Rights and Electronic
Commerce Standards (NDECS (Interoperatibility of
Data for Electronic Commerce Systems), MPEG-21,
Digital Object Identifier (DOI)
Metadata interoperability problem exists in a
Museums Environment
8Museums metadata crosswalks
- Museums and the Network Why
- Museums that want to convert their data from one
format to another (for example, moving data into
a new collections management system) - Museums that want to exchange data with another
organization using a different metadata standard - Several museums that wish to collaborate to
create a collective or distributed resource that
allows seamless searching by users - Museums using internally more than one standard
to meet their various needs for documentation,
management, security, and access - Implemented Solutions Crosswalk of Metadata
Element Sets - The Getty Research Institute "Crosswalk of
Metadata Element Sets for Art, Architecture, and
Cultural Heritage Information and Online
Resources". - CHIN Humanities Data Dictionary
9RDA Resource Description and Access
- Joint Steering Committee for Revision of AACR
(JSC) is working towards a new standard RDA
Resource Description and Access, scheduled for
release in early 2009. - RDA is a new standard for resource description
and access, designed for the digital environment - A flexible framework for describing all resources
- analog and digital - Data readily adaptable to new and emerging
database structures - Data compatible with existing records in online
library catalogues - Globalizable and Localizable content standard
covering all media - Independent of technical communication formats
- Aimed at everybody who needs to find, identify,
select, obtain, use, manage and organize
information - RDA and other standards
- RDA/ONIX framework for resource categorization
- RDA/MARC21 mapping
- RDA/Dublin Core mapping
http//www.collectionscanada.ca/jsc/rda.html
10Localisation paradigm (as a business metadata
example)XML Localization Interchange File Format
- OASIS-XLIFF (Organization for the Advancement of
Structured Information Standards - XML
Localization Interchange File Format) has emerged
as a standard interchange file format for
localization-related data and metadata. - Localization challenges
- Insufficient interoperability between tools.
- Lack of support for overall localization
workflow. - Localization tools developers and users need to
deal with various formats. - Large number of proprietary intermediate formats.
11Localisation paradigm (as a business metadata
example)XML Localization Interchange File Format
- XLIFF Advantages Localization Customer, Tools
Vendor, Service Provider - Single format for adjunct processing (e.g.
quality control in terms of spell checking). - Less dependency on vendors which are able to work
with special formats. - Tighter control on what goes to localization
(Pre-filtering of what to translate or not). - Controlled information flow (author/developer
notes, item properties, etc.). - All advantages of XML-based processing (e.g.
ID-based leveraging) - Focus on development of core functionality rather
treatment of source format. - Open and standard solution for proprietary
formats. - Global implementation of utilities (e.g. one
spell checker for both RTF and HTML).
12 The High Level View
- An XLIFF document can capture anything needed for
a localization project - Localizable objects (e.g. text strings) in source
and target languages. - Supplementary information (e.g. glossaries, or
material to recreate the original format). - Administrative information (e.g. workflow data).
- Custom data (e.g. initialization information for
tools).
13 The XLIFF Document
- An XLIFF document is designed to store the
extracted data related to localization. - Each given source container (e.g. a file, a
database table, and so forth) corresponds to a
ltfilegt element in XLIFF. - Each XLIFF document can include several ltfilegt
elements. - A whole localization project can possibly be
stored in a single XLIFF document.
14 Bilingual Model
- Each ltfilegt element is designed to store one
source language and one target language. - The rationale is that the translation for every
target language is done by different people most
of the time. - However, languages in ltalt-transgt element can be
different. For example, proposed matches in
national Portuguese when translating into
Brazilian Portuguese.
15 Localizable Objects
- XLIFF allows not only text string localization
but also localization of other object types such
as graphics. - Supplementary information can be represented in a
generic way through inline codes (e.g. formatting
of text). - Relationship between objects can be captured
(e.g. all items in a menu).
16An XLIFF Snippet
- A simple menu represented as XLIFF
17Content Creators
Localisation Domain
Publisher/ Customer Domain
18- OR -
Localisation Domain
Publisher/ Customer Domain
19- The needs of local communities often conflict
with the goal of effective cross-searching and
retrieval of information. - Metadata interoperability becomes top priority in
a networked information environment. - Many methods and techniques have already been
developed by various Organizations to overcome
the problem. - A common global standard, such as RDA, can offer
solutions. - Localisation Industry has already realized the
advantages of a common localisation metadata
standard, through the development of XLIFF.
20References
- RDA (Research Description and Access) is
available at www.collectionscanada.ca/jsc/news.htm
l - The XLIFF (XML Localisation Interchange File
Format) TC Web Site http//www.xliff.org - Metadata Interoperability and Standardization A
Study of Methodology Part I and II, Lois Mai
Chan, Marcia Lei Zeng , D-Lib Magazine, June
2006, Volume 12 Number 6 (available in
http//www.dlib.org/dlib/june06/chan/06chan.html) - NISO 2004 NISO (National Information Standards
Organization). (2004). Understanding metadata.
Bethesda, MD NISO Press. Available
lthttp//www.niso.org/standards/resources
/UnderstandingMetadata.pdfgt. - CCDA, 2000 CCDA (ALCTS/CCS/Committee on
Cataloging Description and Access). (2000). Task
Force on Metadata Final report, June 16, 2000.
Available lthttp//www.libraries.psu.edu/tas/jca/c
cda/tf-meta6.htmlgt. - Taylor 2004, p. 369 Taylor, A. (2004). The
Organization of Information. 2nd ed. Westport,
CN Libraries Unlimited. - MARC (MAchine-Readable Cataloging) Formats
http//lcweb.loc.gov/marc/ Developed and
maintained by the Library of Congress Network
Development and MARC Standards Office. - MODS (Metadata Object Description Schema)
http//www.loc.gov/standards/mods/ Developed and
maintained by the Library of Congress Network
Development and MARC Standards Office. - RDF DuCharme, B. (2003). Building Metadata
Applications with RDF. XML.com. Available
lthttp//www.xml.com/pub/a/2003/02/12/rdflib.htmlgt.
- METS (Metadata Encoding and Transmission
Standard) http//www.loc.gov/mets Developed as an
initiative of the Digital Library Federation
maintained in the Network Development and MARC
Standards Office of the Library of Congress.
21References
- MARCLite, http//www.loc.gov/marc/bibliographic/li
te/ Developed and maintained by the Library of
Congress Network Development and MARC Standards
Office. A subset of the data elements in the
complete MARC 21 Format for Bibliographic Data. - MARCXML, http//www.loc.gov/marcxml , Developed
and maintained by the Library of Congress Network
Development and MARC Standards Office. A
framework for working with MARC data in a XML
environment. - GEM (The Gateway to Educational Materials)
Element Set, http//www.thegateway.org/about/docum
entation/schemas, Maintained by the GEM
Consortium. A set of metadata elements based on
the Dublin Core used by GEM members to organize
and improve access to their own educational
materials. - ETD-MS an Interoperability Metadata Standard for
Electronic Theses and Dissertations,
http//www.ndltd.org/standards/metadata/current.ht
ml, Developed by the Networked Digital Library of
Theses and Dissertations (NDLTD). A standard set
of metadata elements based on the Dublin Core
used to describe electronic theses and
dissertations. - CHIN Canadian Heritage Information Network,
http//www.chin.gc.ca/English/index.html - SPECTRUM Standard ProcEdures for CollecTions
Recording Used in Museums, The UK Museum
Documentation Standard. 1997-2002,
www.mda.org.uk/spectrum.htm - ICOM-CIDOC The International Committee for
Documentation of the International Council of
Museums, http//www.willpowerinfo.myby.co.uk/cidoc
/ - RSLP (Research Support Libraries Programme in UK)
, Standard for Collection-level description,
http//www.ukoln.ac.uk/metadata/rslp/
22References
- CDWA, Categories for the Description of Works of
Art, http//www.getty.edu/research/conducting_rese
arch/standards/cdwa/introduction.html - VRA (Visual Resources Association) Core
Categories, http//www.vraweb.org/ - RLG (Research Libraries Group) REACH Element Set,
http//www.rlg.org/reach.elements.html - MIDAS (Monument Inventory Data Standard),
http//www.jiscmail.ac.uk/cgi-bin/filearea.cgi?LMG
T1FISHagetf/web_midasintro.htm - Darwin Core , http//speciesanalyst.net/docs/dwc/i
ndex.html - Getty Research Institute "Crosswalk of Metadata
Element Sets for Art, Architecture, and Cultural
Heritage Information and Online Resources,
http//www.getty.edu/research/conducting_research/
standards/index.html