Title: SemanTic%20Interoperability%20To%20access%20Cultural%20Heritage
1 SemanTic Interoperability To access Cultural
Heritage
- Frank van Harmelen
- Henk Matthezing
- Peter Wittenburg
- Marjolein van Gendt
- Antoine Isaac
- Lourens van der Meij
- Stefan Schlobach
- Paul Doorenbosch
2CH Interoperability Problems
- Current CH trend portals that build on
heterogeneous collections - Different databases/vocabularies/MD schemes
3(No Transcript)
4CH Interoperability Problems
- Current CH trend portals that build on
heterogeneous collections - Different databases/vocabularies/MD schemes
- Syntactic interoperability problem being solved?
- Access can be granted
- Semantic interoperability still to be addressed
- Links with original vocabularies/MD structures
are lost
5(No Transcript)
6STITCH General Goals
- Allow heterogeneous CH collections to be accessed
- In an integrated way
- Still benefiting from specific collection
commitments - Keeping original metadata schemes and
vocabularies - Using Semantic Web means for
- Representation of different points of view in one
system - Creation and use of alignment knowledge
7(No Transcript)
8STITCH General Goals (2)
- Research objective develop theory, methods and
tools for allowing metadata interoperability
through semantic links between vocabularies - Formalization of schemes (and collections)
- Applying ontology mapping techniques to those
schemes - Using the results of the mappings in formal
reasoning mechanisms (and dedicated interfaces)
9Applying SW research to concrete objectives
- Specificity of resources (thesauri, metadata
schemes) - Formalization in a context of natural semantics
- What can ontology mapping techniques bring to
solve the interoperability problem in CH? - Quantitative and qualitative evaluation
- Integration into realistic scenarios
- Are these techniques really applicable to the CH
case? - Uses that have to be further specified
- What does accessing collections in an integrated
way mean? - Interfaces, services?
- Anticipating needs that are not yet stabilized
10Pilot Project
- Experiment on a reduced scale
- Choose and formalize 2 collections and their
associated subject vocabularies - Rijksmuseum ARIA Masterpieces and its catalogue
- KB Illustrated Manuscripts and Iconclass
- Use existing mapping tools to align vocabularies
- Adapt/develop a browsing interface providing an
integrated access using - Original vocabularies and their structure
- Alignment information
111st Collection KB Illustrated Manuscripts
122nd Collection Rijksmuseum ARIA collection
13PP Modules
14PP Modules
15Collection Formalization Goals
- Analysis of the vocabularies and MD structures
- Representation using SW languages
- Testing standard means (SKOS/RDF)
- Conversion for vocabularies, but also for
metadata structures - Ontologies providing proper collection-related
relations - Conversion for interface and reasoning engine
(application-specific) but also for formal
ontology mapping tools
16Vocabulary Formalisation ARIA in SKOS
17Collection Formalization Problems
- Interpreting and representing vocabularies using
formal standards is hindered by expressivity
variation - Complex models
- Fuzzy structures, weakly structured
- Implies some loss of data during standardisation?
- Part of the formalization is system-specific
- Depending on application environment
- Standard RDFS expressivity and implemented tools
- Depending on the mapping tools, which might make
different hypotheses on the nature of knowledge
to align - OWL classes vs. nodes in trees
- Changes the role of the standard representation
in the system?
18PP Modules
19Automatic Ontology Matching Techniques
- Generally aiming at recognizing equivalence or
subsumption links between ontology elements - Lexical
- Labels of entities, textual definitions
- Structural
- Structure of the formal definitions of entities,
position in the hierarchy - Statistical
- Objects, instantiation of the concepts
- Shared background knowledge (oracles)
- Using conceptual references to deduce
correspondences - Most mapping tools use a mix of such approaches
- E.g. lexical string matching can ignite a
structural alignment process
20Collection Integration Goals
- Provide mappers with proper resources
- Pre-processing done in previous step
- Use them in the most efficient way
- Setting taking into account the specificities of
CH vocabularies - Evaluation/selection of their results
- Taking into account the use of CH vocabularies in
their collection - Use their result in the application system
- Post-processing
- Do it for vocabularies but also for metadata
schemes - Not in pilot
21Mappings
22Mappings
23Collection Formalization Problems
- Input needs pre-processing, possibly division
- Output needs re-interpretation of mapping
relations - Can confidence measures be used?
- Alignment process
- Usually turning to resources that may be absent
from thesauri - Rich formal/structural information
- Dually indexed documents
- Not (properly) using all information found in
thesauri - E.g. rich lexical information
- Leading to low-quality thesaurus mapping
24PP Modules
25User Interface Access to Collections
- Adapted faceted browsing paradigm (Flamenco)
- Search by navigating through several facets
- STITCH PP facet adaptation
- From orthogonal facets (material, location)
to facets describing different conceptual schemes
(ARIA, Iconclass) - 3 views on integrated collections
- Single view
- Combined view
- Merged view
- http//stitch.cs.vu.nl
26Collections Access Single View
- Facets based on 1 point of view and its
associated concept scheme(s) - Access to objects indexed against concepts from
other schemes - If mapping between their index and the concepts
from single view - A single point of view on integrated data set
27Collections Access Combined View
- Search based on 2 (or more) points of view
- One facet uses 1 vocabulary from 1 point of view
- Facets attached to the different points of view
are presented - Simultaneous access to different points of view
of the same data
28Collections Access Merged View
- Facets using a merged concept scheme
- Mapping leads to hierarchical links between
schemes - Making the links between vocabularies more
visible during search - A way to enrich weakly structured vocabularies
29Collection Access Conclusion
- Prototype is thin layer on top of SW/RDF
technology (using Sesame) - All data is stored in and retrieved from RDF
repositories - Easily adaptable for experimentation with
different views (without programming) - For convincing results you need good quality
mapping - E.g., to assess the value of Merged view
- Towards application-specific evaluation criteria?