Converting an Existing Taxonomic Data Resource to Employ an Ontology and LSIDS - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Converting an Existing Taxonomic Data Resource to Employ an Ontology and LSIDS

Description:

Data sharing is fundamental to biodiversity and taxonomic data applications, Previous attempts to facilitate sharing have had limited success ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 21
Provided by: Jes321
Category:

less

Transcript and Presenter's Notes

Title: Converting an Existing Taxonomic Data Resource to Employ an Ontology and LSIDS


1
Converting an Existing Taxonomic Data Resource to
Employ an Ontology and LSIDS
  • Jessie Kennedy
  • Rob Gales, Robert Kukla

2
Introduction
  • Data sharing is fundamental to biodiversity and
    taxonomic data applications,
  • Previous attempts to facilitate sharing have had
    limited success
  • lack of take up of data exchange standards
  • now slowly happening due to the TDWG standards
    initiative
  • the absence of a common terminology or vocabulary
    for use in taxonomic data
  • the lack of reference database systems for
    serving authoritative data
  • Proposed new technologies
  • a Core Ontology for taxonomic data to model the
    biodiversity domain.
  • Adoption of Life Science Identifiers (LSIDs) by
    the TDWG GUID group
  • for uniquely identifying taxonomic data objects,
    e.g specimens, names, concepts, etc.
  • LSIDs can make use of an Ontology to define the
    data to be returned
  • Need a mechanism for migrating existing data to
    the new technologies
  • explore the issues in using LSIDs and RDF
    according to an Ontology.

3
Re-using LSIDs
  • Using LSIDs per se will not address the issue of
    data sharing
  • Repositories must reuse LSIDs to cross reference
    data within and outwith their own repository.
  • It is important that we use the same LSID to
    refer to the same entity
  • If multiple LSIDs exist for the same entity we
    would be required to decide whether or not two
    LSIDs were really the same thing.
  • We would be in a similar situation as we are
    today,
  • for example, trying to decide if two taxonomic
    names are really the same.
  • Generating LSIDs for any self contained data set
    is a fairly trivial task
  • Appointing LSIDs to existing data from an
    authoritative repository to re-use them is more
    challenging.

4
Project Overview
  • Imagining the future
  • Assume have authority providers for certain data
  • Publications, names etc e.g. IPNI, ZOObank, IF,
    Pubbank
  • Want to Convert Existing Data repository
  • Relational database
  • the Hexacorallians of the World
  • Represent existing data as RDF triples
  • Use LSIDs to uniquely identify entities in data
  • according to a domain ontology which extends TDGW
    core ontology
  • Use LSIDs to cross reference between the data in
    the repository
  • Some LSIDs re-used from external sources
  • Some LSIDs generated locally
  • Owned data
  • Development of a tool to aid the process of
    converting internal database keys to LSIDs
  • aid users in appointing the appropriate LSID from
    some external LSID authority.

5
Creating Domain Ontology
  • Draft Core Ontology
  • Core and BDI ontology
  • Classes and optional relationships between
    classes
  • Extend to Domain Ontology
  • Domain classes inherit from the core classes
  • Extended with additional classes
  • Re-use existing ontologies where possible
  • Specify additional literal properties
  • Where necessary
  • Straightforward for developer
  • For Hexacorallia data
  • Creating RDF triples
  • Manual mapping of relational data to RDF triples
    according to OWL specification
  • Used wasabi mapping extensions custom code for
    generation

6
Simulate Authority Providers
7
Convert Existing Provider
Convert Existing Thematic Data Provider to use
existing LSIDs and ontology
Original data repository
RDF Data to be updated with LSIDs from
authority providers
Hexacorallia
Thematic
Provider
Map to ontology
Hexacorallia
Thematic
LSID Observation subset
Triple Store
LSID Match
with linking tool
Match -gtLSID
Match -gtLSID
Match -gtLSID
Match -gtLSID
Match -gtLSID
Store
Person
Authority
(
simulated
)
Publication
Name
Specimen
Concept
Triple
Triple
Observation
Triple
Triple
Triple
LSID Resolution
Store
Store
Triple Store
Store
Store
Services
8
Linking.
9
Configure Provider for Update
10
Linking.
11
Configure the linker
12
Linking.
13
Request Annotations
14
Linking Service
15
Linking Service
Determines properties for matching
Return suggestions to the client
Weight possible matches
16
Confirm/Skip Annotations
Suggested match
17
Confirm/Skip Annotations
18
Confirm/Skip Annotations
19
Research Questions
  • How effective is the draft ontology for
    representing existing data sources?
  • Can suitable extensions be easily defined?
  • Straight forward for developer
  • Need independent verification
  • What are the issues for an existing data provider
    to convert their data to using the ontology and
    LSIDs?
  • Replace or annotate existing data
  • If, for example, I replace an author with a
    person LSID what I get when I resolve a person
    wont likely be what I would have had when I had
    the data for an author.
  • Dependencies between LSIDable objects
  • If you link via a taxon name LSID the resolved
    name should have embedded an LSID for a
    publication so there shouldnt be any need (in
    principal) to match publications for names
  • What about authorities that issues LSIDs but
    dont map to other authorities
  • e.g. name providers not mapping to either
    publication or specimen providers
  • and dont want to!

20
Research Questions
  • What support would a linking tool need to provide
    end users?
  • How would users want to process this data
  • How much automation?
  • E.g. above a certain confidence level
  • Would his be trusted?
  • Order of matching
  • E.g. match all instances of persons at once
  • Match of persons by publication?
  • Other Issues
  • Performance of existing linking tool approach
  • Lots of data passing going on
  • Need better batch or one at a time
  • Finding authorities that provide linking services
  • How do you find out about authorities with
    linking services?
  • How do you know which ones to use?

21
Acknowledgements
  • TDWG/Gordon Betty Moore Foundation
Write a Comment
User Comments (0)
About PowerShow.com