OWL and Existing Thesaurus Migration - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

OWL and Existing Thesaurus Migration

Description:

Thesaurus Knowledge domain. Different standards. Different ... Thesaurus in Data Bases. OWL uses qualifiers to ... a current thesaurus is not ... – PowerPoint PPT presentation

Number of Views:82
Avg rating:3.0/5.0
Slides: 32
Provided by: marjori7
Category:

less

Transcript and Presenter's Notes

Title: OWL and Existing Thesaurus Migration


1
OWL and Existing Thesaurus - Migration
  • Marjorie M.K. Hlava
  • Standards Implementer
  • Access Innovations, Inc.
  • Data Harmony Software

2
The Main Players
  • Semantic Web standards
  • XML, RDF, and OWL
  • NISO, ISO, BSI
  • Provide the framework
  • Share across the web
  • Platform independent
  • Software independent

3
Thesaurus Standards
  • Z39.19 Monolingual Thesauri
  • ISO 2788 Monolingual
  • ISO 5767 Multilingual
  • A taxonomy is the hierarchical view of a
    thesaurus with information objects attached at
    the appropriate node

4
Data Base Standards
  • Output
  • MARC
  • Format b
  • RDF
  • Markup
  • XML
  • SGML
  • Metadata
  • Dublin Core
  • NFAIS /DOI Minimum data set
  • Dialog default and additional fields

5
Fitting the Pieces Together
  • XML - Rules, Syntax for Structured Documents
  • RDF - Data Framework
  • OWL concept records for the Web

6
eXtensible Markup Language
  • XML syntax
  • Markup
  • Add content to the markup
  • Attributes
  • Elements
  • Tag formats
  • Well formed
  • Transferable data

7
Resource Description Framework
  • RDF
  • standard method - simple descriptions
  • RDF semantics - a clear set of rules for
    providing simple descriptive information
  • RDF Schema a way for those descriptions to be
    combined into a single vocabulary.
  • Self documenting XML

8
Whats an Ontology?
  • subject - or domain - specific vocabularies.
  • An ontology defines the terms used to describe
    and represent an area of knowledge.
  • Ontologies include computer-usable definitions of
    basic concepts in the domain and the
    relationships among them.
  • They encode knowledge in a domain and also
    knowledge that spans domains.

9
OWL
  • Facilitates greater machine interpretability of
    Web content by providing additional vocabulary
    along with formal semantics.
  • Tags in fielded format
  • The fields add information about how to use that
    term
  • Coordinates terms (pre-coordinate)

10
How OWL
  • OWL uses
  • URIs for naming
  • RDF for description framework
  • Capabilities to ontologies
  • Ability to be distributed across many systems
  • Scalability
  • Compatibility with Web standards for
    accessibility and internationalization
  • Openness
  • Extensibility

11
OWL - Web Ontology Language
  • Provides a language
  • OWL Lite
  • 10 properties
  • 43 classes
  • OWL DL and OWL Full
  • 4 additional properties
  • 11 more classes
  • Defined for WWW architecture
  • Semantic Web in particular.

12
Exactly what?
  • OWL properties and classes
  • relations between classes (e.g. disjointness),
  • cardinality (e.g. "exactly one"),
  • equality,
  • richer typing of properties,
  • characteristics of properties (e.g. symmetry),
  • enumerated classes
  • OWL Web Ontology Language Overview
  • http//www.w3.org/TR/2004/REC-owl-features-2004021
    0/

13
So how are they different?
  • Might not be
  • Thesaurus Knowledge domain
  • Different standards
  • Different historical applications
  • Ontologies in web
  • Thesaurus in Data Bases
  • OWL uses qualifiers to disambiguate terms

14
Tell me that in Thesaurus Terms
  • Relations between classes (e.g.
    disjointness),Facets
  • Cardinality (e.g. "exactly one"), Instance
  • Equality, Use and USED FOR / Synonyms
  • Richer typing of properties, More fields for user
    defining the terms
  • Characteristics of properties (e.g. symmetry),
    Roles and treatment codes
  • Enumerated classes - notation

15
Terms and concepts
  • The stated differences
  • Terms represent a concept
  • Concepts are interrelated
  • Clarity and maintenance challenges
  • Ontologies seek to make it easier..

16
Migrating a thesaurus
  • Digital form (not paper)
  • Understand the construct of the thesaurus
  • Get permission to use it
  • Make it XML with RDF headers
  • Keep the original thesaurus structure
  • BT, RT PT notes etc
  • Convert the labels to rdf format
  • Rdfsscopenote
  • Do not keep the node labels you dont need them

17
Migrating a thesaurus
  • Try to use the WebOnt DTD for the XML
  • http//www.w3c.org/2001/sw/WebOnt/web-ont-issues.h
    tmlI4.3-Structured-Datatypes
  • Keep the original meanings

18
Semanitc conversion (?)
  • Augment term to make it linkable and more concise
  • Broader term owlTransitiveProperty
  • Related term owlSymmetricProperty
  • Preferred term prefLabel
  • Non preferred term altLabels
  • TopConcept is not top terms but special classes

19
In Ontologies
  • Descriptors contain concepts
  • Concepts are a set of terms
  • One concept is a preferred concept of a
    descriptor
  • Each descriptor can have qualifiers indicating
    aspects of a descriptor
  • ABDOMEN -
  • Anatomy
  • Diseases
  • Concepts in one descriptor are hierarchically
    related

20
NISO versus OWL
  • Concepts versus terms
  • Descriptors as a set of terms vs. a preferred
    term
  • Qualifiers are not used to disambiguate terms
    use additional properties instead
  • Relations as nodes
  • Relations as arcs

21
Thesaurus fields
  • Hierarchical
  • BT / NT
  • Associative
  • Related Terms RT
  • Equivalence
  • Preferred terms Descriptors
  • Use / Used For
  • Notes
  • Qualifiers and facets

22
How do I make my thesaurus into an Ontology?
  • Map the field labels
  • Add additional data
  • Convert to XML
  • Output the file in RDF instead of Native XML
  • Then its an ontology

23
National Cancer Institute
24
NCI Ontology Record
25
NCI RDF DTD
26
Object Property DTD Sample
  • Biochemical_Class_or_Structure" C74
    74 and_Drugs_Kind"/ cals_and_Drugs_Kind"/
    art_of_Chemical_or_Drug" C75
    75 and_Drugs_Kind"/ cals_and_Drugs_Kind"/
    Target_Organism" C73 73
    nd"/
    rdfID"rChemical_or_Drug_Has_Target_Anatomy"
    C71 71 rdfresource"Chemicals_and_Drugs_Kind"/

    rdfID"rChemical_or_Drug_FDA_Approved_for_Disease
    " C70 70 rdfresource"Chemicals_and_Drugs_Kind"/
    Kind"/

27
Semantic links
  • se" C79 79 rdfresource"Organism_Kind"/ rdfresource"Findings_and_Disorders_Kind"/

28
Implied Attributes
  • rdfID"rEO_Models_Human_Disease"
  • C79
  • 79
  • Kind"/

29
Scope notes
  • Concept scope note
  • scopeNote is a rdfssubPropertyOf rdfscomment

30
Conclusion
  • OWL is a standard to watch
  • Implementation requires knowledge of UNIX / Linux
  • Conversion from a current thesaurus is not
    difficult
  • The base knowledge for applying requires new way
    of looking at concepts

31
Stay tuned for updates
  • Marjorie M.K. Hlava
  • President
  • Access Innovations Inc
  • Data Harmony Software
  • 505-998-0800
  • www.accessin.com
  • www.dataharmony.com
  • mhlava_at_accessinn.com
Write a Comment
User Comments (0)
About PowerShow.com