Title: XML technologies
1The ISIS Facilities Ontology and
OntoMaintainerLouisa Casely-Hayford and Shoaib
Sufi
2 ISIS a CCLRC Neutron Muon Facility
- ISIS is the worlds leading pulsed Neutron Muon
source situated at the CCLRC Rutherford Appleton
Laboratory. ISIS supports an international
community of around 1600 scientists in a range of
scientific disciplines. - Currently ISIS produces about 700GB of combined
Neutron Muon data each year and this figure is
set to rise with the addition of a new target
station. - The ISIS Metadata Catalogue (ICAT) is a twenty
year back catalogue of experiments conducted at
ISIS it contains approximately 3GB of metadata
which references 3TB of data. - In order to maximise the value of data produced
from the ISIS facility, it must be fully
searchable. - To address this problem, e-Science is developing
numerous software solutions and ontologies are
seen as one of these useful approaches.
3Why Ontologies are a useful solution?
- Ontologies offer a powerful means to formally
express the nature of a domain. - To share common understanding of the structure of
information among people - To enable reuse of domain knowledge
- To make domain assumptions explicit
- They provide central controlled vocabularies that
can be integrated into catalogues, databases, web
publications and knowledge management
applications
4 Why Ontologies?
- Reasons why keywords are limited
- This is because these free text keywords.
- Have no context
- A great deal of their meaning is implicit to ISIS
users - Hard to map by non-experts to terms used by
facilities in the same domain and harder still to
those outside - E.g. The keyword HRPD, is a powder
diffractometer to ISIS users, however other
collaborating Neutron Facilities like SNS at ORNL
would understand powder diffractometer but not
the cryptic HRPD.
- Currently ICAT contains over 10,000 keywords
describing experiments that are used to index
experimental studies -
- However this is seen as a limited method
-
5Why Ontologies are a useful Solution?
- The creation of ontologies at ISIS will aid in
the mapping of concrete manifestations of
familiar terms in one domain as well as related
concepts in different domains. - This will facilitate searching of data by
category and grouping of data into keywords
across studies. - This could aid in the cross facility searching of
related scientific data from the various
scientific facilities housed at CCLRC e.g. CLF
and DL.
6Building an Ontology
- Defining terms in a domain and relations between
them. - Defining concepts in the domain (classes)
- Arranging the concepts in a hierarchy
(subclass-superclass hierarchy) - Defining which attributes and properties (slots)
classes can have and constraints on their values - Defining individuals
- Involves collaboration between domain experts and
ontology builders. - Ontologies are expressed in a formal language and
developed within an editing environment.
7A Protégé-OWL Ontology
- Classes
- Individuals
- Properties
Italy
America
livesIn
Gemma
England
hasSibling
A class is a concept in the domain - a
class of People - a class of Pets - a
class of Countries A class is a collection of
elements with similar properties. Instances of
classes - America can be an instance of the class
Country.
Class Country
Mathew
hasPet
Class Person
Fluffy
Fido
Class Pet
8Building of the ISIS Facilities Ontology
- Examples of keywords in these five categories
are - HRP00145.RAW - a datafile name.
- HRPD - a High Resolution Powder Diffractometer
one of the many instruments used in experiments
at the ISIS facility. - Hydrazinium - an investigation title, chemical
names and compounds were used as investigation
titles of experiments in ICAT. - 1986 - the year in which a particular experiment
was conducted - JINR (Joint Institute for Nuclear Research) - the
name of an investigator.
- The ISIS facilities ontology is based on keywords
in the ISIS Metadata catalogue (ICAT). - Over 10,000 keywords housed in ICAT and many are
synonyms. - Keywords in ICAT were grouped into 5 main
categories - Datafile name
- Instrument
- Investigation title
- Investigator
- Year
9ISIS Facilities Ontology Hierarchy
10ISIS Facilities Ontology
Class ISISExperiment
hasTitle
Hydrazinium
Protein Crystallography GroupExperiment
Class InvestigationTitle
wasConductedIn
Class CrystallographyGroupExperiment
1986
hasInvestigator
Class Year
hasUsedInstrument
hasDataFileName
Pete Jones
HRPD
HRP00145.RAW
Class Investigator
Class Instrument
Class DataFile
11(No Transcript)
12Further Application of Ontologies to ICAT-ISIS
Online Proposal System
- Scientists can submit applications for beamtime
at ISIS through an online application form which
is known as the ISIS Online Proposal System. - The ICAT(Metadata catalog) not only holds the 20
year back catalog of data, but will also hold
data from approved proposals and data generated
from experiments conducted at ISIS. - Three separate modular ontologies for Sample,
Investigator and Experiment are being developed
to mark up the Proposal system. - These ontologies are partly based on the proposal
system database schema.
13Sample, Investigator and Experiment Ontologies
Sample
Experiment
Investigator
14OntoMaintainer
- Consensus on Concepts modelled in the ISIS
Facilities ontology, was achieved through a
series of interviews with domain experts. - During the design and creation process, there was
a difficulty in sharing current versions of the
ontology with our collaborators at ISIS. - This is because to view the hierarchical
structure of the ontology, scientists would have
to download and install Protégé locally. - The Ontology Maintainer was developed to
facilitate the community in remotely viewing
current versions of the ontology.
15Screen Shot of OntoMaintainer
16Benefits of OntoMaintainer
- It is easily accessible because it is available
over the web - Allows domain experts to contribute towards the
maintenance of the ontologies - Encourages collaboration between domain experts
(scientists) and ontology builders by allowing
members of the community to be involved in the
development and maintenance of ontologies - Makes collaboration between domain experts and
ontology builders more efficient
17Future Work
- Completion of Sample, Investigator, Experiment
and ISIS Facilities Ontologies - Ontologies will be used to mark up the ICAT back
catalogue and new approved studies submitted
through Online Proposal System - Ontology Maintainer will be improved through the
addition of properties to enable relationships
between individuals in classes to be shown. - Graphical view of the total hierarchy of the
ontology will be added to the user interface of
the Ontology Maintainer. - Tree hierarchy will be made more dynamic through
automatic updating of classes - Currently creating a Mapping tool to map between
the newly created ontologies and existing
databases.
18Conclusion
- Ontologies will help maximise the value of data
collected at ISIS and other CCLRC facilities by
improving the access, navigation and reuse of
data. - Ontologies will facilitate the mapping of terms
across CCLRC facilities which will allow
cross-facility searching e.g. external users will
be able to search for all experiments carried out
across CCLRC using a powder diffractometer
(instrument) even if they do not know the local
names of the specific instruments. - The OntoMaintainer will facilitate the process of
creating and maintaining ontologies by providing
a means of getting feedback directly from domain
experts. It will improve the social aspect of
building ontologies by allowing all members of
the community to provide input in the building of
ontologies. - Major challenge scope, modularity and integration
of ontologies.
19References
- Protégé web site http//protege.stanford.edu
- Documentation
- Users Guide
- Tutorial
- Protégé -discussion mailing list
- Ontology library
- CO-ode http//www.co-ode.org/
20Questions