Title: A MultiDiscipline Metadata Registry for Science Interoperability
1A Multi-Discipline Metadata Registry for Science
Interoperability
J. Steven Hughes/JPL - steve.hughes_at_jpl.nasa.gov D
aniel J. Crichton/JPL - daniel.crichton_at_jpl.nasa.g
ov Jason J. Hyon/JPL - jason.hyon_at_jpl.nasa.gov Sea
n C. Kelly/UTA - sean.kelly_at_jpl.nasa.gov
Open Forum on Metadata Registries January 17-21,
2000 Santa Fe, New Mexico
2A Multi-Discipline Metadata Registry for Science
Interoperability
- Background
- Problem Statement
- System Overview
- Profile Development
- Conclusion and Issues
3Background
- NASAs Office of Space Science
- Planetary Science
- Planetary Data System (PDS)
- 5 Science disciplines nodes - 2 Support nodes
- 1 Central node
- Heterogeneous domains - short term missions
- Astrophysics
- Astrophysics Data System
- 100s to 1000s of nodes
- Homogeneous domains - long term missions
- Space Physics
- Space Physics Data System
- Several identified nodes
4Background
- Planetary Data System (PDS)
- Archives essentially all science data from
- solar system exploration missions
- Prototype - 1986, Operational - 1990
- Publishes archive quality products
- Well defined standards architecture
5Background Planetary Science Standards
Architecture
6Background
- Planetary Science Data Dictionary
- 1000 Data Elements spanning Planetary Science
disciplines - Nomenclature Standard
- Meaning, type, ranges, enumerated values
- Planetary Science Data Model
- Developed as Planetary Science enterprise E/R
model - Planetary Science Entities - Spacecraft,
Instruments - Science Data Entities - Data Products,
Projections, ... - Data Organization Entities - Volumes
- Management Entities - Nodes, Personnel
- Implemented as the PDS Data Set Catalog in an
RDBMS - Distributed in Object Description Language
7Background
- Challenge
- Develop single interface for locating space
science data. - Provide data system interoperability.
- Support correlative Science.
8Problem Statement
Space scientists can not easily locate or use
data across the hundreds if not thousands of
autonomous, heterogeneous, and distributed data
systems currently in the Space Science community.
- Heterogeneous Systems
- Data Management - RDBMS, ODBMS, HomeGrownDBMS,
BinaryFiles - Platforms - UNIX, LINUX, WIN3.x/9x/NT, Mac, VMS,
- Interfaces - Web, Windows, Command Line
- Data Formats - HDF, CDF, NetCDF, PDS, FITS, VICR,
ASCII, ... - Data Volume - KiloBytes to TeraBytes
- Heterogeneous Disciplines
- Moving targets and stationary targets
- Multiple coordinate systems
- Multiple data object types (images, cubes, time
series, spectrum, tables, - binary, document)
- Multiple interpretations of single object types
- Multiple software solutions to same problem.
- Incompatible and/or missing metadata
9Proposed Solution
- Encapsulate individual data systems. (Hide
uniqueness.) - Communicate using metadata that describe
resources - Data (e.g. data sets, images)
- non-Data (e.g. catalogs, services)
- Enable interoperability based on metadata
compatibility. - Refocus problem on metadata development.
10Proposed Solution (cont)
- Object_Oriented Data Technology Task (OODT)
- Domain independent data management infrastructure
- Domain independent data structures
- XML - Standard interchange language
- Metadata management
- Resource profile
- Message passing
- Domain independent system infrastructure
- CORBA for interoperability between computer
systems and languages - Message passing to simply interface design
- Standardized reusable server components
11System OverviewObject Oriented Data Technology
Framework
PDS Systems
12System OverviewProfile Service
- Profile describes a resource
- Available datasets and products
- Types of resources and where theyre located
- Optionally reference other profile servers
Data system 1
Data system 2
13System OverviewQuery Service
- Knows how to crawl through servers to produce a
result - Crawls through profiles to discover other
profiles and product servers - Crawls through product servers to display
available products - Accessible through CORBA API or through web
browser
14Profile Development Objective
- Objective
- Design and develop domain generic structure that
will capture the metadata necessary for
identifying and locating science data resources
across distributed heterogeneous data systems. - Result
- Profile - A resource description (subset of
meta-model) sufficient to determine if the
resource might resolve a query.
15Profile Development Approach
- Choose a common interchange format.
-
- Develop a domain generic language.
- Implement domain specific instances.
- Model the domain.
- Capture the meta-data.
- Develop system to manage the results.
16Profile Development Choose a common interchange
format
- XML
- eXtensible Markup Language
- More expressive than HTML
- More simple than SGML
- A meta-language used to define domain languages.
- XSIL - eXtensible Scientific Interchange
Language. - XIL - Instrument control language.
- Wide acceptance as an interchange format.
- Electronic data interchange (EDI) standard.
17Profile Development Develop a domain generic
language
- Define a generic structure (XML DTD) that can
describe - heterogeneous domain-specific resources.
- Profile - A resource description with sufficient
information to - determine if the resource satisfies a query.
- Profile elements
- name, syntax, unit, value_instance, meaning,
alias, - encodes selected domain attributes and their
values specific to this resource - Resource attributes - id, title, discipline,
location_id, - Profile attributes - id, title, desc, type,
data_dictionary_id,
18Profile Development Develop a domain generic
language prof.dtd
lt!ELEMENT PROFILES (PROFILE)gt lt!ELEMENT
PROFILE (PROFILE_ATTRIBUTES, RESOURCE)gt
lt!ATTLIST PROFILE PROFILE_ID CDATA REQUIRED
gt lt!ELEMENT PROFILE_ATTRIBUTES (ID,
TITLE, DESC, TYPE, STATUS_ID,
SECURITY_TYPE, PARENT_ID, CHILD_ID,
REVISION_NOTE, DATA_DICTIONARY_ID)gt
- lt!ELEMENT RESOURCE
- (RESOURCE_ATTRIBUTES,
- PROFILE_ELEMENT)gt
- lt!ELEMENT RESOURCE_ATTRIBUTES
- (RESOURCE_ID,
- RESOURCE_TITLE,
- RESOURCE_DISCIPLINE,
- RESOURCE_AGGREGATION,
- RESOURCE_CLASS,
- RESOURCE_LOCATION_ID,
- RESULT_MIME_TYPE)gt
- lt!ELEMENT PROFILE_ELEMENT
- (ELEMENT_NAME, ELEMENT_MEANING,
ELEMENT_ALIAS, - VALUE_SYNTAX, VALUE_UNIT,
- (VALUE_INSTANCE (MINIMUM_VALUE,
MAXIMUM_VALUE)))gt
19Profile Development Profile Example - PDS
Distributed Inventory System
ltPROFILE PROFILE_ID "PROFILE_PDS_DIS_V1.3.n" gt
ltPROFILE_ATTRIBUTESgt ltIDgt PROFILE_PDS_DIS_V1.
3.n lt/IDgt ltTITLEgt Planetary Data System -
Distributed Inventory System - Profile V1.0
lt/TITLEgt ltDESCgt This profile describes the
Planetary Data System (PDS) Distributed Inventory
System (DIS) ... ltTYPEgt PROFILE lt/TYPEgt
ltDATA_DICTIONARY_IDgt OODT_PDS_DATA_SET_DD_V1.0
lt/DATA_DICTIONARY_IDgt lt/PROFILE_ATTRIBUTESgt
ltRESOURCEgt ltRESOURCE_ATTRIBUTESgt
ltRESOURCE_IDgt PDS_DIS_V1.3.n lt/RESOURCE_IDgt
ltRESOURCE_TITLEgt Planetary Data System -
Distributed Inventory System lt/RESOURCE_TITLEgt
ltRESOURCE_DISCIPLINEgt PDS lt/RESOURCE_DISCIPLINE
gt ltRESOURCE_AGGREGATIONgt GRANULE
lt/RESOURCE_AGGREGATIONgt ltRESOURCE_CLASSgt
INVENTORY lt/RESOURCE_CLASSgt
ltRESOURCE_LOCATION_IDgt http//pds.jpl.nasa.gov/pds
brows.htm lt/RESOURCE_LOCATION_IDgt
ltRESULT_MIME_TYPEgt text/html lt/RESULT_MIME_TYPEgt
lt/RESOURCE_ATTRIBUTESgt ...
20Profile Development Profile Example (cont) - PDS
Distributed Inventory System
ltPROFILE_ELEMENTgt ltELEMENT_NAMEgt
DATA_OBJECT_TYPE lt/ELEMENT_NAMEgt
ltELEMENT_MEANINGgt The data_object_type element
provides the type ... ltVALUE_SYNTAXgt
ENUMERATION lt/VALUE_SYNTAXgt ltVALUE_UNITgt
N/A lt/VALUE_UNITgt ltVALUE_INSTANCEgt IMAGE
lt/VALUE_INSTANCEgt ... lt/PROFILE_ELEMENTgt
ltPROFILE_ELEMENTgt ltELEMENT_NAMEgt
DATA_SET_NAME lt/ELEMENT_NAMEgt
ltELEMENT_MEANINGgt The data_set_name element
identifies a PDS data set. -- example ...
ltVALUE_SYNTAXgt ENUMERATION lt/VALUE_SYNTAXgt
ltVALUE_UNITgt N/A lt/VALUE_UNITgt
ltVALUE_INSTANCEgt VO1/VO2 MARS VISUAL IMAGING
SUBSYSTEM DIGITAL ... ltVALUE_INSTANCEgt VO2
MARS RADIO SCIENCE SUBSYSTEM RESAMPLED LOS ...
lt/PROFILE_ELEMENTgt ltPROFILE_ELEMENTgt
ltELEMENT_NAMEgt TARGET_NAME lt/ELEMENT_NAMEgt
ltELEMENT_MEANINGgt The target_name element
provides the names of the targets ...
ltELEMENT_ALIASgt ADS.OBJECT_ID lt/ELEMENT_ALIASgt
ltVALUE_SYNTAXgt ENUMERATION lt/VALUE_SYNTAXgt
ltVALUE_UNITgt N/A lt/VALUE_UNITgt
ltVALUE_INSTANCEgt IDA lt/VALUE_INSTANCEgt
ltVALUE_INSTANCEgt JUPITER lt/VALUE_INSTANCEgt ...
lt/PROFILE_ELEMENTgt lt/RESOURCEgt
21Profile Development Develop a domain generic
language
- Specialize the profile class
- Profile - One profile to one resource (e.g.
inventory) - Inventory - One profile to many resources (e.g.
data set, image) - Minimized profile element attributes
- no meanings
- subsets of preferred values
- Dictionary - One profile to one discipline
- Maximize profile element attributes
- aliases , meanings
- union of all preferred values
22Profile Development Develop a domain generic
language
- Profile element hierarchy
- Dictionary - Planetary Science Data Dictionary
- data elements - union of all data elements in all
profiles - preferred values - union of all data element
values - e.g. TARGET_NAME ADRASTEA, , VENUS
- Profile - Planetary Image Atlas - Viking,
Galileo, MPF, ... - data elements - union of all data elements for
all - entities managed by resource
- preferred values - union of data element values
- e.g. TARGET_NAME MARS, DEIMOS, PHOBOS,
JUPITER, ... - Inventory - Viking Orbiter Image Catalog
- data elements - data elements associated with
inventory item. - perferred values - data element values for
inventory item. - e.g. TARGET_NAME MARS, DEIMOS, PHOBOS
23Profile Development Implement domain specific
instances
- Apply domain generic language to specific domain.
- E.g. Space/Earth Science data and other
resources. - Model the domain
- Data Dictionary
- Data Model
- Capture the meta-data
- Extracted from domain metadata repository
24Profile Development Implement domain specific
instances Inventory Example - PDS Data Set
ltRESOURCEgt ltRESOURCE_ATTRIBUTESgt
ltRESOURCE_IDgt VO1/VO2-M-VIS-5-DIM-V1.0
lt/RESOURCE_IDgt ltRESOURCE_TITLEgt VO1/VO2
MARS VISUAL IMAGING SUBSYSTEM DIGITAL IMAGING
MODEL ... ltRESOURCE_DISCIPLINEgt PDS
lt/RESOURCE_DISCIPLINEgt ltRESOURCE_AGGREGATION
gt GRANULE lt/RESOURCE_AGGREGATIONgt
ltRESOURCE_CLASSgt DATA lt/RESOURCE_CLASSgt
ltRESOURCE_LOCATION_IDgt http//pds.jpl.nasa.gov/cgi
-bin/pdsserv.pl?OBJECT_IDPDS100676 ...
ltRESULT_MIME_TYPEgt text/html lt/RESULT_MIME_TYPEgt
lt/RESOURCE_ATTRIBUTESgt ltPROFILE_ELEMENTgt
ltELEMENT_NAMEgt DATA_SET_NAME
lt/ELEMENT_NAMEgt ltVALUE_INSTANCEgt VO1/VO2
MARS VISUAL IMAGING SUBSYSTEM DIGITAL IMAGING
MODEL ... lt/PROFILE_ELEMENTgt
ltPROFILE_ELEMENTgt ltELEMENT_NAMEgt
DATA_OBJECT_TYPE lt/ELEMENT_NAMEgt
ltVALUE_INSTANCEgt IMAGE lt/VALUE_INSTANCEgt
lt/PROFILE_ELEMENTgt ltPROFILE_ELEMENTgt
ltELEMENT_NAMEgt TARGET_NAME lt/ELEMENT_NAMEgt
ltVALUE_INSTANCEgt MARS lt/VALUE_INSTANCEgt
lt/PROFILE_ELEMENTgt ltPROFILE_ELEMENTgt
ltELEMENT_NAMEgt VOLUME_ID lt/ELEMENT_NAMEgt
ltVALUE_INSTANCEgt VO_2001 lt/VALUE_INSTANCEgt ...
ltVALUE_INSTANCEgt VO_2014 lt/VALUE_INSTANCEgt
lt/PROFILE_ELEMENTgt lt/RESOURCEgt
25Conclusion Profile Development - Review
- Choose a common interchange format. (XML)
- Develop a domain generic language. (X2PL)
- (XML eXtensible Profile Language)
- Implement domain specific instances. (Resource
Profiles) - Develop system to manage the profiles. (Profile
Servers)
26Conclusion Issues
- Develop space science metadata registry
- 10 high level concepts - Anchor Points
- Complete development of discipline registries
- Determine management policy
- Design meta-model and mandate conformance
- Evolved meta-model through voluntary conformance
- Determine space science metadata standards
- NASA Data Entity Dictionary Specification
Language (DEDSL - XML syntax) currently being
used