Title: Publishing and Resource Discovery with Registries
1Publishing and Resource Discovery with Registries
THE US NATIONAL VIRTUAL OBSERVATORY
- Ray Plante
- Gretchen Greene
2All about Registries
- Overview of the Registry Framework
- Publishing to the NVO
- VOResource Resource Metadata in XML
- IVOA Standard Registry Interface
- Exercise query registry using standard interface
- Exercise register resources in a registry
3The role of Resource Registries
- Used to discover and locate resourcesdata and
servicesthat can be used in a VO application - Resource anything that is describable and
identifiable. - Besides data and services organizations,
projects, software, - Presently concerned with simple set of resource
types - Registry a list of resource descriptions
- Expressed as structured metadata
- to enable automated processing and searching
4An Overview of Data Discovery
- You can search the main NVO registry to find
resources based on descriptive criteria - NVO Registries are coarse-grained
- You can find organizations, archives, catalogs
- Wont find images, celestial objects, table
records - Registry framework contains multiple registries
- searchable registries
- publishing registries
5Registry Framework
VO Projects
Full Searchable Registry
Data Centers
Local Searchable Registry
Specialized Portals Services
6Registry Framework
VO Projects
harvest
(pull)
Full Searchable Registry
Data Centers
Local Searchable Registry
Specialized Portals Services
7Registry Framework
VO Projects
harvest
(pull)
Cross-harvest
Full Searchable Registry
Data Centers
Local Searchable Registry
Specialized Portals Services
8Registry Framework
VO Projects
harvest
(pull)
Cross-harvest
Full Searchable Registry
Data Centers
selective harvesting
Local Searchable Registry
Specialized Portals Services
9Registry Framework
VO Projects
Full Searchable Registry
Data Centers
search queries
Local Searchable Registry
Client Applications
Specialized Portals Services
10Registry Framework
VO Projects
Full Searchable Registry
Data Centers
search queries
Local Searchable Registry
Client Applications
Specialized Portals Services
11NVO Public Registries
Registry URL Searchable? Publishing?
STScI/JHU NVO Registry http//nvo.stsci.edu/voregistry/ Yes Yes
Caltech Carnivore http//nvo.caltech.edu8080/carnivore/ Yes Yes
NCSA Registration Portal http//nvo.ncsa.uiuc.edu/nvoregistration.html No Yes
Private Publishing Registries
- HEASARC
- CDS
- Only support harvesting protocol
12Overview of Publishing
- Resources are published if one can use NVO
facilities to find them. - How to Publish to the NVO
- http//us-vo.org/pub/files/PublishHowto.html
- Multiple layers of publishing
- Starts with registry description of resource
- Data Access Services
- Incremental exposure for incremental effort
- Who are you? How you publish depends on what you
want to publish. - An individual with a small data collection
- An archive center
- Someone with a cool service
13Small collectionsVO-ready Repositories
- Repositories that allow users to deposit data to
share with community - Guarantee long-term storage, availability
- Automatic support for VO publishing mechamisms
- Entries into NVO Registry
- Support for standard services
- Cone Search, SIA, SSA, SkyNode
- Currently available Repositories
- Images NCSA Astronomy Digital Image Library
http//adil.ncsa.uiuc.edu/ - Spectra Spectrum Service for the VO
http//voservices.net/spectrum/ - Part of an emerging data-preservation effort
- Focusing on processed products associated with
published results - Collaboration between NVO, journal publishers,
and the library community - Goals
- data publishing integrated into the journal
publishing process - data stored in distributed repositories run by
academic libraries - fully VO compliant
14Persistent ArchivesTools for Federation
- Registering your resources with a VO publishing
registry - Enter description into registration form at one
of the available NVO registries - STScI/JHU Registry http//nvo.stsci.edu/voregist
ry/ - NCSA Registration Portal http//nvo.ncsa.uiuc.ed
u/nvoregistration.html - Caltech Carnivore http//nvo.caltech.edu8080/ca
rnivore/ - If you have a large number of resources to
register, you can run your own registry on your
own site - Caltech Carnivore http//nvo.caltech.edu8080/ca
rnivore/
15Caution Construction Ahead
- IVOA Standard Registry Interface
- Will come on-line this fall world-wide
- As part of this upgrade, NVO will unify
publishing interfaces - It wont matter which NVO registry you register
with - Improved support for all types of resources
- Will affect how users express advanced,
constraint-based searches. - In general, this presentation describes the
registries and publishing in terms of the new
standards - Your feedback is valuable!
- Publishing GUI
- The publishing process
- Client interfaces
16Persistent ArchivesTools for Federation
- What can/should you register?
- Should your Organization
- Declares yourself as a publisher with an ID
- Should your Data Collection
- Users at least know how to access it via a
Browser - Can your existing services
- Browser-based services e.g. search page
- Traditional CGI services
- Web Services
- The next level
- Implement and register one or more standard
services - Cone Search
- Simple Image Access
- Simple Spectral Access
- SkyNode
- newest service standard
17Cool ServicesIntegrating with the VO
- Register your service at a registry
- Currently
- Can register a generic Web Service
- If service doesnt fall into supported
categories, register it as a generic Service - Improved support for non-standard services coming
- Feel free to let us know where the forms are
inadequate - Integrate support for standard VO formats,
schemas - FITS and VOTable
- Standard Data Model schemas (emerging)
- VOResource, Space-time Coordinates, Spectra
- Implement Standard Support Interface
- a standard in development for
- Self-description, tracking health and usage
18A word about Identifiers
- IVOA Identifier a globally-unique URI
identifying a resource - Ex ivo//adil.ncsa/targeted/SIA
- Required as part of a registered resource
description - As publisher, you control what it looks like
- Two components
- Authority ID e.g. adil.ncsa
- Defines a namespace for identifiers
- Owned by a single publishing organization
- Resource Key e.g. targeted/SIA
- Name for the resource unique within the namespace
- Encourage re-use of local identifiers
19Resource Metadata XML Schema
- Classes of Resources
- Generic Resource
- Extensions e.g.
- Organisation, DataCollection, Service,
DataService, CatalogService, Registry, - Organized into separate schemas
- Core resource metadata VOResource
- Various extensions schemas containing specific
types - Capable of describing
- Data centers, research organizations, missions,
observatories - Data collections, archives
- VO standard services Cone Search, Simple Image
Access, Simple Spectral Access, SkyNode - Existing Browser/CGI-based services
- Web Services
20Resource Metadata Services
- Service resource records extends the core by
adding capability metadata - capability the interfaces/protocols and
behavior supported by the service - Each standard protocol is considered a different
capability - A service can support several capabilities (e.g.
ConeSearch and SkyNode) - There are associated standard capability metadata
extensions for standard protocols - For Simple Image Access, can state
- Maximum number of records returned
- Maximum query region
- Whether returned images are cutouts or static
images - Capability metadata includes a description of the
service interface - All interface descriptions include a service or
access URL - For Web Services, access URL is usually
sufficient - For REST-like interfaces, more descriptions of
inputs can be described. - Capability model allows description of support
for different versions of protocol standards
21Sample Resource Descriptionadilsia.xml
ltResource xsitype"vsCatalogService"
created"2003-06-23T190232" updated"2004-04-05T
170722" namespace definitions gt
ltvalidationLevel validatedBy"ivo//nvo.ncsa/regi
stry"gt2lt/validationLevelgt lttitlegt NCSA
Astronomy Digital Image Library Simple Image
Access lt/titlegt ltshortNamegtADILlt/shortNamegt
ltidentifiergtivo//adil.ncsa/targeted/SIAlt/ident
ifiergt ltcurationgt ltpublisher
ivo-id"ivo//rai.ncsa/RAI"gt NCSA Radio
Astronomy Imaging lt/publishergt
ltcreatorgt ltnamegtcontributing
authorslt/namegt ltlogogthttp//adil.ncsa.uiu
c.edu/images/adilfooter.giflt/logogt
lt/creatorgt ltdategt2002-01-01lt/dategt
ltcontactgt ltnamegtDr. Raymond
Plantelt/namegt ltemailgtadil_at_ncsa.uiuc.edult/
emailgt lt/contactgt lt/curationgt
ltcontentgt ltdescriptiongt This
allows searching for ADIL images via the SIA
protocol. lt/descriptiongt
ltreferenceURLgthttp//adil.ncsa.uiuc.edu/lt/referenc
eURLgt lttypegtArchivelt/typegt
ltcontentLevelgtUniversitylt/contentLevelgt
ltcontentLevelgtResearchlt/contentLevelgt
lt/contentgt
The specific class of resource
22Sample Resource Descriptionadilsia.xml
ltResource xsitype"vsCatalogService"
created"2003-06-23T190232" updated"2004-04-05T
170722" namespace definitions gt
ltvalidationLevel validatedBy"ivo//nvo.ncsa/regi
stry"gt2lt/validationLevelgt lttitlegt NCSA
Astronomy Digital Image Library Simple Image
Access lt/titlegt ltshortNamegtADILlt/shortNamegt
ltidentifiergtivo//adil.ncsa/targeted/SIAlt/ident
ifiergt ltcurationgt ltpublisher
ivo-id"ivo//rai.ncsa/RAI"gt NCSA Radio
Astronomy Imaging lt/publishergt
ltcreatorgt ltnamegtcontributing
authorslt/namegt ltlogogthttp//adil.ncsa.uiu
c.edu/images/adilfooter.giflt/logogt
lt/creatorgt ltdategt2002-01-01lt/dategt
ltcontactgt ltnamegtDr. Raymond
Plantelt/namegt ltemailgtadil_at_ncsa.uiuc.edult/
emailgt lt/contactgt lt/curationgt
ltcontentgt ltdescriptiongt This
allows searching for ADIL images via the SIA
protocol. lt/descriptiongt
ltreferenceURLgthttp//adil.ncsa.uiuc.edu/lt/referenc
eURLgt lttypegtArchivelt/typegt
ltcontentLevelgtUniversitylt/contentLevelgt
ltcontentLevelgtResearchlt/contentLevelgt
lt/contentgt
The specific class of resource
Metadata quality rating
23Sample Resource Descriptionadilsia.xml
ltResource xsitype"vsCatalogService"
created"2003-06-23T190232" updated"2004-04-05T
170722" namespace definitions gt
ltvalidationLevel validatedBy"ivo//nvo.ncsa/regi
stry"gt2lt/validationLevelgt lttitlegt NCSA
Astronomy Digital Image Library Simple Image
Access lt/titlegt ltshortNamegtADILlt/shortNamegt
ltidentifiergtivo//adil.ncsa/targeted/SIAlt/ident
ifiergt ltcurationgt ltpublisher
ivo-id"ivo//rai.ncsa/RAI"gt NCSA Radio
Astronomy Imaging lt/publishergt
ltcreatorgt ltnamegtcontributing
authorslt/namegt ltlogogthttp//adil.ncsa.uiu
c.edu/images/adilfooter.giflt/logogt
lt/creatorgt ltdategt2002-01-01lt/dategt
ltcontactgt ltnamegtDr. Raymond
Plantelt/namegt ltemailgtadil_at_ncsa.uiuc.edult/
emailgt lt/contactgt lt/curationgt
ltcontentgt ltdescriptiongt This
allows searching for ADIL images via the SIA
protocol. lt/descriptiongt
ltreferenceURLgthttp//adil.ncsa.uiuc.edu/lt/referenc
eURLgt lttypegtArchivelt/typegt
ltcontentLevelgtUniversitylt/contentLevelgt
ltcontentLevelgtResearchlt/contentLevelgt
lt/contentgt
The specific class of resource
Metadata quality rating
Identity Metadata what we call it
24Sample Resource Descriptionadilsia.xml
ltResource xsitype"vsCatalogService"
created"2003-06-23T190232" updated"2004-04-05T
170722" namespace definitions gt
ltvalidationLevel validatedBy"ivo//nvo.ncsa/regi
stry"gt2lt/validationLevelgt lttitlegt NCSA
Astronomy Digital Image Library Simple Image
Access lt/titlegt ltshortNamegtADILlt/shortNamegt
ltidentifiergtivo//adil.ncsa/targeted/SIAlt/ident
ifiergt ltcurationgt ltpublisher
ivo-id"ivo//rai.ncsa/RAI"gt NCSA Radio
Astronomy Imaging lt/publishergt
ltcreatorgt ltnamegtcontributing
authorslt/namegt ltlogogthttp//adil.ncsa.uiu
c.edu/images/adilfooter.giflt/logogt
lt/creatorgt ltdategt2002-01-01lt/dategt
ltcontactgt ltnamegtDr. Raymond
Plantelt/namegt ltemailgtadil_at_ncsa.uiuc.edult/
emailgt lt/contactgt lt/curationgt
ltcontentgt ltdescriptiongt This
allows searching for ADIL images via the SIA
protocol. lt/descriptiongt
ltreferenceURLgthttp//adil.ncsa.uiuc.edu/lt/referenc
eURLgt lttypegtArchivelt/typegt
ltcontentLevelgtUniversitylt/contentLevelgt
ltcontentLevelgtResearchlt/contentLevelgt
lt/contentgt
The specific class of resource
Metadata quality rating
Identity Metadata what we call it
Curation Metadata who is responsible
25Sample Resource Descriptionadilsia.xml
ltResource xsitype"vsCatalogService"
created"2003-06-23T190232" updated"2004-04-05T
170722" namespace definitions gt
ltvalidationLevel validatedBy"ivo//nvo.ncsa/regi
stry"gt2lt/validationLevelgt lttitlegt NCSA
Astronomy Digital Image Library Simple Image
Access lt/titlegt ltshortNamegtADILlt/shortNamegt
ltidentifiergtivo//adil.ncsa/targeted/SIAlt/ident
ifiergt ltcurationgt ltpublisher
ivo-id"ivo//rai.ncsa/RAI"gt NCSA Radio
Astronomy Imaging lt/publishergt
ltcreatorgt ltnamegtcontributing
authorslt/namegt ltlogogthttp//adil.ncsa.uiu
c.edu/images/adilfooter.giflt/logogt
lt/creatorgt ltdategt2002-01-01lt/dategt
ltcontactgt ltnamegtDr. Raymond
Plantelt/namegt ltemailgtadil_at_ncsa.uiuc.edult/
emailgt lt/contactgt lt/curationgt
ltcontentgt ltdescriptiongt This
allows searching for ADIL images via the SIA
protocol. lt/descriptiongt
ltreferenceURLgthttp//adil.ncsa.uiuc.edu/lt/referenc
eURLgt lttypegtArchivelt/typegt
ltcontentLevelgtUniversitylt/contentLevelgt
ltcontentLevelgtResearchlt/contentLevelgt
lt/contentgt
The specific class of resource
Metadata quality rating
Identity Metadata what we call it
Curation Metadata who is responsible
Content Metadata what it contains
26Sample Resource Descriptionadilsia.xml
ltcapability xsitype"siaSimpleImageAccess"
standardID"ivo//ivoa.net/std/SIA"gt
ltvalidationLevel validatedBy"ivo//"gt2lt/val
idationLevelgt ltinterface
xsitype"vsParamHTTP" role"std"gt
ltaccessURL use"base"gt
http//adil.ncsa.uiuc.edu/cgi-bin/voimquery?survey
famp lt/accessURLgt
ltqueryTypegtGETlt/queryTypegt
ltresultTypegtapplication/xmlvotablelt/resultTypegt
lt/interfacegt ltimageServiceTypegtPoin
tedlt/imageServiceTypegt ltmaxQueryRegionSizegt
ltlonggt360.0lt/longgt
ltlatgt180.0lt/latgt lt/maxQueryRegionSizegt
... lt/capabilitygt ltcoveragegt
ltstcSTCResourceProfilegt
ltstcAstroCoordSystem id"UTC-ICRS-TOPO"
xlinkhref"ivo//STClib/CoordSysU
TC-ICRS-TOPO"
xlinktype"simple"/gt
ltstcAstroCoordArea coord_system_id"UTC-ICRS-TOPO
"gt ltstcAllSky/gt
lt/stcAstroCoordAreagt lt/stcSTCResourceProfi
legt ltwavebandgtRadiolt/wavebandgt
ltwavebandgtMillimeterlt/wavebandgt
ltwavebandgtInfraredlt/wavebandgt
ltwavebandgtOpticallt/wavebandgt
ltwavebandgtUVlt/wavebandgt lt/coveragegt lt/Resourcegt
Capability Metadata what it can do
27Sample Resource Descriptionadilsia.xml
ltcapability xsitype"siaSimpleImageAccess"
standardID"ivo//ivoa.net/std/SIA"gt
ltvalidationLevel validatedBy"ivo//"gt2lt/val
idationLevelgt ltinterface
xsitype"vsParamHTTP" role"std"gt
ltaccessURL use"base"gt
http//adil.ncsa.uiuc.edu/cgi-bin/voimquery?survey
famp lt/accessURLgt
ltqueryTypegtGETlt/queryTypegt
ltresultTypegtapplication/xmlvotablelt/resultTypegt
lt/interfacegt ltimageServiceTypegtPoin
tedlt/imageServiceTypegt ltmaxQueryRegionSizegt
ltlonggt360.0lt/longgt
ltlatgt180.0lt/latgt lt/maxQueryRegionSizegt
... lt/capabilitygt ltcoveragegt
ltstcSTCResourceProfilegt
ltstcAstroCoordSystem id"UTC-ICRS-TOPO"
xlinkhref"ivo//STClib/CoordSysU
TC-ICRS-TOPO"
xlinktype"simple"/gt
ltstcAstroCoordArea coord_system_id"UTC-ICRS-TOPO
"gt ltstcAllSky/gt
lt/stcAstroCoordAreagt lt/stcSTCResourceProfi
legt ltwavebandgtRadiolt/wavebandgt
ltwavebandgtMillimeterlt/wavebandgt
ltwavebandgtInfraredlt/wavebandgt
ltwavebandgtOpticallt/wavebandgt
ltwavebandgtUVlt/wavebandgt lt/coveragegt lt/Resourcegt
The specific class of capability
Capability Metadata what it can do
28Sample Resource Descriptionadilsia.xml
ltcapability xsitype"siaSimpleImageAccess"
standardID"ivo//ivoa.net/std/SIA"gt
ltvalidationLevel validatedBy"ivo//"gt2lt/val
idationLevelgt ltinterface
xsitype"vsParamHTTP" role"std"gt
ltaccessURL use"base"gt
http//adil.ncsa.uiuc.edu/cgi-bin/voimquery?survey
famp lt/accessURLgt
ltqueryTypegtGETlt/queryTypegt
ltresultTypegtapplication/xmlvotablelt/resultTypegt
lt/interfacegt ltimageServiceTypegtPoin
tedlt/imageServiceTypegt ltmaxQueryRegionSizegt
ltlonggt360.0lt/longgt
ltlatgt180.0lt/latgt lt/maxQueryRegionSizegt
... lt/capabilitygt ltcoveragegt
ltstcSTCResourceProfilegt
ltstcAstroCoordSystem id"UTC-ICRS-TOPO"
xlinkhref"ivo//STClib/CoordSysU
TC-ICRS-TOPO"
xlinktype"simple"/gt
ltstcAstroCoordArea coord_system_id"UTC-ICRS-TOPO
"gt ltstcAllSky/gt
lt/stcAstroCoordAreagt lt/stcSTCResourceProfi
legt ltwavebandgtRadiolt/wavebandgt
ltwavebandgtMillimeterlt/wavebandgt
ltwavebandgtInfraredlt/wavebandgt
ltwavebandgtOpticallt/wavebandgt
ltwavebandgtUVlt/wavebandgt lt/coveragegt lt/Resourcegt
The specific class of capability
Capability Metadata what it can do
Coverage Metadata how the data covers the sky,
time, frequency
29IVOA Standard Registry Interface
- Harvesting Interface
- Used by registries to exchange resource
descriptions. - Defined as a profile on the Open Archives
Initiative (OAI) harvesting standard - Search Interface
- How client applications discover resources
- 5 operations
- getIdentity returns VOResource description of
registry - getResource returns the VOResource description
for a given identifier - keywordSearch returns all descriptions that
contain words from a given set - search returns all descriptions that match a
set of specific constraints - xquerySearch (optional) XQuery-based searching
- For end users, many of the details of these
operations may be hidden behind user-oriented
tools and libraries - Possible exception expressing constraint-based
searches
30Advanced, constraint-based searching
- Placing constraints on values of specific
metadata - Expressed as an ADQL where clause
- e.g. title like 'Deep Field' or
shortName'HDF' - A field or column name is expressed as a simple
XPath to the element being constrained - Relative to the root Resource element
- Composed of /s and element names only
- predicates and special characters (, ., ..,
//) are not allowed - Must point to a primative valuee.g. contains a
string - Can point to an attribute by preceeding name with
_at_ - (curation/publisher/_at_ivo-id'ivo//ned.ipac' or
curation/publisher like 'IPAC') and
content/contentLevel'Research' and
capability/validationLevel gt 3
31Advanced, constraint-based searching
- Searching on xsitype
- Avoid including prefix label in xsitype
constraint - _at_xsitype like 'CatalogService'
- capability/_at_xsitype like 'SimpleImageAccess
- _at_xsitype like 'Service'
- Matches Service, DataService, CatalogService
- Selecting based on coverage
- It is generally not useful to apply ADQL
constraints to Space-Time Coordinate (STC)
metadata - e.g. anything under stcSTCResourceProfile
- STC descriptions are complex and not sufficiently
unique - Emerging footprint services will facilitate
selection based on coverage