The SDMX Registry Model - PowerPoint PPT Presentation

About This Presentation
Title:

The SDMX Registry Model

Description:

SDMX provides a number of standards and guidelines which support the standard ... is the role of governments, supra-national initiatives, and public-private consortia ... – PowerPoint PPT presentation

Number of Views:110
Avg rating:3.0/5.0
Slides: 19
Provided by: agr104
Category:
Tags: sdmx | model | registry | supra

less

Transcript and Presenter's Notes

Title: The SDMX Registry Model


1
The SDMX Registry Model
  • April 2, 2009
  • Arofan Gregory
  • Open Data Foundation

2
Background
  • SDMX provides a number of standards and
    guidelines which support the standard exchange of
    statistics
  • Standard models/XML/EDIFACT formats for data
  • Standard models/XML formats for metadata
  • Standard architecture based on a set of registry
    services
  • Guidelines for the use of standard statistical
    concepts across domain boundaries
  • Framework for establishing domain standards
    within each statistical domain

3
SDMX Registries
  • This talk focuses on the SDMX Registry Services
  • These are key to fully automating statistical
    discovery and exchange
  • They are the primary means of enhancing
    visibility and discovery of data and metadata
    within statistical communities
  • They are designed to provide a connection point
    between SDMX and other related standards

4
Existing Problems
  • Duplication of effort
  • There is a lot of duplicative work within
    statistics, because there is little awareness of
    other data collection within specific areas
  • This is wasteful
  • Even with a large amount of public statistical
    data available on the Internet
  • It is difficult to find good data with good
    metadata
  • This impacts end-users (researchers, students,
    journalists) more than policy makers with
    dedicated access to the data
  • Using existing data can be difficult
  • Too many formats too much emphasis on Web-site
    presentation (as opposed to download)
  • Too little metadata for existing data sets
  • Difficult or impossible to combine data from
    different sources
  • Access to data sources is difficult or impossible
    (not even the documentation is accessible)
  • Understanding concepts and definitions can be
    challenging this impacts comparability of data

5
The Case for Infrastructure Support
  • New standards allow for broader visibility and
    re-use of data and metadata
  • Produces greater transparency
  • Produces higher quality and efficiency in data
    access through automation
  • Domains cannot be governed by individual
    organizations
  • The mission of most organizations is too narrow
    (even international ones)
  • This is the role of governments, supra-national
    initiatives, and public-private consortia
  • Most public data is paid for by the taxpayers
  • But they are the least-well served for their
    investment

6
Emerging Solutions
  • Web-services technology can deal with many of the
    generic problems inherent in distributing data
    sources and applications around the Internet
  • Standards such as DDI, SDMX, and ISO/IEC 11179
    provide specific models and formats for use
    within the domains of statistics and research
  • SDMX provides a powerful registry model for
    establishing a research infrastructure
  • Designed to integrate with/support use of many
    other related standards (DDI, ISO 11179, METS,
    XBRL, etc.)
  • SDMX registry tools are available free and as
    open source today

7
How do the SDMX Registry Services Work?
  • An SDMX Registry (that is, an implementation of
    the standard registry services) provides a number
    of things to applications
  • A repository of metadata about the structures and
    concepts of data and metadata sets
  • A repository of information about who provides
    what data and metadata to whom
  • Helps to manage data across a broad network
  • A registry of available data and metadata sets in
    standard formats
  • Lists all information to find and use standard
    data and metadata throughout a community network

8
SDMX Registry/Repository
SDMX Registry Interfaces
Register
Indexes data and metadata
REGISTRY Data Set/Metadata Set
Query
Submit
Describes data and metadata sources and reporting
processes
REPOSITORY Provisioning Metadata
Query
Submit
REPOSITORY Structural Metadata
Describes data and metadata structures
Query
9
SDMX Registry/Repository
SDMX Registry Interfaces
Register
Indexes data and metadata
REGISTRY Data Set/Metadata Set
Query
Subscription/Notification Applications can
subscribe to notification of new or changed
objects
Submit
REPOSITORY Provisioning Metadata
Query
Submit
REPOSITORY Structural Metadata
Describes data and metadata structures
Query
10
Deploying SDMX Registry Services Within Domains
  • It is anticipated that each organization leading
    a statistical domain will deploy a set of
    registry services to support exchanges within
    that domain
  • This is also possible within national statistical
    systems and individual organizations
  • It is possible to have generic, public
    registries as well
  • This model has not been widely explored
  • SDMX-type registries within research domains also
    make sense
  • To supplement existing data archives and RDCs
  • Lowers the cost of development of research
    infrastructure significantly
  • Huge increase in visibility of and access to data
    and sourcing information

11
The Old JEDH (Joint External Debt Hub) Site
BIS
WEBSITE
IMF
OECD
World Bank
(Various Formats)
(3-month production cycle)
12
JEDH with SDMX
Retrieves data from sites
BIS
SDMX Agent
SDMX-ML
SDMX-ML Loaded into JEDH DB
Info about data is registered
IMF
SDMX-ML
Discover data and URLs
SDMX Registry
OECD
SDMX-ML
Data provided in real time to site
World Bank
SDMX-ML
JEDH Site
SDMX-ML
(Debtor database)
13
SDMX in Action Prototype System
FAO SDMX Registry
2
3a
National Publication Server(s)
Regional Publication Server
3b
Flow of FAO CountrySTAT- RegionSTAT Implementation
4
1
RegionSTAT
CountrySTAT
Slide courtesy of the FAO
14
Prototype System Explanation
1
  • CountryStat National Publication Server
  • The web site is published from the files in
    CountryStat
  • SDMX Publication
  • The new CountryStat files are converted to
    SDMX-ML data sets and made web accessible on the
    CountryStat web site
  • These files are registered in the FAO SDMX
    Registry
  • RegionStat Regional Publication Server
  • Queries the registry for new registrations which
    responds with registration details including the
    URL of the new data sets
  • Retrieves the new data sets from the CountryStat
    web site
  • Converts the SDMX-ML files to an internal format
    and integrates the new data sets with existing
    RegionStat data sets
  • Re-publishes the RegionStat web site

2
3a
3b
4
Slide courtesy of the FAO
15
Federation of SDMX Registries
  • SDMX uses a selective approach to replication of
    resources found inside domain SDMX registries
  • Each domain registry can become a recognized user
    in other domain registries
  • Subscription/notification can drive real-time
    replication of registry metadata around the
    network
  • With a coordinated hub registry, a more formal
    registry network could be established
  • This would require no extension to existing
    technologies
  • This would require a major feat of organization
    (!)
  • This is a very light federation mechanism
  • Other, more intensive schemes have failed in
    other technology domains (UDDI, etc.)

16
SDMX Registries and Other Standards
  • The SDMX Registry Services are designed to
    support related standards
  • SDMX reference metadata reports can provide
    links to metadata and data in other standard
    formats
  • Allows for indexing of needed metadata fields
    from other standards within the SDMX registry
    natively
  • Can provide access to native non-SDMX formatted
    XML resources (DDI, Dublin Core, METS, XBRL,
    etc.)
  • Benefits include
  • Clarifying data and metadata ownership issues
  • Making sourcing transparent by linking aggregates
    to source data/metadata
  • Provide capabilities which are typically not
    available today to support comparison
    (integration with ISO/IEC 11179 metadata
    registries for dealing with terminology issues,
    etc.)

17
Clarification
  • Not all registries are the same
  • UDDI and ebXML registries are much more generic
    in purpose, and compatible with SDMX
  • ISO/IEC Metadata Registries are not mechanistic
    web-services registries
  • They are specialized repositories of metadata
    around semantics, concepts and terminology
  • These are compatible with, not duplicative of,
    SDMX registry technology
  • ISO/IEC 11179 could be implemented as an SDMX
    registry (!)

18
ODaF Vision - Standards
Federated Registries (Based on SDMX, ebXML, web
services)
ISO 11179
Semantic definitions
Aggregated Data/Metadata (SDMX)
registered
Organized using
References to source data
METS Packaging
XBRL Business Reports
DDI Microdata Sets
Standard classifications
Dublin Core Citations
Used in
ISO 19115 Geographies
Write a Comment
User Comments (0)
About PowerShow.com