SDMX Information Model - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

SDMX Information Model

Description:

conforms to business rules of the dataflow ... conforms to business rules of the data/metadata flow. publishes/ reports. data sets ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 53
Provided by: chris1275
Category:

less

Transcript and Presenter's Notes

Title: SDMX Information Model


1
SDMX Information Model
  • Pedagogical Explanation
  • Arofan Gregory and Chris Nelson Metadata
    Technology Ltd.
  • OECD SDMX Expert Group MeetingGeneva April 6-7
    2006

2
Data Set
3
We have a dataset, what do we need to know?
  • Its structure
  • Who reports it
  • How a specific data set fits into the overall
    collection framework and which organisation is
    responsible for reporting which parts
  • The reporting schedule
  • That it has been reported

4
Data Set Structure
5
Data Set Structure
  • Computers need structure of data
  • Concepts and terms
  • Code lists
  • Data values
  • How these fit together

6
Structural Definitions
7
Data Makes Sense
SA,B,1,1999-06-3016547
8
Data Set Structure
  • Comprises
  • Concepts that identify the observation value
  • Concepts that add additional metadata about the
    observation value
  • Concept that is the observation value
  • Any of these may be
  • coded
  • text
  • date/time
  • number
  • etc.

9
Data Set Structure
  • Dimensions
  • Attributes
  • Measure
  • Representation

10
Data Set Structure
dimension
dimension
attribute
attribute
dimension
dimension
dimension
measure
11
Data Structure Definition
Key
Group Key
Dimensions
Attributes
Measures
Representation
Concept
12
Data Set Publishing/Reporting
  • Publishing data sets and collecting data sets is
    a process
  • As a process it must have metadata that enables
    organisations to control it
  • what data is it
  • who publishes it
  • who collects it
  • when is it published/reported

13
Structure Definition
Data Flow
Data Set
can get data from multiple data providers
can provide data for many data flows using agreed
data structure
Provision Agreement
Data Provider
  • The data flow is the artefact that contains
    metadata about the provision of data
  • In a data reporting scenario the data flow is
    defined by the data collector, and there can be
    many data providers reporting data for the data
    flow
  • A data provider may report data for many data
    flows (perhaps for many organisations)

14
Organising Data Flows
  • Organisations may wish to categorise the data
    flows
  • For convenience
  • To facilitate control
  • who reports what/when (release calendar)
  • who has reported
  • more about these later
  • To facilitate search for data (more about this
    later)

15
Data Reporting
Data Structure Definition
CategoryScheme
comprises subject or reporting categories
uses specific data/metadata structure
can be linked to categories in multiple category
schemes
Data Flow
Category
Data Set
conforms to business rules of the data/metadata
flow
can have child categories
publishes/reports data sets
can get data from multiple data providers
can provide data for many data flows using agreed
data structure
Provision Agreement
Data Provider
Metadata
16
We have metadata what do we need to know?
  • What is the metadata for (what does it describe)
  • Who reports it
  • How a specific metadata set fits into the overall
    collection framework and which organisation is
    responsible for reporting which parts
  • The reporting schedule
  • That it has been reported

17
Metadata Controlling It
  • What can be done for data can also be done for
    metadata
  • Metadata has a structure
  • Metadata is reported/published
  • Metadata needs to be controlled
  • Metadata needs to be found
  • Metadata may need to be linked to data

18
What Sort of Metadata?
  • Data values are limited in where they belong
  • Series key (usually qualified by time)
  • Data attribute values are limited in where they
    belong
  • Observation value
  • Series key
  • Group key
  • Data set
  • Metadata is not limited in this way
  • Metadata is everywhere
  • Can we learn from the data side how to describe
    metadata structure definitions

19
Metadata Structure Definition
  • Concepts
  • Hierarchies
  • Representation (e.g. code list)

Provision Agreement
20
Metadata Structure Definition
uses defined concepts
concept defined in
Metadata Report
Concept Scheme
Concept
takes semantic and context from
can have hierarchy
specifies to which object types the concept can
be attached
Partial Target Identifier
identifies the code list from which the value of
the (key) component must be taken when metadata
is reported
specifies the identifier components (key) of
the target object
identifies target object type of the component
Target Object Type
21
Metadata Target
Data Flow
Provision Agreement
Data Provider
22
ARC
Metadata_Concepts
Metadata Structure Definition
MetadataReport
Concept Scheme
Concept
Release Date
Release Status
Format and Permitted Value List
Id Provision_Agreement
Can be used to identify just the Data Provider or
just the Data Flow
Partial Target Identifier
Data Flow
Data Provider
Target Object Type
23
Metadata Structure Definition Identifiers
Metadata Structure Definition ARC_DATA
Full Target Identifier
Provision_Agreement
Identifier Component
Target Object Type
Data Flow
Item Scheme
Identifier Component
Data Provider
Target Object Type
Item Scheme
24
Metadata Structure Definition Metadata Report
ARC
Metadata Report
Attachment Provision_Agreement
Metadata Attribute
Release Date
Concept
Representation DateTime
Metadata Attribute
Release Status
Concept
Representation
25
Metadata Reporting
Metadata Structure Definition
CategoryScheme
comprises subject or reporting categories
uses specific metadata structure
can be linked to categories in multiple category
schemes
Metadata Flow
Category
Metadata Set
conforms to business rules of the metadata flow
can have child categories
can get metadata from multiple metadata providers
publishes/reports metadata sets
Constraint
can have constraints sub set of possibilities
defined in the Structure Definition
Provision Agreement
can provide metadata for many metadata flows
using agreed metadata structure
Data Provider
26
Information Model Summary So Far
  • Supports data and metadata reporting and exchange
  • Data and metadata structure definitions
  • Data and metadata sets
  • Supports the process of reporting and exchange
  • Data/metadata providers
  • Data/metadata flows
  • Provision agreements

27
Data/Metadata Reporting/Exchange
CategoryScheme
Structure Definition
comprises subject or reporting categories
uses specific data/metadata structure
can be linked to categories in multiple category
schemes
Data Set or Metadata Set
Data or Metadata Flow
Category
conforms to business rules of the data/metadata
flow
publishes/reports data sets or metadata sets
can have child categories
can get data/metadata from multiple data/metadata
providers
Constraint
can have constraints sub set of possibilities
defined in the Structure Definition
can provide data/metadata for many data/metadata
flows using agreed data/metadata structure
Provision Agreement
Data Provider
28
Controlling Data and Metadata
  • How do we control data and metadata reporting?
  • How do we find data and metadata?
  • How do we share data and metadata

29
SDMX Registry
  • The Registry supports many of the artefacts in
    the Information Model
  • Hold indexes for data and metadata and where
    these can be found on the web
  • Data and metadata set indexes
  • Stores structure definitions
  • Data and metadata structures
  • Code lists
  • Category schemes
  • Data flows
  • Stores provisioning metadata
  • Data providers
  • Provision agreements
  • The Registry is used to store structural and
    provisioning definitions, to register data sets
    and metadata sets, and links between them
  • The Registry is a resource that can be queried by
    applications to find data, metadata, and the
    structural definitions supporting these
  • The Registry specification defines the behaviour
    of an SDMX Registry and the Registry interfaces,
    which are an XML schema specification
  • The Registry functions are modelled in the
    Information Model, but its functionality is best
    explained in the context of the schematic already
    used for data and metadata (Data/Metadata
    Reporting and Exchange)

30
SDMX Registry/Repository
SDMX Registry Interfaces
Register
Indexes data and metadata
REGISTRY Data Set/Metadata Set
Query
Subscription/Notification
Submit
Describes data and metadata sources and reporting
processes
REPOSITORY Provisioning Metadata
Query
Submit
REPOSITORY Structural Metadata
Describes data and metadata structures
Query
31
Data Set Registration
Structure Definition
  • The data is registered against the provision
    agreement
  • The Constraint holds the indexes such as the
    series keys, or the list of dimension values

Data Flow
Constraint
Keys
Data Set
Provision Agreement
Data Provider
URL, registration date etc.
32
Data Query
CategoryScheme
Structure Definition
  • The query can start anywhere and navigate to the
    data
  • In the registry all navigation is bi-directional.
  • Category Drill down searches will start at the
    Category and go via Data Flows.
  • Fine grained queries can be built using
    structural metadata (e.g. dimension names and
    possible values)
  • Fine grained searches are possible on the
    Constraints

Data Flow
Category
Constraint
Data Set
Provision Agreement
Data Provider
33
Metadata Set Registration
  • Metadata that is reported regularly is registered
    against the (Metadata) Provision Agreement
  • The metadata content (the metadata set) is linked
    to the object to which it relates
  • This link can be stored in the registry
  • e.g. a link to data set to which it relates
  • a link to the data provider to which it relates
  • Registry/Repository operators could use the
    repository to store the metadata itself
  • This is not a part of the Information Model nor
    of the SDMX standards

34
Metadata Query
  • The indexed metadata set itself can be searched
  • Links to data can be discovered and followed
  • e.g. is there any metadata for a specific data
    set, or part of the data set?
  • If so what sort of metadata?
  • Where is the metadata (URL)?
  • More on this later

35
Information Model Summary So Far
  • Supports data and metadata reporting and exchange
  • Data and metadata structure definitions
  • Data and metadata sets
  • Supports the process of reporting and exchange
  • Data/metadata providers
  • Data/metadata flows
  • Provision agreements
  • Supports registration
  • Data and metadata sets
  • Supports query
  • Categories linked to data and metadata
  • Constraints for finer grained queries

36
Summary Data/Metadata Reporting, Query
CategoryScheme
Structure Definition
comprises subject or reporting categories
uses specific data/metadata structure
can be linked to categories in multiple category
schemes
Data Set or Metadata Set
Data or Metadata Flow
Category
conforms to business rules of the data/metadata
flow
publishes/reports data sets or metadata sets
can have child categories
can get data/metadata from multiple data/metadata
providers
Constraint
can have constraints sub set of possibilities
defined in the Structure Definition
can provide data/metadata for many data/metadata
flows using agreed data/metadata structure
Provision Agreement
Data Provider
37
Registry what else?
  • Link metadata to parts of a data set or data base
    contents
  • Query for metadata linked to data

38
Registry link metadata to data
These can be described in terms of key sets,
combined into an Attachment Constraint, linked to
a specific data set, and a metadata set
39
Constraints Structure
  • Supports the specification of sub sets of data or
    metadata structure definitions or data and
    metadata sets
  • In terms of allowable key values
  • In terms of allowable dimension, attribute, or
    measure values
  • Constraints can apply to
  • Data sets so called cubes or cube regions
  • Entire databases
  • Data flows
  • Metadata sets
  • Entire metadata repositories
  • Metadata flows
  • Data providers
  • Provision agreements
  • Two kinds of Constraint
  • Content this is used to define the actual or
    allowable content
  • Attachment this is used to define a sub set of
    data or metadata set for the purpose of attaching
    metadata to it

40
Constraints Structure Schematic
Sets of keys to be included in or excluded from
the scope
Constraint
AttachmentConstraint
ContentConstraint
Key Set
Sets of values to be included in or excluded from
the scope
Specification of a key
Cube Region
Key
Set of values for a concept
Identity of the Concept (e.g. Country)
Specification of a key value
Concept Values
Key Value
Concept
List of values
Values
41
Constraints usage
  • Data source registration
  • Data source can be a data set or a database
  • Content Constraint is used to define the content
    of a data set or database
  • This supports fine grained queries
  • Attaching metadata to parts of a data set or
    other data source
  • Target object of a metadata set is an Attachment
    Constraint linked to a registered data set or
    database content

42
Attachment Constraint
Metadata is linked to the Constraint
Constraint is linked to the Data Set
Attachment Constraint
Registered Data Set
Registered Metadata Set
Key Sets define the sub set of the Data Set
Key Set
SA,B,1,1999-03-31 SA,B,1,1999-06-33 SA,B,1,1999-09
-30 SA,B,2,1999-03-31 etc.
Key(s)
43
Information Model Support for Data Analysis
  • Viewing, comparing and analysing data in
    different groupings
  • Hierarchical Code Lists
  • Converting data and metadata from one coding and
    structure scheme to another scheme
  • Structure and Code Mapping

44
Hierarchical Code Lists - Example
  • France is a country
  • France is part of the continent of Europe
  • France is a member of NATO
  • France is a member of the EU
  • France is a member of the G10
  • When I analyse statistics I might want to see
    totals by
  • continent
  • trading block
  • military alliance
  • financial grouping
  • France will be grouped with different sets of
    countries depending on the view required
  • How do we express these groupings?

45
Code List
Code Composition
Reference Area
6B NATO B0 EU B1 NAFTA BE Belgium BG
Bulgaria CA Canada CH Switzerland CZ Czech
Republic DE Germany DK Denmark E1 Europe E8
North America EE Estonia ES Spain FI Finland FR
France GB United Kingdom GR Greece HU Hungary JP
Japan I2 Euro 12 IT Italy NE Netherlands US
United States
Code
G10 countries
Europe
EU countries
NATO countries
NAFTA countries
Code Association
North America
46
Hierarchical Code Scheme
comprises code groups
comprises hierarchies
Code List
relates a code to a parent code
belongs to
code
Code Association
Code
parent code
Properties of the association
groups codes with the same parent
Property
Code Composition
value based hierarchy has code groups
comprises code groups
Hierarchy
level based hierarchy has formal levels
Level
47
Item Scheme Maps
  • Many types of item scheme use the same
    fundamental structure
  • Code list
  • Category scheme
  • Concept scheme
  • Two Item Schemes can be mapped

48
Item Scheme Association
target item scheme
source item scheme
Category Scheme Map
Concept Scheme Map
Code List Map
Association Role
Item Scheme
Item Scheme
has item associations
Concept Scheme
Category Scheme
Category Scheme
Concept Scheme
Code List
Code List
Item Association
target item
source item
Item
Item
Concept
Category
Category
Concept
Code
Code
Additional metadata
Property
49
Structure Maps
  • Structures can also be mapped
  • Data structures
  • Metadata structures

50
Information Model Summary
  • Supports data and metadata reporting and exchange
  • Data and metadata structure definitions
  • Data and metadata sets
  • Supports the process of reporting and exchange
  • Data/metadata providers
  • Data/metadata flows
  • Provision agreements
  • Supports registration
  • Data and metadata sets
  • Data and metadata can be linked
  • Supports query
  • Categories linked to data and metadata
  • Constraints for finer grained queries
  • Retrieval of metadata linked to data
  • Supports data analysis, comparison and conversion
  • Hierarchical code schemes
  • Structure, Concept, Code, Category maps

51
Data/Metadata Reporting, Query, Analysis, Mapping
CategoryScheme
Structure and Item Scheme Maps
Structure Definition
Data Set or Metadata Set
Data or Metadata Flow
Category
Attachment Constraint
Content Constraint
Provision Agreement
Data Provider
Registered Data Set or Metadata Set
52
Thank You
Write a Comment
User Comments (0)
About PowerShow.com