Title: The importance of Metadata
1The importance of Metadata
Marta Melgar García mmelgar_at_ine.es
2Presentation Index
- Introduction
- Statistical Metadata
- Standards and Terminologies
- Languages for Statistical Metadata
- Statistical Metadata in Spain
- Metadata in European Websites
- References
3Introduction
Metadata Definition In general data about
data Functionally structured data about
data Metadata includes data associated with
either information object for purposes of
description, administration, legar requirements,
technical functionality, use and usage, and
preservation. Source Dublin Core Metadata
Initiative
4Introduction
Statistical Metadata is any information that is
needed by people or systems to make proper and
correct use of the real statistical data, in
terms of capturing, reading, processing,
interpreting, analysing and presenting the
information (or any other use). In other words,
statistical metadata is anything that might
influence or control the way in which the core
information is used by people or software.
5Introduction
- Why metadata are important?
- To get a complete picture of the subject matter.
- To provide information that makes data
understandable and shareable. - To be a repository of knowledge and expertise.
- To structure the information and store expert
knowledge from subject area specialists (some
times unstored). - SourceWTO
6Introduction
- Why metadata are important?
- For assessing the quality and reliability of
data. - To determine the effectiveness of any
cross-country analysis. - To highlight differences between countries and
deviations from international standards. - They are very important for users in selecting
and interpreting data. - SourceWTO
7Introduction
- What are the objectives of metadata?
- Great customer satisfaction.
- Greater productivity.
- Better public perception and cooperation.
8Introduction
- Detailed list of metadata
- Definition
- Description of dimensions
- Coverage (geographical, reference period,
exclusions) - Sources
- Classification
- Methodology (brief description)
- Quality assessment
9Introduction
- Problems related to Metadata
- Knowledge of the main users is essential.
- Metadata are effective when they meet the needs
and expectations of users. - Elaborate and very detailed metadata are
difficult to keep updated. It is important that
the amount of information is kept to a minimum. - This requires judgement from the area specialist
on what statistical and methodological aspects
are important and which will have considerable
impact on how data may be used.
10Introduction
- Problems related to Metadata
- On the other hand it is crucial that metadata are
complete. - The effectiveness of metadata depends as well on
the easiness of getting the information.
11Statistical Metadata
- Purpose
- Statistical metadata or metadata for statistical
data and processes is used to enhance users
search and understanding of statistical data,
improve and automate survey processing within
each office, and facilitate statistical data
harmonization, among many others. - Many offices are using metadata driven systems to
automate parts of the survey process. - Source Statistics Canada
12Statistical Metadata
- What is Statistical Metadata?
- Any information that is needed by people or
systems to make proper and correct use of the
real statistical data when - Capturing
- Reading
- Processing
- Presenting
- Analysing
- Interpreting
- Exchanging
- Searching
- Browsing
- SourceAndrew Westlake
13Statistical Metadata
- What does Statistical metadata include?
- File description
- Codebooks
- Processing details
- Sample designs
- Fieldwork reports
- Terminology
14Statistical Metadata
- Statistical Metadata can be used
- informally by people who read it.
- formally by software to guide the way information
is processed.
15Statistical Metadata
- What is Statististical Metadata important for?
- Sharing data
- Archiving (Secondary users need good information)
- Discovery (data can help me to solve a problem)
- Automatization (parametrisation of standardised
processes) - Quality
16Statistical Metadata
- Metadata is not and absolute concept.
- Data become metadata when they are put into a
descriptive relationship with something else
(Farance and Gillman, 2005).
17Statistical Metadata
- What stage does the metadata apply to?
- Design
- Data collection
- Data processing
- Transformation and analysis
- Dissemination
- Exchange
- SourceAndrew Westlake
18Statistical Metadata
- Statistical production process
- Archiving
- Secondary use of data
19Statistical Metadata
- An statistical metadata system is a data
processing system that uses, store and produces
statistical metadata (UNECE 2000).
20Statistical Metadata
- Quality and metadata
- Product quality for statistics are often
described according to Eurostat criteria
(Eurostat 1998) - Relevance and completeness.
- Accuracy.
- Timeliness and punctuality.
- Comparability and coherence.
- Accesibility and clarity.
21Statistical Metadata
- Systematic information about statistics or
statistical metadata are neccesary for - Satisfy users needs.
- Clearness of statistics.
- Improve accesibility.
- Information about production processes are
essential in order for the users to understand
the statistics.
22Statistical Metadata
- Further developments
- Develop a system where metadata are directly
linked with the data. - Develop also metadata by country or region, when
required. - Dissemination of metadata make the information
available to external to the division users
23Standards Terminologies
- Dublin Core (DCMI)
- SDMX
- ISO 11179
- Neuchâtel Terminological Model
24Standards (Dublin Core)
- What is the Dublin Core?
- The Dublin Core metadata standard is a simple yet
effective element set for describing a wide range
of networked resources - The Dublin Core standard includes two levels
Simple and Qualified
25Standards (Dublin Core)
- The semantics of Dublin Core have been
established by an international,
cross-disciplinary group of professionals from
librarianship, computer science - Dublin Core has two classes of terms -- elements
(nouns) and qualifiers (adjectives)
26Standards (Dublin Core)
- DCMI goals
- Simplicity of creation and maintenance
- Commonly understood semantics
- International scope
- Extensibility
27Standards (SDMX)
- SDMXStatistical Data and Metadata eXchange.
- The name Statistical Data and Metadata eXchange
refers to an international initiative aimed at
developing and employing more efficient processes
for exchange and sharing of statistical data and
metadata among international organisations and
their member countries. - SDMX is an initiative to foster standards for the
exchange of statistical information. - The initiative, started in 2001, is sponsored by
7 international organisations Bank for
International Settlements (BIS), European Central
Bank (ECB), Eurostat, International Monetary Fund
(IMF), Organisation for Co-operation and
Development (OECD), United Nations (UN) and the
World Bank (WB).
28Standards (SDMX)
- The SDMX metamodel is concerned with the
structure of data and metadata and with semantics
required to understand the meaning of the data
and metadata. - The SDMX message formats have two basic
expressions, SDMX-ML(using XML syntax) and
SDMX-EDI (using EDIFACT syntax and based on the
GESMES/TS statistical message. - SDMX specifies registry interfaces based on the
SDMX model. - Sourcehttp//www.sdmx.org
29Standards (SDMX)
- What are the goals of SDMX?
- Standardisation for statistical data and metadata
access and exchange. - The objective is to establish a set of commonly
recognised standards to have easy access to
statistical data, wherever these data may be, but
also access to metadata that makes the data more
meaningful and usable.
30Standards (SDMX)
- What kinds of metadata can be exchanged with
SDMX? - SDMX metadata standards build on the distinction
between structural and reference metadata - Structural metadata are those metadata acting as
identifiers and descriptors of the data, such as
names of variables or dimensions of statistical
cubes. Structural metadata must be associated
with the data, otherwise it becomes impossible to
identify, retrieve and browse the data. - Reference metadata are metadata that describe the
contents and the quality of the statistical data
(conceptual metadata, describing the concepts
used and their practical implementation,
methodological metadata, describing methods used
for the generation of the data, and quality
metadata, describing the different quality
dimensions of the resulting statistics, e.g.
timeliness, accuracy - Sourcehttp//www.sdmx.org
31Standards (ISO 11179)
- ISO 11179 INFORMATION TECHNOLOGY-METADATA
REGISTRIES (MDR). - ISO 11179 has an explicit registry metamodel as
part of its model. - Standardized data design procedures for
supporting electronic data interchange. - It develops a set of principles, methods and
procedures for specifying what is needed to
document the association between the various
types of administered items and one or more
classification schemes. - It does not establish a particular classification
scheme as preeminent.
32Terminologies (Neuchâtel Terminological Model)
- It defines the key concepts that are relevant for
the structuring of metadata and provides the
conceptual framework for the development of a
database organising that metadata. - A Terminology lists statistical concepts.
- A Model is a set of related concepts which is
used for producing a structured specification of
some area of interest.
33Terminologies (Neuchâtel Terminological Model)
- Purpose to arrive at a common language and a
common perception of the structure of
classifications. - It is both a terminology and a conceptual model.
- It has a two level structure
- First level of the object types.
- Second the attributes associated with each
object type.
34Languages for Statistical Metadata (XBRL)
- XBRL is a language for the electronic
communication of business and financial data
which is revolutionising business reporting
around the world. -
35Languages for Statistical Metadata (XML)
- SDMX makes use of the schema definition language
known as W3C XML Schema (XSD). - The combination of statistical metadata and XML
(Extensive Markup Language) leads to the
creation of a framework for organizing and
retrieving statistical information. - Statistical information takes heterogeneous forms
which range from textual to numeric, graphs,
tablesand even more multimedia. This means
different types of data.
36Languages for Statistical Metadata (XML)
- Such heterogeneity creates barriers to organising
and making statistical data accesible from a Web
page. - An ideal solution to such heterogeneous data is
to use object-oriented database. - Another solution is to use statistical metadata
and XML to construct a framework for organising
and searching statistical data. - Source Bi and Murtagh
37Statistical metadata in Spain
- We already have metadata in different fields
(methodologies). - The objective of metadata is to build a tool in a
medium term in order to facilitate the
integration and co-ordination of the whole
information requested by INE to data providers. - Our aim is to produce more harmonised information
and more comparable to allow data users get a
tool about every statistical operation performed
by INE. - SourceBlanco and Sánchez-Luengo
38Statistical metadata in Spain
Metadata scope, source, frequency, IOE Code
39Statistical metadata in Spain
Survey Methodology
40Statistical metadata in Spain
Survey design
41Metadata in European Websites Eurostat
Metadata icon
42Metadata in European Websites Eurostat
43Metadata in European Websites Romania
44Metadata in European Websites Romania
Metadata icon
45Metadata in European Websites Romania
46References
- OECD, Metadata for short-term indicators
International comparisons and best practices,
working paper. - OECD, The role of metadata in promoting
international comparisons and adherence to
international statistical standards,
(http//www.oecd.org/std/metarole.htm) - Bureau of Census, United States, Transition plan
for unified approach to metadata management at
the bureau of the Census, working paper. - UN/ECE Secretariat, Standards for Statistical
Metadata on Internet, working paper. - Statistics Canada, The evolution of metadata at
Statistics Canada an integrative approach,
working paper. - Statistics New Zealand, examples of metadata in
the Survey and Output Information Database and
INFOS database at http//www.stats.govt.nz/statswe
b.nsf. - Statistics Canada, examples of metadata in
Information on Products and Services Catalogue at
http//www.statcan.ca/english/search/ips.htm. - http//www.intracen.org/countries/metadata.htm
47- Thank you very much for your attention