The importance of Metadata - PowerPoint PPT Presentation

About This Presentation
Title:

The importance of Metadata

Description:

UN/ECE Secretariat, Standards for Statistical Metadata on Internet, working paper. Statistics Canada, The evolution of metadata at Statistics Canada: ... – PowerPoint PPT presentation

Number of Views:120
Avg rating:3.0/5.0
Slides: 48
Provided by: INE100
Category:

less

Transcript and Presenter's Notes

Title: The importance of Metadata


1
The importance of Metadata
Marta Melgar García mmelgar_at_ine.es
2
Presentation Index
  • Introduction
  • Statistical Metadata
  • Standards and Terminologies
  • Languages for Statistical Metadata
  • Statistical Metadata in Spain
  • Metadata in European Websites
  • References

3
Introduction
Metadata Definition In general data about
data Functionally structured data about
data Metadata includes data associated with
either information object for purposes of
description, administration, legar requirements,
technical functionality, use and usage, and
preservation. Source Dublin Core Metadata
Initiative
4
Introduction
Statistical Metadata is any information that is
needed by people or systems to make proper and
correct use of the real statistical data, in
terms of capturing, reading, processing,
interpreting, analysing and presenting the
information (or any other use). In other words,
statistical metadata is anything that might
influence or control the way in which the core
information is used by people or software.
5
Introduction
  • Why metadata are important?
  • To get a complete picture of the subject matter.
  • To provide information that makes data
    understandable and shareable.
  • To be a repository of knowledge and expertise.
  • To structure the information and store expert
    knowledge from subject area specialists (some
    times unstored).
  • SourceWTO

6
Introduction
  • Why metadata are important?
  • For assessing the quality and reliability of
    data.
  • To determine the effectiveness of any
    cross-country analysis.
  • To highlight differences between countries and
    deviations from international standards.
  • They are very important for users in selecting
    and interpreting data.
  • SourceWTO

7
Introduction
  • What are the objectives of metadata?
  • Great customer satisfaction.
  • Greater productivity.
  • Better public perception and cooperation.

8
Introduction
  • Detailed list of metadata
  • Definition
  • Description of dimensions
  • Coverage (geographical, reference period,
    exclusions)
  • Sources
  • Classification
  • Methodology (brief description)
  • Quality assessment

9
Introduction
  • Problems related to Metadata
  • Knowledge of the main users is essential.
  • Metadata are effective when they meet the needs
    and expectations of users.
  • Elaborate and very detailed metadata are
    difficult to keep updated. It is important that
    the amount of information is kept to a minimum.
  • This requires judgement from the area specialist
    on what statistical and methodological aspects
    are important and which will have considerable
    impact on how data may be used.

10
Introduction
  • Problems related to Metadata
  • On the other hand it is crucial that metadata are
    complete.
  • The effectiveness of metadata depends as well on
    the easiness of getting the information.

11
Statistical Metadata
  • Purpose
  • Statistical metadata or metadata for statistical
    data and processes is used to enhance users
    search and understanding of statistical data,
    improve and automate survey processing within
    each office, and facilitate statistical data
    harmonization, among many others.
  • Many offices are using metadata driven systems to
    automate parts of the survey process.
  • Source Statistics Canada

12
Statistical Metadata
  • What is Statistical Metadata?
  • Any information that is needed by people or
    systems to make proper and correct use of the
    real statistical data when
  • Capturing
  • Reading
  • Processing
  • Presenting
  • Analysing
  • Interpreting
  • Exchanging
  • Searching
  • Browsing
  • SourceAndrew Westlake

13
Statistical Metadata
  • What does Statistical metadata include?
  • File description
  • Codebooks
  • Processing details
  • Sample designs
  • Fieldwork reports
  • Terminology

14
Statistical Metadata
  • Statistical Metadata can be used
  • informally by people who read it.
  • formally by software to guide the way information
    is processed.

15
Statistical Metadata
  • What is Statististical Metadata important for?
  • Sharing data
  • Archiving (Secondary users need good information)
  • Discovery (data can help me to solve a problem)
  • Automatization (parametrisation of standardised
    processes)
  • Quality

16
Statistical Metadata
  • Metadata is not and absolute concept.
  • Data become metadata when they are put into a
    descriptive relationship with something else
    (Farance and Gillman, 2005).

17
Statistical Metadata
  • What stage does the metadata apply to?
  • Design
  • Data collection
  • Data processing
  • Transformation and analysis
  • Dissemination
  • Exchange
  • SourceAndrew Westlake

18
Statistical Metadata
  • Statistical production process
  • Archiving
  • Secondary use of data

19
Statistical Metadata
  • An statistical metadata system is a data
    processing system that uses, store and produces
    statistical metadata (UNECE 2000).

20
Statistical Metadata
  • Quality and metadata
  • Product quality for statistics are often
    described according to Eurostat criteria
    (Eurostat 1998)
  • Relevance and completeness.
  • Accuracy.
  • Timeliness and punctuality.
  • Comparability and coherence.
  • Accesibility and clarity.

21
Statistical Metadata
  • Systematic information about statistics or
    statistical metadata are neccesary for
  • Satisfy users needs.
  • Clearness of statistics.
  • Improve accesibility.
  • Information about production processes are
    essential in order for the users to understand
    the statistics.

22
Statistical Metadata
  • Further developments
  • Develop a system where metadata are directly
    linked with the data.
  • Develop also metadata by country or region, when
    required.
  • Dissemination of metadata make the information
    available to external to the division users

23
Standards Terminologies
  • Dublin Core (DCMI)
  • SDMX
  • ISO 11179
  • Neuchâtel Terminological Model

24
Standards (Dublin Core)
  • What is the Dublin Core?
  • The Dublin Core metadata standard is a simple yet
    effective element set for describing a wide range
    of networked resources
  • The Dublin Core standard includes two levels
    Simple and Qualified

25
Standards (Dublin Core)
  • The semantics of Dublin Core have been
    established by an international,
    cross-disciplinary group of professionals from
    librarianship, computer science
  • Dublin Core has two classes of terms -- elements
    (nouns) and qualifiers (adjectives)

26
Standards (Dublin Core)
  • DCMI goals
  • Simplicity of creation and maintenance
  • Commonly understood semantics
  • International scope
  • Extensibility

27
Standards (SDMX)
  • SDMXStatistical Data and Metadata eXchange.
  • The name Statistical Data and Metadata eXchange
    refers to an international initiative aimed at
    developing and employing more efficient processes
    for exchange and sharing of statistical data and
    metadata among international organisations and
    their member countries.
  • SDMX is an initiative to foster standards for the
    exchange of statistical information.
  • The initiative, started in 2001, is sponsored by
    7 international organisations Bank for
    International Settlements (BIS), European Central
    Bank (ECB), Eurostat, International Monetary Fund
    (IMF), Organisation for Co-operation and
    Development (OECD), United Nations (UN) and the
    World Bank (WB).

28
Standards (SDMX)
  • The SDMX metamodel is concerned with the
    structure of data and metadata and with semantics
    required to understand the meaning of the data
    and metadata.
  • The SDMX message formats have two basic
    expressions, SDMX-ML(using XML syntax) and
    SDMX-EDI (using EDIFACT syntax and based on the
    GESMES/TS statistical message.
  • SDMX specifies registry interfaces based on the
    SDMX model.
  • Sourcehttp//www.sdmx.org

29
Standards (SDMX)
  • What are the goals of SDMX?
  • Standardisation for statistical data and metadata
    access and exchange.
  • The objective is to establish a set of commonly
    recognised standards to have easy access to
    statistical data, wherever these data may be, but
    also access to metadata that makes the data more
    meaningful and usable.

30

Standards (SDMX)
  • What kinds of metadata can be exchanged with
    SDMX?
  • SDMX metadata standards build on the distinction
    between structural and reference metadata
  • Structural metadata are those metadata acting as
    identifiers and descriptors of the data, such as
    names of variables or dimensions of statistical
    cubes. Structural metadata must be associated
    with the data, otherwise it becomes impossible to
    identify, retrieve and browse the data.
  • Reference metadata are metadata that describe the
    contents and the quality of the statistical data
    (conceptual metadata, describing the concepts
    used and their practical implementation,
    methodological metadata, describing methods used
    for the generation of the data, and quality
    metadata, describing the different quality
    dimensions of the resulting statistics, e.g.
    timeliness, accuracy
  • Sourcehttp//www.sdmx.org

31
Standards (ISO 11179)
  • ISO 11179 INFORMATION TECHNOLOGY-METADATA
    REGISTRIES (MDR).
  • ISO 11179 has an explicit registry metamodel as
    part of its model.
  • Standardized data design procedures for
    supporting electronic data interchange.
  • It develops a set of principles, methods and
    procedures for specifying what is needed to
    document the association between the various
    types of administered items and one or more
    classification schemes.
  • It does not establish a particular classification
    scheme as preeminent.

32
Terminologies (Neuchâtel Terminological Model)
  • It defines the key concepts that are relevant for
    the structuring of metadata and provides the
    conceptual framework for the development of a
    database organising that metadata.
  • A Terminology lists statistical concepts.
  • A Model is a set of related concepts which is
    used for producing a structured specification of
    some area of interest.

33
Terminologies (Neuchâtel Terminological Model)
  • Purpose to arrive at a common language and a
    common perception of the structure of
    classifications.
  • It is both a terminology and a conceptual model.
  • It has a two level structure
  • First level of the object types.
  • Second the attributes associated with each
    object type.

34
Languages for Statistical Metadata (XBRL)
  • XBRL is a language for the electronic
    communication of business and financial data
    which is revolutionising business reporting
    around the world.

35
Languages for Statistical Metadata (XML)
  • SDMX makes use of the schema definition language
    known as W3C XML Schema (XSD).
  • The combination of statistical metadata and XML
    (Extensive Markup Language) leads to the
    creation of a framework for organizing and
    retrieving statistical information.
  • Statistical information takes heterogeneous forms
    which range from textual to numeric, graphs,
    tablesand even more multimedia. This means
    different types of data.

36
Languages for Statistical Metadata (XML)
  • Such heterogeneity creates barriers to organising
    and making statistical data accesible from a Web
    page.
  • An ideal solution to such heterogeneous data is
    to use object-oriented database.
  • Another solution is to use statistical metadata
    and XML to construct a framework for organising
    and searching statistical data.
  • Source Bi and Murtagh

37
Statistical metadata in Spain
  • We already have metadata in different fields
    (methodologies).
  • The objective of metadata is to build a tool in a
    medium term in order to facilitate the
    integration and co-ordination of the whole
    information requested by INE to data providers.
  • Our aim is to produce more harmonised information
    and more comparable to allow data users get a
    tool about every statistical operation performed
    by INE.
  • SourceBlanco and Sánchez-Luengo

38
Statistical metadata in Spain
Metadata scope, source, frequency, IOE Code
39
Statistical metadata in Spain
Survey Methodology
40
Statistical metadata in Spain
Survey design
41
Metadata in European Websites Eurostat
Metadata icon
42
Metadata in European Websites Eurostat
43
Metadata in European Websites Romania
44
Metadata in European Websites Romania
Metadata icon
45
Metadata in European Websites Romania
46
References
  • OECD, Metadata for short-term indicators
    International comparisons and best practices,
    working paper.
  • OECD, The role of metadata in promoting
    international comparisons and adherence to
    international statistical standards,
    (http//www.oecd.org/std/metarole.htm)
  • Bureau of Census, United States, Transition plan
    for unified approach to metadata management at
    the bureau of the Census, working paper.
  • UN/ECE Secretariat, Standards for Statistical
    Metadata on Internet, working paper.
  • Statistics Canada, The evolution of metadata at
    Statistics Canada an integrative approach,
    working paper.
  • Statistics New Zealand, examples of metadata in
    the Survey and Output Information Database and
    INFOS database at http//www.stats.govt.nz/statswe
    b.nsf.
  • Statistics Canada, examples of metadata in
    Information on Products and Services Catalogue at
    http//www.statcan.ca/english/search/ips.htm.
  • http//www.intracen.org/countries/metadata.htm

47
  • Thank you very much for your attention
Write a Comment
User Comments (0)
About PowerShow.com