Case Studies: Statistics Canada (WP 11) - PowerPoint PPT Presentation

About This Presentation
Title:

Case Studies: Statistics Canada (WP 11)

Description:

Case Studies: Statistics Canada (WP 11) Alice Born alice.born_at_statcan.ca Statistics Canada UNECE Workshop on Statistical Metadata – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 27
Provided by: born4
Learn more at: https://unece.org
Category:

less

Transcript and Presenter's Notes

Title: Case Studies: Statistics Canada (WP 11)


1
Case StudiesStatistics Canada (WP 11)
  • Alice Born alice.born_at_statcan.ca
    Statistics Canada
  • UNECE Workshop on Statistical Metadata
  • July 4 to 6, 2007

2
Outline
  1. Overview
  2. Statistical metadata systems and the statistical
    cycle
  3. Statistical metadata in each phase of the
    statistical cycle
  4. Systems and design issues
  5. Organizational and cultural issues

3
Overview of Integrated Metadatabase (IMDB)
  • To support interpretation of the data
    dissemination phase
  • Responsibility of Standards Division (metadata,
    classifications and standard definitions)
  • Adherence to Policy on Informing Users on Data
    Quality and Methodology, Policy on Standards and
    Quality Assurance Framework
  • In general, metadata goes back November 2001

4
Overview of Integrated Metadatabase (IMDB)
  • Contains metadata on 350 active and 250 inactive
    surveys and statistical programs
  • Purpose
  • Methodology used to produce the data
  • Measures of data accuracy
  • Variables, classifications for the data
  • Location of clean master datafile
  • Contacts
  • Survey managers cannot release data without the
    prescribed metadata mandatory

5
Overview of Integrated Metadatabase (IMDB)
  • Next priorities
  • Complete documentation of variables
  • Complete questionnaire model
  • determine metadata for archived datafiles may
    require additional metadata
  • Lessons learned
  • Opportunities in collecting metadata in the first
    phase of the statistical cycle not at the time
    of dissemination

6
Statistical metadata systems and the statistical
cycle
  • Relationship with survey planning and design
    phase
  • IMDB expanded its role as part of the Household
    Survey Content Harmonization
  • Standardize concepts, questions, question blocks
    across household surveys
  • Variables follow the ISO-IEC 11179
  • Questions and question blocks, associated
    response choices linked to variables and
    classifications are stored in the IMDB at the
    beginning
  • Survey Specification Manager pulls metadata from
    the IMDB but contains specifications and code

7
Statistical metadata systems and the statistical
cycle
  • Relationship to dissemination systems
  • Metadata for information modules on the STC
    website mandatory
  • Information for survey respondents requires
    metadata prior to release of data
  • Data Liberation Initiative public-use microdata
    files documented in DDI
  • Metadata to support data exchange SDMX, DDI,
    XBRL, Wiki, HTML, etc.

8
Statistical metadata systems and the statistical
cycle
  • Relationship to aggregation - analysis phase
  • Analytical datawarehouses use IMDB to organize
    their tables (variables and classifications)
  • Relationship to archive phase
  • IMDB contains location of master datafile, record
    layout, contact information
  • Currently developing business rules for archived
    datafiles

9
Statistical metadata systems and the statistical
cycle
  • Relationship with management systems
  • Software Register registry of Agencys software
    and applications organized by survey and
    statistical program IMDB is the inventory
  • Quality management assessment and questionnaire
    based on inventory of surveys in the IMDB reuse
    of existing metadata

10
IMDB in the survey life cycle
Data Warehouses
Operations Management
Quality Assurance
Analysis
Dissemination
IMDB
IMDB
Metadata
Collect
Edit
Estimate
Tabulate
Publish
Design
Archive
Operational Data
Registers
Survey Data
Administrative Data
Operational Data Stores
11
Statistical metadata for phases in the
statistical cycle
  • Metadata describing statistical business
    processes
  • Data dissemination for interpretation of data
  • IMDB serves as the corporate inventory of all
    surveys and statistical programs, questionnaires,
    master datafiles
  • metadata or paradata resides in other
    metainformation systems SSM, IQMS

12
Statistical metadata for phases in the
statistical cycle
  • Metadata for data elements
  • Supports Survey planning and design Analysis
    Dissemination Archiving
  • Metadata objects tracked over time for changes
    (versioning) and validity (registration)
  • Output to online data tables and STC products
  • For discovery inventory of DE on STC website
    and STCWiki (internal review before going public)
  • Links to questions, question blocks, datafiles

13
STCWiki Type of marital status of person
14
Statistical metadata for phases in the
statistical cycle
  • Metadata for survey planning and design
  • Questions, standard questions blocks and standard
    response choices in IMDB
  • Mapped to value domains, data elements and
    surveys in the IMDB
  • These metadata assembled into collection
    instruments in other metainformation systems
    outside the IMDB

15
Systems and design issues
  • IMDB started in 1998
  • Phase 1 Consolidation of existing metadata stores
  • Phase 2 Metadata describing statistical business
    processes
  • Phase 3 Metadata for data elements, etc.
  • MetaStat system Statistical activity, survey,
    instance, frame, universe, instrument, datafiles,
    survey methodology, documentation, data accuracy
  • MetaWeb system object class, property, data
    element, value domain, question, response
    choices, question block, value meaning manager

16
Phase 2 Input Screens
Text strings related to data components
Directives Resource Bundle Key
Value SurveySDDS Statistical Data Doc ...
IMDB database
Labels Resource Bundle Key Value SurveySDDS
SDDS ...
17
Phase 2 Input ScreenAdministered Item
18
Phase 2 - Identification Tab
19
Systems and design issues
  • Dissemination and information discovery systems
  • Web publication from IMDB is through HTML,
    dynamically generated with Perl scripts
  • Conforms to government standards CLF
  • Survey-centric view and developing DE-centric
    view
  • Discovery from Wiki solution non-linear view of
    Phase 2 and 3 metadata
  • Allows users to view links among administered
    items in the IMDB

20
Organizational and cultural issues
  • Information management
  • Assist in harmonization / usage of standards
  • Knowledge sharing
  • Corporate memory
  • Reuse of our metainformation assets

21
Knowledge Sharing/Corporate MemorySurvey Life
Cycle
Collect
Edit
Estimate
Tabulate
Publish
Design
Survey Universe Frame Instance Collection
Instrument Methodology Data Files Enterprise
Architecture
Concepts (Object Class, Property, Data Element
Concept) Data Elements Questions Questions
Blocks Classifications (Conceptual Domain Value
Domain)
22
Corporate MemoryData Files
Operational Data
Registers
Survey Data
Administrative Data
Operational Data Stores
Public Use Master File
Archival information
Clean Master File
Archived Data
23
Reuse of Information AssetsInformation
Discovery/Dissemination
  • One meta data source
  • many uses for the information
  • many output formats



24
Reuse of Information AssetsApplications
Development
Classification coding
Collection instrument development
Publishing
Other applications
25
Reuse of Information AssetsIntegration with Data
Data Warehouses
CANSIM
26
Organizational and cultural issues
  • STC is one of the most integrated statistical
    systems in the world
  • As part of its Enterprise Architecture strategy
    moving towards centralized and generalized
    systems, including the IMDB
  • IMDB was built initially to support
    interpretation of disseminated data
  • Pressure is to provide metadata up (and down) the
    statistical value chain and into management
    systems
  • Opportunities at the Survey planning and design
    phase reuse of existing metadata (variables,
    classifications, questions, etc) registered in
    the IMDB coherence
Write a Comment
User Comments (0)
About PowerShow.com