Microdata Management Toolkit Tools to facilitate archive and dissemination of surveys PowerPoint PPT Presentation

presentation player overlay
1 / 23
About This Presentation
Transcript and Presenter's Notes

Title: Microdata Management Toolkit Tools to facilitate archive and dissemination of surveys


1
Microdata Management ToolkitTools to facilitate
archive and dissemination of surveys
Session E2 - Thursday, 26 May Tools for
Preservation Integration and Assessment Preservin
g and improving the access to large and complex
household surveys
  • A PDF for Data?
  • Metadata Editor / Nesstar Publisher 3.5
  • CD builder
  • Guidelines for Archiving Dissemination

2
Background
  • Sponsored by World Bank / International Household
    Survey Network
  • Presented earlier this week
  • Created in September 2004
  • International organizations actively sponsoring
    household surveys
  • Marrakech Action Plan for Statistics
  • http//www.surveynetwork.org
  • Survey often under-used limited access for users
    which leads to poor return on investment limited
    impact on the ground, difficulties in policy
    making
  • Common obstacles quality, technical capacity,
    legal/political issues
  • Common problems
  • Accessibility, Timeliness, Coherence
  • Lack of metadata / documentation / data
  • Poorly organized archives
  • To address technical issues Need for new tools
    and guidelines ? Microdata Management Toolkit

3
Toolkit Requirements
  • User friendly software suite and guidelines to
    archive and disseminate microdata
  • Facilitate metadata exchange compliant with
    common XML specifications (DDI, Dublin Core)
  • Facilitate archiving put together metadata and
    data, address common quality control issues
  • Facilitate dissemination simple to redistribute
    on cd/dvd and the web, answer producer/depositor
    needs (subset, anonymization, quality control)
  • Works with common data formats (spss, sas, stata,
    statistica, cspro/imps/issa)
  • Multilingual support
  • Free or Inexpensive
  • Availability of technical support and training
  • Accompanied with guidelines and training program
  • Supported by national, international and research
    communities

4
Core file format - A PDF for Data?
  • How can we carry around the information?
  • Looking at documents ? PDF
  • Can we do the same for data?
  • Yes, a Nesstar file holds data metadata!
  • Partner with Nesstar Ltd to develop new tools
  • Why strong tool for metadata management,
    available today, community acceptance, technical
    support, past experience
  • Development agreement
  • Enhance existing publisher software and make
    available as a stand alone product
  • Open binary file format (not a black box) and
    availability of API
  • Free data reader (like pdf) that allows user to
    access at the data and metadata and convert to
    their favorite format
  • Special licensing agreement for developing
    countries

5
Toolkit Components
  • Archiving Metadata Editor (World Bank / Nesstar
    Ltd.)
  • To compile survey data, documentation and
    metadata in a standard format (Nesstar/DDI). Free
    data reader for users.
  • Built on Nesstar Publisher
  • Dissemination CD Builder (World Bank / Mark
    Diggory)
  • To facilitate the publication of survey data,
    documentation and metadata on CD-ROM and on the
    web (transforms DDI into HTML based navigation)
  • Based on Eclipse Platform, open source
  • Guidelines Handbook (World Bank / ICPSR)
  • To provide data producer with information on
    policies and legal aspect of data dissemination,
    guidelines to document datasets and
    recommendations in setting up a data archive

6
The Toolkit Process
1
Import data and compile metadata
3
Generate HTML based CD-ROM
2
Import metadata and prepare CD-ROM
7
What is the Nesstar Publisher?
  • Advanced data management program
  • DDI /DC Metadata authoring tool
  • Import/Export to common data formats
  • Standalone or w/Nesstar server
  • http//www.nesstar.com
  • Easy editing/creation of DDI documented datasets.
    No need to know XML.
  • Full DDI import and export for single
    file/language studies.
  • Templates which lets your organization
    standardize the use of the DDI.
  • Default texts in templates.
  • Local controlled vocabularies.
  • Possible to share the documentation work between
    different persons.
  • A Category Repository which lets you share
    categories within a dataset and between datasets.
  • Variable groups.
  • Easy setting of weights.
  • Frequency and summary statistics output, with
    options for each variable.
  • Import and export to the most common statistical
    formats.

8
What is the Metadata Editor?
  • Nesstar Publisher 3.0
  • A tool to prepare and publish surveys to a
    Nesstar Server
  • Sold as a component of the Nesstar Software
    Suite
  • Multiple components (editor, hierarchy, cube,
    resources)
  • ? New Model for Version 3.5
  • All components integrated under one interface
  • A study is stored in a single Nesstar file
  • Enhanced and new functionalities
  • Quality control, computed variables, recodes,
    anonymize, subset
  • Availability of a free Nesstar Data Reader
  • Produce DDI / Dublin Core (DC) XML documents
  • Available as a stand-alone software package

9
Editor key features (1)
Template driven metadata editor allows for users
to decide which DDI/DC elements to use.
All surveys stored as projects in a single tree
hierarchy
10
Editor key features (2)
Easy to use interface for document, survey, file
and variable metadata editing
11
Editor key features (3)
Data import preserves existing dictionary and
generates summary statistics
DDI and Dublin Core Metadata import/export
Manage variable groups
12
Editor key features (4)
Description of a dataset primary keys and
hierarchy
Support for survey documentation as Dublin Core
resources
and validation of dataset relationships
Automatic randomization of primary key variables
AND MORE
13
Data Reader
  • Free software
  • PDF philosophy
  • Access to survey metadata
  • Access to data (no need for specialized software)
  • Export to common formats
  • Single file holds data and metadata

14
What is the CD Builder?
  • Purpose is to publish survey metadata, documents
    and data on a CD-Rom (or web site)
  • Transforms DDI into an HTML based interface
  • User can customize the layout (branding) and
    content of the CD (single or multi-surveys)
  • Open source application
  • Build on the Eclipse Framework
  • Based on DDI / Dublin Core
  • Integrates with Metadata Editor
  • Easy to use

15
CD Builder Process
1
Create new CD-ROM Project
  • Selecting a survey consist in opening the
    DDI-XML or Nesstar file
  • The survey branding determines the overall
    look and feel of the CD
  • The survey type determines the default
    metadata content

Add a survey to the project and select its type
and branding
2
Click the Save button to generate the HTML
interface
3
After a few minutes, your CD Project is ready for
publishing!
4
16
Key Features
Content of CD pages is fully customizable
A CD-ROM project can hold several surveys
  • Branding customization
  • Can be published to web
  • Multilingual support
  • Automatic updates
  • and more

17
Sample output
18
Handbook
  • Handbook on the Documentation, Dissemination,
    and Preservation of Microdata
  • Part I Policy, legal and ethical issues and
    recommendations. Benefits and costs of microdata
    dissemination
  • Part II Technical guidelines documenting,
    disseminating and preserving a dataset
  • Part III Setting-up a central data archive

19
Benefits and Users (1)
  • What will the toolkit improve?
  • Documentation (based on standards, guidelines and
    validation)
  • Preservation data and metadata stay together, CD
    archiving
  • Cataloguing facilitate metadata exchange
  • Dissemination CD, DVD, Web
  • Quality validation procedures, use of common
    language, adoption of best practices

20
Benefits and Users (2)
  • Potential users?
  • Survey producers at national level preservation,
    dissemination, harmonize framework
  • International survey sponsors
  • Data archives
  • Who will benefit?
  • Data producers
  • National International survey sponsors
  • Survey data repositories
  • Data analysts
  • Policy makers and population
  • DDI Community

21
Status Availability
  • Publisher 3.5
  • Beta version available
  • Nesstar commercial release during the summer
  • CD Builder
  • Beta version available
  • Public release expected in September (Open
    Source)
  • Guidelines
  • Draft completed
  • Review over the summer

22
Next?
  • Distribution, training and adoption of the
    toolkit
  • User acceptance tests and pilot sites
  • Release of open source components (Sourceforge,
    DDI)
  • Future developments
  • Translations in other languages
  • Plug-ins for Publisher and/or Reader (open
    source)
  • Availability of API library
  • Basic analytical functionalities (tabulation,
    graphs, etc.)
  • Evaluation of disclosure risks / anonymization
    procedures
  • Embed document in archive file (?)
  • Plan for DDI 3.0 support
  • Bug fixes / enhancements / new features (based on
    user feedback)
  • And more based on feedback from users, DDI open
    source community
  • Integration of other tools
  • Argus confidentiality
  • CSPro production
  • Virtual Data Center (VDC) web based
    dissemination
  • Strong collaboration and participation of the
    community

23
Thank you!
  • QUESTION / ANSWER
Write a Comment
User Comments (0)
About PowerShow.com