The Data Documentation Initiative DDI: Metadata Standards to Support Access, Sharing and Preservatio - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

The Data Documentation Initiative DDI: Metadata Standards to Support Access, Sharing and Preservatio

Description:

Metadata is essential information for research and reuse of data. The further data gets from its source, the greater the importance of the metadata ... – PowerPoint PPT presentation

Number of Views:159
Avg rating:3.0/5.0
Slides: 33
Provided by: wend133
Category:

less

Transcript and Presenter's Notes

Title: The Data Documentation Initiative DDI: Metadata Standards to Support Access, Sharing and Preservatio


1
The Data Documentation Initiative (DDI)Metadata
Standards to Support Access, Sharing and
Preservation of Data
  • Wendy Thomas
  • Minnesota Population Center
  • TCRG Presentation
  • 8 April 2009

2
Metadata provides support for
  • Survey and data collection preparation
  • Data collection
  • Data processing
  • Analysis
  • Data discovery and access
  • Replication
  • Repurposing (secondary data use)

3
Metadata
  • Metadata is essential information for research
    and reuse of data
  • The further data gets from its source, the
    greater the importance of the metadata
  • Content is critical
  • Structure is becoming increasingly important in a
    networked world

4
Why Standards?
  • Standards provide structure for
  • Accurate transfer of content between systems
  • Increased automation of ingest, reducing costs
  • Interoperability between systems and software
  • Structural base for discovery and comparison

5
Example Dublin Core
  • Print card catalogs
  • Standalone databases
  • WorldCat and Google
  • Static
  • stationary
  • Proprietary structure
  • Little cross-site searching
  • Standardized content
  • Cross-site searching

6
Interacting Standards for Data
  • Dublin Core
  • ISO/IEO 11179
  • ISO 19118 Geography
  • Statistical Packages
  • METS
  • PREMIS
  • SDMX
  • DDI
  • Citation structure
  • Coverage
  • Temporal
  • Topical
  • Spatial
  • Location specific information

7
Interacting Standards for Data
  • Dublin Core
  • ISO/IEO 11179
  • ISO 19118 Geography
  • Statistical Packages
  • METS
  • PREMIS
  • SDMX
  • DDI
  • Structure and content of a data element as the
    building block of information
  • Supports registry functions
  • Provides
  • Object
  • Property
  • Representation

8
Interacting Standards for Data
  • Dublin Core
  • ISO/IEO 11179
  • ISO 19118 Geography
  • Statistical Packages
  • METS
  • PREMIS
  • SDMX
  • DDI
  • US FGDC and MN standard
  • Focus is on describing spatial objects and their
    attributes

9
Interacting Standards for Data
  • Dublin Core
  • ISO/IEO 11179
  • ISO 19118 Geography
  • Statistical Packages
  • METS
  • PREMIS
  • SDMX
  • DDI
  • Proprietary standards
  • Content is generally limited to
  • Variable name
  • Variable label
  • Data type and structure
  • Category labels
  • Translation tools used to transport content

10
Interacting Standards for Data
  • Dublin Core
  • ISO/IEO 11179
  • ISO 19118 Geography
  • Statistical Packages
  • METS
  • PREMIS
  • SDMX
  • DDI
  • Digital Library Federation
  • Consistent outer wrapper for digital objects of
    all type
  • Contains a profile providing the structural
    information for the contained object

11
Interacting Standards for Data
  • Dublin Core
  • ISO/IEO 11179
  • ISO 19118 Geography
  • Statistical Packages
  • METS
  • PREMIS
  • SDMX
  • DDI
  • Preservation information for digital objects

12
Interacting Standards for Data
  • Dublin Core
  • ISO/IEO 11179
  • ISO 19118 Geography
  • Statistical Packages
  • METS
  • PREMIS
  • SDMX
  • DDI
  • Developed for statistical tables
  • Supports well structured, well defined data,
    particularly time-series data
  • Contains both metadata and data
  • Supports transfer of data between systems

13
Interacting Standards for Data
  • Dublin Core
  • ISO/IEO 11179
  • ISO 19118 Geography
  • Statistical Packages
  • METS
  • PREMIS
  • SDMX
  • DDI
  • Version 3.0 covers life-cycle of data and
    metadata
  • Data collection
  • Processing
  • Management
  • Reuse or repurposing
  • Support for registries
  • Grouping Comparison

14
Metadata Coverage
  • Packaging
  • Citation
  • Geographic Coverage
  • Temporal Coverage
  • Topical Coverage
  • Structure information
  • Physical storage description
  • Variable (name, label, categories, format)
  • Source information
  • Methodology
  • Detailed description of data
  • Processing
  • Relationships
  • Life-cycle events
  • Management information
  • Dublin Core
  • ISO/IEO 11179
  • ISO 19118
  • Statistical Packages
  • METS
  • PREMIS
  • SDMX
  • DDI

15
Metadata Coverage
  • Packaging
  • Citation
  • Geographic Coverage
  • Temporal Coverage
  • Topical Coverage
  • Structure information
  • Physical storage description
  • Variable (name, label, categories, format)
  • Source information
  • Methodology
  • Detailed description of data
  • Processing
  • Relationships
  • Life-cycle events
  • Management information
  • Dublin Core
  • ISO/IEO 11179
  • ISO 19118
  • Statistical Packages
  • METS
  • PREMIS
  • SDMX
  • DDI

16
Metadata Coverage
  • Packaging
  • Citation
  • Geographic Coverage
  • Temporal Coverage
  • Topical Coverage
  • Structure information
  • Physical storage description
  • Variable (name, label, categories, format)
  • Source information
  • Methodology
  • Detailed description of data
  • Processing
  • Relationships
  • Life-cycle events
  • Management information
  • Dublin Core
  • ISO/IEO 11179
  • ISO 19118
  • Statistical Packages
  • METS
  • PREMIS
  • SDMX
  • DDI

17
Metadata Coverage
  • Packaging
  • Citation
  • Geographic Coverage
  • Temporal Coverage
  • Topical Coverage
  • Structure information
  • Physical storage description
  • Variable (name, label, categories, format)
  • Source information
  • Methodology
  • Detailed description of data
  • Processing
  • Relationships
  • Life-cycle events
  • Management information
  • Dublin Core
  • ISO/IEO 11179
  • ISO 19118
  • Statistical Packages
  • METS
  • PREMIS
  • SDMX
  • DDI

18
Metadata Coverage
  • Packaging
  • Citation
  • Geographic Coverage
  • Temporal Coverage
  • Topical Coverage
  • Structure information
  • Physical storage description
  • Variable (name, label, categories, format)
  • Source information
  • Methodology
  • Detailed description of data
  • Processing
  • Relationships
  • Life-cycle events
  • Management information
  • Dublin Core
  • ISO/IEO 11179
  • ISO 19118
  • Statistical Packages
  • METS
  • PREMIS
  • SDMX
  • DDI

19
Metadata Coverage
  • Packaging
  • Citation
  • Geographic Coverage
  • Temporal Coverage
  • Topical Coverage
  • Structure information
  • Physical storage description
  • Variable (name, label, categories, format)
  • Source information
  • Methodology
  • Detailed description of data
  • Processing
  • Relationships
  • Life-cycle events
  • Management information
  • Dublin Core
  • ISO/IEO 11179
  • ISO 19118
  • Statistical Packages
  • METS
  • PREMIS
  • SDMX
  • DDI

20
Metadata Coverage
  • Packaging
  • Citation
  • Geographic Coverage
  • Temporal Coverage
  • Topical Coverage
  • Structure information
  • Physical storage description
  • Variable (name, label, categories, format)
  • Source information
  • Methodology
  • Detailed description of data
  • Processing
  • Relationships
  • Life-cycle events
  • Management information
  • Dublin Core
  • ISO/IEO 11179
  • ISO 19118
  • Statistical Packages
  • METS
  • PREMIS
  • SDMX
  • DDI

21
Metadata Coverage
  • Packaging
  • Citation
  • Geographic Coverage
  • Temporal Coverage
  • Topical Coverage
  • Structure information
  • Physical storage description
  • Variable (name, label, categories, format)
  • Source information
  • Methodology
  • Detailed description of data
  • Processing
  • Relationships
  • Life-cycle events
  • Management information
  • Dublin Core
  • ISO/IEO 11179
  • ISO 19118
  • Statistical Packages
  • METS
  • PREMIS
  • SDMX
  • DDI

22
DDI Full content coverage for survey and
administrative data
  • Conceptual coverage
  • Methodology
  • Data Collection
  • Processing cleaning, paradata
  • Recoding and derivations
  • Variable and tabular content
  • Internal relationships
  • Physical storage
  • Data management

23
Plus Relationships between studies
  • Comparison by design
  • Study series can inherit from earlier metadata
  • Capture changes only
  • Data integration
  • Mapping of codes between source and target
  • Capture comparison information
  • Comparison of abstract content models
  • Publication of reusable materials (code schemes,
    concept schemes, geographic structure, etc.)

24
Why bother?
  • Researcher perspective
  • Improved data mining between and across systems
  • Increased explicit implicit comparison
  • Interoperability
  • Improved access to detailed metadata
  • Ability to reuse rather than repeat metadata
    content

25
Why bother?
  • Data Collector
  • Support for internal consistency
  • Early capture of a broad range of metadata
  • Interoperability
  • Reuse of metadata inheritance
  • Retention of explicit relationships between data
    collection and the resulting data files

26
Why bother?
  • Knowledge-based organization
  • Interoperability
  • Supports consistent use concepts, questions,
    variables, etc. throughout organization
  • Supports implicit comparison through reuse of
    content
  • Supports explicit comparison by mapping content
    between studies and to standard content
  • Captures metadata/knowledge at point of creation

27
Why bother?
  • Data Manager
  • Interoperability
  • Flexibility in data storage
  • Reuse of content
  • Strong data typing

28
DDI does not replace good content
  • DDI structures metadata to leverage content
  • Collection and processing
  • Discovery and access
  • Analysis and repurposing
  • Registries
  • Comparison
  • DDI is not a software application
  • Supports and informs software applications
  • DDI is a neutral archival structure
  • Preserving content and relationships

29
INDEPTH/DSS Example
  • 38 Demographic Surveillance Sites in 19 countries
    spanning Africa, South Asia, Central American and
    Oceania
  • Diverse yet similar health research portfolios
  • Data management goals
  • Standardize and harmonize data collection tools
  • Cross-site comparability of information
  • Sharing data effectively and efficiently

30
Reasons for choosing DDI
  • It will be ideal to describe our data for the
    purposes of the Data Repository
  • It has really powerful features that will enable
    us to standardise several facets of our work.
  • I originally underestimated the usefulness DDI
    will have as a means to harmonised data
    collection between sites.
  • Ability to expand comparison and harmonization
    with additional groups such as AIDS research team

31
Future DDI Developments
  • Controlled vocabularies to improve machine
    actionability
  • Data collection methodology and process expansion
    for more depth and detail
  • Qualitative data
  • Increased comparison coverage
  • Tools

32
Contacts
  • DDI Alliance
  • http// www.ddialliance.org
  • Link to DDI Technical Specification
  • http//www.ddialliance.org/ddi3/index.html
  • DDI Users Group Sign-up
  • http//www.ddialliance.org/DDI/codebook/listserv.h
    tml
  • Wendy Thomas, Chair, DDI Technical Implementation
    Committee
  • wlt_at_pop.umn.edu
Write a Comment
User Comments (0)
About PowerShow.com