Putting DDI 3'0 to Work for You - PowerPoint PPT Presentation

1 / 74
About This Presentation
Title:

Putting DDI 3'0 to Work for You

Description:

... values (incorrect computation or recode logic), inconsistent or undocumented codes ... variables (archive-specific, indexes, recodes, etc.) if appropriate. ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 75
Provided by: san104
Category:
Tags: ddi | putting | recode | work

less

Transcript and Presenter's Notes

Title: Putting DDI 3'0 to Work for You


1
Putting DDI 3.0 to Work for You!
  • Sanda Ionescu,
  • Documentation Specialist, ICPSR
  • Mary Vardigan,
  • DDI Alliance Director
  • IASSIST Conference Stanford UniversityMay 27,
    2008

2
Todays Schedule
  • 900 915 Brief DDI History and Intro
  • 915 930 Life Cycle Early Stages
  • 930 1045 Life Cycle Exercise
  • 1045 1100 Break
  • 1100 1150 Life Cycle Archive Beyond
  • 1150 1200 Questions and Answers

3
First Half of Morning
  • We will be moving through the data life cycle of
    a real study and will document it as we go.
  • We will use a tool to produce markup for seven
    life cycle stages.
  • Sanda will guide us through the exercise and Mary
    will go step by step onscreen.
  • End result is DDI documentation deposited into an
    archive.

4
Second Half
  • Once our sample data and documentation are
    deposited, we review the changes made by the
    archive.
  • Then we discuss DDI 3.0 in the archival context
    and why it makes sense to use it.
  • Finally, assuming we have convinced you, we
    discuss how to move to DDI 3.0!

5
DDI History
  • Effort began in 1995 when ICPSR convened a small
    international group at IASSIST in Quebec City.
  • Standard began as SGML, then converted to
    Web-friendly XML.
  • 2000 DDI Version 1.0 published as a DTD, mainly
    document- and codebook-centric.

6
DDI History
  • 2003 DDI Version 2.0 published with extended
    scope including aggregate data coverage and
    geography.
  • Versions 1.0 through 2.1 (latest published) are
    backwards compatible, and based on the same
    structure.

7
DDI History
  • February 2003 Formation of the DDI Alliance, a
    self-sustaining membership organization whose
    members have a voice in the development of the
    DDI specification.
  • http//www.ddialliance.org/

8
DDI History
  • Version 3.0
  • 2004-2006 Planning and Development

  • November 2006 Internal Review
  • February 2007 Public Review
  • July 2007 Candidate Draft Release
  • April 2008 Proof of Concept and Vote
  • April 28, 2008 Official Publication of DDI 3.0
  • http//www.ddialliance.org/ddi3/index.html

9
DDI 3.0 Features
  • Full implementation of XML Schemas
  • Emphasis on metadata reuse
  • Modular structure
  • Use of schemes

10
DDI 3.0 FeaturesModular structure
  • Allows increased flexibility in using the
    specification.
  • Main modules

Instance
Study Unit
Resource Package
Group
Conceptual Components
Data Collection
Logical Product
Physical Instance
Physical Data Product
Comparative
Archive
11
DDI 3.0 FeaturesUse of Schemes
  • Facilitates reuse of information
  • Categories
  • Codes
  • NCubes
  • Physical Structures
  • Record Layouts
  • Organizations
  • Concepts
  • Universes
  • Geographic Locations
  • Geographic Structures
  • Questions
  • Interviewer Instructions
  • Variables

12
DDI 3.0 Features
  • Machine-actionable
  • Grouping and comparison features
  • Registries now possible
  • Versioning clarified
  • Multi-lingual support

13
DDI 3.0 Features
  • Compatibility with other metadata standards
  • MARC, DC, but also
  • SDMX (Statistical Data and Metadata Exchange)
  • ISO 11179 (Metadata Registries)
  • FGDC (Digital Geospatial Metadata)
  • ISO 19115 (Geographic Information Metadata)
  • PREMIS, METS forthcoming
  • Life cycle orientation

14
Life Cycle Orientation
  • DDI 3.0 documents all stages in the life cycle of
    a data collection
  • pre-production production
    post-production secondary use

new research effort
15
DDI 3.0 Use Cases
  • Documenting an on-going, original research
    project.
  • Documenting secondary use of data.
  • Creating concept/question/variable banks.
  • Generating multiple delivery formats for data
    dissemination/discovery.
  • Metadata mining for comparison, etc.

16
DDI 3.0 to Document an On-going Research Project
  • DDI 3.0 can be used to document a research
  • project in real time, from its inception
    (study
  • proposal, design) through data collection,
    processing,
  • and initial data production.

17
Research Staff
Principal Investigator
Collaborators
ltDDI 3.0gt Questions Instrument
ltDDI 3.0gt Variables Physical Stores

ltDDI 3.0gt Purpose Concepts Universe Geography Peop
le/Orgs
ltDDI 3.0gt Funding Revisions



ltDDI 3.0gt Data Collection Data Processing

Data
Archive/ Repository
Submitted Proposal
Publication
18
DDI 3.0 to Document an On-going Research Project
  • Advantages
  • Richer, contextual information made available and
    preserved.
  • Increased accuracy, as life cycle stages are
    documented at the source.
  • No loss of information as study progresses
    through its life cycle.
  • Changes in documentation preserved through
    versioning.
  • Ultimately gives data analysts more information
    to understand and assess data quality.

19
DDI 3.0 to Document an On-going Research Project
  • Use case exercise
  • Academic environment.
  • Faculty member/researcher initiates an original,
    independent research project.
  • Small-scale effort.
  • No use of computer-assisted interviewing
    software.
  • Resulting data and documentation to be deposited
    to a data center/archive.
  • Archive provides incentives and support for
    documenting all activities in DDI as they happen.

20
DDI 3.0 to Document an On-going Research Project
  • Incentives for entering documentation at the
    source
  • Information easy to enter use of data entry tool
    hides complexities of xml code.
  • Underlying DDI structure provides prompts and
    pre-organizes information.
  • DDI may also serve as a management/diagnostic
    tool to assist in data processing and cleaning
    operations, or revising the documentation.
  • Real-time entries and standardized content ensure
    high-quality documentation that facilitates
    primary data analysis and preparing reports.

21
DDI 3.0 to Document an On-going Research Project
  • Use case exercise
  • Based on a real study in the ICPSR archive
    (ICPSR study No. 9413, Survey of Three
    Generations of Mexican Americans, 1981-1982)
  • Study documentation is laid out sequentially
    according to the life cycle.
  • http//www.icpsr.umich.edu/DDI/ddi3/workshop
  • Data entry tool provides a user-friendly
    interface and is projected to produce DDI 3.0
    output follows life cycle, but may also be used
    retrospectively.

22
Life Cycle StagesStudy Proposal
WHO? (Principal Investigator)
When?
(November 1st, 1979)
WHO? (Co-authors)
Research Question(s) Hypotheses Population
Geographic Area Provisional Title
23
Life Cycle StagesStudy Proposal Input
http//www.icpsr.umich.edu/DDI/ddi3/workshop/DDI3_
TransformerTOOL/DDIv2dot2.html
24
Life Cycle StagesStudy Proposal DDI 3.0 Output
DDI
WHO? (Principal Investigator)
Archive Individual
Life Cycle Event Responsibility Date
When?
WHO? (Co-authors)
Study Unit Creator (s) Title Purpose Universe
Ref. Spatial Coverage
(Provisional Title) Research Question(s) Hypothese
s Population Geographic Area
Conceptual Component Universe Geographic
Structure
http//www.icpsr.umich.edu/DDI/ddi3/workshop/files
/iassist_stdyprop.pdf
25
Life Cycle StagesStudy Funding
WHO? Funding Agency
WHEN?
(June 1st, 1980)
Proposal
Grant 5-R01-AG-01573
26
Life Cycle StagesStudy Funding Input
http//www.icpsr.umich.edu/DDI/ddi3/workshop/DDI3_
TransformerTOOL/DDIv2dot2.html
27
Life Cycle StagesStudy Funding DDI 3.0 Output
DDI
Archive Organization
WHO? Funding Agency
Study Unit Funding Agency Grant Number
Life Cycle EventResponsibility Date
Proposal
http//www.icpsr.umich.edu/DDI/ddi3/workshop/files
/iassist_stdyfunding.pdf
28
Life Cycle StagesDefining Concepts
WHO?
WHEN?
(July 1st, 1980)
Question/Concept Bank
Research Questions
()
Study Concepts

29
Life Cycle StagesDefining Concepts Input
http//www.icpsr.umich.edu/DDI/ddi3/workshop/DDI3_
TransformerTOOL/DDIv2dot2.html
30
Life Cycle StagesDefining Concepts DDI 3.0
Output
DDI
Life Cycle Event Responsibility, Date
DDI Concept Scheme
(Ref.)
Question/Concept Bank
Research Questions
()
Study Concepts

http//www.icpsr.umich.edu/DDI/ddi3/workshop/files
/iassist_concepts.pdf
31
Life Cycle StagesQuestionnaire Design
WHO?
WHEN?
(July 25, 1980)
Question/Concept Bank
Study Concepts
()
Questions, Responses

32
Life Cycle StagesQuestionnaire Design Input
http//www.icpsr.umich.edu/DDI/ddi3/workshop/DDI3_
TransformerTOOL/DDIv2dot2.html
33
Life Cycle StagesQuestionnaire Design DDI 3.0
Output
DDI
Life Cycle Event Responsibility, Date
DDI Question Scheme
(Ref.)
Question/Concept Bank
Study Concepts
()
Logical Product Category Scheme(s) Code Schemes
Questions, Responses

http//www.icpsr.umich.edu/DDI/ddi3/workshop/files
/iasssist_questions.pdf
34
Life Cycle StagesQuestionnaire Translation
WHO?
WHEN?
(September 1st, 1980)
Original Language Questions, Responses
Translated Questions, Responses
35
Life Cycle StagesQuestionnaire Translation Input
http//www.icpsr.umich.edu/DDI/ddi3/workshop/DDI3_
TransformerTOOL/DDIv2dot2.html
36
Life Cycle StagesQuestionnaire Translation DDI
3.0 Output
DDI
Life Cycle Event Responsibility, Date
DDI Question Scheme -Bilingual Version-
Original Language Questions, Responses
Logical Product Category Scheme(s) -Bilingual
Version-
Translated Questions, Responses
http//www.icpsr.umich.edu/DDI/ddi3/workshop/files
/iassist_transl_qstns.pdf
37
Life Cycle StagesData Collection
WHO?
WHO?
(1981-1982)
REPORT
SAMPLE
(October 15, 1980 April 1st, 1981)
38
Life Cycle StagesData Collection Input
http//www.icpsr.umich.edu/DDI/ddi3/workshop/DDI3_
TransformerTOOL/DDIv2dot2.html
39
Life Cycle StagesData Collection DDI 3.0 Output
DDI
Life Cycle Events Responsibility, Dates
Data Collection Responsibility Date Sampling Mode
Of Collection Note
http//www.icpsr.umich.edu/DDI/ddi3/workshop/files
/iassist_datacoll.pdf
40
Life Cycle StagesData Production
WHO?
WHEN?
(1983)
QA
DATA
41
Life Cycle StagesData Production Input
http//www.icpsr.umich.edu/DDI/ddi3/workshop/DDI3_
TransformerTOOL/DDIv2dot2.html
42
Life Cycle StagesData Production DDI 3.0 Output
DDI
Life Cycle Event Responsibility, Date
Data Collection (Processing Operations)
Logical Product Variable Scheme Additional
Code/Category Schemes Missing Data
Physical Data Product Record Structure Variables
Locations
QA
Physical Instance (Processing Checks) Number of
Cases Number of Records
DATA
http//www.icpsr.umich.edu/DDI/ddi3/workshop/files
/iassist_dataprod.pdf
43
  • BREAK

44
Life Cycle Stages Data Cleaning and Processing
DDI as diagnostic/management tool
  • The presence of standardized documentation
    facilitates data processing.
  • DDI documentation can be used as a project
    dashboard to identify problems and keep track
    of operations.
  • Queries can address
  • Data errors missing values, out-of-range values
    (incorrect computation or recode logic),
    inconsistent or undocumented codes
  • Missing documentation question text, description
  • Editing errors missing labels, misspelled
    variable names

45
Life Cycle StagesDeposit to Archive
  • At the time of deposit, both the research process
    and the data are already documented in DDI
  • Advantages
  • The presence of standardized information
    facilitates archival processing, enabling
    procedure streamlining and automation.
  • Richer, more accurate information made available
    for preservation, archival processing and
    dissemination enhances data discovery and
    secondary analysis.

46
Life Cycle StagesDeposit to Archive
  • Richer, more accurate information. Examples
  • Original / working title preserved (may be found
    in early reports, published prior to any title
    changes).
  • Authors affiliation and position at the time of
    research.
  • Responsible agencies and dates made available for
    all life cycle events.
  • Parallel / associated research efforts and
    publications accurately documented.

47
Life Cycle StagesDeposit to Archive
  • Richer, more accurate information. Examples
  • Presence of concepts represents an important
    added value for data discovery, appraisal, and
    further analysis.
  • Documented source of concepts and questions
    (original or re-used) is relevant for secondary,
    and particularly comparative analysis efforts.
  • For bi- or multilingual studies, multiple
    language versions of descriptive elements are
    made available side-by-side, facilitating
    comparison, analysis and/or filtered specific
    language(s) retrieval.
  • http//www.icpsr.umich.edu/cocoon/DDI3/worksh
    op/9413_CR3_2_DataProd.xml?displayvarshighlight-
    tokenno

48
Life Cycle StagesDeposit to Archive
  • Use of DDI throughout the study life cycle
    prevents loss of information.
  • Preservation of successive versions allows
    early-bound information retrieval.
  • To meet specific goals and needs, the archive may
    create its own version(s) of the documentation,
    but will also preserve the originally deposited
    version.
  • The DDI format enables easy, automated navigation
    among all existing versions.

49
Life Cycle StagesArchival Processing Data and
Documentation
  • The archive becomes the maintaining agency and
    creates its own instance
  • The archive is described as organization, as
    owner/maintainer of collection, and specified as
    (new) publisher and/or distributor, with
    appropriate date(s).
  • Original archive (depositor to present archive)
    referenced in the archive module.
  • Reference may also be included to originally
    deposited DDI that is preserved and also made
    accessible.

50
Life Cycle StagesArchival Processing Data and
Documentation
  • The archive edits or adds information and
    populates new DDI fields to support archival
    operations
  • Edits title to conform to archives standards
    (ICPSR adds study date)
  • Updates authors affiliation according to current
    position, and adds/updates contact information
    (telephone, e-mail, current address, etc.)
  • Adds subject headings and keywords to assist data
    discovery (searches at study level)

51
Life Cycle StagesArchival Processing Data and
Documentation
  • The archive edits or adds information
  • Adds study abstract, integrating purpose with
    description of data collection and the final data
    product.
  • Adds structured methodological information,
    enabling more granular, targeted searches (e.g.,
    temporal coverage, analysis unit(s) covered, kind
    of data, data source).

http//www.icpsr.umich.edu/cocoon/DDI3/workshop/94
13_CR3_2_ARCHIVE.xml?highlight-tokenyes
52
Life Cycle StagesArchival Processing Data and
Documentation
  • The archive documents any in-house,
    post-production processing as well as resulting
    changes in the data
  • New data file identification, to reflect archive
    location.
  • Description of processing checks performed by
    archive.
  • Description of added variables (archive-specific,
    indexes, recodes, etc.) if appropriate.
  • Variable- and category-level statistics may be
    calculated and added to the DDI documentation to
    enhance variables description.

53
Life Cycle StagesArchival Processing Data and
Documentation
  • The archive adds an itemized description of the
    entire distribution package associated with a
    study, including archival-specific information
    like availability, access conditions/restrictions,
    and collection completeness, as well as
    item-level identification, URI, format, medium,
    etc.

http//www.icpsr.umich.edu/cocoon/DDI3/workshop/94
13_CR3_2_ARCHIVE.xml?highlight-tokenyes
54
Integrating DDI 3 into Archives
  • What is in it for us?
  • Standardized study descriptions provide for
    integration and consistency between collection
    catalog and documentation products.
  • Standardized documentation supports automated
    generation of multiple delivery formats,
    including PDF and HTML.

55
Integrating DDI 3 into Archives
  • What is in it for us?
  • DDI 3 enables the creation of an expanded
    scientific record covering the full life cycle,
    including instrument documentation.
  • DDI 3 supports streamlining and increased
    automation of archival operations.
  • DDI 3 instances can carry data inline.
  • DDI 3 has improved functionality for
    complex/hierarchical files.

56
Integrating DDI 3 into Archives
Improved functionality for complex/hierarchical
files. Example
https//www.icpsr.umich.edu/DDI/ddi3/workshop/
57
Integrating DDI 3 into Archives
  • What is in it for us?
  • DDI 3 facilitates grouping and comparison from
    the highest level to the lowest
  • Mechanism to organize series information, showing
    only what changes over time.
  • Variable harmonization and comparison.

58
Integrating DDI 3 into Archives
  • What is in it for us?
  • Modular structure and use of schemes allow
    creation of meta-resources, offering additional
    functionality
  • Question/concept/variable banks
  • Geography databases
  • Organizations/Individuals registries

59
Integrating DDI 3 into Archives
  • What is in it for us?
  • Concept/question/variable banks
  • Metadata reuse
  • Cross-study variable/question/concept searches
    and analyses
  • Cross-study comparisons
  • Track questions/variables over time
  • Register an organizations official measures

60
Integrating DDI 3 into Archives
Concept/question/variable banks
.
61
Integrating DDI 3 into Archives
Concept/question/variable banks
.
62
Integrating DDI 3 into Archives
Concept/question/variable banks
.
63
Integrating DDI 3 into Archives
  • Geography databases /registries
  • Automatically match locations with appropriate
    geographic level
  • Keep track of historical changes
  • Information always accurate and up-to-date
  • Facilitate data entry

64
Integrating DDI 3 into Archives
  • Organizations/Individuals registries
  • Keep track of historical changes (names,
    affiliations, contact information, etc.)
  • Information always accurate and up-to-date
  • Facilitate data entry

65
Integrating DDI 3 into Archives
  • What is in it for us?
  • Preservation
  • Life cycle orientation of documentation means
    that a chain of custody is provided to meet
    preservation requirements.
  • Archives can use the life cycle events to track
    data processing activities (data transformation).
  • The structure of DDI 3.0 integrates well with
    FEDORA (Flexible Extensible Digital Object
    Repository Architecture) a digital repository
    management system used by many archives.
  • Separate instances can be created to follow the
    OAIS model SIP, AIP, DIP.

66
Integrating DDI 3 into Archives
  • Information sharing
  • Use of DDI 3 facilitates information sharing and
    collaborative projects among archives
  • Example SRO-ICPSR Data Documentation and
    Dissemination project implements a common, DDI
    3.0 compliant, database model to allow a smooth
    data transfer between the two organizations.

67
Integrating DDI 3 into ArchivesSRO-ICPSR
collaboration project
ICPSR
SRO
SAS/SPSS/Stata files
DDI 3.0
Blaise output
DDI 2.x
Other
Common RELATIONAL DATABASE model for data
documentation - Compliant with DDI 3.0 -
Client Applications
Web Applications
ICPSR Variable-level Search
ICPSR projects will be able to use documentation
generated by SRO projects
68
ArchivesMoving the collection to DDI 3.0
  • Catalog records
  • Archive standard -gt map to DDI 3.0
  • Dublin Core -gt map to DDI 3.0
  • DDI 2.x -gt map to DDI 3.0
  • Conversion by simple programming script or XSLT.

69
ArchivesMoving the collection to DDI 3.0
  • Catalog record conversions
  • Examples
  • ICPSR -gt DDI 2.1 -gt DDI 3.0
  • http//www.icpsr.umich.edu/DDI/ddi3/workshop/files
    /Template_DDI2_toDDI3_Mapping_S.pdf
  • Dublin Core -gt DDI 2.1 -gt DDI 3.0
  • http//www.icpsr.umich.edu/DDI/ddi3/workshop/files
    /Dublin_Core_DDI2_toDDI3_20Mapping.pdf
  • ICPSR Stylesheet DDI 2.1 -gt DDI 3.0
  • http//www.icpsr.umich.edu/DDI/ddi3/workshop/

70
ArchivesMoving the collection to DDI 3.0
  • Legacy studies
  • Tools
  • Stats to DDI 3.0
  • DDI 3.0 editor
  • XML editor
  • DDI 2.x codebooks
  • Tools
  • DDI 2.x to DDI 3.0 converter
  • (may be stylesheet, or simple script, based on
    DDI 2.x to 3.0 mapping)

71
Resources
  • DDI 3.0 Proof of Concept -
  • Use Cases and Implementations
  • http//www.ddialliance.org/DDI/ddi3/use-cases.html
  • DDI Tools
  • http//tools.ddialliance.org/
  • Workshop materials
  • http/www.icpsr.umich.edu/DDI/ddi3/workshop

72
Contact Information
  • Sanda Ionescu sandai_at_umich.edu
  • Mary Vardigan vardigan_at_umich.edu
  • Matthew Richardson matvey_at_umich.edu
  • DDI users list
  • http//www.ddialliance.org/codebook/listserv.html

73
Questions?
74
The End.
Write a Comment
User Comments (0)
About PowerShow.com