Looking into the future - PowerPoint PPT Presentation

About This Presentation
Title:

Looking into the future

Description:

Looking into the future Providing Social Science Data Services Jim Jacobs * But we will make use of ddi at every stage in the life cycle First principles Metadata ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 20
Provided by: jj81
Learn more at: http://3stages.org
Category:

less

Transcript and Presenter's Notes

Title: Looking into the future


1
Looking into the future
  • Providing Social Science Data Services
  • Jim Jacobs

2
First principles
  • Metadata are data about data -- information about
    information.
  • Its all about having complete, accurate,
    re-usable metadata.
  • Software to process the metadata is secondary. We
    should be able to have metadata today that we
    know will be usable in unforeseeable computing
    environments (operating systems, software,
    hardware).

3
First principles
  • Metadata should be
  • Comprehensive
  • Complete
  • Uncompromised
  • Consistent
  • Flexible
  • Sharable
  • Usable and re-usable
  • Preservable
  • Parseable by computer
  • Documented
  • Non-proprietary

4
How XML fits in
  • XML is designed to be parseable with generic
    tools.
  • XML can encode meaning and can be
    self-documenting
  • XML is non-proprietary, open, flexible.

5
How XML fits in
XML is designed to make it easy to find and
usejust the elements you need from a large
document.
Cherry picking
6
How XML fits in
ltstdyDscrgt ltcitationgt
lttitlStmtgt lttitlgtGreat Power Wars,
1495-1815lt/titlgt ltIDNogt9955lt/IDNogt
lt/titlStmtgt ltrspStmtgt
ltAuthEntygtLevy, Jack S.lt/AuthEntygt
lt/rspStmtgt ltprodStmtgt
ltfundAggtNational Science Foundation.lt/fundAggt
ltgrantNogtSES86-10567lt/grantNogt
lt/prodStmtgt ltdistStmtgt
ltdistrbtr abbr"ICPSR" affiliation"Institute for
Social Research, University of Michigan"
URI"http//www.icpsr.umich.edu"gtInter-university
Consortium for Political and Social
Researchlt/distrbtrgt ltdistDate
date"1994-05-20"gt1994-05-20lt/distDategt
lt/distStmtgt ltserStmtgt lt/serStmtgt
ltverStmtgt ltdateAddedgt1994-05-20lt/
dateAddedgt ltdateUpdatedgt1994-05-20lt/
dateUpdatedgt lt/verStmtgt
ltbiblCitgtLevy, Jack S. GREAT POWER WARS,
1495-1815 Computer file. New Brunswick, NJ and
Houston, TX Jack S. Levy and T. Clifton Morgan
lttitlgtGreat Power Wars, 1495-1815lt/titlgt
You can cherry-pick just what you need from a
large XML document
7
From legacies to the future
  • HTML
  • PDF
  • Any stat package
  • Nesstar, SDA, Dataverse
  • Library OPAC
  • Google
  • OAI, METS, etc.
  • RSS, RDF
  • GIS
  • DDI 3, 4
  • SAS
  • SPSS
  • OSIRIS
  • PDF
  • Paper
  • Data dictionary
  • Etc.

DDI
8
From many contributors to many uses
  • The web
  • Live documents
  • Databases
  • publications
  • Data archives
  • Data libraries
  • Institutional repositories
  • Secondary analysis
  • New research
  • New knowledge
  • researcher
  • Data collector
  • Analyst
  • Data producer,distributor
  • Data archivist
  • Data librarian
  • Users of statistics
  • Governmentagency

DDI
9
OAIS Functional Model
OAIS Functional Model
Archival Storage
  • Ingest

Access
10
Information Packages
OAIS Information Model
SIP
DIP
DIP
AIP
  • SIP

DIP
SIP
11
Data stewardship life cycle
12
DDI Production
13
DDI Use
14
DDI will enable transformation
  • New kinds of data discovery (beyond indexing)
  • Metadata as a primary resource (metadata as data)

15
Metadata for data discovery
  • ICPSR already uses DDI metadata to create its
    Variables database.
  • Nesstar and Dataverse software use metadata to
    produce searchable indexes of data repositories
  • In the future we should see the harvesting of DDI
    from many repositories to create indexes across
    collections. (oclc.org/oaister/)
  • In the future well see data discovery by concept
    and methodology and geography and time period,
    not just keyword.

16
Metadata as data
  • By structuring metadata according to a
    methodology (the lifecycle-of-data approach), we
    create metadata that we can treat as data.
  • We can analyze metadata the way we would analyze
    any data file.
  • As more metadata of this kind are created, we are
    accumulating a body of information that makes it
    possible to study trends across time and
    geography.

17
Metadata as data
  • The technical documentation for the Army's Korean
    conflict casualty electronic records file has
    casualty codes that were never used in the data
    files.
  • The presence of codes in the metadata for injury
    by lethal gas and by radiation exposure suggests
    that Army personnel who designed this
    record-keeping system expected the possible use
    of those as weapons. Examination of the data
    alone would have missed this suggestion.
  • The codes for 'place of casualty' included, in
    addition to South Korea Sector and North Korea
    Sector, the Indo-China Sector, Tibet Sector,
    Mongolia Sector, Honan Sector (sic), Manchuria
    Sector, North Japan Sector, South Japan Sector,
    South China Sector, and Formosa Sector."

18
Metadata as data
  • A researcher at the Danish Data Archive is doing
    a qualitative analysis of the questionnaires used
    in seven surveys about ethnic minorities in
    Danish society, "with the purpose of showing how
    surveys ... mirror and project societal
    understandings of the subjects under
    investigation."

19
Metadata as data
  • Wendy Thomas of the Minnesota Population Center
    examined U.S. Census metadata from 1790 through
    2000 and compared the changing concept of race
    and ethnicity as embodied in the categories used
    by the Census Bureau questions over time. Those
    concepts are only documented in the metadata, not
    the Census data files themselves.
Write a Comment
User Comments (0)
About PowerShow.com