Data Preservation Imperatives: The Role of the US National Science Foundation PowerPoint PPT Presentation

presentation player overlay
1 / 47
About This Presentation
Transcript and Presenter's Notes

Title: Data Preservation Imperatives: The Role of the US National Science Foundation


1
Data Preservation Imperatives The Role of the US
National Science Foundation
  • Lucy Nowell, Ph.D.
  • Office of Cyberinfrastructure
  • Conference on Permanent Access to the Records of
    Science
  • Brussels, Belgium
  • 15 November 2007

2
Outline
  • NSF Office of Cyberinfrastructure
  • Motivation for Data Preservation
  • Role of Universities and Academic Libraries
  • Characteristics of the Digital Age
  • NSF OCI Data Strategic Vision and Goals

3
(No Transcript)
4
NSF Act of 1950
  • To promote the progress of science
  • Encourage develop a national policy for the
    promotion of basic research and education in the
    math, physical, medical, biological, engineering
    and other sciences
  • Initiate support basic scientific research in
    the sciences

5
U.S. President
Science Advisor
Office of Management and Budget
Other boards, councils, etc.
Office of Science and Technology Policy
Science Advisor
Major Departments
Health and Human Services
Commerce
Agriculture
Interior
Homeland Security
Defense
Energy
Independent Agencies
National Aeronautic and Space Administration
Environmental Protection Agency
Nuclear Regulatory Commission
Smithsonian Institution
Other agencies
6
  • Research Directorates
  • Biological Sciences
  • Computer Info. Science Eng.
  • Education Human Resources
  • Engineering
  • Geosciences
  • Mathematical Physical Sciences
  • Social, Behaviorial Econ. Sciences

7
New Modes of Investigation The conduct of
science and engineering is changing and evolving.
This is due, in large part, to the expansion of
networked cyberinfrastructure
NSF Strategic Plan 2006-2011
8
Office of CyberInfrastructure (OCI)
Dan Atkins Office Director José Muñoz Dep.
Office Dir.
Judy Hayden Mary Daley Irene Lombardo Deborah
White
Terry Langendoen
Lucy Nowell
Diana Rhoten
Kevin Thompson
Steve Meacham, Abani Patra
Learning Workforce
Virtual Organizations
Software/ Middleware
High Performance Computing
Data
9
Cyberinfrastructure
is the organized aggregate of technologies that
enable us to access and integrate todays
information technology resourcesdata and
storage, computation, communication,
visualization, networking, scientific
instruments, expertiseto facilitate science and
engineering goals. - Fran Berman, Director,
SDSC
10
CI Vision 4 Interrelated Perspectives
Collaboratories, Observatories Virtual
Organizations
Learning Workforce Development
11
The Fragility of Memory in a Digital Age
In 1964, the first electronic mail message was
sent from either MIT, the Carnegie Institute, or
Cambridge University. The message does not
survive, however, and so there is no documentary
record to determine which group sent the
pathbreaking message.
Report of the Task Force on Archiving of Digital
Information Commission on Preservation and Access
and the Research Libraries Group
12
NASA plans new search for missing moon tapes
  • Aug. 15, 2006, 513PM
  • Seth Borenstein, Associated Press
  • WASHINGTON NASA said today it was launching an
    official search for more than 13,000 original
    tapes of the historic Apollo moon missions.

13
Study Resource type Resource half-life
Koehler (1999 and 2002)  Random Web pages  2.0 years
Nelson and Allen (2002)  Digital Library Object  24.5 years
Harter and Kim (1996)  Scholarly Article Citations  1.5 years
Rumsey (2002)  Legal Citations  1.4 years
Markwell and Brooks (2002)  Biological Science Education Resources  4.6 years
Spinellis (2003)  Computer Science Citations  4.0 years
Source Koehler W. (2004) Information Research,
9 (2), 174
14
Replication of Results A Cornerstone of Science
  • the results of one scientist's experiment are
    not considered reliable until another scientist
    has replicated them. The reproducibility of
    results plays several different, crucial roles in
    sciencebut in many circumstances,
    considerations of time and money often make
    reproducibility impractical.
  • The Key Role of Replication in Science, Nancy S.
    Hall, The Chronicle of Higher Education, 10
    November 2000

15
Replication of Results
  • First and foremost, scientists attempt to
    reproduce someone else's experiment if they doubt
    that the results are accurate, or if the results
    contradict a view that is widely accepted in the
    field.
  • An experiment is so reproducible that replicating
    it becomes a test of the student if the student
    cannot replicate the experiment, it is the
    student who is at fault.
  • As a training exercise, a new person in a group
    might be asked to repeat experiments that others
    have already performed, both to familiarize the
    newcomer with the work of the group and to give
    the older members a sense of the newcomer's
    expertise.
  • The Key Role of Replication in Science, Nancy S.
    Hall, The Chronicle of Higher Education, 10
    November 2000

16
Replication of Data Collection Not Always Feasible
  • Medical experiments carried out over years or
    decades, involving hundreds or even thousands of
    human subjects.
  • Events that are singular and beyond the
    experimenter's control, like comets, earthquakes,
    and volcanic eruptions.
  • The Key Role of Replication in Science, Nancy S.
    Hall, The Chronicle of Higher Education, 10
    November 2000

17
A Global Response
  • Ensuring research data are easily accessible,
    so that they can be used as often and as widely
    as possible, is a matter of sound stewardship of
    public resources.

Organization for Economic Cooperation and
Development (OECD) Promoting Access to Public
Research Data for Scientific, Economic, and
Social Development
18
A Challenge for Society
  • If we are effectively to preserve for future
    generations the . corpus of information in
    digital form that represents our cultural record,
    we need to commit ourselves technically,
    legally, economically, and organizationally to
    the full dimensions of the task.

Report of the Task Force on Archiving of Digital
Information, 1996 Commission on Preservation and
Access and the Research Libraries Group
19
The Universities
  • Ever since their inception, universities have
    been occupied with the fundamental elements of
    what we now call 'knowledge management', i.e. the
    creation, collection, preservation and
    dissemination of knowledge.

Andre Oesterlinck, Knowledge Management in
Post-Secondary Education Universities
20
  • The distinctive mission of the University is to
    serve society as a center of higher learning,
    providing long-term societal benefits through
    transmitting advanced knowledge, discovering new
    knowledge, and functioning as an active working
    repository of organized knowledge.

Mission Statement of the University of California
21
The Academic Libraries
  • It is to the research library community that
    others will look for the preservation of
    digital assets, as they have looked to us in the
    past for reliable, long-term access to the
    traditional resources and products of research
    and scholarship.

Association of Research Libraries (ARL) Strategic
Plan 2005-2009
22
  • Information is the currency of the digital age
    and information integration is the means for
    mobilizing that currency for discovery,
    innovation, learning, and progress.

23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
Before the Digital Age A World Constrained to 4
Dimensions
28
5th Dimension
CI
29
Opening a 5th dimension through
cyberinfrastructure is the revolutionary force of
the digital age
30
Characteristics of a 5D World (in priority order)
  1. Time and place are no longer barriers to
    participation and interaction
  2. Access is open to specialists and non-specialists
    alike
  3. Information is the primary driver for progress
  4. The realm of the possible is expanded through new
    capabilities, resources, and mechanisms

31
Individuals, groups, organizations, and nations
that dont embrace the 5th dimension will fall
behind in the digital age
32
The World Is Flat - Thomas Friedman
The flat world is expanding -Anonymous OCI
program director
  • More room for innovation
  • New spaces for learning and discovery
  • Expanded opportunities for collaboration and
    interaction
  • Greater capabilities for research and
    education

33
NSF Draft Strategic Plan for Data, Data
Analysis, and Visualization
Chapter 3
http//www.nsf.gov/pubs/2007/nsf0728/index.jsp
34
Vision
  • Science and engineering digital data are
    routinely deposited in a well-documented form,
    are regularly and easily consulted and analyzed
    by specialists and non-specialists alike, are
    openly accessible while suitably protected, and
    are reliably preserved.
  • NSF Cyberinfrastructure Vision for 21st Century
    Discovery, Chapter 3

35
Goals
  • To catalyze the development of a system of
    science and engineering data collections that is
    open, extensible and evolvable.
  • To support development of a new generation of
    tools and services facilitating data acquisition,
    mining, integration, analysis, and visualization.

36
Principles
  • Data generated with NSF funding will be
    accessible and reliably preserved
  • Research/education opportunities determine
    investment priorities
  • Broad community engagement is necessary in
    reviewing and prioritizing data activities

37
Principles (contd)
  • Data is only useful if it can be found,
    understood, and analyzed
  • Legitimate privacy, confidentiality, and
    intellectual property rights must be protected
  • International, interagency, and public- private
    partnerships are essential

38
Digital Data Preservation and Access Framework
  • User-centric
  • Multi-Sector
  • Sustainable
  • Reliable
  • Nimble

39
DataNet
  • A robust and resilient national and global
    digital data framework for preservation and
    access to the resources and products of the
    digital age
  • Provide reliable digital preservation, access,
    integration and analysis capabilities for science
    and/or engineering over a decades-long timeline
    sustainability
  • Continuously anticipate and adapt to changes in
    technologies user needs and expectations
  • Engage at the frontiers of science engineering
    research education, with research development
    to drive the leading edge forward
  • Serve as component elements of an interoperable
    data preservation and access network, spanning
    national and international boundaries shared
    governance and standards
  • Creation of new types of organizations that fully
    integrate all of these capabilities

40
DataNet Partners
  • Combine expertise in library and archival
    sciences computer, computational and information
    sciences cyberinfrastructure and domain
    sciences and engineering
  • Develop models for economic and technological
    sustainability over multiple decades
  • Engage at the frontiers of science and
    engineering research and education
  • Work cooperatively and in coordination to to
    create a functional data network with
    revolutionary new capabilities for information
    access, use, and integration without regard to
    conventional barriers such as data type and
    format, discipline or subject area, and time and
    place/institution.

41
DataNet Partner Responsibilities
  • Provide for full data management life cycle
  • Data deposition/acquisition/ingest
  • Data curation metadata management
  • Data protection, including privacy
  • Data discovery, access, use, dissemination
  • Data interoperability, standard, integration
  • Data evaluation, analysis, visualization
  • Engage in research central to DataNet
    responsibilities
  • Education training
  • Community user input assessment
  • International engagement collaborate
    coordinate closely with preservation access
    organizations to catalyze formation of a global
    data network
  • Foreign collaborators are expected to secure
    support from their own national sources.

42
Summary Strategic Plan
  • Promote a change in culture
  • Catalyze development of a national digital data
    framework
  • Support new generations of tools, services, and
    capabilities

43
NSFNet Traffic September 1991
44
The World Wide DataNet _at_ TT0
Data point-of-presence
45
The World Wide DataNet _at_ TTN
46
The Whole Is Greater Than the Sum of Its Parts
  • Climate Change
  • Pandemic
  • Drought and Starvation
  • Sustainable Energy
  • Aging Populations
  • Human Behavior under Stress
  • Etc.

47
Thank you!
Write a Comment
User Comments (0)
About PowerShow.com