Title: Preserving Research Data The Canadian Experience
1Preserving Research Data The Canadian Experience
- Charles Humphrey
- University of Alberta
- February 2005
2Outline
- Two national consultations in Canada
- National Data Archive Consultation (NDAC),
October 2000 to June 2002 - National Consultation on Access to Scientific
Research Data (NCASRD), June 2004 to spring 2005 - Findings from these consultations
- Future directions
3Support Received
- This presentation draws upon research that was
supported by the Social Sciences and Humanities
Research Council of Canada (Grant No.
421-2000-0011 421-2000-0017) and upon the work
of the National Data Archive Consultation and the
National Consultation on Access to Scientific
Research Data.
4National Data Archive Consultation
- Investigation on behalf of the National Archives
and the Social Sciences and Humanities Research
Council - Two phases
- Year 1 Demonstrate the need for national data
archiving services - Year 2 Recommend one or more models to provide
these services
5The NDAC Phase I
- The definition of research data employed by the
NDAC consists of three parts - outputs of the research process that exist
between raw research materials and published
results - digital information structured through
methodology for the purpose of producing new
knowledge - digital information produced by researchers and
of interest to researchers.
6National Data Archive Consultation. Phase One
Needs Assessment Report. May 2001.
7The Risk Level for Research Data in Canada
- Three studies were conducted in conjunction with
the NDAC that provided evidence about the level
to which Canadian research data are at risk.
8The Risk Level for Research Data in Canada
- A gap-analysis of existing mandates and practices
of national institutions - A follow-up study to an investigation first
conducted twenty years ago by the now defunct
Machine-Readable Archives and - A survey of researchers receiving a standard
research grant from the SSHRC between 1998 and
2000.
9Gap Analysis
- The mandates and practices of Canadian
institutions with responsibilities for the
preservation of heritage were examined to
determine the types of digital objects that are
currently protected.
10Gap Analysis
- Findings
- The vast majority of academic and non-academic
research data fall outside the current
interpretation and execution of the mandates of
the National Library and National Archives (now
the Library Archives of Canada). - No other Canadian institution has a national
mandate or the resources to address the current
level of need for preserving research data.
11Revisiting the MRA Study
- An administrative investigation twenty years ago
identified a population of 150 SSHRC-funded
studies utilizing research data. - Twenty years later, can the data from any of
these research projects be located?
12Revisiting the MRA Study
- Findings
- Data from 3 out of 110 studies could be found
without contacting the original principal
investigators directly for further details. - The 3 studies for which data were found were all
deposited in the United States with the
Inter-university Consortium for Political and
Social Research (ICPSR).
13Revisiting the MRA Study
- Conclusion
- The risk of data loss is very high without an
institution with the specific mandate to preserve
research data.
14A Survey of SSHRC-funded Researchers
- Researchers who received a standard research
grant from the SSHRC between 1998 and 2000 were
asked about their plans to preserve the data from
their projects. - Only seven percent said that they had deposited
the data from their funded project, while another
18 percent said they intend to deposit the data.
15A Survey of SSHRC-funded Researchers
- When asked to identify where they had or would
deposit data, almost all named a source that is
not an archive. - They mistakenly listed university library data
services, the Web, and a Statistics Canada
Research Data Centre. - A couple of researchers indicated that they would
deposit the data from their projects if they only
knew where and how to do this.
16A Survey of SSHRC-funded Researchers
- Conclusion
- Without a recognized institution responsible for
preserving research data, researchers do not know
where or how to archive the data from their
research, even if they would like to see the data
preserved.
17A Survey of SSHRC-funded Researchers
- Conclusion
- For the vast majority of researchers in this
study, archiving data is an unknown activity in
conducting research.
18Size of the Problem
- We know that research data are at risk, but how
big of a problem is this? - The survey of researchers who received a standard
research grant from the SSHRC provides evidence
that around 550 out of every 1,000 projects
results in the creation or use of data files
and/or databases.
19Size of the Problem
- This is just the tip of the iceberg!
- There is no estimate for other SSHRC-funded
projects, other granting agencies, or other
agencies and departments creating research data.
20Does It Matter?
- This question has been asked in other countries
with answers that apply equally in Canada. - Protecting the financial investment in data
- Stewardship and custodial responsibilities
- Legal and ethical obligations and
- Knowledge-generation opportunities.
21The NDAC Phase II
- A primary objective of the second phase was to
recommend the institutional form that national
data archiving services should take in Canada. - First, research was conducted to identify the
types of existing institutional models for data
archives.
22The Results
- A typology of organizational models was developed
from the results of a survey of 36 international
organizations in data archiving and data services
in the social sciences and humanities. - Three generalized models were identified that
summarized groupings of the characteristics from
the survey.
23The Results
- While no single existing institution is
necessarily described completely by one of these
three models, the typology offers a fair summary
of the current mix of organizations.
24The Three Models
- The Topical Data Archive
- The Agency-based Data Archive
- The Comprehensive Research Data Archive
25A Proposed Canadian Model
- Establish by legislative mandate an agency
reporting to Parliament through the Ministry of
Industry or Heritage or a combination of both - Fund centrally through Parliament
- Grant authority to act on behalf of the
Government of Canada in international
negotiations related to research data and its
management standards and practices - Structure as a network of distributed service
points with a central service facility
26A Proposed Canadian Model
- The central facility would be responsible for
data management, standards development, and data
preservation - The service points would be responsible for
assisting with the deposit of data, accessing
data, and training and user consultation - Service points would be located in universities
and other institutions interested in providing
access to preserved research data (a model
similar to the Depository Service Program between
government publishing and Canadian libraries)
27A Proposed Canadian Model
- A management board would oversee the operation of
this National Data Archive Network and consist of
representatives from the regions in Canada as
well as various stakeholders that manage, use,
and produce research data - Furthermore, this agency would enter into formal
co-operative working relationships with other
national institutions, such as the Library and
Archives of Canada and Statistics Canada.
28National Consultation on Access to Scientific
Research Data
- A Task Force of experts was assembled to organize
a two-day National Forum to investigate issues
regarding access to research data in Canada and
to formulate recommendations. - Experts from the natural and medical sciences
were engaged to complement the work of the NDAC. - The Task Force developed a mind-map of ideal
achievements reached in 2010 as a result of
improved data access.
29(No Transcript)
30The NCASRD National Forum
- A document structured around the main entries of
this mind map was prepared and distributed to
an assembly of 70 researchers who attended a
National Forum held on November 22 23, 2004. - This body generated its own mind map of
achievements for the year 2010.
31(No Transcript)
32(No Transcript)
33The NCASRD National Forum
- Working backwards from the ideal achievements,
sub-groups prepared vignettes of the steps needed
to be accomplished to reach the end-states
documented in the mind map. These steps were
subsequently organized into recommendations. - A draft report with recommendations arising from
the discussions at the National Forum has been
written and circulated to members of the Task
Force. Look for the final report to be available
before the summer of 2005.
34Future Directions
- A field of funded research is needed to study
issues about the preservation of and access to
research data. - Areas for research
- The Data Economy who gets what data, when and
how - The Life Cycle of Data the course of data and
its corresponding metadata from the earliest
stages of planning through to the secondary uses
of data
35Future Directions
- Areas for research
- Metadata Standards the development of standards
that detail the life course of data - Preservation Standards the best practices in
long-term preservation of data - Data Stewardship the ethics and best practices
of sharing research data - Inhibitors to Access the legal and cultural
barriers to access.
36References
- Hackett, Yvette. A national research data
management strategy for Canada the work of the
National Data Archive Consultation Working
Group. IASSIST Quarterly, vol. 25, no. 3 (2001),
13 -16. - Humphrey, Charles. On the advantages of freely
accessible data comment letter. Epidemiology,
vol. 14, no. 3 (2003) 381. - Humphrey, Charles. Research for building a
better data community. IASSIST Quarterly, vol.
25, no. 1 (2001), 21-24. - Jacobs, James A. and Charles Humphrey.
Preserving research data. Communications of the
ACM, vol. 47, no. 9 (September 2004) 27-29.
37References
- National Data Archive Consultation. Phase One
Needs Assessment Report. May 2001.http//www.sshr
c.ca/web/whatsnew/initiatives/da_phase1_e.pdf - National Data Archive Consultation. Final Report
Building Infrastructure for Access to and
Preservation of Research Data. June
2002.http//www.sshrc.ca/web/whatsnew/initiatives
/da_finalreport_e.pdf - National Consultation on Access to Scientific
Research Data website.http//ncasrd-cnadrs.scit
ech.gc.ca/about_e.shtml