Title: Canada, the IPY, and Data Management
1Canada, the IPY, and Data Management
World Data Center for Glaciology, Boulder
Facilitating the international exchange of snow
and ice data
- Mark A. Parsons IPY Data Policy and Management
Sub-committeeIPY Data and Information
ServiceElectronic Geophysical Year
Cryospheric System Annual Science
Meeting Toronto, Canada 24 February 2006
2(No Transcript)
3What will IPY4 bring? The Challenge!
- Will you be able to find all the data relevant to
your research and see relationships between data
sets. Access - Will you be able to merge and integrate different
data sets across experiments and disciplines?
Interoperability - Will you be able to subset, visualize, and
transform the data? Usability - Will you be able to retrieve and understand IPY4
data in 2050? Preservation
4Organization of IPY Data Management
- Data Policy Management Subcommittee
- scientists
- data managers
- funding agencies
IPY Joint Committee
eGY
Programme Office
Data Information Service
Users
Projects
Data Centers, Virtual Observatories, etc.
5(No Transcript)
6Alternate Views of the DIS
DIS?
7Alternate Views of the DIS
Some Canadian pieces to the puzzle
GSC
ELOKA
CNADC
DIS
CCIN
8A specific projectDADDI
- Interoperable data from CRYSYS, SEARCH, and
elsewhere to better understand Arctic coastal
processes. - Involves CCIN, three DAACs, and possibly AWI.
- A Web Services approach to data description,
exchange, discovery, visualization, and access. - A possible prototype for IPY.
- Stay tuned for a scoping workshop this summer in
New York.
9But thats not enough! You Must
- Require rigorous data management plans
- Determine archive and identify data management
point of contact within project - Document well and often
- Negotiate roles, responsibilities, and milestones
with archive and DIS - Make data freely available
- Ensure appropriate data attribution and ownership
- Ensure long-term preservation and access
including non-digital data.
10Better yet, you should
- Identify relevant historical data and data from
other projects and make appropriate arrangements - Make data interoperable through standard
formats, transfer mechanisms, descriptionsbuild
coalitions - Facilitate model assimilation
- Develop high-level outreach products
11Open Questions and Issues
- How interoperable do you want to be? What does
portal mean to you? - How does IPY data fit into current operational
systems? - What about GEOSScan IPY be a prototype?
- Standards are essential, but which ones?
(ISO19115, OAIS, OGC) - Tech trends that can help us (XML (GML),
ontologies, portals, etc.) - What do you think about the data policy?
- Need a solid business model esp. for the
long-term - Data Committee and DIS partners meet next week.
12We welcome your feedback!
Ellsworth LeDrewells_at_watleo.uwaterloo.ca
Mark A. Parsonsparsonsm_at_nsidc.org
13Milestones
- Early 2006
- Identify data-management point of contact,
preferably with some data management expertise - Identify and arrange funding for an archive
- Mid 2006
- Define specific roles and responsibilities and
standards for data description, quality control,
formats, etc - Identify other relevant data not directly part of
the project and make similar arrangements - Early 2007
- Submit catalogue metadata to the IPY Data and
Information Service (DIS) and appropriate
archive. - During IPY as data are collected.
- Submit full ISO19115 metadata to DIS and archive
as data are collected. - Submit full, verified (QCed) data collection and
comprehensive documentation to archive - 2009
- Archives make data fully and publicly available
through the DIS and other mechanism.
14Systems and Innovation
Succeeded
Challenged
Failed
The Standish Groups CHAOS report. An
assessment of 40,000 IT application projects
15Data Management Considerations or Themes
- Manage technical innovation
- Systems need people
- Scientists and data managers working together
- Preservation and AccessTwo peas in a pod
- The nature of the documentation
- The nature of the data
16The People Part
A striking proportion of project difficulties
stem from people in both customer and supplier
organisations failing to implement known best
practice.
Oxford University/Computer Weekly survey of
public and private sector IT projects (emphasis
added)
However, people are much more able to adapt to
change, uncertainty, and messy systems
17The People Part Science and Data Management
- Many have stated the need to involve scientists
in data management, but - It is also important to involve data managers in
conducting science. - Field Experiments
- 20 increase in data quality (Parsons, et al.
2004) - 70 of experiment cost is data collection
(Longley, et al. 2001) - Observing systems
18Preservation and AccessTwo Peas in a Pod
- Scientific Data Stewardship
- preservation and responsive supply of reliable
and comprehensive data, products, and information
for use in building new knowledge to - USGCRP, 1998
- the long-term preservation of the scientific
integrity, monitoring and improving the quality,
and the extraction of further knowledge from the
data - H. Diamond et al., NOAA/NESDIS, 2003
19Access. What is it?
- Preservation requirements are well defined in the
Open Archive Information System (OAIS) Reference
Model, but - No similar model for access requirements eGY
could help - Not even a common definition of access and what
restricts it - Unique access requirements for social science
data and non-digital collections (physical
samples, photographs, audio, etc.)
20Documentation
- Use existing standards, e.g.
- ISO19115 metadata standard
- OAIS Reference Model
- Describe uncertainty
- Challenge your assumptions
We must not start from any and every accepted
opinion, but only from those we have defined
those accepted by our judges or by those whose
authority they recognize. Aristotle c. 350 BC
21The Data Itself
01100010100100111101011100011110110010101000111001
11001010100111010101001110001101011010000100001001
01001001010110010010001010100100100101010101001010
10010100101010000011111001011010101011010001011110
10110101101010100110001010010011110101110001111011
00101010001110011100101010011101010100111000110101
10100001000010010100100101011001001000101010010010
01010101010010101001010010101000001111100101101010
10110100010111101011
- Formats
- Archives and users may have different needs
- Consider four themes (Raymond, 2004)
- Transparency
- Interoperability
- Extensibility
- Storage or transaction economy
22Data Management Principles (bumper stickers)
Preservation without access is pointless access
without preservation is impossible.
Its about DATA not systems
Involve scientists in data management data
managers in science
Think about long-term archiving NOW!
Document uncertainty!
Keep things simple flexible
Consider the needs of current, future, and
unknown users
23Whats Next?
- The Data and Information Service should be
created soon. - The Data Sub-Committee needs to consider these
themes and principles when developing the IPY
data policy.