Title: CLADDIER project fundamentals
1CLADDIER project fundamentals
- Citation, Location and Deposition in Discipline
and Institutional Repositories - Sam Pepler
- Project Manager
- BADC
- CLADDIER workshop, Chilworth, Southampton, UK
- 15th May 2007
2People involved
- Bryan Lawrence (PI, BADC)
- Sam Pepler (Project Manager, BADC)
- Sue Latham (BADC)
- Pauline Simpson (NOCS)
- Jessie Hey (University of Southampton)
- Brian Matthews (CCLRC)
- Catherine Jones (CCLRC)
- Katie Portwin (Contractor, CCLRC)
- Shoaib Sufi (CCLRC)
- Kevin ONeil (CCLRC)
- Katherine Bouton (Reading, NCAS)
3Outline
- This talk is to set the background to today's
presentations - Project aims and background
- Starting point and approach
- How the project relates to this workshop
4Outline
- This talk is to set the background to today's
presentations - Project aims and background
- Starting point and approach
- How the project relates to this workshop
5Funding
- This work is primarily funded via a JISC grant,
through the Digital Repositories programme - It is also funded by
- CCLRC library funding
- NCAS/BADC funding
6CLADDIER Aims
- To provide mechanics for citing data compatible
with citing papers - To provide a mechanism for inter-repositories
communication of citation and reference
information - To report on the issues these mechanisms raise
A practical look at citation of data and how
repositories could communicate citation
information.
7Partners
- NCAS/BADC The NERC designated data centre for
atmospheric science - NOCS/e-Prints Soton An institutional repository
for Southampton University - CCLRC/ePubs An institutional repository for
CCLRC BADC is hosted by CCLRC - NCAS/Reading NCAS is a distributed centre.
Reading are going to test CLADDIER products on
active researchers
8Timing
- Started July 2005
- Original finish date June 2007
- Extended to early Autumn 2007
- To do
- Citation ping mechanism
- User testing
- Writing up
9Outline
- This talk is to set the background to today's
presentations - Project aims and background
- Starting point and approach
- How the project relates to this workshop
10(No Transcript)
11Overall approach
- Emphasis is on generating a light-weight solution
for sharing citation information. Develop
protocols. No monoliths. - In scope
- Tailor IRs and BADC systems to include and
provide citation information - Development of data/document discovery service
- Investigation of interchange formats.
Configuration of OAI servers to harvest these
formats - Out of scope
- Complete population of citation databases in
repositories - Installation of software outside of CLADDIER
project partner institutes - Quality and quantity of records harvested
- IPR issues
12Joanna is not going to mention the epubs
repository in her citation.
Practical approach. How do I get there from here?
Can you cite data and papers in the same list? At
what level are they equivalent?
How is citation info stored in repositories?
Does Joanna know how to cite data in her paper?
Does Joanna want to cite data in her paper?
What are Data Y and Data X as they exist at BADC?
They may be bits of datasets.
Does Joanna think the BADC is going to store Data
Y long enough to make it citable?
Is the information needed to cite a dataset there?
13Citation and linking in repositories
- Inclusion of structures for references in
repositories - Add some test references to papers and dataset
- Test systems developed by active scientists
- Construct a citation ping protocol
- A simple mechanism to push citation information
between repositories - Recommendations for data/publication linkage
(based on lessons learned, and a review of the
literature)
14Examine data publication issues
- Discussion of data citation, with real dataset
examples. Asking scientists and data providers
what and how they would like to cite data. - Analysis of the conceptual models from academic
publishing and data centres. Is a dataset
equivalent to a paper? - Examine data publication methods and issues. Peer
review of data. - Test systems developed by active scientists.
- Discuss issues with the community.
- A workshop disseminating information about the
project will be held for the environmental
science community - User Experience of the CLADDIER System (written
by active environmental scientists based on their
experiences) - Methodologies and Practices for Data Publication
15Discovery Service
- Produce a data/document discovery portal. This
enables examination of data and papers as peers. - This is an example of repository communication.
- Analysis of the conceptual models from academic
publishing and data centres. Is a dataset
equivalent to a paper? - OAI-PMH interfaces for the three repositories
- Light weight portal software CLADDIER discovery
service
16Deliverables
- OAI-PMH interfaces for the three repositories
- Light weight portal software CLADDIER discovery
service - A simple mechanism to push citation information
between repositories - A workshop disseminating information about the
project will be held for the environmental
science community - User Experience of the CLADDIER System (written
by active environmental scientists based on their
experiences) - Identifier Migration Issues for Repositories
- Recommendations for data/publication linkage
(based on lessons learned, and a review of the
literature) - Methodologies and Practices for Data Publication
17Outline
- This talk is to set the background to today's
presentations - Project aims and background
- Starting point and approach
- How the project relates to this workshop
18This workshop
- Discussion of data publication and citation.
- Feedback from the workshop will help us write our
data publication report. - An opportunity to introduce data publication
ideas and engage the community.
19OJIMS
- A project outcome
- OJIMS is a project between the RMetS and the BADC
to try and introduce peer-review to data in the
form of an overlay journal.
20Big issues
- What information the repositories lack.
- What data would researchers like to cite.
- How are datasets defined in data centres.
- Data citation and citation standards in scholarly
works. - How reference information is stored in
repositories. - How do you see institutional repositories working
with data centres in the future. - What does peer review of data mean is it needed.