eCrystals Federation - PowerPoint PPT Presentation

About This Presentation
Title:

eCrystals Federation

Description:

... Society of Chemistry (RSC), Chemistry Central: scholarly ... No established crystallography dictionary or controlled vocabulary to give chemistry context ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 16
Provided by: lizlyonsi
Category:

less

Transcript and Presenter's Notes

Title: eCrystals Federation


1
eCrystals Federation Open Repositories for
Data-driven Science Dr Liz Lyon, UKOLN,
University of Bath, UK Dr Simon Coles, University
of Southampton, UK Chemical Informatics
Workshop, Manchester, March 2008
Federation
2
Themes
  • Context Institutional data repositories
    crystallography exemplar
  • Scale repository federations
  • Longevity Digital curation and preservation
  • Integration Semantic challenges

3
eBank Project building the eCrystals Data
Repository
Started Sept 2003 Scholarly knowledge cycle
context UKOLN-led interdisciplinary team
ePrints platform _at_ Southampton Institutional
Repository exemplar Embedded in
workflow http//ecrystals.chem.soton.ac.uk
4
Scaling Up Report Phase 3 findings Data policy
should reflect lab practice institutional
model Diverse lab practice LIMS proprietary
formats Data quality criteria/validation Prior
publication problem We need automated assignment
of terms for data discovery No discipline
preservation model
5
The
6
eCrystals Repository ePrints.org v3.0
7
Repository Foundations
Learned society subject repository support
  • Using simple Dublin Core
  • Crystal structure
  • Title (Systematic IUPAC Name)
  • Authors
  • Affiliation
  • Creation Date
  • Additional chemical information through
    Qualified Dublin Core
  • Empirical formula
  • International Chemical Identifier (InChI)
  • Compound Class Keywords
  • Specifies which datasets are present in an
    entry
  • Application Profile http//www.ukoln.ac.uk/projec
    ts/ebank-uk/schemas/
  • DOI links http//dx.doi.org/10.1594/ecrystals.che
    m.soton.ac.uk/145
  • Rights Citation http//ecrystals.chem.soton.ac.
    uk/rights.html

8
Federation interoperability linking services
  • Roll-out in 2 phases led by University of
    Southampton
  • Establish Federation policies, application
    profile, mappings
  • Bi-directional links with derived articles in
    publisher repositories, IUCr, Royal Society of
    Chemistry (RSC), Chemistry Central scholarly
    knowledge cycle
  • StOReLink project - Test linking options StORe
    middleware and CLADDIER
  • OAI-ORE Testbed

eChemistry project
9
Laboratory practice workflow
  • Community standard CIF
  • Mixed lab practice central service facility
    versus single staff crystallographer in
    department
  • Achieve end-to-end workflow
  • Challenge of instrument manufacturers with
    proprietary formats
  • Repository Lite for smaller lab operations?

X-ray diffractometers
10
eBank-UK Phase 3 Curation Preservation Study
Sustainability issues
  • http//www.ukoln.ac.uk/projects/ebank-uk/curation/
  • Examined four main areas
  • Audit and certification (TRAC, DRAMBORA, NESTOR,
    ISO International repository audit and
    certification BOF Group)
  • The Open Archival Information System (OAIS) and
    Representation Information (RI)
  • eBank-UK application profile and preservation
    metadata
  • ePrints.org repository platform

Recommendations Self-assessment using
DRAMBORA Consider Representation Information in
wider context Develop preservation
strategy Capture preservation metadata - PREMIS
11
Semantic issues
  • Crystallographic schema underpins CIF
    (Crystallographic Information Framework), but is
    limited to data parameters
  • e.g. cell_length_a

12
  • IUCr Acta Cryst 1992
  • Limited set of keywords describing methods,
    properties applications, compounds, attributes
  • No established crystallography dictionary or
    controlled vocabulary to give chemistry context

13
What do we want to do?
  • Support depositors keyword/term assignment
  • Facilitate and improve automated indexing
  • Support advanced search / browse
  • Allow metadata validation enhancement
  • Apply across a heterogeneous Federation
  • Cross search, cross browse functionality
  • Link data to all associated digital objects
  • Develop domain semantics / vocabulary
  • Use domain-specific authority files
  • Mine to discover rather than find
  • Achieve full inter-disciplinary integration

14
Some (semantic) issues..
  • How are terms assigned?
  • Informal tags and/or structured KOS?
  • How is a vocabulary curated and maintained?
  • Can a vocabulary be transformed into a (Semantic
    Web related understanding) ontology?
  • Disambiguation, acronyms, IUPAC names
  • Persistent identification for data citation
  • Granularity of data citation
  • Data (and metadata) quality, provenance,
    validation
  • Embedding within complex workflows
  • Use collaborative social approaches?
  • Community adoption becomes part of the culture

15
Questions? Slides will be available at
http//wiki.ecrystals.chem.soton.ac.uk/index.ph
p http//www.ukoln.ac.uk/ukoln/staff/e.j.lyon/pres
entations.html
Federation
Write a Comment
User Comments (0)
About PowerShow.com