eBank CombeDay - PowerPoint PPT Presentation

About This Presentation
Title:

eBank CombeDay

Description:

Diffractometer. Grid Middleware. Structures. Database. CombeDay 2005. 4 ... Initialisation: mount new sample on diffractometer & set up data collection ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 19
Provided by: simon175
Category:

less

Transcript and Presenter's Notes

Title: eBank CombeDay


1
Making Data Openly Available Simon Coles
2
Data Overload!
3
CombeChem eScience testbed
Properties
4
Chemistry Publications
Ideas and interpretations
Hooks into the literature
Raw data!
Results derived data
5
(No Transcript)
6
(No Transcript)
7
Establishing common ground
  • Understand the data creation process
  • Terminology and definitions
  • Data
  • Metadata
  • Datafile
  • Dataset
  • Data holding
  • Different views
  • Digital library researchers, computer scientists,
    chemists
  • Generic vs specific
  • Modeller vs practitioner
  • Aim for a common ontology
  • Modelling the domain
  • Creating a metadata schema

8
Crystallography workflow
  • Initialisation mount new sample on
    diffractometer set up data collection
  • Collection collect data
  • Processing process and correct images
  • Solution solve structures
  • Refinement refine structure
  • CIF produce CIF (Crystallographic Information
    File format)
  • Report generate Crystal Structure Report

9
Deposition into the archive
10
An Archive entry
ecrystals.chem.soton.ac.uk
11
Access to the underlying data
12
Some metadata issues
  • Using simple and qualified Dublin Core
  • Additional chemical information in schema for
    harvesting e.g. empirical formula
  • Schema contains International Chemical Identifier
    (InChI)
  • Specifies which parts of a dataset are present
  • Links to eprints (and other published literature)
    derived from the data
  • Using vocabularies specific to crystallography
  • Engaging the broader scientific community to
    ensure different schemas are compliant and
    standards can emerge

13
Dataset
Data flow in eBank
Dataset
Dataset
dctermsreferences
Harvesting OAI-PMH oai_dc
Crystal structure (data holding)
ePrint UK aggregator service
Linking
Harvesting OAI-PMH ebank_dc
ebank_dc record (XML)
Deposit
dctypeCrystalStructure and/or Collection
eBank UK aggregator service
Institutional repository
dcidentifier
Crystal structure report (HTML)
dctermsisReferencedBy
Harvesting OAI-PMH oai_dc
Eprint jump-off page (HTML)
dcidentifier
Eprint manifestation (e.g. PDF)
Eprint oai_dc record (XML)
Subject service
dctypeEprint and/or Text
Linking
Model input Andy Powell, UKOLN.
14
Harvesting OAIster
15
Linking and aggregating
16
Embedded in a science portal
17
Current situation
  • Version 2.0 eBank metadata schema
  • Pilot institutional e-data repository for
    harvesting (raw, derived, results data) using
    EPrints software
  • Exports records as ebank_dc and oai_dc
  • Validation of schema discussion with
    International Union of Crystallography for final
    developments and wider deployment
  • Pilot eBank UK aggregator service
  • Developing search interface Version 1.0
  • Testing with PSIgate physical sciences portal
    embedding eBank UK

18
Whats next?
  • Progress towards generic metadata schemas
  • Validation against other schema (CCLRC Model)
  • Eprints.org software allow for more generic
    scientific data and schemas?
  • Metadata enhancement keywords based on knowledge
    of keywords in related publications?
  • Investigate identifiers International Chemical
    Identifier
  • Explore context sensitive linking
  • Full embedding into chemical and crystallographic
    research and publishing
  • e-Learning embedding and pedagogic evaluation
  • Feasibility study in related domains
Write a Comment
User Comments (0)
About PowerShow.com