Title: eBank CombeDay
1The Central Role of Data Capturing and Sharing
Chemistry Research Data Simon Coles School of
Chemistry, University of Southampton,
U.K. s.j.coles_at_soton.ac.uk
2Current Situation - Data Generation
Synthesis
Characterisation
3Current Situation Data Management
Data from experiments conducted as recently as
six months ago might be suddenly deemed
important, but those researchers may never find
those numbers or if they did might not know
what those numbers meant Lost in some research
assistants computer, the data are often
irretrievable or an undecipherable string of
digits To vet experiments, correct errors, or
find new breakthroughs, scientists desperately
need better ways to store and retrieve research
data Data from Big Science is easier to
handle, understand and archive. Small Science is
horribly heterogeneous and far more vast. In time
Small Science will generate 2-3 times more data
than Big Science. Lost in a Sea of Science
Data S.Carlson, The Chronicle of Higher
Education (23/06/2006)
4Current Situation Data and Publishing
5Separating Data from Interpretations
Intellect Interpretation (Journal article,
report, etc)
Underlying data (Institutional data repository)
6Smart Labs
7Laboratory IRs and Information Management
8The R4L Repository
Create new compound
Add experiment data and metadata
Deposit
Search / Browse
9Blogging Experiments
- A repository can
- Allow one to put, store and get digital objects
- Provide minimal search and browse functions
- NOT provide the presentation and discussion
functions essential to a scientific study - Social networking tools and approaches can
provide a way
10Facilitating Research
- Facilitates geographically distributed
collaborative research - Useful approach for sharing failed experiments?
11Machines Blogging Experiments
- Automatic upload by scientific instrument
12Comments and Annotation
- A picture says a thousand words!
- Chemists like to sketch!
- Need for more advanced Blog tools / technology
13Current Situation - Data Deluge
1.5,000,000
30,000,000
450,000
14Laboratory Data Management and Archive
15The eCrystals Public Data Archive
http//ecrystals.chem.soton.ac.uk
16NCS Data Publication Policy
- Joint publication Timed release of data tied to
conventional journal article - Separate publication Independent release of data
so that it can be cited e.g. from a journal
article, grant report, poster - Accidental or undesired results Immediate
release after agreement with concerned parties - Never to be formally published results Automatic
release after three years - Embargo feature default 3 years, but timescale
can be defined by depositor - Record can be made public at any time (following
agreement from all concerned parties) - Roles of all concerned parties defined
(originator, etc) - Data citation, DOI, Rights
http//www.ncs.chem.soton.ac.uk/pub_pol.htm
17Linking and aggregating
- Link data and associated publications
- Dataset annotated with metadata
- Semantic publishing on WWW and in journals
http//www.rsc.org/Publishing/Journals/ProjectPros
pect/index.asp
http//www.ukoln.ac.uk/projects/ebank-uk/pilot/
18eCrystals Global Federation Model
Data discovery, linking, citation
Presentation services / portals
Data creation capture in Smart lab
Data discovery, linking, citation
Aggregator services
Search, harvest
Search, harvest
Publication
Deposit
Validation
Subject Repository
Data analysis
Institutional data repositories
Search, harvest
Laboratory repository
Deposit
Deposit
Deposit , Validation
Curation Preservation
Institution Library Information Services
Deposit
19Changing Times!
Information Consumers
Information Providers
All I am saying is that now is the time to
develop the technology to deflect an asteroid