Title: Metadata For CARMEN
1Metadata For CARMEN
- Phillip Lord and Frank Gibson
2Problems
- In the standard model, one collects data,
publishes a paper or papers and then gradually
loses the original dataset. - THE NEW KNOWLEDGE ECONOMY AND SCIENCE AND
TECHNOLOGY POLICY Geoffrey Bowker, University of
California, San Diego
3The need for clear metadata
- Most neurosciences data is relative simple in
structure - But often contextually complex
- Sometimes associated with behavioural features
4Neuroscience spike data
- The raw data is just a waveform
- But what is the experiment for?
- What stimulus is the organism/tissue receiving?
- Even, which channel is which?
- The data sets being produced are (reasonably)
large (10s of Gb, or 1Tb in three months)
5Information Extraction
- How do we get extract the information?
http//en.wikipedia.org/wiki/ImageATTtelephone-la
rge.jpg
6Multi-Author data
Author PMID Type Size
1 Davierwala et al 16155567 Synthetic_Lethality 627
2 Krogan et al 14759368 Affinity_Capture-MS 164
3 Hazbun et al 14690591 Affinity_Capture-MS 3210
4 Gavin et al 11805826 Affinity_Capture-MS 3596
5 Ho et al 11805837 Affinity_Capture-MS 733
6 Ito et al 11283351 Two-hybrid 275
From Katherine James, NCL
7(No Transcript)
8How do we represent
In silico Analysis
Derived data
Laboratory Experiments
9(No Transcript)
10(No Transcript)
11Joseph Whitworth
12Metadata
- Description of results
- Sample
- How it was generated
- Equipment
- Processing steps
- Expensive to capture
- Important to validate result
Lab-book
Lab-book
Lab-book
Lab-book
Lab-book
Lab-book
Lab-book
Lab-book
Lab-book
13The need for standards!
- established by consensus and approved by a
recognized body, that provides, rules,
for the optimum degree of order in a given
context - BSI -
- http//www.bsi-global.com/en/Standards-and-Publica
tions/About-standards/Glossary/
14View from microarrays
- Content Standard Minimal Information
MO -- Terminology
MAGE -- Structure
From the MGED society
15Life science communities
Society Domain Website
The Genomics Standards Consortium (GCS) Genomics http//darwin.nox.ac.uk/gsc/
Microarray and Gene Expression Data Society (MGED) Genomics www.mged.org
Proteomics Standards Initiative (PSI) Proteomics http//psidev.info
Metabolomics Standards Initiative (MSI) Metabolomics www.metabolomicssociety.org
Flow Cytometry experiment Community Flow Cytometry www.flowcyt.org
16(No Transcript)
17MINI electrophysiology
- General Features
- Study Subject
- Recording Location
- Task
- Stimulus
- Recording
- Time Series Data
18Recording Location
- Recording Location Structure
- Brain Area
- Slice Thickness
- Slice Orientation
- Cell Type
- Cell Type co-ordintates
- Location conformation
19(No Transcript)
20View from microarrays
- Content Standard Minimal Information
MO -- Terminology
MAGE -- Structure
From the MGED society
21 Functional Genomics Experiment
(FuGE)
- Model of common components in science
investigations, such as materials, data,
protocols, equipment and software. - Provides a framework for capturing complete
laboratory workflows, enabling the integration of
pre-existing data formats.
22Part of CISBAN in a nutshell
Screen mutants for sensitivity to damage/nutrition
Robot
Robot
- Data curation.
- Functional analysis.
- Interactions with in silico
- programme.
Reference set of 5,000 mutant strains
23CISBAN dataflow
Neil Wipat, Newcastle University
24Data Entry with SYMBA
http//symba.sourceforge.net/
Allyson Lister, Newcastle University
25Data Entry with SyMBA
26Summary
- We are generating metadata standards for
neurosciences - We are following a well-trodden path from
bioinformatics - We adopted FuGE and have built MINI
27Future Work
- More neurosciences experimental datatypes.
- Minimal Information about a Service
- Describe analysis software as well as lab
experiments. - Outreach!
28Acknowledgements
- MINI Frank Gibson, Paul G Overton, Tom V
Smulders, Simon R Schultz, Stephen J Eglen, Colin
D Ingram, Stefano Panzeri, Phil Bream, Evelyne
Sernagor, Mark Cunningham, Christopher Adams,
Christoph Echtermeyer, Jennifer Simonotto, Marcus
Kaiser, Daniel C Swan, Martyn Fletcher, Phillip
Lord - CISBAN Anil Wipat (PI), Allyson Lister (Research
Associate),
29(No Transcript)