Grid Data Requirements Scoping Metadata - PowerPoint PPT Presentation

1 / 8
About This Presentation
Title:

Grid Data Requirements Scoping Metadata

Description:

Grid Data Requirements Scoping Metadata & Provenance Dave Pearson Oracle Corporation UK GRID Data Requirements Scoping Requirements Gathering Establish need for ... – PowerPoint PPT presentation

Number of Views:114
Avg rating:3.0/5.0
Slides: 9
Provided by: Admini968
Category:

less

Transcript and Presenter's Notes

Title: Grid Data Requirements Scoping Metadata


1
Grid Data Requirements ScopingMetadata
Provenance
  • Dave Pearson
  • Oracle Corporation UK

2
GRID Data Requirements Scoping
  • Requirements Gathering
  • Establish need for database interoperability in
    the Grid
  • 3 months exercise
  • Interviews and questionnaire
  • Deliverable as Scoping report
  • Participants
  • UK e-Science communities and Grid projects
  • Astrophysics, Bioinformatics, Combinatorial
    Chemistry, Ecology, Engineering, Environmental
    Sciences, HEP, Neuroscience
  • Findings
  • Widespread requirement for provenance information
    for
  • Establishing reliability quality of data
  • Traceability raw data to publication
  • Automated lab book
  • Reproducing and recreating results
  • Impact analysis
  • Few existing solutions

3
Data Processing
  • Processing Characteristics
  • Well defined work flow
  • Correction, calibration, transformation,filtering,
    merging
  • Relatively static reference data
  • Stable processing functions (audited changes)
  • Periodic reprocessing from archive

4
Analysis and Interpretation
  • Analysis Characteristics
  • - Variable workflow
  • - Standard functions
  • - Standard and personal
  • filtering and summarisation
  • - Retain drill down capability

5
Analysis and Interpretation
  • Conclusions/Inferences
  • Descriptions
  • Trends
  • Correlations
  • Relationships
  • Analysis and Interpretation Characteristics
  • Highly dynamic work flow
  • Multiple data types
  • Volatile data
  • Annotations, inferences, conclusions
  • Evidential reasoning
  • Shared multiple versions of truth
  • Periodic version consolidation

6
Metadata Requirements
  • Technical Metadata
  • Direct referencing - Physical location and data
    schema/structure
  • Data currency/status version, time stamping
  • Accreditation/Access permissions - Ownership
    (Dublin Core)
  • Query time/Governance - data volume, no. of
    records, access paths
  • Contextual Metadata
  • Logical referencing physical data
    semantic/syntactic ontologies
  • Lexical translation Thesaurus, ontological
    mapping
  • Named derivations (summarisations)
  • Scope of Requirements
  • All science communities
  • Related to provenance

7
Metadata Requirements
  • Data Versioning
  • Distinguish latest/agreed version of data
  • Maintain history record of change
  • Synchronise and mirror replicated data
  • Distinguish shared personal interpretations
    and/or annotations
  • Provenance
  • Record of data processing calibration,
    filtering, transformation
  • Record of workflow methods, standards and
    protocols
  • Reasoning evidential justification for
    inferences conclusions
  • Scope of Requirements
  • All science communities
  • Includes Technical and Contextual Metadata

8
Provenance Issues
  • Schema evolution
  • Granularity of record
  • Processed v Derived
  • Inheritance
  • Lack of structured annotations, ontologies
  • Interactive analysis dynamic workflow
  • Multiple derived data sources
  • Context of usage
  • Best practice can change
  • Multiple versions of the truth
  • Evidential reasoning
  • Existing data applications
  • Where is the provenance record stored
Write a Comment
User Comments (0)
About PowerShow.com