Database access and data retrieval (a users view) - PowerPoint PPT Presentation

About This Presentation
Title:

Database access and data retrieval (a users view)

Description:

Database access and data retrieval (a users view) R. Coelho Associa o EURATOM/IST, Instituto de Plasmas e Fus o Nuclear Outline 1 General overview of fusion ... – PowerPoint PPT presentation

Number of Views:105
Avg rating:3.0/5.0
Slides: 33
Provided by: Bru6175
Category:

less

Transcript and Presenter's Notes

Title: Database access and data retrieval (a users view)


1
Database access and data retrieval (a users view)
R. Coelho Associação EURATOM/IST, Instituto de
Plasmas e Fusão Nuclear
  • Outline
  • 1 General overview of fusion databases
  • 2 Data storage/retrieval methods and
    datastructures
  • 3 SDAS at ISTTOK

2
  • I - General overview of fusion databases
  • Databases play a fundamental role in fusion
    plasma research
  • Essential for storage of seminal/standard
    benchmarking discharges.
  • Assist the construction/deduction of elementary
    scaling laws and design phase of fusion devices
    (what to expect on confinement, MHD, transport,)
  • Assist the modeling effort by providing a
    validated set of input experimental data (cross
    sections, machine dependent data,) and
    experimental plasma data on which to validate the
    codes.
  • ? Databases offer a clear display of community
    achievements

3
  • Fusion databases 3 notable examples
  • International Multi-Tokamak Profile Database
    (ITPA)
  • Atomic Data and Analysis Structure (ADAS)
  • Experimental Nuclear Reaction Data (EXFOR)

4
  • International Multi-Tokamak Profile Database
    (ITPA)
  • Objectives
  • To provide all the information required for
    transport codes to simulate discharges from a
    variety of tokamaks.
  • Provide data to be compared against the predicted
    outputs from the codes.
  • Provide data and the modelling results to be used
    as part of the ITER physics basis.
  • Coverage
  • Released publically in 1998.
  • Built from 201 shots from 21 devices. Recent data
    has been added to secondary but remains for
    working group only access.

5
  • International Multi-Tokamak Profile Database
    (ITPA)
  • Storage/accessing
  • MDS server, data stored as MDS trees.
  • Relational database with comments, 0D and 1/2D
    metadata assists the database queries.
  • http//tokamak-profiledb.ukaea.org.uk/
  • C M Roach, M Walters, R V Budny, F Imbeaux,
  • TW Fredian et al, Nuc. Fus., 48, 125001 (2008)

6
  • Atomic Data and Analysis Structure (ADAS)
  • Objectives
  • Provide interconnected set of computer codes and
    data collections for modelling the radiating
    properties of ions and atoms in plasmas.
  • Assist in the analysis and interpretation of
    spectral emission and support detailed plasma
    models (crucial in plasma edge).
  • Coverage
  • Plasmas ranging from the interstellar medium
    through the solar atmosphere and laboratory
    thermonuclear fusion devices to technological
    plasmas.

7
  • Atomic Data and Analysis Structure (ADAS)
  • Accessing
  • A key range of routines for accessing the
    database and delivering data to user codes is
    included. FORTRAN, C, C, IDL and MATLAB are
    supported.

http//open.adas.ac.uk/index.php
Assisting fusion since JET was born(1983)
8
  • Experimental Nuclear Reaction Data (EXFOR)
  • Objectives
  • Provide an extensive compilation of experimental
    nuclear reaction data.
  • Coverage
  • Neutron induced reactions have been compiled
    systematically since the discovery of the
    neutron.
  • Charged particle and photon reactions have been
    covered less extensively
  • Data from 17700 experiments, its' bibliographic
    information, as well as experimental information
    about the data. The status (e.g., the source of
    the data), and history (e.g., date of last
    update) of the data set is also included.

9
  • Experimental Nuclear Reaction Data (EXFOR)
  • Repository
  • Stored at International Network of Nuclear
    Reaction Data Centres (NRDC). http//www-nds.iaea.
    org/exfor/exfor.htm

10
  • II - Data storage/retrieval methods
  • MDS
  • HDF5
  • Universal Access Layer method
  • Paradigm for data retrieval methodologies

11
  • MDSplus (MDS) http//www.mdsplus.org
  • What is it ?
  • Set of software tools for data acquisition and
    storage and a methodology for management of
    complex scientific data.
  • What iT allows ?
  • All data from an experiment or simulation code to
    be stored into a single, self-descriptive,
    hierarchical structure.
  • How difficult is it ?
  • Programming interface contains only a few basic
    commands, simplifyng access even into
  • complicated data structures.

12
  • MDSplus (MDS)
  • SOME CONCEPTS
  • The Data Hierarchy - Trees, Nodes, and Models. A
    self-descriptive hierarchy called a TREE,
    consisting of large numbers of named NODES which
    make up the branches (structure) and leaves
    (data) of each tree.
  • MDSplus SHOTS are trees created from a special
    type of tree called a MODEL, a template which
    contains all of the structure and setup data for
    an experiment or code.
  • Node Characteristics - Self Description
    metadata including the data type, array
    dimensions, data length, units, independent axes,
    the parents and children of the node, tag names,
    the date when the data was stored, the name of
    the user who wrote data, and so forth.

13
  • MDSplus (MDS)
  • TREE EXAMPLE
  • The node on the far right "Ip" is an example of a
    MEMBER, a type of node used to contain data
  • Child and member nodes as analogous to the
    directories and files on a typical operating
    system.

14
  • MDSplus (MDS)
  • DETAILS ON THE API
  • The basic calls as they would be ordered in an
    application are, in generic syntax
  •  
  • mdsconnect,'server_name'
  • mdsopen,'tree_name',shot_number
  • result mdsvalue('expression')
  • mdsput,'node_name','expression'
  • mdsclose,Documentation_beginners_tree_nam
    e,shot
  • mdsdisconnect

15
  • MDSplus (MDS)
  • ACCESSING JET DATA
  • (workaround since not a native MDS server
    storage)
  • MATLAB
  • gtgt mdsconnect('mdsplus.jet.efda.org')
  • gtgt y,statusmdsvalue('_sigjet("ppf/magn/ipla",4
    0573)')
  • gtgt x,statusmdsvalue('dim_of(_sig)')
  • gtgt mdsdisconnect
  •  
  • IDL
  • IDLgt mdsconnect,'mdsplus.jet.efda.org'
  • IDLgt ymdsvalue('_sigjet("ppf/magn/ipla",40573)')
  • IDLgt xmdsvalue('dim_of(_sig)')
  • IDLgt plot,x,y
  • IDLgt mdsdisconnect

16
  • HDF5 http//www.hdfgroup.org/index.html
  • HDF5 is a self-describing file format and library
    for storing scientific data.
  • A versatile data model that can represent very
    complex data objects and a wide variety of
    metadata (different datatypes on the same tree)
    with direct access to parts of the file without
    parsing the entire file.
  • A completely portable file format with no limit
    on the number or size of data objects in the
    collection.
  • A software library that runs on a range of
    computational platforms, from laptops to
    massively parallel systems, and implements a
    high-level API with C, C, Fortran 90, and Java
    interfaces.

17
(No Transcript)
18
  • Universal Access Layer (UAL)
  • MOTIVATION
  • HDF5 and MDSplus represent successful tools for a
    common data format and organization, thus
    allowing effective data sharing among different
    applications.
  • But will these standards survive the lifespan of
    ITER ? A more generic approach is envisaged and
    been implemented on the ITM-TF.
  • Consistent Physical Objects (CPO) - a generic
    view in trees and sub-trees of the data
    organization, transparent to the actual method
    used for data storage.
  • G. Manduchi et al, Fusion Engineering and Design
    83, 462-466 (2008)

19
  • Universal Access Layer (UAL)
  • DATA STRUCTURE

20
  • Universal Access Layer (UAL)
  • DATA STRUCTURE
  • MSE CPO

21
  • PARSING THE DATA STRUCTURE
  • CPO tree-like hierarchical structure is defined
    through language independent XML schemas. These
    can be easily parsed to each programming
    language.

22
  • Universal Access Layer (UAL)
  • DATA FLOW (D.COSTER)
  • The multi-level UAL manages the CPO I/O between
    codes as a common data bus and the data retrieval
    (MDS or HDF5 stored data)

23
  • Universal Access Layer (UAL)
  • CPO I/O
  • euitm_open(name,shot,run)
  •  
  • euitm_get(path, output_structure)
  • the location of the CPO is specified by the
    string argument path
  • output_structure is language dependent and will
    hold the output data.
  •  
  • euitm_put(path, input_structure)
  • the location of the CPO is specified by the
    string argument path
  • input_structure is language dependent and will
    hold the input data. CPO is specified by the
    string argument path.

24
  • Universal Access Layer (UAL)
  • ACCESSING EXPERIMENTAL DATA
  • Cortesy of J.Signoret and F.Imbeaux

25
  • Metodologies for data retrieval
  •  WHAT IS A SIGNAL ?
  • any kind of data that describes a particular
    measurement during a discharge and contains some
    information about plasma properties,e.g. 2/3D
    data time-series data, contour maps, images
  •  OUTPUT PER SHOT ?
  • Diagnostics at JET top 10 Gbytes/shot.much
    smaller than the expected values for ITER !
  •  WHAT IS MEASURED ?
  • Physical properties manifest as patterns with a
    direct parallel between the physical behaviour
    and the structural shapes that are generated
    (spikes in D? emission during Edge localised
    modes (ELMs), Soft X-ray and ECE emission during
    sawtooth crash (ST).

26
  • Metodologies for data retrieval
  • Traditional approach
  • Query founded on shot/signal
  • Manual inspection of structural shapes/features
  • Very tedious and long process
  •  

27
  • Metodologies for data retrieval
  • Pattern recognition approach
  • Data with technical and scientific criteria
    guidance.
  • Pattern oriented compliant, just as people
    behave when they analyze data.
  • Relies on enclosed techniques for data retrieval
  • Feature extraction
  • single entity (temporal segment inside a waveform
    or a set of
  • pixels within an image)
  • compound entity (more than one segment/signal)
  • Classification system (supervised/unsupervised)
  • Similarity measure (metrics proximity measure)
  • J.Vega et al, Fusion Eng. And Design 83, 382
    (2008)

28
III Shared Data Access System (SDAS)
  • Why another Data Retrieval Software?
  • The problem
  • Scientists need to access data from different
    laboratories
  • Each laboratory has its own way of retrieving
    data
  • Scientists have to spend time and effort learning
    how the different data access schemes work,
    change their analysis code for each experiment
    and manage updated versions for each different
    program and library required
  • Does not mean that every association must store
    and retrieve data in the same way.
  • The main data index is changing from shot number
    to time and events, where the pulse number is
    just one among the most relevant events against
    data is catalogued.
  •  

28/33
29
Shared Data Access System (SDAS)
  • Why another Data Retrieval Software?
  • The solution
  • Hide all complexity from end-users
  • Scientists only have to learn once how to access
    data
  • Users don't ask data for information directly to
    the association's database but to a software
    layer
  • The software layer provides the same data access
    functions in all associations
  • Data blocks are tagged against specific events
    which happen during the life cycle of a discharge
  •  

29/29
30
Without SDAS
With SDAS
31
  • SDAS is based on Remote Procedure Calls (RPC)
  • The SDAS server is formed by an XML-RPC server
  • and by a connector to the storage mechanism
  • Data is indexed by time and events
  • SDAS server and libraries available on Python,
  • Java and C
  • Read and Write support (for post processed data)?
  • Supported in several data analysis programs
  • Matlab, IDL, Octave, Mathematica
  • Documentation in wiki http//cdaq.cfn.ist.utl.pt
    8085/
  • Currently being used in ISTTOK/PT, Compass/CZ
  • and TJ-II/ES
  •  

SDAS Technology
31/29
32
Data access
  • SDAS libraries are easily integrated in programs
    such as MatLab, Mathematica and IDL
  • SDAS provides over 20 functions which allow to
  • Search parameters and events
  • Retrieve single and multiple data

32/33
Write a Comment
User Comments (0)
About PowerShow.com