Integrating HDF5 with SRB - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Integrating HDF5 with SRB

Description:

Object-level access to HDF5 stored in the SRB ... [Jae Alameda] what kind of message is transferred through SRB: objects or string message ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 18
Provided by: peter198
Category:

less

Transcript and Presenter's Notes

Title: Integrating HDF5 with SRB


1
Integrating HDF5 with SRB
  • The HDF5-SRB Architecture
  • Peter Cao, HDF, NCSA
  • February 24, 2005

2
Project Description
  • Object-level access to HDF5 stored in the SRB
  • Use SRB as middleware to transfer data between
    the server and client
  • Interactive and efficient access
  • Previous work
  • Extracting entire HDF5 files
  • Extracting byte-streams through the SRBs POSIX
    interface

3
The SRB Architecture
SRB Client
MCAT
SRB Server
HPSS
Unitree
DB2
FTP
HDF5
ObjStore
Distributed Storage Resources database system,
archival storage system, file system, ftp
4
The HDF5-SRB Architecture
HDF5 file
HDF Application
HDF5 Library
HDF5 Object (File, Group, Dataset, Attribute)
HDF5 Object (File, Group, Dataset, Attribute)
MCAT
HDF5-SRB Module (unpackMsg/packMsg)
SRB Server
HDF5-SRB Module (unpackMsg/packMsg)
5
The HDF5-SRB Module
Client API srbObjRequest(void obj, int objID)
Server API srbObjProcess(void obj, int objID)
5. H5Object
3. H5Objop()
7. unpackMsg()
6. packMsg()
HDF5 Library
1. packMsg()
2. unpackMsg()
4. Access file
HDF5 file
SRB Server
6
Implementation Requirement
  • Object fashion
  • Interactive access
  • Data information encapsulated in structure
  • Easy mapping to objects in client application
  • Simple and efficient
  • No complicated packMsg()/unpackMsg()
  • Use one set of objects for both server and client
  • Minimum data to transfer between client and
    server
  • Pack only required data
  • No redundant member object within an object

7
HDF5 Objects
H5File

H5Dataset
H5Group
Data operations implemented on the server side
Client Side
Server Side
H5Datatspace
H5Attribute
H5Datatype
8
H5File
9
H5Group
typedef struct H5Object_t enum H5GROUP,
H5DATASET t union struct H5Group struct
H5dataset u H5Object
10
H5Dataset
11
H5Datatype
12
H5Dataspace
13
H5Attribute
14
Implementation Challenge
  • Efficiency of the packMsg/unpackMsg
  • Datatype conversion
  • The Client needs to know the datatype from server
  • The server have to use client datatype to load
    data
  • Life cycle of object
  • When to close object (dataset, group, file)
  • When to clean memory space
  • Byte stream to transfer large raw data
  • How to pack/unpack VL/compound data

15
Milestone
  • Module specifications
  • RFC 02/11/05
  • Tech. seminar 02/24/05
  • final publication 03/04/05
  • Implementation
  • Compile and install test SRB server 03/18/05
  • Client-side module 03/31/05
  • Server-side module 04/22/05
  • Client application 05/20/05
  • Testing and merge source with SDSC 07/15/05
  • Documentation and release 08/31/05

16
Further Work
  • Metadata Ingest
  • srbObjPut() calls the HDF5 ingest program to put
    file information into MCAT
  • Datacutter
  • searching and filtering HDF5 data
  • HDF5 Indexing
  • store HDF5 indexing table into MCAT

17
Questions/Comments?
  • Ruth Aydt what object can be packed in the new
    srbObjRequest() and srbObjProcess APIs. What are
    the objIDs, how they are managed
  • Jae Alameda what kind of message is transferred
    through SRB objects or string message
  • Mike Folk and other How to transfer large raw
    dataset byte stream or openDAP-g way
  • Albert Cheng how to accomplish complex HDF5
    request number of message vs complex message
  • Elena Pourmal Is the packMsg()/unpackMsg()
    part of the current SRB or new functions
  • Quincey Koziol When passing objects between
    client and server, how to ensure to pass fields
    of the object only need for the operation
  • Bob Mcgrath How to manage the life cycle of
    object on the server side. When client dies, how
    to close the object on the server (timeout?)
  • Stuart Levy Synchronization and locking issues.
    concurrent access to file and operations on file.
    File cache and physical file location
  • Quincey Koziol and other In general, there were
    a lot of questions about the message protocol,
    what parts of the structure are optional, etc..
    I would say we need to document the protocol as
    completely as we can.
  • Quincey Koziol How will datatypes of attributes
    be handled, how will selection from compound
    datatype fields of compound be done.
  • Joe Futrelle How MCAT handle complex query from
    HDF5 or other data
  • Ruth Aydt How file access control is handled in
    HDF5 or SRB
  • Elena had some idea about precomputing some of
    the messages.  Notclear if this is really viable.
  • It would be good to add an example that shows the
    steps of a simple operation,e.g., open dataset.
Write a Comment
User Comments (0)
About PowerShow.com