BinX - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

BinX

Description:

e-Science Data Information and Knowledge Transformation. BinX A ... Ted Wen tedwen_at_edikt.org. Robert Carroll robert.carroll_at_edikt.org. www.edikt.org. Agenda ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 43
Provided by: Ted52
Category:
Tags: binx | carroll

less

Transcript and Presenter's Notes

Title: BinX


1
BinX A Tool for Binary File Access
  • eDIKT project team
  • Ted Wen tedwen_at_edikt.org
  • Robert Carroll robert.carroll_at_edikt.org

2
Agenda
  • About the BinX project
  • Introduction to the BinX language
  • Introduction to the BinX library
  • Example application
  • Overview of the BinX API
  • Discussion

3
The problem
  • Most scientific data are in binary files
  • Binary data files are not all standardized
  • Binary data files are platform-dependent
  • XML is useful to represent metadata
  • Scientific datasets can be too large in XML

4
What is BinX?
  • Binary in XML
  • Annotation language
  • Using XML
  • Descriptive
  • Low-level
  • Software components
  • BinX library
  • Generic utilities
  • API

5
How and Why BinX is used
Special Application Program
Application Program
Application Program
BinX Library
ltdatasetgt lt/datasetgt
Application Program
6
The BinX Language
  • Annotating a binary data stream
  • Mark up data types
  • Mark up sequences
  • Mark up arrays
  • Complex structures

7
Data elements
  • Primitive data elements
  • Byte, character, integer, real
  • Complex data elements
  • Arrays, struct, union
  • User-defined data elements

8
Primitive Data Types
  • Character
  • ltcharacter-8gt
  • ltstringgt
  • (Fixed length, variable length and delimited)
  • Integer
  • ltbyte-8gt
  • ltshort-16gt, ltunsignedShort-16gt
  • ltinteger-32gt, ltunsignedInteger-32gt
  • ltlong-64gt, ltunsignedLong-64gt
  • Real
  • ltfloat-32gt
  • ltdouble-64gt
  • ltquadruple-128gt

9
Primitive Data Types
  • Mark up data types

FF 7F 7F FF FF FF 00 00 C8 42 42 C8 00
00 1 2 3 4
  1. ltshort-16 byteOrderlittleEndiangt
    32767lt/short-16gt
  2. ltinteger-32 byteOrderbigEndiangt
    2147483647lt/integer-32gt
  3. ltfloat-32 byteOrderlittleEndiangt100.0lt/float-32
    gt
  4. ltfloat-32 byteOrderbigEndiangt100.0lt/float-32gt

10
Abstract struct types
  • Mark up a sequence

ltstructgt ltunsignedShort-16 /gt
ltunsignedShort-16 /gt ltbyte-8 /gt
ltbyte-8 /gt ltbyte-8 /gt lt/structgt
Screen descriptor in GIF Screen width
unsigned short Screen height unsigned
short Packed field a byte Background
colour index byte Pixel aspect ratio byte
11
Abstract array types
  • Mark up an array

ltarrayFixedgt ltinteger-32 /gt ltdim
indexTo99gt ltdim indexTo9
/gt lt/dimgt lt/ arrayFixed gt
A 2-dimensional array containing
10-by-100,32-bit integers
12
Embedded abstract types
  • Complex structures

ltstructgt ltshort-16 /gt ltarrayFixedgt ltbyte-8
/gt ltdim indexTo7 /gt lt/arrayFixedgt ltstructgt
ltinteger-32 /gt ltfloat-32 /gt ltdouble-64
/gt lt/structgt lt/structgt
13
User-defined metadata
  • Label the data types and structures

ltstruct varNameData Samplegt ltshort-16
varNameID /gt ltarrayFixed varNameList of 10
complex numbersgt ltstruct varNameComplexgt
ltfloat-32 varNameReal /gt ltfloat-32
varNameImaginary /gt lt/structgt ltdim
indexTo9 /gt lt/arrayFixedgt lt/structgt
14
Reusable type definitions
  • Define macros for reuse

ltdefinitionsgt ltdefineType typeNameFourCCgt lt
arrayFixedgt ltcharacter-8 /gt ltdim count4
/gt lt/arrayFixedgt lt/defineTypegt lt/definitionsgt
ltstruct varNameWave_Headergt ltuseType
typeNameFourCC varNameKeyword
/gt ltinteger-32 varNameChunk_Size /gt lt/structgt
15
Linking to binary data
  • Reference the binary data file

ltdefinitionsgt ltdefineType typeNameHeadergt
lt/defineTypegt ltdefineType typeNameFormat_Chunk
gt lt/defineTypegt ltdefineType
typeNameData_Chunkgt lt/defineTypegt lt/definitio
nsgt ltdataset srcmyfile.wavgt ltuseType
typeName"Header" /gt ltuseType typeName"Format_Ch
unk" /gt ltuseType typeName"Data_Chunk"
/gt lt/datasetgt
16
The BinX document
  • lt?xml version1.0?gt
  • ltbinx xmlnshttp//www.edikt.org/binxgt
  • ltdataset srcbinary.bin byteOrderlittleEndian
    gt
  • ltshort-16/gt
  • ltinteger-32/gt
  • ltdouble-64/gt
  • lt/datasetgt
  • lt/binxgt

17
A BinX document
  • ltbinx byteOrderbigEndiangt
  • ltdefinitionsgt
  • ltdefineType typeNamemyTypgt
  • ltarrayFixedgt
  • ltcharacter-8/gt
  • ltdim indexTo9/gt
  • lt/arrayFixedgt
  • lt/defineTypegt
  • lt/definitionsgt
  • ltdataset srcmyfile.bingt
  • ltuseType typeNamemyTyp/gt
  • ltinteger-32 varNameX /gt
  • lt/datasetgt
  • lt/binxgt

Root element
Data class section
Abstract data type
Data instance section
18
DataBinX
  • DataBinX BinX with Data

ltdataset srcmyfile.bingt
ltstructgt ltshort-16 /gt ltlong-64 /gt ltdouble-64
/gt lt/structgt ltarrayFixedgt ltinteger-3
2 /gt ltdim count2 /gt lt/arrayFixedgt lt/da
tasetgt
ltdatasetgt ltstructgt ltshort-16gt100lt/shor
t-16gt ltlong-64gt1000lt/long-64gt
ltdouble-64gt5.257lt/double-64gt lt/structgt
ltarrayFixedgt ltdimgt
ltinteger-32gt1lt/integer-32gt lt/dimgt
ltdimgt ltinteger-32gt2lt/integer-32gt
lt/dimgt lt/arrayFixedgt lt/datasetgt
19
The BinX Library
  • Core library
  • Utilities
  • Applications

20
Output from the library
  • DataBinX
  • combined data and BinX document
  • SchemaBinX
  • Binary data stream
  • DataBinX SchemaBinX Binary data

21
BinX Components
  • The library has core functionality to support
    generic utilities and applications

Applications
BinX core functionality Parse/Gen BinX doc
Read/write binary data Parse/Gen DataBinX
Utilities
BinX Library Core
Generic tools DataBinx pack/unpack
Extractor
Applications Domain-specific
22
BinX application models
  • Data manipulation model
  • Data transportation model
  • Data service model
  • Data query model
  • Data catalogue model

23
Data manipulation model
  • Extraction
  • Subset of a dataset
  • Combination
  • Merge several datasets
  • Transformation
  • Conversion of data types
  • Change of sequence order
  • Transposition of array dimensions
  • Transparency
  • Automatic change of byte order

24
Data transportation model
  • DataBinX as interlingua

XSLT
BinX Util
ZIP tool
Send Receive
XML document
DataBinX
ZIP (MIME)
XSLT
BinX Util
ZIP tool
25
Data service model
  • Publishing logical datasets in BinX

0101010101
Dataset from multiple data sources
DB
BinX
0101010101
0101010101
0101010101
0101010101
BinX
BinX
Grid
Dataset from several binary files
Dataset from one binary file
Client
26
Data query model
  • Create DataBinX
  • From Binary and BinX
  • Query DataBinX
  • Use XPath
  • Create New DataBinX
  • Results from query
  • Parse DataBinX
  • Create new Binary and BinX

DataBinX
XPath
New DataBinX
27
Data catalogue model
BinX 1
Abstract
  • Primary storage
  • Binary data files
  • Metadata
  • Syntactic annotation
  • Semantic annotation
  • Classification
  • Domain specific
  • Cross-reference
  • XLink

BinX 1.2
METADATA
BinX 1.1
BinX 1.2.1
BinX 1.2.2
BinX 1.2.3
Detailed
0101010101
0101010101
0101010101
0101010101
BINARY
28
Application in Astronomy
  • Case Study
  • Data Conversion
  • Between FITS and VOTable

29
Application in astronomy
  • FITS and VOTable conversion

DataBinX Utility
BinX library Core
SIMPLE T END 01010101
lt?xml version. ltVOTABLEgt lt/VOTABLEgt
30
FITS file
0
79
SIMPLE T / file does conform to FITS standard
BITPIX 8 / number of bits per data pixel
NAXIS 1 / number of data axes

END
3D 4A 14 0F 1C FE 25 04
XTENSION BINTABLE / binary table extension
BITPIX 8 / 8-bit bytes
NAXIS 2 / 2-dimensional binary table

END
7B 3E 40 2C 16 70 E7 6F
Primary HDU
Header
Data
Extension
Header
Data
31
VOTable
  • ltVOTABLEgt
  • ltRESOURCEgt
  • ltPARAM nameObs valueBob/gt
  • ltTABLE nameStarsgt
  • ltFIELD nameStar-name datatypechar
    arraysize10 /gt
  • ltFIELD nameRA datatypefloat /gt
  • ltFIELD nameDec datatypefloat /gt
  • ltFIELD nameCounts datatypeint
    arraysize2x3x /gt
  • ltDATAgt
  • ltTABLEDATAgt
  • ltTRgt
  • ltTDgtProcyonlt/TDgtltTDgt114.827lt/TDgtltTDgt
    5.227lt/TDgt
  • ltTDgt4 5 3 4 3 2 1 2 3 3 5 6lt/TDgt
  • lt/TRgt
  • lt/TABLEDATAgt
  • lt/DATAgt
  • lt/TABLEgt
  • lt/RESOURCEgt
  • lt/VOTABLEgt

32
FITS ?DataBinX ?VOTable
  • FITS to VOTable conversion

DataBinX Utility
FITS
XSLT transformer
DataBinX
Schema BinX
Preprocessor
VOTable
XSLT
33
VOTable?DataBinX?FITS
  • VOTable to FITS conversion

Schema BinX
VOTable
DataBinX Utility
DataBinX
XSLT transformer
Binary Data
Post processor
FITS Header
XSLT
FITS
34
Support
  • Information and software download
  • http//www.edikt.org/binx
  • Questions
  • support_at_edikt.org
  • Requirements and suggestions
  • tedwen_at_edikt.org
  • robertc_at_edikt.org

35
BinX API
36
Parsing a BinX document
  • BxBinxFile pReader new BxBinxFile()
  • If (pReader-gtparse(mybinx.xml))
  • BxDataset pDataset

  • pReader-gtgetDataset()

37
Reading a BinX document
  • BxArrayFixed pArray pDataset-gtgetArray(0)
  • BxArrayFixed pArray pDataset-gtgetArray(fixed)
  • Get an array object
  • BxDataset pStruct pArray-gtget(0, 0)
  • Get a struct from the array

38
Reading a BinX document
  • BxFloat32 pReal pStruct-gtgetFloat(Real)
  • Float real pReal-gtgetFloat()
  • Get the data value

39
Creating BinX document
  • BxBinxFileWriter pWriter new
    BxBinxFileWriter()
  • Create a object to write out the document
  • BxDataset pData new BxDataset()
  • Create a new dataset (in memory BinX document)
  • BxShort16 i16 new BxShort16(100)
  • pData-gtaddDataObject(i16)

40
Creating BinX document
  • BxBinaryFile pbf new BxBinaryFile()
  • Create a new binary file
  • pbf-gtsetDatasetPointer(pData)
  • Create a link to the BinX document
  • pWriter-gtsetBinaryFilePtr(pbf)
  • pWriter-gtsave("TestDataset.xml")
  • Save the BinX document

41
Merge binary data
  • BxBinxFileReader pFile1 new
    BxBinxFileReader(file1.xml)
  • BxBinxFileReader pFile2 new
    BxBinxFileReader(file2.xml)
  • BxDataset pDataset1 pFile1-gtgetDataset()
  • BxDataset pDataset2 pFile2-gtgetDataset()
  • BxArray pArray1 pDataset1-gtgetArray(0)
  • BxArray pArray2 pDataset2-gtgetArray(0)
  • BxDataObject pData1 pArray1-gtgetNext()
  • BxDataObject pData2 pArray2-gtgetNext()
  • FILE fo fopen(output.dat,wb)
  • pData1-gttoStreamBinary(fo)
  • pData2-gttoStreamBinary(fo)

42
Summary
  • One BinX document can describe
  • many binary files
  • Generate BinX document from code
  • Easy to use interfaces
  • Flexible
Write a Comment
User Comments (0)
About PowerShow.com