Title: BinX
1BinX A Tool for Binary File Access
- eDIKT project team
- Ted Wen tedwen_at_edikt.org
- Robert Carroll robert.carroll_at_edikt.org
2Agenda
- About the BinX project
- Introduction to the BinX language
- Introduction to the BinX library
- Example application
- Overview of the BinX API
- Discussion
3The problem
- Most scientific data are in binary files
- Binary data files are not all standardized
- Binary data files are platform-dependent
- XML is useful to represent metadata
- Scientific datasets can be too large in XML
4What is BinX?
- Binary in XML
- Annotation language
- Using XML
- Descriptive
- Low-level
- Software components
- BinX library
- Generic utilities
- API
5How and Why BinX is used
Special Application Program
Application Program
Application Program
BinX Library
ltdatasetgt lt/datasetgt
Application Program
6The BinX Language
- Annotating a binary data stream
- Mark up data types
- Mark up sequences
- Mark up arrays
- Complex structures
7Data elements
- Primitive data elements
- Byte, character, integer, real
- Complex data elements
- Arrays, struct, union
- User-defined data elements
8Primitive Data Types
- Character
- ltcharacter-8gt
- ltstringgt
- (Fixed length, variable length and delimited)
- Integer
- ltbyte-8gt
- ltshort-16gt, ltunsignedShort-16gt
- ltinteger-32gt, ltunsignedInteger-32gt
- ltlong-64gt, ltunsignedLong-64gt
- Real
- ltfloat-32gt
- ltdouble-64gt
- ltquadruple-128gt
9Primitive Data Types
FF 7F 7F FF FF FF 00 00 C8 42 42 C8 00
00 1 2 3 4
- ltshort-16 byteOrderlittleEndiangt
32767lt/short-16gt - ltinteger-32 byteOrderbigEndiangt
2147483647lt/integer-32gt - ltfloat-32 byteOrderlittleEndiangt100.0lt/float-32
gt - ltfloat-32 byteOrderbigEndiangt100.0lt/float-32gt
10Abstract struct types
ltstructgt ltunsignedShort-16 /gt
ltunsignedShort-16 /gt ltbyte-8 /gt
ltbyte-8 /gt ltbyte-8 /gt lt/structgt
Screen descriptor in GIF Screen width
unsigned short Screen height unsigned
short Packed field a byte Background
colour index byte Pixel aspect ratio byte
11Abstract array types
ltarrayFixedgt ltinteger-32 /gt ltdim
indexTo99gt ltdim indexTo9
/gt lt/dimgt lt/ arrayFixed gt
A 2-dimensional array containing
10-by-100,32-bit integers
12Embedded abstract types
ltstructgt ltshort-16 /gt ltarrayFixedgt ltbyte-8
/gt ltdim indexTo7 /gt lt/arrayFixedgt ltstructgt
ltinteger-32 /gt ltfloat-32 /gt ltdouble-64
/gt lt/structgt lt/structgt
13User-defined metadata
- Label the data types and structures
ltstruct varNameData Samplegt ltshort-16
varNameID /gt ltarrayFixed varNameList of 10
complex numbersgt ltstruct varNameComplexgt
ltfloat-32 varNameReal /gt ltfloat-32
varNameImaginary /gt lt/structgt ltdim
indexTo9 /gt lt/arrayFixedgt lt/structgt
14Reusable type definitions
ltdefinitionsgt ltdefineType typeNameFourCCgt lt
arrayFixedgt ltcharacter-8 /gt ltdim count4
/gt lt/arrayFixedgt lt/defineTypegt lt/definitionsgt
ltstruct varNameWave_Headergt ltuseType
typeNameFourCC varNameKeyword
/gt ltinteger-32 varNameChunk_Size /gt lt/structgt
15Linking to binary data
- Reference the binary data file
ltdefinitionsgt ltdefineType typeNameHeadergt
lt/defineTypegt ltdefineType typeNameFormat_Chunk
gt lt/defineTypegt ltdefineType
typeNameData_Chunkgt lt/defineTypegt lt/definitio
nsgt ltdataset srcmyfile.wavgt ltuseType
typeName"Header" /gt ltuseType typeName"Format_Ch
unk" /gt ltuseType typeName"Data_Chunk"
/gt lt/datasetgt
16The BinX document
- lt?xml version1.0?gt
- ltbinx xmlnshttp//www.edikt.org/binxgt
- ltdataset srcbinary.bin byteOrderlittleEndian
gt - ltshort-16/gt
- ltinteger-32/gt
- ltdouble-64/gt
- lt/datasetgt
- lt/binxgt
17A BinX document
- ltbinx byteOrderbigEndiangt
- ltdefinitionsgt
- ltdefineType typeNamemyTypgt
- ltarrayFixedgt
- ltcharacter-8/gt
- ltdim indexTo9/gt
- lt/arrayFixedgt
- lt/defineTypegt
- lt/definitionsgt
- ltdataset srcmyfile.bingt
- ltuseType typeNamemyTyp/gt
- ltinteger-32 varNameX /gt
- lt/datasetgt
- lt/binxgt
Root element
Data class section
Abstract data type
Data instance section
18DataBinX
ltdataset srcmyfile.bingt
ltstructgt ltshort-16 /gt ltlong-64 /gt ltdouble-64
/gt lt/structgt ltarrayFixedgt ltinteger-3
2 /gt ltdim count2 /gt lt/arrayFixedgt lt/da
tasetgt
ltdatasetgt ltstructgt ltshort-16gt100lt/shor
t-16gt ltlong-64gt1000lt/long-64gt
ltdouble-64gt5.257lt/double-64gt lt/structgt
ltarrayFixedgt ltdimgt
ltinteger-32gt1lt/integer-32gt lt/dimgt
ltdimgt ltinteger-32gt2lt/integer-32gt
lt/dimgt lt/arrayFixedgt lt/datasetgt
19The BinX Library
- Core library
- Utilities
- Applications
20Output from the library
- DataBinX
- combined data and BinX document
- SchemaBinX
- Binary data stream
- DataBinX SchemaBinX Binary data
21BinX Components
- The library has core functionality to support
generic utilities and applications
Applications
BinX core functionality Parse/Gen BinX doc
Read/write binary data Parse/Gen DataBinX
Utilities
BinX Library Core
Generic tools DataBinx pack/unpack
Extractor
Applications Domain-specific
22BinX application models
- Data manipulation model
- Data transportation model
- Data service model
- Data query model
- Data catalogue model
23Data manipulation model
- Extraction
- Subset of a dataset
- Combination
- Merge several datasets
- Transformation
- Conversion of data types
- Change of sequence order
- Transposition of array dimensions
- Transparency
- Automatic change of byte order
24Data transportation model
XSLT
BinX Util
ZIP tool
Send Receive
XML document
DataBinX
ZIP (MIME)
XSLT
BinX Util
ZIP tool
25Data service model
- Publishing logical datasets in BinX
0101010101
Dataset from multiple data sources
DB
BinX
0101010101
0101010101
0101010101
0101010101
BinX
BinX
Grid
Dataset from several binary files
Dataset from one binary file
Client
26Data query model
- Create DataBinX
- From Binary and BinX
- Query DataBinX
- Use XPath
- Create New DataBinX
- Results from query
- Parse DataBinX
- Create new Binary and BinX
DataBinX
XPath
New DataBinX
27Data catalogue model
BinX 1
Abstract
- Primary storage
- Binary data files
- Metadata
- Syntactic annotation
- Semantic annotation
- Classification
- Domain specific
- Cross-reference
- XLink
BinX 1.2
METADATA
BinX 1.1
BinX 1.2.1
BinX 1.2.2
BinX 1.2.3
Detailed
0101010101
0101010101
0101010101
0101010101
BINARY
28Application in Astronomy
- Case Study
- Data Conversion
- Between FITS and VOTable
29Application in astronomy
- FITS and VOTable conversion
DataBinX Utility
BinX library Core
SIMPLE T END 01010101
lt?xml version. ltVOTABLEgt lt/VOTABLEgt
30FITS file
0
79
SIMPLE T / file does conform to FITS standard
BITPIX 8 / number of bits per data pixel
NAXIS 1 / number of data axes
END
3D 4A 14 0F 1C FE 25 04
XTENSION BINTABLE / binary table extension
BITPIX 8 / 8-bit bytes
NAXIS 2 / 2-dimensional binary table
END
7B 3E 40 2C 16 70 E7 6F
Primary HDU
Header
Data
Extension
Header
Data
31VOTable
- ltVOTABLEgt
- ltRESOURCEgt
- ltPARAM nameObs valueBob/gt
- ltTABLE nameStarsgt
- ltFIELD nameStar-name datatypechar
arraysize10 /gt - ltFIELD nameRA datatypefloat /gt
- ltFIELD nameDec datatypefloat /gt
- ltFIELD nameCounts datatypeint
arraysize2x3x /gt - ltDATAgt
- ltTABLEDATAgt
- ltTRgt
- ltTDgtProcyonlt/TDgtltTDgt114.827lt/TDgtltTDgt
5.227lt/TDgt - ltTDgt4 5 3 4 3 2 1 2 3 3 5 6lt/TDgt
- lt/TRgt
- lt/TABLEDATAgt
- lt/DATAgt
- lt/TABLEgt
- lt/RESOURCEgt
- lt/VOTABLEgt
32FITS ?DataBinX ?VOTable
- FITS to VOTable conversion
DataBinX Utility
FITS
XSLT transformer
DataBinX
Schema BinX
Preprocessor
VOTable
XSLT
33VOTable?DataBinX?FITS
- VOTable to FITS conversion
Schema BinX
VOTable
DataBinX Utility
DataBinX
XSLT transformer
Binary Data
Post processor
FITS Header
XSLT
FITS
34Support
- Information and software download
- http//www.edikt.org/binx
- Questions
- support_at_edikt.org
- Requirements and suggestions
- tedwen_at_edikt.org
- robertc_at_edikt.org
35BinX API
36Parsing a BinX document
- BxBinxFile pReader new BxBinxFile()
- If (pReader-gtparse(mybinx.xml))
-
- BxDataset pDataset
-
pReader-gtgetDataset()
37Reading a BinX document
- BxArrayFixed pArray pDataset-gtgetArray(0)
- BxArrayFixed pArray pDataset-gtgetArray(fixed)
- Get an array object
- BxDataset pStruct pArray-gtget(0, 0)
- Get a struct from the array
38Reading a BinX document
- BxFloat32 pReal pStruct-gtgetFloat(Real)
- Float real pReal-gtgetFloat()
- Get the data value
39Creating BinX document
- BxBinxFileWriter pWriter new
BxBinxFileWriter() - Create a object to write out the document
- BxDataset pData new BxDataset()
- Create a new dataset (in memory BinX document)
- BxShort16 i16 new BxShort16(100)
- pData-gtaddDataObject(i16)
40Creating BinX document
- BxBinaryFile pbf new BxBinaryFile()
- Create a new binary file
- pbf-gtsetDatasetPointer(pData)
- Create a link to the BinX document
- pWriter-gtsetBinaryFilePtr(pbf)
- pWriter-gtsave("TestDataset.xml")
- Save the BinX document
41Merge binary data
- BxBinxFileReader pFile1 new
BxBinxFileReader(file1.xml) - BxBinxFileReader pFile2 new
BxBinxFileReader(file2.xml) - BxDataset pDataset1 pFile1-gtgetDataset()
- BxDataset pDataset2 pFile2-gtgetDataset()
- BxArray pArray1 pDataset1-gtgetArray(0)
- BxArray pArray2 pDataset2-gtgetArray(0)
- BxDataObject pData1 pArray1-gtgetNext()
- BxDataObject pData2 pArray2-gtgetNext()
- FILE fo fopen(output.dat,wb)
- pData1-gttoStreamBinary(fo)
- pData2-gttoStreamBinary(fo)
42Summary
- One BinX document can describe
- many binary files
- Generate BinX document from code
- Easy to use interfaces
- Flexible