Title: ROOT IO for SQL databases
1ROOT I/O for SQL databases
- Sergey Linev, GSI, Germany
2What is SQL?
- SQL Structured Query Language
- SQL is language to create, modify and retrieve
data from Relational Database Management System
(RDBMS) - Standard since 1986, several revisions in 1992
SQL2, 1999 SQL3 - Supported in most commercial and open source RDBMS
3SQL support in ROOT
- Abstract interface via three classes
- TSQLServer query execution
- TSQLResult result of single SELECT query
- TSQLRow content of single row
- Implemented for MySQL, Oracle, PostgreSQL
- TTreeSQL class provides TTree interface to
database tables. Allow tree drawing and table
modifications
4Objects store in SQL database
- Write code yourself
- pro probably, best performance
- contra development, maintenance
- Take existing SQL I/O framework
- pro fast to implement
- contra second I/O scheme for your classes
- Convert to binary buffer and write as BLOB
- pro uses ROOT I/O facility
- contra no access to data without ROOT
5TFile as interface to database
- Main features of TFile
- Subdirectories structure and keys list
- Schema evolution
- Pro of TFile interface to SQL database
- learning you do not need to learn
- using same I/O code can be used
- developing code can be tested with standard or
xml file format - Contra
- fixed table design
- performance
6How it works
- New TBufferSQL2 class implemented
- inherited from TBuffer class
- redefines all virtual methods for basic data
types and objects I/O - When writing object
- splits object data on parts, belonging to
different class members and produces SQL INSERT
queries - When reading object
- generates SELECT queries and extract class member
values from obtained tables subset
7Class table structure
- For each class version separate table
- TBox_ver2, TGraph_ver4 and so on
- Class members stored in separate columns
- basic data types
- fixed-size array of basic types
- object, pointer on object (object id)
- parent class (version)
- special treatment for TObject and TString
- Anything else will be stored in raw format in
special table like TGraph_streamer_ver4
8TSQLFile class
- Inherited from TFile
- Full support of schema evolution
- Beside TTree all other ROOT classes are supported
- Additional configuration possible for
- usage of table indexes
- transactions
- MySQL table types selection
- suffixes for column names
9Example with simple class
class TBox public TObject, public TAttLine,
public TAttFill Double_t fX1
Double_t fY1 Double_t fX2 Double_t
fY2 ClassDef(TBox, 2)
TSQLFile f(mysql//host.domain/test,
create,user,pass) for (int
n1nlt10n) TBox b new
TBox(n,n2,n3,n4) b-gtWrite(Form(boxd,n
))
SELECT FROM TBox_ver2
TObject_ver1
- Created tables
- TBox_ver2
- TObject_ver1
- TAttLine_ver1
- TAttFill_ver2
- KeysTable
- ObjectsTable
- Configurations
10Common tables
- Three common tables
- KeysTable list of all keys in the file
- ObjectsTable list of all objects id
- Configurations TSQLFile config
KeysTable
ObjectsTable
Configurations
11More complicated example
TGraph_streamer_ver4
TSQLFile f(mysql//host.domain/test,
update,user,pass) TGraph
gr new TGraph(10) for (int n0nlt10n)
gr-gtSetPoint(n, n1, (n1)(n1))
gr-gtWrite(gr)
TGraph_ver4
parent classes
TList_streamer_ver5
NULL pointer
ObjectsTable
12Support of custom streamers
- Important, while lot of ROOT and some user
classes has custom streamers - Data, produced by custom streamer, directly
written to _streamer_ tables like
TList_streamer_ver5 - Special case when custom streamer reads data,
written by standard I/O - Even in custom streamer user can split data into
different columns of normal class table
13External access to SQL tables
- Easy navigation with simple SELECT statements
- One raw in table corresponds to one object
- Class name and version for each object can be
found in ObjectsTable - TSQLFileMakeSelectQuery() produce SELECT
statement, which aggregates data of object from
different tables in one
14Example with TBox class
- Query, produced by f-gtMakeSelectQuery(TBoxClass(
)) - SELECT t1.objid, t2.UniqueId, t2.Bits,
t2.ProcessId, t3.fLineColor, t3.fLineStyle,
t3.fLineWidth, t4.fFillColor, t4.fFillStyle,
t1.fX1, t1.fY1, t1.fX2, t1.fY2 FROM TBox_ver2 AS
t1 LEFT JOIN TObject_ver1 AS t2 USING(objid)
LEFT JOIN TAttLine_ver1 AS t3 USING(objid)
LEFT JOIN TAttFill_ver1 AS t4 USING(objid)
TObject
TAttLine
TAttFill
TBox
15SQL I/O performance
- Two aspects
- Time and CPU usage on user host
- Quality and number of SQL statements
// TFile f(test.root,recreate) // TXMLFile
f(test.xml,recreate) TSQLFile
f(mysql//host.domain/test,
recreate,user,pass) TClonesArray
clones("TBox", 10000) for(int n0nlt10000n)
new (clonesn) TBox(n1,n2,n3,n4) clones.W
rite("clones0", TObjectkSingleKey) gBenchmark-
gtStart(Write") clones.Write("clones",TObjectkS
ingleKey) gBenchmark-gtShow(Write")
// TFile f(test.root,recreate) // TXMLFile
f(test.xml,recreate) TSQLFile
f(mysql//host.domain/test,
read,user,pass) TClonesArray clon 0,
clon0 0 f.GetObject("clones0",
clon0) gBenchmark-gtStart(Read") f.GetObject("c
lones", clon) gBenchmark-gtShow(Read")
16Performance measurement
MySQL 4.1 on Fedora Core 4. Pure text queries
are used. With ODBC or native MySQL client
factor 2 to 3 can be gained Query text length
was limited to 50 KB and long queries were
split Oracle 10g on SuSE 8. New TSQLStatement
class (not yet in ROOT) was used to gain factor
25 in writing compare to standard ROOT
17Conclusion
- new TSQLFile provides new possibilities for usage
of SQL database in ROOT - Tested with MySQL 4 and Oracle 10, can be adopted
for other RDBMS - To be done
- enhancement of ROOT TSQLServer classes
- new TSQLStatement class is required
- investigation of ODBC usage in ROOT
- performance optimization