ROOT%20I/O%20TTree%20Queries - PowerPoint PPT Presentation

About This Presentation
Title:

ROOT%20I/O%20TTree%20Queries

Description:

Authentication Overhaul. Object Property Editor. e.g.. TH1Editor, ... XML files allow the interchange of data with applications unable to read ROOT file directly ... – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 26
Provided by: cddocd
Category:

less

Transcript and Presenter's Notes

Title: ROOT%20I/O%20TTree%20Queries


1
ROOT I/OTTree Queries
  • CHEP 2004
  • René Brun / CERN Philippe Canal / Fermilab Fons
    Rademakers / CERN

http//root.cern.ch
2
Contents
  • Status
  • Overview
  • List of other presentations
  • ROOT I/O
  • Large Files
  • Double32_t
  • Foreign objects
  • New interfaces
  • XML back-end
  • Historical recap.
  • Containers Support
  • Mainly for STL containers
  • Splitting
  • TTree Query
  • TTree
  • Auto load of TRefed branches
  • UserInfo
  • CloneTree
  • TTree Query
  • Calling free standing functions
  • Rebinning
  • Support for Indexed Friends
  • Arbitrary C in queries (TTreeMakeProxy)
  • Support for SQL back-end
  • Future Plans

3
Presentations and Posters
  • 328 The Next Generation Root File Serverby
    Andrew ANUSHEVSKY (Theatersaal Sept 27,1630 -
    1650)
  • 412 XML I/O in ROOTby Sergey LINEV (Brunig 1
    2 Sept 29, 1520 - 1540)
  • 430 Global Distributed Parallel Analysis using
    PROOF and AliEnby Fons RADEMAKERS (Theatersaal
    Sept 29, 1520 - 1540)
  • 104 Authentication/Security services in the
    ROOT frameworkby Gerardo GANIS (Brunig 3 Sept
    29, 1650 - 1710)
  • 169 Guidelines for Developing a Good GUIby
    Ilka ANTCHEVA (Brunig 12 Sept 30,1400 -
    1420)
  • 287 Super scaling PROOF to very large
    clustersby Maarten BALLINTIJN (Ballsaal Sept
    30,1500 - 1520)
  • Poster on September 29
  • 128 XTNetFile, a fault tolerant extension of
    ROOT TNetFile client
  • Poster on September 30
  • 298 The ROOT 3-D graphics and geometry classes
  • 170 The User Interface Design in ROOT
  • 303 The ROOT Linear Algebra Package
  • 98 RDBC ROOT DataBase Connectivity
  • 99 Interactive Data Analysis with Carrot (ROOT
    Apache Module)

4
Status
  • ROOT 4.01/02 just released
  • Production Release of 4.01 planned for December
    2004
  • Many improvements since CHEP2003
  • This talks
  • I/O and TTree queries
  • For other developments, see the other ROOT
    related talks
  • XROOTD
  • A new generation ROOT file server
  • Authentication Overhaul
  • Object Property Editor
  • e.g.. TH1Editor, TH2Editor, TGraphEditor
  • New classes for GUI
  • GUI builder
  • Brand new GL viewer
  • Math and Stats
  • New Matrix package Implementation
  • New functions in TMath (Now a namespace)
  • Quadratic programming

5
TFile and TDirectory
  • Very Large Files
  • Support on all platforms for 64 bits integers via
    the portable typedefs Long64_t and ULong64_t.
  • Long long on Unix, _int64 with VC
  • Support for File larger than 2Gb added in ROOT
    4.00
  • File smaller than 2Gb still readable by older
    version of ROOT
  • Support for TTree with more than 231 entries
  • Double32_t
  • Same as Double_t in memory
  • Same as Float_t on disk
  • Support automatic schema evolution to and from
    float and double
  • Warning too many read/write cycle could result
    in some loss of precision

6
XML output format
  • Update to the I/O classes to allow the
    customization of the backend.
  • Implemented for XML
  • Will be used for SQL support.
  • XML files allow the interchange of data with
    applications unable to read ROOT file directly
  • Example
  • Refer to Sergey Linevs presentation for more
    details
  • Extract from c.xml

ltXmlKey name"c1" cycle"1"gt ltObject
class"TCanvas"gt ltVersion v"5"/gt ltTPad
version"8"gt ltTVirtualPad version"2"gt
ltTObject fUniqueID"0" fBits"3000008"/gt lt
TAttLine version"1"gt ltfLineColor
v"1"/gt ltfLineStyle v"1"/gt ltfLineWidth
v"1"/gt lt/TAttLinegt ltTAttFill
version"1"gt ltfFillColor v"19"/gt ltfFillStyle
v"1001"/gt lt/TAttFillgt
TCanvas c h.Draw() c.SaveAs("c.xml")
c.SaveAs("c.root")
7
ROOT I/O History
  • Version 2.25 and older
  • Only hand coded and generated streamer function,
    Schema evolution done by hand
  • I/O requires ClassDef, ClassImp and CINT
    Dictionary
  • Version 2.26
  • Automatic schema evolution
  • Use TStreamerInfo (with info from dictionary) to
    drive a general I/O routine.
  • Version 3.03/05
  • Lift need for ClassDef and ClassImp for classes
    not inheriting from TObject
  • Any non TObject class can be saved inside a TTree
    or as part of a TObject-class
  • Version 4.00/00
  • Automatic versioning of Foreign classes
  • Version 4.00/08
  • Non TObject classes can be saved directly in
    TDirectory

2000
2001
2002
2004
8
Foreign Objects
.... Bytecount (4 bytes) 0 (2 bytes) checksum (4
bytes) ObjectN Bytecount 0 checksum
objectN1 ....
TBuffer
  • To save non instrumented classes
  • Need just the data dictionary
  • Default versioning provided by a Checksum based
    on the type and name of the persistent data
    members
  • Checksum stored as an additional 4 bytes
  • ClassDef advantages
  • The IsA function generated by ClassDef speeds up
    considerably the access to the TClass for a
    given object.
  • The version number (2 bytes maximum) consumes
    less space on disk than the 0checksum
  • New interface to store and retrieve object with
    Type Safety

ptrclass ptr directory-gtWriteObject(ptr,"na
me") ptrclass ptr directory-gtGetObject("name",p
tr)
0 if object absent or of wrong type
9
TClonesArray
  • Optimization of the number of calls to new and
    deletes
  • Ability to split the collection of objects in a
    TTree
  • Improve compression and run-time
  • Ability to save object member-wise
  • Store the same data member of all the elements of
    the collections consecutively
  • Improve compression (buffer data more
    homogeneous)
  • Improve run-time (avoid n-1 tests of the data
    type)
  • Ability to use in TTreeDraw as a collection
  • Ability to read back without the original
    compiled code

10
Old STL Container Support
  • For versions older than 4.00/00
  • Collection always stored object wise
  • Nesting of STL collections was extremely limited
  • No splitting was possible
  • STL containers stored using a generated function
  • One generated function per actual data member.
  • Compiled version of these functions required for
    writing and also for reading

void R__User_fList1(TBuffer R__b,
void R__p, int) if
(R__b.IsReading()) vectorltTHitgt fList1
(vectorltTHitgt )R__p int
R__n fList1.clear() R__b gtgt R__n
R__stl.reserve(R__n) for (int R__i 0
R__i lt R__n R__i)
THit R__t
R__t.Streamer(R__b)
fList1.push_back(R__t) else
writing
11
New Container Support
  • New Abstract Interface
  • TVirtualCollectionProxy
  • Can be implemented for almost any collections
  • Allows
  • Splitting (for collection of homogenous objects)
  • Use in Tree Query (with automatic looping)
  • Will allow
  • Member-wise streaming (as opposed to Object wise
    streaming)
  • Also
  • Arbitrary nesting of STL containers
  • Reading of STL containers without original code
    (Emulated mode)
  • Note as of 4.00/08 only stdvector has Proxies.
  • Early Prototype and fundamental Concepts by
    Victor Perevoztchikov

12
STL Support
  • Each STL container instance now has an associated
    TClass object
  • Several co-existing streaming implementations
  • Generated Streamer
  • For object-wise streaming
  • Fully respect custom allocators and comparators
  • Easier to implement and similar run-time cost as
    a templated solutions
  • Templated Proxy (e.g.. TVectorProxy)
  • For splitting and member-wise streaming Fully
    respect custom allocators and comparators
  • Emulation Proxy (e.g.. TEmulatedVectorProxy)
  • For reading without a compiled version
  • Allow easy sharing of ALL ROOT files that have no
    custom streamers.
  • Why not rely only on the Emulation Proxy
  • Implementation difficulties
  • An emulation proxy acting on live STL object
    requires a few tricks and assumptions
  • memory footprint of the STL container object is
    (usually?) independent from the template
    parameter
  • List proxy would need a series of list of
    increasing fixed size content (aka.
    listltchar1024gt, listltchar2048gt)
  • Does not respect allocators and comparator
  • Templated proxy can be faster and more memory
    efficient.
  • The emulation layer might actually be implemented
    using alternative collections (if we assume it
    does not have to deal with real objects)

13
Container I/O Implementation
  • Any container can be summarized by the sequence
    of its contents addresses
  • Use TVirtualCollectionAt via TVirtualCollection
    operator
  • Pros
  • I/O Code completely independent of the collection
  • Reduced code duplication in TStreamerInfo
  • No run-time cost for TClonesArray
  • Cons
  • Implementation for containers with no random
    access iterator needs to cache the iterator.
  • Member-wise implementation
  • Member-wise/object-wise choice will be encoded in
    the version number of the STL collections
  • API will be provided to select member-wise or
    object-wise for data member that are STL
    collections

14
TTree
  • TRef autoload
  • Added (optional) support for the auto-loading of
    branches referenced by a TRef object.
  • Generate one table of references to branches per
    entry
  • TRefGetObject uses this table to find and load
    the branch containing the referenced object
  • To enable it call
  • .

tree-gtBranchRef()
class Event TClonesArray fTracks TRef
fLastTrack branchtree.GetBranch("fLastTrack"
) branch-gtGetEntry(7) tlast
event-gtGetLastTrack()
  • TTreeGetUserInfo
  • Used to store with the TTree any user defined
    object(s) that is not depending on the entry
    number
  • Examples
  • Luminosity, Calibrations etc.
  • .

tree.GetUserInfo()-gtAdd(myobject)
15
Copying a TTree
  • Very flexible simple copying tools allowing cut
    on
  • Number of entries
  • Number of branches
  • Selection of entries base on a Formula
  • Useable for both TTree and TChain
  • Important simplification of the interface
  • Removed the requirement of explicitly setting the
    addresses for ALL the branches.

3 Branches
2 Branches
tree-gtSetBranchStatus(br,kFALSE) newtreetree-
gtCloneTree()
3 Branches
tree-gtCopyTree(fTracks.fPxlt1.2)
16
TTree Queries
  • Implemented Boolean expression optimization (
    and )
  • Rebinning now possible from the TTree data (via
    new histogram editor)
  • Improved TTreeScan output (customization and
    array display)
  • Call to external functions
  • Free standing function or class static member
    function
  • Compiled or interpreted with Numerical arguments
    and Numerical return type
  • Example

tree-gtDraw("TMathProb(var,5)")
17
TTree Queries
  • Support for Collections
  • TTreeFormula now treats any collection class
    which has a TVirtualCollection in the exact same
    way as a TClonesArray
  • Automatically loops over the elements
  • Can access a specific element
  • Synchronized with other collections and arrays in
    the formulas
  • Connecting several TTrees
  • TChain adds more entries
  • TTree Friends adds (virtually) more branches
  • Prior to ROOT 4.00/08 correlation between Friends
    made only by entry number
  • This is a problem if Trees have semantically a
    different sequence of entries
  • Can now connect the Friend using an Index
  • For example Run Number/Event Number
  • Use abstract interface TVirtualIndex
  • Concrete implementation TTreeIndex

Main Tree
User Tree
Indexed Main Tree
User Tree
1
1
2
2
1
1
2
2
1
2
1
1
1
2
1
1
2
1
2
1
2
2
2
2
18
The MakeClass Revolution
  • Current Fast Analysis Frameworks
  • TTreeDraw
  • Fast histogramming
  • Load branch on Demand
  • Only simple expressions
  • MakeCode
  • C-Style
  • Obsolete
  • MakeClass
  • Flat representation of the tree
  • Difficulties with variable size arrays
  • Branch loaded explicitly
  • MakeSelector
  • Proof Ready
  • Flat representation of the tree
  • Difficulties with variable size arrays
  • Branch loaded explicitly
  • Elegant Replacement for MakeClass/MakeSelector
  • Currently named MakeProxy
  • Creates a C context where branch names
    (including periods) can be used as variable
  • On demand loading of branches
  • Respect/recreate the original class structure
  • Array bound check
  • Use the users shared libraries (when available)

19
MakeProxy Examples
  • TTreeDraw of a script
  • Implemented using MakeProxy
  • Enables complex looping
  • Allow call to any C functions or member
    functions!
  • Still provide on-demand loading of the branches
  • And allow any arbitrary C

tree-gtDraw(hsimple.C)
Double_t hsimple() int last
fTracks.GetLast() for(int i1 i lt last-1
i) htemp-gtFill(fTracks.fPti-fTrack
s.fPti-1) return fTracks.fPtlast
fTracks.fPtlast-1
20
File types Access in 4.01/xx
user
Local File X.xml
TTreeSQL
TFile TKey/TTree TStreamerInfo
TSQLServer TSQLRow TSQLResult
http
rootd/xrootd
Oracle
Local File X.root
Castor
Dcache
MySQL
PgSQL
RFIO
Chirp
SapDb
21
New RDBMS interface Goals
  • Access any RDBMS tables from TTreeDraw
  • Create a Tree in split mode ? creating a RDBMS
    table and filling it.
  • The table can be processed by SQL directly.
  • The interface uses the normal I/O engine,
    including support for Automatic Schema Evolution.

22
New RDBMS Interface
  • Current prototype
  • Simple TTree (branch with leaf list)
  • Implemented via TSQLxxx for reading and writing
  • Implemented via RDBC for reading
  • See http//carrot.cern.ch/onuchin/RDBC/
  • Should be released in December 2004.
  • Should be expanded to support branch of objects
  • Need to implement a way to store and retrieve
    TStreamerInfo(s) and TProcessID(s) in the
    database
  • Will probably use SQL binary blob to store
    non-split objects.

23
RDBMS Examples
Connect to an existing db
TTreeSQL tree(const char db,const char
uid,) tree.Print(), Browse, Scan,
etc tree.Draw(var1var2,varx lt0)
Create the data base on server
TTree query style converted to SQL
TTreeSQL tree(mysql//localhost/test,nobody,n
ew) Event event new Event tree.Branch(top,
Event,event) tree.Fill() tree.AutoSave()
Columns created using the normal split
algorithm. Blobs created below split.
A TSQLRow is filled and sent to the server
24
Future Plans for I/O and TTree
  • Implement member-wise storing for stdvector
    (late 2004)
  • Implement TVirtualCollectionProxy for each of the
    STL containers (late 2004, early 2005)
  • Add support for auto loading of TRef branches
    across trees
  • TChain, TTree Friends and Indexing
  • Add support for befriending TChain objects
    using an Indexed relation
  • TTree Queries
  • Allow following (transparently) TRef and
    TRefArray

25
Summary
  • TFile improvement
  • Large files and trees, Double32_t, XML output
    format.
  • Support for non-instrumented classes
  • Enhancement in I/O and Tree Query for collection
  • Split Collections
  • Fast histograming of (potentially) any
    collections
  • Lift restrictions on STL I/O
  • Nested containers
  • Reading without compiled code
  • TTree
  • Remove stringent requirements on CloneTree
  • Add support for auto loading of referenced
    objects
  • Support for RDBMS databases back-end coming soon.
  • TTree Queries
  • Can call any functions taking numerical arguments
  • Can use arbitrary C and still use the branch
    names as variables
  • TTree Friend linked by Index
Write a Comment
User Comments (0)
About PowerShow.com