Title: CHEP 2000
1CHEP 2000
Data Handling in KLOE I.Sfiligoi INFN LNF,
Frascati, Italy
2The KLOE experiment
KS?p p - KL?p p - (CP not)
- at DAFNE ?-factory
- main goal
- CP violation study
- other interesting fields
- kaon form factors
- kaon rare decays
- radiative f decays
KS?p p - KL?3p 0?6g
3KLOE Requirements
- Data acquisition (at full DAFNE luminosity)
- 1011 events per year acquired
- 50 MB/s sustained throughput
- Computing power
- ALL the events need to be reconstructed
- Storage requirements
- one petabyte of raw and reconstructed events
- hundreds of megabytes of related
data(configurations, slow control data,
calibration parameters, etc.)
4KLOE computing environment
- Based on a set of medium-sized servers
- Connected using commercial switched networks
(Fast Ethernet and Gigabit Ethernet) - Heterogeneous environment, several platforms
- IBM AIX on PowerPC
- Sun Solaris on Sparc
- Compaq Tru64 Unix on Alpha
- HP-UX on PA-RISC
5KLOE storage pool
- Different policies for different types of data
- raw and reconstructed events on tape libraries,
with big disk pools for data caching - related data managed by a disk based database
system - analysis output on disk pools
6Disk pools
- Four categories of disk pools are present
- each data acquisition node in the farm has its
own small disk pool - computing nodes write their output to
centralized, NFS mounted disk pools - separate disk pools are used as a cache for the
events on tape - analysis output is written to its own, central
AFS mounted disk pool
7Tape library
- Several automated tape libraries supported(at
the moment the 5500 slot tape library is
partitioned between two tape servers) - Accessed using commercial software
- IBM ADSM with the current tape library
8KLOE software
- Three distinct categories
- DAQ (or online)
- reconstruction and analysis (or offline)
- Monte Carlo
ANSI C
FORTRAN inside A_C
FORTRAN
The interface to the Data Handling System must be
compatible with all of them
9KLOE Data Handling System
- Composed of four elements
- Database System
- Archiving System
- Spy System
- KLOE Integrated Dataflow (KID)
10KLOE Data Handling System
- A mix of commercial and custom software
- the dependency on commercial software is
minimized by the layers of custom software
- commercial software carries on all the vital
functions
- custom software mostly extends and coordinates
the functionality of the commercial software
11KLOE Data Handling System
- Based on a set of multi-threaded non-privileged
daemons and related libraries - Distributed across several nodes
- Communication by means of TCP/IP sockets on high
ports
- bypasses TCP/IP filtering
- flexible, programming language and operating
system independent - no configuration needed on the client side
12KLOE Data Handling System
- Composed of four elements
- Database System
- Archiving System
- Spy System
- KLOE Integrated Dataflow (KID)
13Database System
- Two distinct database systems are used
based on HepDB data stored as ZEBRA banks
based on a Relational DBMS data are
structured in fields
extended for distributed environments
14Online Database System
- data stored in a Relational DBMS
- IBM DB2 Universal Database at the moment
- communication between the clients (user
applications) and the RDBMS through a database
daemon
15Database Daemon
- The database daemon is the only link between the
applications and the RDBMS - if the RDBMS is changed in the future, only the
database daemon will need to be changed - Different kinds of commands are managed by the
daemon - general SQL commands
- KLOE specific commands
16Database Daemon
- Different kinds of commands are managed by the
daemon
17Database Daemon
- The use of KLOE specific commands has several
advantages - additional checks and restrictions are possible
- data consistency management is centralized
- fast central caches can be implemented
- for example, the DAQ configuration cache reduces
the typical access time from 4 to 0.1 s
18A light version
- The RDBMS is used to ensure flexibility,
reliability and performance - Demanding in terms of computing resources and
management effort - stand-alone environments oftencannot afford it
- A RDBMS-independent version of the database
daemon is under development
19A light version
- A RDBMS-independent version of the database
daemon is under development - limited to KLOE specific and the most frequently
used SQL commands - based on use of flat files containing a small
portion of the data
- not suitable for production environment,but
enough for home use
20KLOE Data Handling System
- Composed of four elements
- Database System
- Archiving System
- Spy System
- KLOE Integrated Dataflow (KID)
21KLOE Archiving System
- Expected event data managed by KLOE
- 1 PB
- Tape libraries needed
- data storage and retrieval non trivial
- random access to data very inefficient
- Disk-based intermediate buffers used
22KLOE Archiving System
- Two types of intermediate buffers
- DAQ, offline and Monte Carlo output are
structured as YBOS files and written on their
disk output areas - event data needed by offline as input are read
from the archiving system disk-cache
23KLOE Archiving System
- Data needs to be migrated
- from output areas to the tape library
- as soon as possible(taking into account also
efficiency concerns) - from the tape library to the disk cache
- when an application needs it(or even better, a
bit earlier) - Migration is totally automated and transparent to
the applications
24KLOE Archiving System
- The Archiving System is made of four components
- storage managers
- disk space managers
- output areas
- cache areas
- archival director
- cache manager
- Communication by means of TCP/IP sockets
- Coordinated by the online database
archADSM spacekeeper filekeeper archiver retrieve
25Storage Managers
- One for each logical tape library
- Allows
- queries about tape library content
- file archival
- file retrieval
- Transaction oriented(if the underlying tape
library software supports it)
26Storage Managers
- The only link between the tape library and the
rest of the system - interface independent of the underlying archiving
software - IBM ADSM is used with the current tape library
- if other products is used in the future, only a
specific storage manager will need to be developed
27Disk Space Managers
- One for each disk pool
- Create and delete files
- unused files get deleted to make space for new
ones
28Archival Director
- Fully automated
- Works in polling mode
- from time to time looks for files ready to be
archived - starts archiving only when enough data is
available - Files are ordered and grouped to minimize the
expected retrieve time - Several groups of files can be archived in
parallel
29Cache Manager
- User driven
- when a file is needed, the application asks the
cache manager where it is located - a retrieve is performed by the manager if needed
- Several requests can be issued at the same time
- the manager reorders them internally to minimize
the tape mounts - Communication by means of TCP/IP sockets
30KLOE Archival System
archiver
Tape Library
Tape Library
...
n
archADSM
archADSM
. . .
m
spacekeeper
spacekeeper
Disk Pool
Disk Pool
DB
. . .
filekeeper
k
filekeeper
Disk Pool
Disk Pool
retrieve
NFS mount
local file system
TCP/IP socket
TCP/IP socket
31KLOE Data Handling System
- Composed of four elements
- Database System
- Archiving System
- Spy System
- KLOE Integrated Dataflow (KID)
32Spy System
- KLOE data acquisition software allows the event
data to be read-out before they get written to
disk - The mechanism that reads those data is called Spy
- Based on use of shared memory buffers
- DAQ processes are piped using this mechanism
- the spy system reads data from the buffers
without interfering with the DAQ
33KLOE Data Handling System
- Composed of four elements
- Database System
- Archiving System
- Spy System
- KLOE Integrated Dataflow (KID)
34KLOE Integrated Dataflow (KID)
- Integration library
- database accesses and retrieve operations hidden
- Offers a single point of access to all the
services - URI-based selection
35Management effort
- The entire system is managed by only a few
people - 3 people (2 full time) are engaged in KLOE
computing system management (including storage) - 1 person is engaged in the development and
management of the online database and the
archiving system - 2 people spend few percent of their time for the
maintenance of the offline database
36CHEP 2000
Data Handling in KLOE I.Sfiligoi INFN LNF,
Frascati, Italy