Title: XDAQ - Real Scale Application Scenarios
1XDAQ - Real Scale ApplicationScenarios
Johannes Gutleber and Luciano Orsini for the CMS
TriDAS collaboration CERNEuropean Organization
for Nuclear ResarchDivision EP/CMD http//cern.c
h/xdaq
CHEP - March, 2003
2Outline
- XDAQ
- A software product line for DAQ
- Description of the asset base
- Re-usable application components (distributed
event builder) - Real scale application scenarios
- Tracker test setup
- Muon chamber production and validation
- Further application scenarios
- Summary and conclusions
3Software Product Lines
A product line is a set of software-intensive
systems sharing a common, managed set of features
that satisfy the specific needs of a particular
market segment or mission and that are developed
from a common set of core assets in a prescribed
way. P. Clements and L. NorthopSoftware Product
Lines, AW 2001
XDAQ is a sofware product line for data
acquisition, designed for the CMS experiment at
CERN.
4Scope of XDAQ
- Environment for data acquisition applications
- communication over multiple network technologies
concurrently - e.g. input on Myrinet, output on TCP/IP over
Ethernet - configuration (parametrization) and control
- protocol and bookkeeping of information
- cross-platform deployment
- write once, use on every supported platform
(Unix, RTOS) - high-level provision of system services
- memory management, synchronized queues, tasks
- built-in efficiency enablers
- zero-copy and buffer loaning schemes usable by
everyone - Aim at creating interoperable DAQ systems
- ECAL, HCAL, Tracker, Muon local DAQs commonly
managed - Gain a common understanding of the problem domain
- terms, use-cases, priorities laid down in common
documentation
5XDAQ Based DAQ Systems
Product linerequirements
Customized DAQsystem
ConfigurationManagementInfrastructure
Manufacturing(prescription)
CoreArchitecture
ApplicationComponents
Newly created artifacts(components,
documents)that can be generalized mayenter the
asset base.
XDAQ asset base
6XDAQ Asset Base (1)
Asset (provided) Product (to be done)
Requirements common to all DAQ systems (see Towards a homogeneous architecture for DAQ systems, Comp. Phys. Comm. 2003) Additional requirements for specific application scenarios (e.g. interface to readout systems, monitoring requirements, collaboration with commercial control systems)
Configuration Management Plan (CMP) Software organization, build and installation and release procedures (including cross-platform support) Creation of document and software repositories (e.g. in CVS), creation of a technical support and tracking environment as outlined in CMP (e.g. through Sourceforge.net)
Core architecture (distributed data acquisition, see also Software Architecture for Processing Clusters based on I2O, Cluster Computing. J. Netw. Softw. App., 5(1)5565, 2002) Adopted unchanged for new products (new features introduced for products may eventually end up in the architecture core)
7XDAQ Asset Base (2)
Asset (provided) Product (to be done)
Application components (C) - distributed event builder database access (Oracle, mySQL) control and monitoring transport components (Myrinet, Ethernet, HTTP, non-blocking TCP) Used and augmented - interface to readout library, trigger and event data store - extended for non-supported DBMS extension with additional commands deployed unchanged, configured at run-time according to needs
Services (logging, configuration from database) to be used unchanged as they become available
Test components used unchanged
Manuals component description and product creation companion Operational procedures - used unchanged Manuals for created DAQ system may use existing manuals as template, procedures need adaptation
8Distributed Event Builder
Readout Units Buffer event fragments
Event fragments Event data fragments are stored
in separated physical memory systems
Event Manager Interfaces between RU, BU and
trigger
Full events Full event data are stored in a
single physical memory system associated to a
processing unit
Builder Units Assemble event fragments
Requirements L1 trigger 100 kHz (_at_2KB),
ev-size 1MB, 200 MB/s in AND out per RU, 200
MB/s in AND 66 MB/s out per BU
9Event fragments Event data fragments are stored
in separated physical memory systems
Readout Units Buffer event fragments
Event Manager Interfaces between RU, BU and
trigger
Builder Units Assemble event fragments
Full events Full event data are stored in a
single physical memory system associated to a
processing unit
10XDAQ Cluster Based Systems
A cluster consists of a collection of
interconnected whole computers. Each cluster node
runs an XDAQ executive process, that at run-time,
is extended with application components. They use
the interfaces of the XDAQ executive for
communication, configuration and memory
management purposes.
XDAQ nodes
Cluster
Controller
Controller
11Cluster Configuration
Cluster Partition
Definitions
Application class name
Peer transport
Peer transport
Scripts (automized execution of commands in Tcl)
Host computer URL
Address
Application (default parameters, paths to other
applications)
Transport (default parameters)
Module URL (URLs to locations of applications,
transports)
12Readout Unit
- Readout Unit (RU)Application component buffers
data fragments belonging to an event from
readout electronics - interface to detector front-end
- internal buffers for event fragments
- interface to event manager for tagging
read-out data - interface to builder unit to serve event data
upon request
Peer transport (PT)TCP, Myrinet,
XDAQ executive(one per host computer)
13Builder Unit
- Builder Unit (RU)Application component requests
data fragments belonging to the sameevent from
readout units, combines them and serves them to
furtherprocessing components - interface to event manager for signalling
availability of resources to build events - internal readout units to request event data
- interface to data storage/filter unit services
to serve requests for further processing - buffers for chaining event fragments from
readout units to complete events
Peer transport (PT)TCP, Myrinet,
XDAQ executive(one per host computer)
14Event Manager
Peer transport (PT)TCP, Myrinet,
XDAQ executive(one per host computer)
- Event Manager (EVM)Application component
interacts with the trigger subsystem and
obtainsinformation about events for
identification purposes. - interface to trigger to provide credits and to
receive trigger information - interface to builder units for accepting event
readout requests - interface to readout units for providing tags
for event data
15Sample System Configuration
4 readout units
custom readout
1 event manager
1 builder unit
customstorage
custom triggerinterface
Adding a builder unit to distributethe
processing load is a matter ofediting the
configuration only
16Central DAQ Demonstrator
2003 Equipment (Tier-2 cluster at UCSD) EVB
10x10 At 16 kB 95 MB/s (75) 2001 Equipment EVB
32x32 At 16 kB 75 MB/s (60)
- standard MTU (1500 B payload)
- Linux2.4
See Wednesday plenary talk (F. Meijers) Studies
for the CMS Event Builder
17Case 1 CMS Tracker
Acronym Implementation BCN (Builder Control
Network) Fast Ethernet (I2O)BDN (Builder Data
Network) Fast Ethernet (I2O)BU (Builder
Unit) Intel based PC (Linux)DCS (Detector
Control System) Custom (XDAQ based)DSN (DAQ
Service Network) Ethernet (SOAP/HTTP)EVM (Event
Manager) Intel based PC (Linux)FEC (Front-End
Driver Controller) Intel based PC (Linux)FED
(Front-End Driver) Custom PCI cardsFFN (Filter
Farm Network) Fast Ethernet (I2O)FRL (Front-End
Readout Link) PCI busFU (Filter Unit) Intel
based PC (Linux)LTC (Local Trigger
Controller) Custom PCI cardRCMS (Run
Control/Monitor System) Java (xdaqWin, JAXM)RCN
(Readout Control Network) Fast Ethernet (I2O)RU
(Readout Unit) Intel based PC (Linux)
Property Value Event rate 500 Hz 2000
events/spillEvent size 20 kBThroughput (readout
to storage) 11 MB/s (initial)Operation time 5 to
7 days uninterrupted
18Lessons Learned
- Customization and installation time of system 4
FTE months - includes application development AND framework
learning phase - Reduction of detector commissioning 2 hours (30
hours before) - Demonstrated infrastructure support for
flexibility - Initial small setup at PSI, Zurich
- Scaled up installation for test setup at CERN
- only task was re-configuration (30 minutes,
editing of XML file) - Demonstrated stability
- Uninterrupted run-phase lasts on average 7 days
(600 Gbytes) - Demonstrated efficiency
- Recent upgrade to Gigabit Ethernet
- no modification in software
- efficiency boost from 11 MB/s to gt 70 MB/s
acquisition throughput - Users positive feedback
- Simple interface for controlling a complex,
distributed system - Novices could operate and re-configure the system
themselves
19Case 2 CMS Muon Chambers
Acronym Implementation BCN (Builder Control
Network) Fast Ethernet (I2O)BDN (Builder Data
Network) Fast Ethernet (I2O)BU (Builder
Unit) Intel based PC (Linux)DSN (DAQ Service
Network) Ethernet (SOAP/HTTP)EVM (Event
Manager) Intel based PC (VxWorks)FED (Front-End
Driver) Custom VME cardsFRL (Front-End Readout
Link) VME busLTC (Local Trigger
Controller) Custom PCI cardRCMS (Run
Control/Monitor System) Java (CMS prototype)RCN
(Readout Control Network) Fast Ethernet (I2O)RU
(Readout Unit) Intel based PC (Linux) and VME
PowerPC (VxWorks)
Property Value Event rate 10 kHzEvent size x00 B
- x0 KBThroughput (readout to storage) 4
MB/sOperation time continuous (60 uptime
initially)
20Lessons Learned
- Customization and installation time 6 FTE months
- similar to custom development, but
- seamless integration with new system components
(silicon beam telescope, combined RPC and muon
chamber testing) - Transition to new processing hardware without SW
modifications - Use of standard Web protocols for control allows
independent development of complete run-control
system - Multiplatform challenge
- demonstrated Linux and VxWorks operation with
single software - Memory management challenge
- high variance event sizes - stable operation
- Zero tolerance RTOS - no leakage, no
fragmentation(inefficient use of non-virtual
memory would result in unstable and inefficient
operation)
21Further Application Scenarios
- PRISMA and GASP
- are two experiments whose DAQ systems have been
put in place by INFN Legnaro (Italy) using early
versions of the XDAQ architecture - The positive experience of these systems lead to
the development of the XDAQ product line as we
presented it today - CMS Global Trigger
- entered XDAQ based developments in August 2002
for configuration, control and monitoring of FPGA
loaded VME cards - a 1 week tutorial was sufficient to give an
electrical engineering undergraduate student the
basis for autonomous development - CMS ECAL
- ECAL uses the XDAQ product line for developing
crate controllers for configuring, controlling
and monitoring the front-end devices. - This case led to improvements in the database
access and hardware access components, as well as
a joint preparation of user manuals.
22Summary
- Provide product-line to subdetector groups
- Main DAQ is just another product instance
- Software used in larger context is better
understood - Stability and fitting the requirements are key
- investment of time and cost is predictable
- Avoid repeating same mistakes in autonomously
working groups - Concentrate on groups competences (readout,
storage, )
23Conclusions
Creating a new DAQ system is a process of
assembly in a predetermined way rather than a
programming task. We strive towards coming to a
uniform DAQ product-line for all CMS data
acquisition application scenarios ranging from
single CPU setups to the final systems comprising
thousands of nodes.
http//cern.ch/xdaq