Title: CMS Software Architecture
1CMS Software Architecture
Software framework, services and persistency in
high level trigger, reconstruction and analysis
Real-Time Requirementsand Implications for the FU
- Vincenzo Innocente
- CERN/EP/CMC
2CMS (analysisreconstruction) Software
Quasi-online Reconstruction
Environmental data
Detector Control
Online Monitoring
store
Request part of event
Store rec-Obj
Request part of event
Event Filter Object Formatter
Request part of event
store
Persistent Object Store Manager
Object Database Management System
Store rec-Obj and calibrations
store
Request part of event
Data Quality Calibrations Group Analysis
Simulation G3 and or G4
User Analysis on demand
3Use Cases (single event processing)(current
functionality in ORCA)
- Simulated Hits Formatting
- Digitization of Piled-up Events
- DAQ Online monitoring (today for Test-Beams)
- L1 Trigger Simulation
- Track Reconstruction
- Calorimeter Reconstruction
- Global Reconstruction
- Physics Analysis
- Histogramming, Event visualization
4Use Cases (Job Level)(current functionality in
ORCA)
- Select input event-collection (data
card or env variable) - Select events from collection (code,
or at link time) - Produce analysis-objects
- Choose to make persistent newly created objects
- Select output event-collection
- Select a physical-clustering strategy
- (location of output data)
- Select a detector and an algorithm configuration
- Decide if persistent objects are obsolete (and
re-construct them) - Perform shallow or deep copy of selected events
5Requirements (from the CTP)
- Multiple Environments
- Various software modules must be able to run in a
variety of environments from level 3 triggering,
to individual analysis - Migration between environments
- Physics modules should move easily from one
environment to another (from individual analysis
to level 3 triggering) - Migration to new technologies
- Should not affect physics software module
6Requirements (from the CTP)
- Dispersed code development
- The software will be developed by
organizationally and geographically dispersed
groups of part-time non-professional programmers - Flexibility
- Not all software requirements will be fully known
in advance - Not only performance
- Also modularity, flexibility, maintainability,
quality assurance and documentation.
7Strategic Choices
- Modular Architecture (flexible and safe)
- Object-Oriented Framework
- Strongly-typed interface
- Uniform and coherent software solutions
- One main programming language
- One main persistent object manager
- One main operating system
- Adopt standards
- Unix, C, ODMG, OpenGL...
- Use widely spread, well supported products (with
healthy future) - Linux, C, Objectivity, Qt...
- Mitigate risks
- Proper planning with milestones
- Track technology evolution investigate and
prototype alternatives - Verify and validate migration paths have a
fall-back solution ready
8Components
- Reconstruction Algorithms
- Event Objects
- Physics Analysis modules
- Other services (detector objects, environmental
data, parameters, etc) - Legacy not-OO data (GEANT3)
- The instances of these components require to be
properly orchestrated to produce the results as
specified by the user
9Architecture structure
- An application framework CARF (CMS Analysis
Reconstruction Framework), - customizable for each of the computing
environments - Physics software modules
- with clearly defined interfaces that can be
plugged into the framework - A service and utility Toolkit
- that can be used by any of the physics modules
- Nothing terribly new, but...
- We should be able to cope with
- LHC-collaboration complexity
10CARFCMS Analysis Reconstruction Framework
Physics modules
Specific Framework
Reconstruction Algorithms
Data Monitoring
Event Filter
Physics Analysis
Generic Application Framework
Calibration Objects
Event Objects
Configuration Objects
CMS adapters and extensions
Utility Toolkit
ODBMS
C standard library Extension toolkit
Geant4
CLHEP
Paw Replacement
11Reconstruction Scenario
- Reproduce Detector Status at the moment of the
interaction - front-end electronics signals (digis)
- calibrations
- alignments
- Perform local reconstruction as a continuation of
the front-end data reduction until objects
detachable from the detectors are obtained - Use these objects to perform global
reconstruction and physics analysis of the Event - Store Retrieve results of computing intensive
processes
12Reconstruction Model
Geometry
Conditions
Sim Hits
Raw Data
Detector Element
Event
Digis
Rec Hits
Algorithm
13Action on Demand
Compare the results of two different track
reconstruction algorithms
Reco Hits
Detector Element
Reco Hits
Reco Hits
Hits
Event
Reco T1
T1
CaloCl
Reco T2
Analysis
Reco CaloCl
T2
14Action on Demand in HLT
- Introducing and removing algorithms (even at run
time) is straightforward and efficient - A new filter algorithm will automatically trigger
the creation and/or the loading of the objects it
needs - Removing (or not running) a filter will
automatically inhibit the creation of the objects
it depends upon (including geometry, conditions,
algorithms etc. ) - HLT can work on best effort
- fully reconstruct and fully store events for
express-line analysis - partially reconstruct and store just raw data for
the rest - read and store partial events for calibration
purposes - Offline jobs will identify what has been already
reconstructed, what is obsolete, what is missing,
and bring the analysis to completion with no user
intervention
15Problems with traditional architectures
- Traditional Framework schedules a-priori the
sequence of operations required to bring a given
task to completion - Major management problems are produced by changes
in the dependencies among the various operations - Example 1
- Tracks of type T1 are reconstructed only using
tracker hits - Tracks of type T2 requires calorimetric clusters
as seeds - Fast simulation reconstruct tracks type T3
directly from generator information - switching from T1 to T2 the framework should
determine that calorimeter reconstruction should
run first - If T3 are used the most of the tracker software
is not required - Example2
- The global initialization sequence should be
changed because, for one detector, conditions
change more often than foreseen -
16Event Builder/Filter (Real World Object Model)
BU
FED
Read Out Unit
1
FU
FED, ROU, BU, FU, all assemble event fragments
from data coming higher in the chain. Current
wisdom dictates that BU assembles full events,
ship fragments to FU and keeps an event until FU
decides its fate. The whole design should
accommodate technology evolution that could
modify protocols, buffer management and network
connections at any level
1..m
Event
17FU input software(evolution of Test-Beam
prototypes)
- BU Proxy
- Handle connection with real BU (or any other
input device) - Unpack received fragments
- Push data into Read Out Units
- Read Out Unit
- Receive raw-data from BU Proxy
- Push FED data in the corresponding Detector
Element - an intermediate FED object may be required
- Perform AS-IS raw-data persistent storage
- Detector Element
- Request Data from Read Out Unit
- Decode FED data
- Cache decoded and reconstructed data.
- Supply data to reconstruction algorithms and to
persistent storage manager
18FU output
- FU should be capable of persistently store any
raw and reconstructed data corresponding to
accepted events. - Which objects to store and how to clusterize them
should be configurable. - All this is common to all reconstruction
processes. - Current solution is that the FU formats the data
in its final storage representation and write
them to a device that, at configuration time, has
been chosen to be a local disk or a remote data
server. - If raw-data has to be stored, the safest and more
flexible solution is to save them in their
original format at Read Out Unit level
granularity.
19CMS Test Beam setup I/O performance Test
April 1999
(published in RD45 LHCC report)
20Hardware
- Configuration
- System SunEnterprise 450 (4 X UltrSPARC-II
296MHz) - Memory size 512 Megabytes
- Page Size 8KB
- Disk Controllers 2 X Dual FW-SCSI-2 (Symbios,
53C875) - Disks 20 X FUJITSU MAB3091S SUN9.0G Rev. 1705
- OS Solaris 5.6
- File-system Configuration
- 80GB striped file-system of 19 slices 128K "away"
21Simple C I/O
Writing 2GB files in records of 8KB using
ofstreamwrite (egcs 1.1)
5 processes 90 MB/s
Less than 50 cpu load
22Objectivity Test
- H2 like raw data structure
- random size objects (average 2.4KB useful data)
- 4000 events with 45 raw-objects each
- 180K raw-objects in total corresponding to 436MB
of useful data - including event structure and overhead a total of
469MB written to disk per run
23Objectivity optimization
- 32K page size
- INDEX mode 0
- 100 page cache size
- no associations (only VArrayTltooRef()gt)
- commit and hold each 1000 events
24Results (local database)
12 processes 22 MB/s
100 cpu usage!
25Distributed federation
- Remote lock server
- less than 10 degradation
- Remote System Databases
- initialization time increases by a factor 5 to 10
- from few seconds to several tens
- database population less than 10 degradation
- Remote Event database
- limitation from present network speed (1MB/s??)
26Conclusions from I/O test (1)
- Using a state of the art commercial disk server
we can write at a speed of 90MB/s using a simple
C program - On the same system, writing a more realistic
raw-data event structure with Objectivity/DB, we
can reach 22MB/s - Running on two such systems I reached a peak
sustained aggregate speed of 40MB/s
27Conclusion from I/O test (2)
- The CMS online farm is foreseen to have a power
of 4.5 Tips distributed over 1000 cpus - Today, to write with Objectivity at 100 MB/s, a
power of about 6 Gips, distributed over 20 cpus,
is required - We can safely estimate that Objectivity
formatting of raw data will require less than 1
of the CMS online computing power
28FU output alternative solutions
- One alternative is to stream full reconstructed
events, using some intermediate proprietary
format, to a buffer-system and format them
asynchronously. - If event formatting is required, it will require
a similar amount of memory cpu - It could be faster if large-buffer
sequential-streaming is used - The fall back solution is to stream raw-data as a
single complete event record in native format
directly from the BU (or the BU-Proxy) to the
fastest and closest device and perform any
additional formatting offline. - The final bottleneck will most probably be the
network
29Conclusions
- Current CMS software architecture was designed
with HLT requirements in mind - Configurable framework
- Action on demand
- Plug-in physics modules
- Several Filter Farm peculiarities already
prototyped in Test-Beam - The use of uniform and coherent software
solutions will make online-offline software
migration possible - Risks can be mitigated
- Using the framework to shield physics modules
from the underlying technology (will not penalize
performance) - Having fall-back solutions ready
-
30A Uniform and Coherence Approachto Object
Persistency
- Vincenzo Innocente
- CERN/EP/CMC
31HEP Data
- Environmental data
- Detector and Accelerator status
- Calibrations, Alignments
- Event-Collection Data
- (luminosity, selection criteria, )
-
- Event Data, User Data
Navigation is essential for an effective physics
analysis Complexity requires coherent access
mechanisms
32Uniform approach
- Coherent data access model
- Save effort
- Leverage experience
- Reuse design and code
- Main road in producing better and higher quality
software
33CMS needs a serious DBMS
- An experiment lasting 20 years can not rely just
on ASCII files and file systems for its
production bookkeeping, condition database,
etc. - Even today at LEP, the management of all real and
simulated data-sets (from raw-data to n-tuples)
is a major enterprise - Multiple models used (DST, N-tuple, HEPDB,
FATMAN, ASCII) - A DBMS is the modern answer to such a problem
and, given the choice of OO technology for the
CMS software, an ODBMS (or a DBMS with an OO
interface) is the natural solution for a coherent
and scalable approach.
34CMS Experience
- Designing and implementing persistent classes not
harder than doing it for native C classes. - Easy and transparent distinction between logical
associations and physical clustering. - Fully transparent I/O with performances
essentially limited by the disk speed (random
access). - File size overhead (5 for realistic CMS object
sizes) not larger than for other products such
as ZEBRA, BOS etc. - Objectivity/DB (compared to other products we are
used to) is robust, stable and well documented.
It provides also many additional useful
features. - All our tests show that Objectivity/DB can
satisfy CMS requirements in terms of performance,
scalability and flexibility
35CMS Experience
- There are additional configuration elements to
care about ddl files, schema-definition
databases, database catalogs - organized software development rapid prototyping
is not impossible, its integration in a product
should be done with care - Performance degradations often wait you around
the corner - monitoring of running applications is essential,
off-the-shell solutions often exist (BaBar,
Compass, CMS) - Objectivity/DB is a bare product
- integration into a framework is our
responsibility - Objectivity is slow to apply OUR changes to their
product - Is this a real problem? Do we really want a
product whose kernel is changed at each user
request?
36Alternatives ODBMS
- Versant is a viable commercial alternative to
Objectivity - do we have time to build an effective partnership
(eg. MSS interface)? - Espresso (by IT/DB) should be able to produce a
fully fledged ODBMS in a couple of years once the
proof-of-concept prototype is ready - IT restructuring and Hoffman review have delayed
Espresso - We hope to be able to resume Espresso tests soon
- Migrate CARF from Objectivity to another ODBMS
- We expect that it would take about one year
- Such a transition will not affect the basic
principles of CMS software architecture and Data
Model and - Will involve only the core CARF development team.
- Will not disrupt production and physics analysis
37Alternatives ORDBMS
- ORDBMS (Relational DB with OO interface) are
appearing on the market - Up to now they looked targeted to those who have
already a relational system and wish to make a
transition to OO - A New ORACLE product has the appearance of a
fully fledged ODBMS - IT/DB is in the process of evaluating this new
product as an event store - If these early prototypes will look promising
CMS will join this evaluation next year. - We hope to be able to assess the impact of ORDBMS
on CMS Data Model and on migration effort before
the end of 2001
38Fallback Solution Hybrid Models
- (R)DBMS for Event Catalog, Calibration, etc
- Object-Stream files for event data
- Ad-hoc networked dataserver and MSS interface
- Less flexible
- Rigid split between DBMS and event data
- One way navigation from DBMS to event data
- More complex
- Two different I/O systems
- More effort to learn and maintain
- This approach will be used by several experiment
at BNL and FermiLab - (RDBMS not directly accessible from user
applications) - CMS and IT/DB are following closely these
experiences. - We believe that this solution could seriously
compromise our ability to perform our physics
program competitively -
39ODBMS Summary
- A DBMS is required to manage the large data set
of CMS - (including user data)
- An ODBMS provides a coherent and scalable
solution for managing data in an OO software
environment - Once an ODBMS will be deployed to manage the
experiment data, it will be very natural to use
it to manage any kind of data related to detector
studies and physics analysis - Objectivity/DB is a robust and stable kernel
ideal to be used as the base to build a custom
storage framework - Realistic alternatives start to appear on the
market - CMS actively follows these developments