(A Taste of) Data Acquisition, Triggers, and Controls - PowerPoint PPT Presentation

About This Presentation
Title:

(A Taste of) Data Acquisition, Triggers, and Controls

Description:

... control, monitoring ... recent trend Some technological themes Triumph ... try hard to find ways to get realistic advance looks at systems integration and scaling ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 35
Provided by: GregoryDu
Category:

less

Transcript and Presenter's Notes

Title: (A Taste of) Data Acquisition, Triggers, and Controls


1
(A Taste of) Data Acquisition, Triggers,
and Controls
  • Gregory Dubois-FelsmannCaltechCHEP 2003

2
Routine disclaimer
  • Much interesting material presented
  • About 35 talks (13.5 hours, 100MB as
    uploaded)from many experiments covering a rich
    variety of issues
  • Many thanks to the speakers!
  • Fitting this into 25 minutes requires procedures
    familiar to the DAQ community
  • Feature extraction
  • and unfortunately also triggering with a
    decidedly imperfect data acquisition system
  • So I apologize in advance for the things Ive
    missed or perhaps misunderstood!

3
Outline
  • Overview of talks presented
  • Some technological themes
  • Trigger architectural issues
  • The great challenge for the next years scaling
  • Conclusions

4
Talks presented
  • Monday parallel

W. Badgett CDF Run II Data Acquisition
R. Rechenmacher Run II DZERO DAQ / Level 3 Trigger System
S. Luitz The BaBar Event Building and Level-3 Trigger Farm Upgrade
R. Itoh Upgrade of Belle DAQ System
A. Polini The Architecture of the ZEUS Micro Vertex Detector DAQ and Second Level Global Track Trigger
J. Schambach STAR TOF Readout Electronics and DAQ
R. Divià Challenging the challenge Handling data in the Gigabit/s range ALICE
J. Gutleber, L. Orsini XDAQ - Real Scale Application Scenarios CMS et al.
G. Lehmann The DataFlow of the ATLAS Trigger and Data Acquisition System
S. Stancu The use of Ethernet in the DataFlow of the ATLAS Trigger DAQ
S. Gadomski Experience with multi-threaded C applications in the ATLAS DataFlow
5
Talks presented
  • Tuesday parallel

A. Ceseracciu A Modular Object Oriented Data Acquisition System for the Gravitational Wave AURIGA experiment
R. Mahapatra Cryogenic Dark Matter Search Remote Controlled DAQ
T. Steinbeck A Software Data Transport Framework for Trigger Applications on Clusters ALICE
T. Higuchi Development of PCI Bus Based DAQ Platform for Higher Luminosity Experiments e.g., Super-Belle
J. Mans Data Acquisition Software for CMS HCAL Testbeams
B. Lee An Impact Parameter Trigger for DØ
G. Comune The Algorithm Steering and Trigger Decision mechanism of the ATLAS High Level Trigger
V. Boisvert The Region of Interest Strategy for the ATLAS Second Level Trigger
S. Wheeler Supervision of the ATLAS High Level Triggers
6
Talks presented
  • Thursday parallel session 5 DAQ and controls

J. Kowalkowski Understanding and Coping with Hardware and Software Failures in a Very Large Trigger Farm BTeV
M. Gulmini Run Control and Monitor System for the CMS Experiment
S. Kolos Online Monitoring Software Framework in the ATLAS Experiment
G. Watts DAQ Monitoring and Auto Recovery at DØ
K. Maeshima Online Monitoring for the CDF Run II Experiment
M. Gonzalez Berges The Joint COntrols Project Framework CERN multi-expt. LHC et al.
S. Lüders The Detector Safety System for LHC Experiments
J. Hamilton A Generic Multi-node State Monitoring System BaBar
V. Gyurjyan FIPA Agent Based Network Distributed Control System JLAB
M. Elsing Configuration of the ATLAS Trigger System
In parallel with session 5a, unfortunately
7
Talks presented
  • Thursday parallel session 5a first level
    triggers
  • Related plenary talks

G. Grastveit FPGA Co-processor for the ALICE High Level Trigger
B. Scurlock A 3-D Track-Finding Processor for the CMS Level-1 Muon Trigger
P. Chumney Level-1 Regional Calorimeter System for CMS
F. Meijers The CMS Event Builder
M. Grothe Architecture of the ATLAS High Level Trigger Event Selection Software
8
Projects represented
  • Strong emphasis onLHC, continuingrecent trend

(by number of talks)
9
Some technological themes
  • Triumph of C for HEP DAQ confirmed
  • Along with Java for GUIs
  • Triumph of commodity computing hardware (Intel
    IA-32) and operating system (Linux)
  • Large-farms-of-small-boxes model confirmed
  • Near-triumph of commodity networking hardware
  • Fast and GB Ethernet, standard commercial switches

10
More technological themes
  • Continuation of long trend of reducing scope of
    application of custom hardware
  • Yet most DAQ software is still custom in present
    experiments
  • Serious efforts to find what is generic in DAQ
    programming
  • Not just to isolate patterns (knowledge
    applicable by others) but also actual programming
    toolkits
  • Continuation of trend of moving offline code
    and/or frameworks into high level triggers
  • New widespread use of XML for non-event
    information(configuration, monitoring)
  • Many of the major new challenges relate to
    scaling to huge farms
  • Performance
  • Operability, control, monitoring

11
Programming languages
  • C has to a large extent proven itself
  • Some current experiments have DAQ systems written
    from scratch almost entirely in C, including
    real-time code running in an embedded RTOS
    environment
  • E.g., BaBar DataFlow, feature extraction, Level
    3 trigger, and rest of online system written in
    serious OO C on VxWorks, Solaris, Linux
  • Achieves virtually zero deadtime at 5.5kHz
    L1Accept rate on 1997-vintage 300 MHz Motorola
    SBCs and 1.4 GHz Linux P-IIIs Luitz
  • New projects are fairly uniformly continuing to
    adopt it for code in the event data flow path
  • Caveats remain, though usually worth the cost
  • Executable size
  • Dependency management seems to remain a challenge
  • Non-trivial work in creating shareables
  • Ease with which naïve users can write
    non-performant code
  • Threading requires care (see below)
  • Even see use in hardware trigger FPGA coding (see
    below) Scurlock, CMS

12
BaBar online and DAQ system
13
Programming languages II
  • Java has emerged as the other major player
  • Especially for graphical applications
  • E.g., run control GUIs
  • Good points
  • (Some say) ease of programming vs. C
  • Universal availability including rich GUI
    graphics library
  • Simple API for remote object programming (RMI)
  • Caveats
  • Performance (although results vary considerably)
  • JVM quality-of-implementation, platform
    (non-)independence
  • No other real competitors on the horizon except
    for niche applications
  • Saw appearances of Python, LISP, etc.

14
Computing hardware and operating systems
  • Farms of Intel IA-32 / Linux 1,2-CPU machines are
    the coin of the realm today tomorrow?
  • Linux will continue to be!
  • Speakers had little to say about CPU chips except
    that they will buy the most cost-effective farms
    they can shortly before each major project goes
    into final commissioning
  • Intel Itanium line may get some traction by then
  • Buy-late is a big Moores Law win, but see
    scaling concerns below
  • Linux success is particularly striking
  • In use in essentially every HEP role, HLT through
    laptops
  • Even approaching in the embedded world PCI DAQ
    component development for Super-KEK-B et al.,
    Higuchi
  • Still some lingering attachment to other Unix
    flavors for disk servers(though Linux-based IDE
    RAID storage is also becoming very common, in the
    offline world, too)
  • Linux is so successful in HEP that cross-platform
    portability may erode

15
Linux in the embedded DAQ card world
16
Multithreading issues
  • Language and library level
  • Need to stay aware of serialization from locking
    mechanisms used in outside libraries
  • Example C Standard Library containers memory
    pool by default uses a single lock found to
    produce x2 penalty in ATLAS HLT tests Gadomski
  • O/S level
  • Linux is not a real-time operating system
  • Still no full implementation of POSIX threads
  • Implementation of pthread yield operation
    interacts poorly with time slicing in scheduler
    (cant reschedule immediately).
  • Found to produce x4 penalty in ATLAS tests
    Gadomski kernel patch available
  • See also under offline code in the online
    world below

17
Networking
  • Commodity networking hardware!
  • The various flavors of Ethernet (Fast, GB, and
    beyond) have become the almost unchallenged
    fabric of higher-level triggering and event
    building (one major exception CMS still
    considering Myrinet)
  • Standard protocols (TCP, UDP) also ascendant
    some efforts to explore raw Ethernet
  • All groups seem to be making some of the same
    discoveries, notably Network switches are not
    simple, transparent devices!
  • Flow control and buffering behavior must be
    understood in detail
  • Vendors can be cagey about the details
    (proprietary internal arch.)
  • Need good tools to monitor traffic behavior

18
Networking adventures
19
Offline code in the online world
  • Use of offline code in high-level software
    triggers
  • application framework
  • or even offline reconstruction code
  • Several current experiments and most future ones
    are doing this
  • Problems
  • Dependencies
  • Performance offline code has often not been
    exposed to the close scrutiny typical for
    online, and may have axes of flexibility at odds
    with high performance (CPU cycles and memory
    utilization)
  • Multithreading offline code is almost never
    written to be thread-safePresents a problem when
    thread parallelism is needed in a high level
    trigger
  • Make a subset of the offline code, and its
    framework, thread-safe (ATLAS L2)?
  • Replace threads with process and a shared-memory
    data model (BaBar, D0)?
  • Offline event loop model may not be directly
    usable (BaBar, D0, ATLAS)
  • Benefits
  • Greatly simplifies incorporation of trigger
    algorithms in simulations
  • Eases development and validation of trigger
    algorithms

20
Genericity patterns and products
  • We have always noticedThe same ideas keep
    coming up and the same problems have to be solved
    over and over again.
  • We have always thoughtThere must be something
    to be gained from applying that knowledge. Can
    generic problems be solved with generic tools?
  • We have tried in various ways
  • Identifying patterns learning how to think about
    these common problems building up expert
    knowledge that can be applied to the next
    experiment learning lessons
  • Thats what CHEP is all about
  • But we also aspire to reuse applying a software
    product in more than one place

21
Reusing concrete products
  • Sounds great has a mixed history
  • In some places this has come to work well
  • CERNLIB, GEANT4, ROOT are ubiquitous in HEP
  • But there are real obstacles, chief among them
  • The difficulty of sharing code bases between
    experiments in different phases of development
    (example divergence of BaBar and CDF versions of
    their originally shared application framework)
  • The devil is in the details often the high
    level features of a system seem generic (the
    patterns) but the implementation picks up
    experiment-specific features
  • Sometimes this is because of concern with
    compatibility with historical code
  • Sometimes it arises when the high-level
    architecture turns out to need to be driven by
    some low-level optimization
  • Perhaps in principle the high-level design could
    be extended to cover both users needs, but the
    press of deadlines favors a quick hack that
    doesnt require renegotiation

22
Reuse in the online environment
  • Reuse has been perhaps less successful on average
    in the online and DAQ worlds.
  • Often online code is prepared later in the
    construction of an experiment(since simu/reco
    code is usually already required at the proposal
    stage),thus under more time pressure.
  • Online code tends to require more low-level
    optimizations. Often these come with serious
    tradeoffs that can limit the flexibility of a
    design, even to its in-house users.
  • But perhaps the next round of experiments
    presents a rare opportunity to do better
  • The LHC experiments are on the same time scale
    and they still have a fair amount of time left.
  • The use of a common language, O/S, and networking
    environment helps.
  • There are some interesting projects under way!

23
Quest for generic online software
  • Data acquisition
  • XDAQ (arising from the CMS project)
    Gutleber/Orsini
  • So far mostly beingused to provide acommon
    platform for several subdetectors
    commissioning DAQ systems and ease their
    integration into the main DAQ
  • Exploring collaborationwith other expts
  • Performance seems good.
  • How CMS-free can it bekept over time, though?

24
Generic software for online
  • Control and monitoring frameworks
  • Lots of projects in this area a couple of
    examples
  • (see Thursday program for more)
  • Inherently fault-tolerant architectures
    Kowalkowski
  • Motivated by BTeV, but is ata very generic
    level, with CScollaborators viewing it as
    ageneral research project
  • In early stages, but worthwatching
  • Generic monitoringframeworks
  • Example D0s XML-baseddistributed
    monitoringWatts

25
XML
  • A fairly new trend XML is cropping up all over
    in online configuration and monitoring
    applications
  • Perhaps surprising? Not-very-compact, textual
    representation!
  • But we are willing to spend some CPU and network
    bandwidth here (since other things in the systems
    require so much more)
  • Benefits
  • Avoids private toy language problem, when
    combined with scripting tools
  • No more hand-written run card-type parsers
  • Easy to parse in many languages, and thus pick an
    appropriate (and perhaps different) language for
    generating the data and for applying it.
  • Very easy to transmit over a network (byte
    stream)
  • Aids in a) using existing generic tools (editors,
    validators)b) allowing new tools we build to be
    more generic within HEP

26
Triggering scope
27
Triggering
  • Far too much information presented to cover in
    detail
  • Remarkable things can now be done in commodity
    CPUs
  • E.g., ZEUS second level tracking trigger / global
    tracking trigger Polini
  • Silicon vertex tracking information becoming
    absorbed into tracking triggers ahead of Level
    3
  • ZEUS
  • Fermilab (for B physics efficiency) Lee, D0
    hardware

28
ZEUS software tracking trigger
29
Hardware triggers
  • Still indispensable at the first level
  • Used in some places as adjuncts to second level
  • Good progress on testing and production for LHC
    experiments
  • Scurlock Generate VHDL from C codeeases
    production of highly accurate board-level
    simulation of trigger

B. Lee An Impact Parameter Trigger for DØ
G. Grastveit FPGA Co-processor for the ALICE High Level Trigger
B. Scurlock A 3-D Track-Finding Processor for the CMS Level-1 Muon Trigger
P. Chumney Level-1 Regional Calorimeter System for CMS
30
The CMS ATLAS choice
F. Meijers The CMS Event Builder
G. Lehmann The DataFlow of the ATLAS Trigger and Data Acquisition System
V. Boisvert The Region of Interest Strategy for the ATLAS Second Level Trigger
  • and many other talks on configuration and other
    details
  • CMS baseline
  • Build full events at output of Level 1 (100 kHz,
    1MB events)
  • Risk this is a lot of data to handleAble to
    fall back to a partial-readout Level 2 model
  • ATLAS baseline
  • L2 trigger operates on ROIs nominally 2 of
    event data at output of Level 1 (75 kHz, 1MB
    events, 20 kB ROI data)
  • Full event build at L2 rate of 1 kHz, sent to
    Event Filter (EF) farm
  • Risk not yet completely clear that small ROIs
    provide enough informationAble to shift boundary
    between L2, EF somewhat
  • Both experiments finding present or readily
    foreseeable technology adequate
  • at least at level of individual subsystems full
    scale end-to-end tests beginning

31
Scaling
  • Many issues remain to be confronted fully in
    building
  • systems with many thousands of CPUs!
  • Fault tolerance
  • Overseeing huge constantly-changing collections
    of active entities(too many for direct human
    oversight)
  • Performance issues
  • Image activation
  • Calibration constant loading
  • Configuration
  • Global knowledge updates required to keep system
    coherent

File server and/ordatabase contention?
32
An exotic tidbit
  • A familiar reassurance to nervous newcomersGo
    ahead, type whatever you want the worst that can
    happen is that we might have to reboot the
    computer.
  • A cautionary tale from CDF
  • Observed unexpected losses of silicon detector
    readout channels
  • Proposed explanation Vibrational resonances due
    to Lorentz forces on digital power lines to the
    front end chips
  • Limits trigger rates
  • Probably related to high deadtime setting up
    steady patterns

Test stand results simulating overloading the DAQ
system, within a magnetic field
Net result physical damage can be caused by
changing trigger configuration!!!
Good wire bond
Broken bond after enduring vibrational resonance
stress
Up close
33
Regretfully omitted
  • Developments in front-end DAQ electronics
  • STAR TOF readout Schambach
  • Importance of development tools
  • Network performance monitoring many
  • Thread debugging Gadomski

34
Conclusions
  • Trigger/DAQ systems have kept up well with the
    demands of doing physics with the present
    generation of experiments
  • Many new technologies and ideas have made this
    possible
  • But we are entering an entirely new regime with
    experiments of the LHC scale and must take care
    that we are not overwhelmed by complexity
  • We should try hard to find ways to get realistic
    advance looks at systems integration and scaling
    before the last minute bulk hardware buys
  • Very interesting research and reduction-to-practic
    e work lies ahead in the next few years looking
    forward especially to CHEP 2003 2
Write a Comment
User Comments (0)
About PowerShow.com