Title: DAQ Software
1DAQ Software
- Introduction to the System
- Goals for Installation Commissioning
- Software Tasks Manpower
Gordon Watts UW, Seattle December 8,
1999 Directors Review
2Run II DAQ
- Readout channels
- Will be 800,000 in run 2, lt250 kBytesgt/event
- Data rates
- 60-80 crates
- Initial design capacity
- 1000 Hz
- 250 MBytes/sec into the DAQ/l3-farm
- Integration Control
- With online
3Goals
- Continuous operation
- Version 0
- PC Run 1 DAQ hardware simulate the
functionality of the Run 2 system. - Looks similar to final system to both Level 3 and
the outside user - Integrate with Hardware as it arrives
- Small perturbations
- Reliability
- Integration with Online (monitor, errors, etc.)
- We dont get calls at 4am
- Careful testing as we go along
- Test stand at Brown
- Si test and other boot strap operations here
- System isnt Fragile
- If things arent done in the exact order
- deal with it
- understandable error messages.
- All code kept in a code repository (vss)
4 Front End Token Readout Loop
Front End Token Readout Loop
Front End Crate
Front End Crate
Front End Crate
Front End Crate
VRC 1
VRC 8
Front End Crate
Front End Crate
Primary Fiber Channel Loop 8
Front End Crate
Front End Crate
Primary Fiber Channel Loop 1
SB 1
SB 4
Event Tag Loop
Ethernet
Ethernet
ETG
To Collector Router
To Collector Router
Trigger Framework
Segment Data Cables
Segment Data Cables
)
)
5L3 Software
Collector Router
Online System
L3 Supervisor
L3 Monitor
SC
SC
SC
SB
ETG
6L3 Software
- During Running, DAQ hardware is stand alone
- Running components do not require software
intervention on an event-by-event basis - Except for monitoring
- Software must deal with initialization and
configuration only. - Except for the Farm Node
- DAQ components require very little software
- VRC, SB are simple, similar, control programs
with almost no parameter settings - ETG is similar, with more sophisticated software
to handle routing table configuration - Farm Node and Supervisor are the only components
that require significant programming effort. - Monitor node to a lesser extent.
7ETG Interface
Trigger Framework
ETG Node
ETG Program
Triggers Disable
Triggers
Control
Disk
DCOM
Similar to the VRC (and SB) will reuse software
L3 Monitor
L3 Supervisor
8Filter Process
- Physics Filter executes in a separate process.
- Isolates the framework from crashes
- The physics analysis code changes much more
frequently than the framework once the run has
started - Crash recovery saves the event, flags it, and
ships it up to the online system for debugging. - Raw event data is stored in shared memory
Framework
Mutexes
Shared Memory
Filter Process
Run 1 Experience
9Physics Filter Interface
- ScriptRunner
- Framework that runs physics tools and filters to
make the actual physics decision. - Cross platform code (NT, Linux, IRIX, OSF??)
- Managed by the L3 Filters Group
L3 Filter Process
ScriptRunner
L3 Framework Interface
10L3 Supervisor
- Manages Configuration of DAQ/Trigger Farm
- About 60 nodes
- Command Planning
- Online system will send Level 3 simple commands
- L3 must translate them into the specific commands
to each node to achieve the online systems
requests - Supports
- Multiple Runs
- Partitioning the L3 Farm
- Node Crash and Recovery
- Generic Error Recovery
- With minimal impact on running
11ErrorLogging Monitoring
- Error Logging
- L3 Filters group will use the zoom ErrorLogger
- Adopted a consistent set of standards for
reporting errors. - Plug-in module to get the errors off the Level 3
nodes - Sent to monitor process for local relay to online
system - Logfiles written in a standard format
- Trying to agree with online group to make this
standard across all online components - Monitoring
- Noncritical information
- Event counters, buffer occupancy, etc.
- Variables declared mapped to shared memory
- Slow repeater process copies data to monitor
process.
12DAQ/Trigger Integration
Detector
- Between the Hardware and the Online System
- Interface is minimal
- Data Output
- Control
- Monitor Information
- Implications are not Minimal
Trigger and Readout
Readout Crate
L1, L2 TCC
Data Cable
DAQ System
L3 Supervisor
L3 Monitor
Data Cable
L3 Node
Ethernet
NT Level 3
Ethernet
Collector / Router
COOR
Monitor
Data Logger
UNIX Host
DAQ Console
RIP
Detector Console
Ethernet
UNIX Host
FCC
13Software Integration
- Integration outside of Level 3 software
- Integration with offline (where we meet)
- Level 3 Filter
- Must run same offline and online
- Doom/dspack
- Control Monitor communication
- Uses ITC package
- Online Groups standard Communications Package
Requires offline-like Code Releases built on
Online
14NT Releases
- Build is controlled by SoftRelTools
- 100s of source files
- Build system required
- UNIX centric (offline)
- Too much work to maintain two
- SRT2 NT integration is done
- SRT2 is the build system
- Set us back several months no assigned person
- Li (NIU MS) is building NT releases now
- Just starting
- Starting with a small DAQ only release
- DSPACK friends, itc, thread_util, l3base
- Next step is to build 30 size release
- Everything we had nt00.12.00 release
15NT Releases
- Progress is slow
- Build system is still in flux!
- What does it affect?
- ScriptRunner filters tools ITC
- 10 Test right now
- Our ability to test the system now
- Dummy versions of SR interface
- Regular nt trigger releases must occur by March
15, 2000 - Muon Filtering in L3 is one of the commissioning
milestones.
16Scheduling
- Conflicting Requirements
- Must be continuous availible starting now
- Must upgrade integrate final hardware as it
arrives - Software is impacted
- Must upgrade in tandem and without disturbing
running system - Tactic
- Version 0 of software
- Upgrade adiabatically
- Interface to internal components remains similar
- Interface to online system does not change
- Test stand at Brown University for testing.
17VRC Interface
VBD Data Cables
VRC Node
50 MB/s
50 MB/s
VRC Program
FCI from last SB
100 MB/s
Control
FCI to 1st SB
100 MB/s
Disk
DCOM
L3 Supervisor
L3 Monitor
18VRC Interface (V0)
VBD Data Cables
VRC Node
50 MB/s
50 MB/s
VRC Program
VME/MPM (2)
Control
Disk
100 Mb/s
DCOM
(to SB/ETG)
L3 Supervisor
L3 Monitor
19November 1, 1999
- Read raw data from FEC into VRC
- Send raw data in offline format to online system
- Control via COOR
- Held up by NT releases
COOR
Collector Router
ITC
ITC
L3 Super
DCOM
50
DCOM
Detector/ VRC
Auto Start Utility
FEC
20February 15, 2000
- Multicrate readout
- Internal communication done via ACE
- Already implemented
COOR
Collector Router
ITC
ITC
L3 Farm Node
75
L3 Super
ACE
SB/ETG
25
50
DCOM
DCOM
Auto Start Utility
ACE
Detector/ VRC
Detector/ VRC
FEC
FEC
21March 15, 2000
- Muon Filtering in Level 3
- ScriptRunner Interface must be up
- NT releases must be regular
COOR
Collector Router
ITC
ITC
L3 Farm Node
50
L3 Super
ACE
SB/ETG
20
50
DCOM
DCOM
Auto Start Utility
ACE
Detector/ VRC
Detector/ VRC
FEC
FEC
22May 1, 2000
- Multistream Readout
- Ability to partition the L3 Farm
- Multiple Simultaneous Runs
- Route events by trigger bits
- ScriptRunner does output streams
COOR
Collector Router
ITC
ITC
L3 Farm Node
45
L3 Super
ACE
SB/ETG
10
25
DCOM
DCOM
Auto Start Utility
ACE
Detector/ VRC
Detector/ VRC
FEC
FEC
23Test Stands
- Detector subsystems have individual setups
- Allows them to test readout with final
configuration - Allows us to test our software early
- High speed running, stress tests for DAQ software
- Subsystems have some unique requirements
- Necessary for error rate checking in the Si, for
example. - Separate software development branches
- Attempt to keep as close as possible to the final
L3 design to avoid support headaches.
24Test Stands
- Three test stands currently in operation
- Brown Test Stand
- Test hardware prototypes
- Primary software development
- Silicon Test Stand
- L3 Node directly reads out a front end crate
- Helping us and Si folks test readout, perform
debugging and make system improvements - CFT Test Stand
- Instrumented and ready to take data (missing one
tracking board (VRBC) to control readout)
2510 Test
- Si Test Stand will evolve into full blown readout
- 10 test single barrel readout
- Requires full L3 Node
- Test out Silicon Filter Code
- ScriptRunner, Trigger Tools, etc.
- NT releases must be up to speed for this
- This is in progress as we speak
- The ScriptRunner components are held up by NT
releases.
26People
- Joint effort
- Brown University
- University of Washington, Seattle
- People
- Gennady Briskin, Brown
- Dave Cutts, Brown
- Sean Mattingly, Brown
- Gordon Watts, UW
- 1 post-doc from UW
- Students (gt1)
27Tasks
- VRC
- Simple once done will require few modifications.
(1/4 FTE) - SB
- Simple once done will require few modifications
(very similar to VRC) (1/4) - ETG
- Complex initialization required hardware
interface not well understood yet, requires
little work now. By the time VRC ramps down, this
will ramp up. (1/2) - Farm Node
- Large amount of work left to do in communication
with the supervisor and with ScriptRunner. Will
require continuous work as system gains in
complexity (3/4)
28Tasks
- L3 Supervisor
- Complex communication with COOR, started but will
require continuous upgrades as the system
develops in complexity. (1/2) - Monitoring
- Initial work done by undergraduates. Have to
interface to the outside world. No one working on
it at the moment (1/4). - NT Releases
- Offloading to NIU student. Requires continuous
work and interface with many different software
developers (1). - L3 Filter Integration
- Done by hand now, will have to be made automatic.
Take advantage of offline tools (1/2).
29Conclusions
- NT Releases have been the biggest delay
- Keeping up with the offline changes requires
constant vigilance - Offloading this task to a dedicated person.
- 10 test impact, March 15 milstone impact
- Group is correct size to handle the task
- Continuous operation
- Integrating the new hardware with the software
- As long as this group isnt also responsible for
releases. - Weak points currently
- Monitoring
- Integration with online system (log files, error
messages, etc.).
30L3 Farm Node
L3 Node Framework
VME Crate
Each 48 MB/s
Control, Monitoring and Error Module
MPM
Node-VME I/O Module
Ethernet
MPM
DMA capable VME-PCI Bridge
Shared Memory Buffers
Collector Router Module
- Prototype of the framework is finished
- Runs in Silicon Test Stand
- Second version finished by Jan 1.
- Improved Speed, interaction between processes,
new interface, and stability
L3 Filter Interface Module
Dedicated 100 Mbits/s Ethernet to Online
Collector/Router
31Details Of Filter Node
Control
Pool Queue
MPM Reader
Event Buffer Shared Memory
- Get a pointer to an event buffer
- Configures MPMs for receiving new event
- Wait till complete event arrives into MPM
- Load event data into shared memory buffer
- Insert event pointer into the next queue
Data
L3 Filter Input Interface
Validation Queue
L3 Filter Process
Process Interface
Event Validation
- FECs presence validation
- Checksum validation
Filter Queue
Process Interface
Output Events Queue
Output Pool Queue
Command/Monitor/Error Shared Memory
L3 Monitor Interface
Collector/Router Network Interface
- Determine where this event should be sent
- Sent event to collector/router node
Data to Online Host System
L3 Error Interface
32L3 Supervisor Interface
- Receives and interprets COOR commands and turns
them into internal state objects - Next step is communication to clients
VRC/ETG/SB/L3Node
Supervisor
COOR Command Interface
Online System
COOR
Commands
Configuration Request
Current Configuration DB
Direct Commands
Resource Allocator
Desired Configuration Data Base
Command Generator Sequencer
Commands
33Auto Start System
- Designed to automatically start after a cold boot
and bring a client to a known idle state - Also manages software distribution
Change Packages, Get Status, Reboot, etc.
Configuration Database
Auto Start Service
Package Database
Client Machine
Auto Start Service
Package
Get Package List
Running Packages
Install Packages
34Timeline
DAQ Availible
2000
2001
J F M A M
J J A S O N D J F
ICD
FT install/hookup
SMT install/ hookup
Roll in
Remove shield wall
FPS
1/2 VLPCs installed
All VLPCs installed
Beam-ready
Lum (L0)
Hookup ECS
Install final CFT electronics
1st CFT crate operational
Install tracking front-end
Run II begins Mar 1, 2001
Install/checkout CAL BLS
Waveguide production
CC cosmics
ECN cosmics
Forward m MDT pixel planes install/survey (AB)
End toroids installed
Forward m MDT pixel planes install/survey (C)
Assemble/install/ align EMC
1st Collab Commissioning Milestone Feb 15, 2000
Cosmic Ray Commissioning
Phase I Central muon, DAQ, RECO, trigger,
tracker front-end
Phase II Fiber tracker, preshowers, VLPCs,
CFT, forward muon
Phase III Full Cosmic Ray Run (add
TRIG, SMT, CAL)
35Silicon Test Display
Master GUI
Monitor
Counters
Raw Data Viewer
CPU1
CPU2
36Monitoring
- Non essential information
- Helpful for debugging
- Two sources of information
- Level 3 Physics Trigger Info
- Accept rates, filter timing.
- Framework ships binary block of data out to
concentrator (1-of-50) which combines it and
re-presents it. - Framework Items
- Event Counters, Node State.
So others can read without impacting system
37Monitoring
- Framework Items use a Shared Memory Scheme
Try to reuse online software
Framework Process 1
Shared Memory
Slow Retransmitter Process
Framework Process 2
NT Native Now, Soon ACE
Framework Process 3
TCP/IP (ACE)
- Framework Process
- Saves name, type, and data of monitor
- Data type is arbitrary
- Implemented with template classes
- Rest of World
- Requests Particular Items
- Update Frequency
Rest of World