Title: Process and Data Flow Control in KLOE
1Process and Data Flow Control in KLOE
- E. Pasqualucci (INFN - Roma)
- enrico.pasqualucci_at_roma1.infn.it
2Outline
- System overview
- Process structure and local communication
- SNMP and remote communication
- Process control
- Data Flow Control system
- DFC monitor
3DAQ system architecture
23000 FEE channels _at_ 2.5 kHz f bckg (10 kHz)
Bandwidth 50 Mbytes/s (5 Kbyte/ev.) Storage
200 Tbyte/y
VIC
CBUS
Tested with peak rates of 10 kHz in multibunches
mode. Tested at maximum required throughput using
no zero suppressed calorimeter data
Trigger chain DFC system
VIC
C P U
F D D I
V I C
Level-2 crates
. . .
FDDI
Run Control
FDDI Switch
Monitor System
. . .
CPU server
CPU server
Storage system
4DAQ software organization
Data Map data Messages Traps
SlowCtl system
5Process structure
- Initialization
- Msg Q creation
- Shmem subscription
- Shmem space allocation for variables
- Main Loop
- Process Event
- Process Command
- Idle time
- Interrupt Handler
- Extract command from Msg Q.
Id Contents
Mapping
Process number Pointer to 1st process Pointer to
2nd process Process name Process id Message queue
id Process status Last command Last command
status Number of variables Variable 1 Variable
2 .. Pointer to 3rd process ..
Header Proc. 1 Proc. 2
All
Processes
6Local communication
- Sending a command
- The sender
- Locates the process
- Gets its id and message Q
- Puts command to Q
- Sends an interrupt
- Writes the command and status and executes it
- Writes the command status (acknowledgement)
Process number Pointer to 1st process Pointer to
2nd process Process name Process id Message queue
id Process status Last command Last command
status Number of variables Variable 1 Variable
2 .. Pointer to 3rd process ..
Process number Pointer to 1st process Pointer to
2nd process Process name Process id Message queue
id Process status Last command Last command
status Number of variables Variable 1 Variable
2 .. Pointer to 3rd process ..
Q
7Managing the DAQ network
- SNMP (Simple Network Management Protocol)
- Largely used to manage network devices
- Defined as a standard by the IETP (Internet
Engineering Task Force) - Implemented using a reliable UDP protocol
- Used to retrieve and/or set information about
- network configuration
- traffic
- faults
- accounting
- Managed objects defined in a Manager Information
Base (MIB) defined by IETP - Private extensions of the standard MIB are
allowed - Public domain software, allows the implementation
of - dedicated agents
- utilities for remote access
8SNMP client-server policy
- MIB
- Variables organized as a tree
- Primitives
- get, get-next, set
- Each device runs a daemon able to
- Understand MIB requests
- Obtain required information
- Execute required actions
- Trap mechanism
- KLOE uses SNMP to
- Control DAQ devices and network
- Implement message distribution
- Implement process control
- Implement Data Flow Control (DFC)
9The command server andthe KLOE MIB sub-tree
10Message system implementation
11Remarks and performance
- Command server
- DAQ process
- receives commands and shares variables
- Command distributor
- Run and process control tools
- tcl/tk commands implemented
- get variable, send message
- Fortran interface for old fashioned software
- Portable
- AIX, OSF1, HP-UX, Solaris, Linux, LynxOS
supported - Optimized library
- Parallel message distribution implemented
- Performance
- Local command 1.2 ms
- Remote variable reading 1.2 ms
- Remote command completion 4 ms
12Production process control
command command start trap signal check
pcd
Control node
OffCtl
cmdsrv
locpc
Production node
Proc_1
Proc_2
13DAQ system architecture
VIC
CBUS
Trigger chain DFC system
VIC
C P U
F D D I
V I C
Level-2 crates
. . .
FDDI
Run Control
FDDI Switch
Monitor System
. . .
CPU server
CPU server
Storage system
14The DFC System
- Changes the packet distribution sequence
- Avoids slow-down in data transmission and
blocking timeouts - Keeps latency under control
Network and trigger stat
Performance stat
Statistics
Commands
Traps
Flow table data
15Receiver protocol
- Receives event sub-packets through the GigaSwitch
- Put packets into multiple circular buffer
- Implements DFC and LatMon farm interface
- Dynamic thresholds
16DFC Protocol
DFC data in VME shared memory
- Initialization
- Builds Network Map
- Builds DFC map (ordered list of RECV IP
addresses) - Creates the first table with Infinity Trigger
number validity - Main Loop
- Wait for trap
- On trap (full/empty)
- Reads the last trigger number from Trigger
Supervisor - Creates next table
- Modifies the validity of the previous table
- Sends auto-test traps
0
0
Validity trigger
17DFC algorithm and performance
- Validity
- v t0 (ttr (tdfc ksdfc))(n ksn) t
- k 5
- autotest
- DFCd reaction time (trap)
- 1.2 ms
- DFC reaction time
- tlocal 1.2 ms
- trigger interaction 6-7 ms
- tdfc O(10-2) ms
- total 10 ms
- DFC-L2 interaction rate
- 1 table / 50 ms (sustained)
- DFC dead time implemented
18The DFC status monitor
19Packet latency
- Latency measurements
- SNMP traps sent to LatMon
- Collector trap when the packet is released for
sender - Receiver trap when all the sub-packets arrived
- Test for receivers buffers
20Summary
- A fast and reliable message system has been
implemented using standard UNIX mechanisms and
the SNMP protocol - Very simple to use
- process template command definition
- fortran and tcl/tk interface
- Allows full process control
- A Data Flow Control system has been developed
using message system and SNMP traps - It allows to redirect network traffic taking into
account the dynamics of the whole system - Dynamic redefinition of thresholds
- It successfully ran during KLOE data acquisition