ALICE DAQ Comprehensive Review 5 - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

ALICE DAQ Comprehensive Review 5

Description:

CASTOR CERN Advanced STORage Manager CERN developed MSS ... CASTOR file system. Interfaced to the GRID. GRID. Catalog. CR5 March 2005. 18. P. VANDE VYVRE CERN-PH ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 41
Provided by: pier136
Category:

less

Transcript and Presenter's Notes

Title: ALICE DAQ Comprehensive Review 5


1
ALICE DAQComprehensive Review 5
  • P. VANDE VYVRE CERN/PH for
  • the ALICE DAQ project
  • Birmingham, Budapest, CERN, Istanbul, Split,
    Zagreb
  • CERN - 7/8 March 2005
  • Istanbul University ALICE membership being
    discussed now with Funding Agency

2
Acronyms (1)
  • AliROOT ALICE sw framework based on ROOT
  • AFFAIR A Fine Fabric and Applications Information
    Recorder Performance monitoring sw
  • ADC ALICE Data Challenge ALICE
    DAQ/HLT/MSS/Offline integrated test
  • BW Bandwidth
  • CASTOR CERN Advanced STORage Manager CERN
    developed MSS
  • CTP Central Trigger Processor System managing
    TRG L0, L1, L2
  • DAQ Data Acquisition System
  • DAS Direct Attached Storage Storage
    accessible from one computer
  • DATE Data Acquisition and Test Environment ALICE
    DAQ sw framework
  • DDL Detector Data Link ALICE optical link
  • DDL DIU DDL Destination Interface Unit Optical
    Link receiving side (DAQ side)
  • DDL SIU DDL Source Interface Unit Optical Link
    sender side (detector side)
  • EBDS Event Building and Distribution System Event
    building load balancing system
  • EDM Event Destination Manager Sw allocating the
    GDC for event-building
  • EOR End Of Run Phase of the DAQ control system
  • GDC Global Data Collector CPU performing
    event-building
  • HLT High Level Trigger ALICE Software Trigger
    Level 3
  • HW Hardware

3
Acronyms (2)
  • I/O bus Input/Output bus Computer bus used for
    input/output
  • L0, L1, L2 Trigger levels 0,1,2 Fast TRG based
    on partial data (hw)
  • LDC Local Data Concentrator CPU performing DDL
    readout sub-event building
  • LTC Local Trigger Crate Local Trigger System
    interface to central TRG and TTC,
    stand-alone TRG system
  • LTU Local Trigger Unit Board interfacing the
    central TRG to the LTC
  • MSS Mass Storage System Data management
    software
  • NAS Network Attached Storage Storage
    accessible from a network through a
    server
  • NIC Network Interface Card Computer interface
    to the network
  • NTW Network
  • OO Object-Oriented Software paradigm (C,
    Java)
  • PCI Open standard of PC I/O bus
  • ROOT OO software framework for I/O
    visualization
  • RORC Read-Out Received Card Mother-board of the
    DDL SIU
  • SAN Storage Area Network Network dedicated to
    serverless storage
  • SMI State Manager Interface Run control based
    on distributed state machines
  • SOR Start Of Run Phase of the DAQ control
    system
  • SW Software
  • TRG Trigger
  • TTC Trigger, Timing and Control Optical
    broadcast system used by the TRG

4
ALICE DAQ
  • Data transfer DDL and D-RORC
  • DATE V5 and DAQ fabric
  • Integration with detectors
  • Data Challenges
  • Installation
  • Commissioning

5
DAQ architecture
Rare/All
CTP
L0, L1a, L2
BUSY
BUSY
LTU
LTU
DDL H-RORC
L0, L1a, L2
HLT Farm
TTC
TTC
FEP
FEP
FERO
FERO
FERO
FERO
Event Fragment Sub-event Event File
10 DDLs 10 D-RORC 10 HLT LDC
262 DDLs
123 DDLs
329 D-RORC 175 Detector LDC
LDC
LDC
LDC
LDC
LDC
Load Bal.
Event Building Network
EDM
50 GDC 25 TDS
GDC
GDC
GDC
DSS
DSS
GDC
5 DSS
Storage Network
6
DDL Radiation Tolerance Test
  • The SIU card works in radiation environment
  • Total ionizing dose 16 Gy/10 years
  • Neutron fluence 3.9 x 1011 n/cm2 /10 years
  • Charged hadron fluence 8.9 x 1011 n/cm2 /10
    years
  • Irradiation
  • Cyclotron of TSI, Uppsala, Sweden protons, 50,
    150, 180 MeV
  • Cyclotron of ATOMKI, Debrecen, Hungary neutrons,
    1 to 15 MeV
  • All components are radiation tolerant except the
    FPGAFocus on FPGA configuration loss
  • FPGAs under test (standalone and complete DDL
    board)
  • ALTERA APEX-E (EP20KE...) SRAM
  • XILINX Virtex II SRAM
  • Actel ProASIC FLASH
  • Tests register and RAM tests
  • MEMORY test read and compare
  • REGISTER test long chain of shift registers

7
Test Setup - 1
Conf.PROM
ALTERAFPGA
optical cable
PC
S/P
OT
DDL card
RORC
MEMORY TEST REGISTER TEST Transfer over DDL
Test Setup - 2
adapter card
PC
  • LV TTL interface

FPGAtestboard
RS232
FPGA
Parallel port
REGISTER TEST
  • XILINX or ACTEL
  • Development Board

8
Test Firmware and Software
  • Tests
  • MEMORY test FPGA internal memory cells filled
    with bit pattern (2048 x 16 bit).
  • REGISTER test long chain of shift registers
    (128 x 16 bit, 128 x 8 bit)

SW read and compare
Expected bit pattern
Read-out bit pattern
Difference
Memory cell error
Logic cell errorConfiguration Loss
9
RadTol Project Results
10
Rad-Tol DDL SIU design
  • Self-healing card
  • FPGA (ALTERA or XILINX) can suffer configuration
    loss
  • Automated configuration error detection and
    recovery
  • Both ALTERA and XILINX FPGAs support this with
    special functions
  • On-board rad. tol. (e.g. Flash based) supervisory
    circuit controls the mechanism
  • Radiation tolerant card
  • All components are radiation tolerant including
    FPGA
  • ACTEL ProASIC adopted as baseline

11
DDL SIU Design (ACTEL)
Xtal
PLL
PLL
Data path(2x16 bits) control
Data path (Serial)
TXCLK
RXCLK
RXCLK/2
TXCLK/2
TXCLK
RXCLK
ACTEL ProASIC
SERDES
OpticalTransceiver
Power
JTAG
  • Hardware
  • Schematic ready for future ProASIC3
  • 2 prototype boards no design error so far
  • Firmware
  • All modules ported from present firmware
  • Timing critical modules reengineered
  • Complete firmware simulated
  • Simple DDL transactions already tested

12
DDL Software
  • All functions accessible as interactive commands
    or API
  • Script-based interpreter for sequence of
    operations
  • Sending command to the FEE
  • Reading FEE status
  • printing the status
  • comparing the status
  • polling the status
  • Downloading data into the FEE from a file
  • Reading data from the FEE
  • writing data into a file
  • comparing data with data in a file
  • Part of start-of-run sequence
  • TPC configuration lt 3.0 sAll pedestals of all
    Altro 2MB/RCUSeveral thousands configuration
    files

FERO
DDL
Control Configuration
Data
D-RORC
LDC
define pedestal_addr 0x1FFF define enable_
pedestal 0x2C reset SIU write_command enable_p
edestal write_block pedestal_addr
pedestal.hex x read_and_check_block pedestal_add
r pedestal.hex x
13
DATE V5
  • Run Control Compatible with latest version of the
    Experiment Control System (See ECS talk)
  • Use of database (MySQL) for configuration
  • HLT decisions distribution
  • New multi-streams data recorder
  • Use of database (MySQL) for Info-logging system
  • DATE V5 ready. Test during Data Challenge (Mar
    05)
  • Event building tested successfully with
    Infiniband
  • Code management system CVS
  • Release packaging and distribution Red Hat RPM

14
Use of DBMS for DATE Configuration
  • Database content
  • DATE RolesActors of DATE system LDCs, GDCs
  • TriggerTrigger masks
  • DetectorsFront-end equipment of LDCs
  • Event building controlEvent building rules
  • BanksMemory banks to operate DATE
  • Database implementation

15
Distribution of HLT decisions inside DAQ
CTP
L2 trigger pattern
L2 trigger pattern
LTU
LTU
HLT Farm
TTC
TTC
Original LDC pattern
FEP
FEP
FERO
FERO
FERO
FERO
L2 trigger pattern
Refined LDC pattern HLT output pattern
Original LDC pattern
LDC
LDC
LDC
LDC
LDC
Refined LDC pattern HLT output pattern
Event Building Network
EDM
GDC
GDC
DSS
DSS
GDC
Refined LDC pattern HLT output pattern
Storage Network
16
HLT decision handling in detector LDC
Detector
Eventsfragments
HLT
Det. LDC
DDL DIU
DDL SIU
D-RORC
readout
DATE banks
HLT decisions
Raw data
decision agent
recorder
NIC
Selected Sub-events
Event Building Network
17
Event building and data recording in GDCs
Event Building Network
Sub-events (raw data, HLT payload, HLT
decisions) HLT decisions
  • Event builderIn subeventsOut I/O vectorSet
    of pointer/size pairs
  • DATE recorderRFIO format doneROOT data format
    in progress Parallel streamsCASTOR file
    systemInterfaced to the GRID

GDC
NIC
DATE data banks
event builder
ROOT recorder
Complete accepted events
GRID Catalog
Storage Network
18
Event building network
  • ALICE baseline for event-building Gigabit
    Ethernet TCP/IP protocol
  • Hw independent de-facto network standard IP
  • Tests done at HP HPC NT (High Performance
    Computing New Technologies)
  • Test of another event building network
  • InfiniBand (4x)
  • IPoIB (IP over IB) stack of Voltaire Inc.
  • System running smoothly
  • Not a single line of codemodified
  • Changing only /etc/hosts(alternate routing of
    packets)
  • Performance with14 LDC 13 GDC 2 GB/s
  • ALICE DAQ ready for future networks

TCP
iSCSI
iSER
TOE
TCP
EthernetNIC
RNIC
InfiniBandHBA
19
Infologger using DBMS
runControl EDM AFFAIR
LDC
DDL DIU
DDL SIU
rcServer
Performance12000 - 20000 msg/s
D-RORC
readout
DATE banks
decision agent
Raw data
Message Database
recorder
NIC
Event Building Network
GDC
NIC
rcServer
DATE banks
event builder
ROOT recorder
20
Data quality monitoring MOOD
  • MOOD Monitoring Of Online Data
  • DATE ROOT environments
  • MOOD framework
  • Interfaces to detector code
  • Applications
  • Raw data integrity
  • Detector performance

21
Monitoring software AFFAIR
  • Monitoring of system parameters CPU usage,
    memory usage etc
  • Monitoring of DATE system individual bandwidth
    In/Out, event number etc

22
AFFAIR V2
  • New tool to install and configure AFFAIR
  • Used daily in the DAQ reference system
  • All performance plots of this talk are produced
    with AFFAIR

23
Transient Data Storage Storage Arrays
  • Fast evolution since 2002-2005
  • Prices dropped dramatically by using COTS disks
  • Hard Disk reliability not yet adequate
  • dotHILL SANnet II 200 FC
  • 12 fiber channel disk slots
  • 1 GB cache
  • 1 x 2Gbit fiber host channel
  • Infortrend IFT-6330
  • 12 IDE drive slots
  • 128 MB cache
  • 2 x 2Gbit fiber host channels
  • Infortrend EonStor A16F-G1A2
  • 16 SATA drive slots
  • 1GB cache
  • 2 x 2Gbit fiber host channel

24
Storage Arrays Performance
  • Aggregate throughput measured for sets of 5
    disks configured as RAID 5

2 GB, write
dothill 1 dothill 2
IFT 1 IFT 2
2 GB, write
25
DAQ Reference System sw data generator
DDG
DDG sw
DDG
DDG sw
HLT
Rare/All
LDC
LDC
LDC
LDC
LDC
Load Bal.
Event Building Network
EDM
GDC
GDC
DSS
Storage
Storage Network
TDS
26
DAQ Reference System hw data generator
L0, L1a, L2
BUSY
BUSY
LTU
LTU
L0, L1a, L2
TTC
TTC
DDG
DDG sw
DDG
DDG sw
HLT
DDG
DDG
DDG
DDG
Rare/All
LDC
LDC
LDC
LDC
LDC
Load Bal.
Event Building Network
EDM
GDC
GDC
DSS
Storage
Storage Network
TDS
27
DAQ/HLT setup for TPC test beam
Detector LDC
Si TelescopeTOF
VME processor CAEN VME boards
Fast Ethernet
10 MB/s
GDC
CASTOR 1.5 TB
3x 250 GB disk
28
Combined ITS Test Beam DAQ setup
Integration with TRG, ECS !
Trigger Logic
LTU
LTU
LTU
NIM-based logic LTU decision Master-Slave
logic TTC-based distribution Event ID from
TTC ECS DATE V5 Event-building event ID
TTC vi
TTC vi
TTC vi
TTC ex
TTC ex
TTC ex
DetectorReadout
DetectorReadout
DetectorReadout
DDL SIU
DDL SIU
DDL SIU
DDL
DDL
DDL
LDC (PC/Linux)
DDL DIU
LDC (PC/Linux)
DDL DIU
LDC (PC/Linux)
DDL DIU
RORC
RORC
RORC
Event Building Network
Mass Storage System Computing Center
GDC
29
DAQ/Detector integration (Feb 04)
30
DAQ/Detector integration (Mar 05)
31
Detector readout time
  • Muon TRK (F. Lefevre - S. Rousseau) 400 µs
    (measurement of test beam Aug. 04 scaled to 5
    occupancy in tracking chambers)
  • SPD (A. Kluge) 260 µs (between L2A and end of
    transfer to DAQ)
  • SDD (D. Nouais) (ITS combined test beam Oct. 04)
  • Single-Buffer Multi-buffer
  • Dead-time 2 0.5-1.6 ms
  • Rate 312 520 events/s
  • TPC (L. Musa)
  • 925 Hz for 16 kBLimited by ALTRO to RCUWill
    improve with sparse readout
  • 320 Hz for 300 kB (central event)Limited by DDL
    bw

32
Data Challenge VI
  • System performances
  • Event building bandwidth 1.5 GB/s with ALICE
    traffic
  • Storage bandwidth (50 increase compared to last
    ADC)
  • Tape 450 MB/s sustained over a week
  • Disk 700 MB/s peak needed
  • System setup
  • New 10 Gb Eth router unstable. Firmware upgrade
    by company failed.
  • System reduced to a a switch of the previous
    generation.Limited number of ports, limited
    bandwidth
  • Limited number of machines (56 nodes, 15 LDC x 41
    GDC)Only flat traffic test so far

33
Data Challenge Event building bandwidth
MBytes/s.
- Discrepancy ALICE traffic vs equal traffic
solved - But too low due to lack of CPU
34
Event building performance
35
Data Challenge Mass Storage bandwidth
MBytes/s.
Delayed to 2005CASTOR not ready
Slightly less than the goal
36
16-17 Feb 2005 very promising start
Bandwidth to disk Before start of migration
37
15-22 Feb 2005
Global bandwidth Migration active
Goal
Large fluctuations between GDCs
38
01-08 Mar 2005 RFIO production period
Goal
39
Current status of Data Challenge
  • Hardware setup
  • New 10 Gigabit Ethernet router not usable
  • Stable network with 1 single N7 box butReduced
    performance, simplified architecture (No 10 Gbit
    Eth router)
  • CASTOR
  • 9 months delay ADC VI restarted Jan 05 with the
    new version
  • Lots of problems have been identified and fixed
    in the new CASTOR version
  • Major involvement of the CASTOR team. But online
    debugging during close-to-production period. Is
    the process under control ?
  • Future schedule only 18 months to get CASTOR
    right !
  • 2005 450 MB/s 2006 750 MB/s
  • From Sep 2006 onwards whole ALICE DAQ team busy
    with detector integration
  • ADC PoW
  • DAQ only - Event building test OK
  • DAQRFIOCASTOR test OK
  • Scalability of DATE V5.5 to be done(HLT
    decisions handling, DB-based applications
    configuration, infologger)
  • DAQROOTCASTOR to be done.
  • We need
  • Stable reliable hw setup dedicated 100 to ADC
  • Adequate storage resources to reach milestones
    450 MB/s with ROOT

40
DAQ Installation
All equipment placed in racks Cables and racks in
DCDB Installation planning estalished
41
DAQ Services Installation
  • Revised DAQ planning
  • DAQ at Point 2 not used before end of the year ?
    complete installation of all services before
    start of DAQ hw installation
  • Jan-May 05 all services
  • May-Jun DAQ installation
  • Jul-Aug DAQ commissioning
  • Sep DAQ ready for TPC test in SXL

42
DAQ Commissioning
  • Commissioning of hardware and software in DAQ lab
  • Reference system with a rack from the
    experimental area
  • Combined tests TRG/DAQ/HLT/DCS/ECS before
    installation
  • Detector integration in the institutes and test
    beams
  • Hardware DDL and RORC
  • Software DATE, MOOD
  • DAQ for Detector Test and Commissioning in Nov05
    for TPC
Write a Comment
User Comments (0)
About PowerShow.com