Title: ALICE and GRID
1ALICE and GRID
- Existing DataGrid activities
- Work Package 8 HEP Applications
- Testbed validation
- No nordic participation
- Work Package 4 Fabric Management
- Fault tolerance
- High Level Trigger (HLT) participation
- University of Heidelberg
- Possible future Norwegian activities
- ALICE HLT system
- Comparable to TIER 1
- Overlap with many Work Packages
2High Level Trigger (HLT) for ALICE
- Bergen
- Budapest
- Frankfurt
- Heidelberg
- Oslo
3(No Transcript)
4TPC event(only about 1 is shown)
5HLT _at_ ALICE
- Detector readout rate (i.e. TPC) gtgt DAQ
bandwidth ? mass storage bandwidth - Physics motivation for a high level trigger
- Need for an online rudimentary event
reconstruction for monitoring
6Data volume and event rate
TPC detector data volume 300 Mbyte/event, data
rate 200 Hz
bandwidth
60 Gbyte/sec
front-end electronics
15 Gbyte/sec
realtime data compression pattern
recognition PC farm 500 clustered SMP
parallel processing
DAQ event building
lt 2 Gbyte/sec
lt 1.2 Gbyte/sec
permanent storage system
7High Level Trigger
- HLT applications
- Open Charm physics
- Quarkonium spectroscopy
- dielectrons
- dimuons
- Jets
8Data rate reduction
- Volume reduction
- regions-of-interest and partial readout
- pile-up removal in pp
- data compression
- entropy coder
- vector quantization
- TPC-data modeling
- Rate reduction
- (sub)-event reconstruction and (sub)-event
rejection before event building
9HLT system structure
TRD trigger
Dimuon trigger
PHOS trigger
Trigger detectors
Level-1
Pattern Recognition
TPC fast cluster finder fast tracker Hough
transform cluster evaluator Kalman fitter
Dimuon arm tracking
Level-3
Extrapolate to ITS
...
Extrapolate to TOF
Extrapolate to TRD
(Sub)-event Reconstruction
10Fast pattern recognition cluster and track
finding, online event reconstruction
11HLT - event flow
12HLT subprojects
- Efficient data formatting, Huffman coding and
Vector quanitization TPC Readout Controller Unit
- Fast cluster finder and Hough transformation
FPGA implementation on PCI Receiver Card - Fast pattern recognition cluster and track
finding, online event reconstruction - Clustered PC-farm architecture and
high-bandwidth, low latency network - Massive parallel computing load distribution and
performance monitoring - Simulation of trigger efficiency and/or data
compression factors for the open hadronic charm,
dielectron, jet and dimuon program
13HLT architecture overview
Optical
Links to Front
-
End
- Not a specialized computer, but a generic
large scale (gt500 node) multi processor cluster - A few nodes have additional hardware (PCI
RORC) - has to be operational in off-line mode also
- Use of commodity processors
- Use of commodity networks
- Reliability and fault tolerance is mandatory
- Use standard OS (Linux)
- Use of on-line disks as mass storage
Receiver
Processos / HLT Processor
RcvBd
RcvBd
RcvBd
RcvBd
RcvBd
RcvBd
RcvBd
RcvBd
PCI
PCI
PCI
PCI
PCI
PCI
PCI
PCI
NIC
NIC
NIC
NIC
NIC
NIC
NIC
NIC
Distributed
Farm Controller
HLT Network
PCI
PCI
NIC
NIC
PCI
NIC
NIC
Monitoring
Server
PCI
PCI
PCI
PCI
PCI
PCI
PCI
PCI
PCI
PCI
NIC
NIC
NIC
NIC
NIC
NIC
NIC
NIC
NIC
NIC
HLT
Processors
14TPC PCI-RORC
PCI bus
DIU
-
CMC
FPGA
Memory
- FPGA coprocessor
- Optical link receiver
PCI bridge
Glue logic
interface
Coprocessor
D32
³
internal
2 MB
DIU card
SRAM
2 MB
Memory D32
- Current prototype commercial eval-board
15FPGA coprocessor
- Implementation
- of Hough transform
16HLT Networking (TPC only)
All data rates in kB/sec (readout not included
here)
92 000
spare
7 000
65 000
92 000
spare
180 links, 200 Hz
92 000
spare
65 000
7 000
92 000
17 000 000
aggregate
2 340 000
252 000
?
cluster finder 18036 nodes
Track segments 10836 nodes
Track merger 7236 nodes
Global L3 12 nodes
Assume 40 Hz coinzidence trigger plus 160 Hz TRD
pretrig with 4 sectors per trigger
17HLT Interfaces
- HLT is autonomous system with high reliability
standards (part of data path) - HLT has a number of operating modes
- on-line trigger
- off-line processor farm
- possibly combination of both
- very high input data rates (20 GB/sec)
- high internal networking requirements
- HLT front-end is first processing layer
- Goal same interface for data input, internal
data exchange and data output
HLT internal, input and output interface Publish/s
ubscribe
- When local do not move data Exchange
pointers only - Separate processes, multiple subscribers for
one publisher - Network API and architecture independent
- Fault tolerant (can loose node)
- Consider monitoring
- Standard within HLT and for input and output
- Demonstrated to work on both shared memory
paradigm and sockets - Very light weight
18HLT - Cluster Slow Control
- Features
- Battery Backed Completely independent of host
- Power Controller Remote powering of host
- Reset Controller Remote physical RESET
- PCI Bus perform PCI bus scans, identify devices
- Floppy/flash emulator create remotely defined
boot image - Keyboard driver remote keyboard emulation
- Mouse driver remote mouse emulation
- VGA replace graphics card
- price very low cost
- Functionality
- complete remote control of PC like terminal
server but already at BIOS level - intercept port 80 messages (even remotely
diagnose dead computer) - interoperate with remote server, providing
status/error information - watch dog functionality
- identify host and receive boot image for host
- RESET/Power maintenance
19HLT and GRID
- HLT subprojects
- Efficient data formatting, Huffman coding and
Vector quanitization - Fast cluster finder and Hough transformation
FPGA implementation on PCI Receiver Card - Fast pattern recognition online event
reconstruction - Clustered PC-farm architecture and
high-bandwidth, low latency network - Massive parallel computing load distribution and
performance monitoring - Simulation of trigger efficiency and/or data
compression factors for the open hadronic charm,
dielectron, jet and dimuon program
- GRID work packages
- WP4 Fabric Management
- WP8 HEP Applications
- WP4 Fabric Management,
- WP5 Mass Storage Managem.,
- WP7 Network Services
- WP1 Grid Work Scheduling,
- WP2 Grid Data Management,
- WP3 Grid Monitoring Services
- WP6 Testbed Demonstrators,
- WP8 HEP Applications
20Clustered PC-farm architecture and
high-bandwidth, low latency network
21Conclusion
- ALICE HLT system
- Project with a large Norwegian contribution
- Large overlap with GRID activities