Title: Presentacin de PowerPoint
1Interactive High Energy Physics Application
Celso Martinez-Rivero Jesus Marco, Rafael
Marco, Oscar Ponce David Rodriguez IFCA-CSIC
(SPAIN) CROSSGRID WP1 MEETING 18-III-2002 KRAKOW
2Distributed Physics Analysis in HEP
- Final user applications for interactive physics
analysis in a GRID aware environment for LHC
experiments - Access to large O/R DBMS through middleware
server vs. access to catalogue distributed root
data files - Distributed data-mining techniques (mainly
Neural Networks, also self organizing maps, ...) - Integration of user-friendly interactive access
via portals - CSIC UAB FZK INP INS
3HEP interactive analysis
- USER REQUIREMENTS
- LHC experiments
- Other taking data? CDF, BaBar
- Physics analysis physics results (TDR,
publications) - HLT Trigger optimisation
- TASKS
- 1.3.1 Interactive Distributed Data Access
- 1.3.2 Data Mining Techniques
- 1.3.3 Integration and Deployment
- 1.3.4 Application to LHC Physics TDR
- DEVELOPPERS
- CSIC D.Rodriguez (O/R DBMS),O.Ponce,
C.Martinez,J.Marco - FZK M.Kunze people on ROOT side
- CMSCSIC (Santander) C.Martinez
- ALICEFZK M.Kunze
- ATLAS UAB A.Pacheco, M.Bosman CSIC (Valencia)
E.Ros, J.Salt, INP P.Malecki - LHCB INP M.Witek
4The final product GUI
Monitoring
Graphic Output/(Input?)
DATASET Dictionary (Classes) Basic
Object Derived Procedures
Alphanumeric Output
Analysis Scripts
Work Persistency
5(XML) Info Work flow
UI
Replica/Data Manager Web Service Finder?
Broker
Master CE
CACHED
6Storage Element as WebService?
- Current SE in EDG
- GridFTP server
- WebService approach
- Passive SE GridFTP, or /grid , etc...
- Active SE
- SQL QUERY (ResultSet in XML) SELECT FROM
- (Three tier servlet running, like Spitfire)
ready! (IBM IDS) - ROOT query (does this make sense? Paw query
does make sense, implemented...) - PROCESSING QUERY ( Agent) Stored Procedure or
XML description (SOAP like?) - SQL QUERY ok for NN in HEP
- PROCESSING QUERY (Agent-like approach) needed
likely for SOM
7Data Mining
- Basic case 1
- Neural Network, BFGS
- Objective Parallel Training
- Balance Data Load (How? Split AFTER DB Query)
- MPI works ok for distributed calculus !
- How to do in ROOT environment (PROOF?)
- First Results...
- Basic case 2
- SOM (Unsupervised learning)
- Objective Cluster Analysis
- Wait for Meteo 1.4.b experience
8Parallel NN
- Prototype
- MLP package in PAW (databasen-tuples)
- Used for DELPHI Higgs search (REALISTIC)
- BFGS method
- Current setup
- Scan ntuples to filter events variables
- Result set in XML, split according to CEs
- 1 Master n slaves, in LOCAL cluster, using
MPICH G2 - NN architecture 16-10-10-1
- Master sends initial weights, each node returns
gradient and errors to master who adds and sends
new weights... - Scaling ok in local cluster, now moving to
non-local environment with latency and QoS in
mind! - Obvious solution adapt NN load in each node to
latency time
9NN scaling
Lattency curve
644577 events, 16 variables 16-10-10-1
architecture 1000 epochs for training
10Integration
- Final User applications should run on simulated
data from the 4 LHC experiments. - TBD
11Example of Trigger Analysis in CMS
- HLT Trigger almost equivalent to offline physics
analysis - 3 Trigger levels under study
- L1 Hardware
- L2 Software standalone
- L3 Software full reconstruction
12Example Single m stream
13Fine Tuning of HLT using Crossgrid
- Some basic topologies with no need of high level
analysis (isolated high Pt muons) - However other topologies much more difficult to
separate from background - More sophisticated analysis mandatory to keep
background rate low while high efficiency! - Crossgrid will help here!
14Trigger CPU Time
- L2 780 ms average
- Main processgt Trajectory Builder
- Important queues!
- L3 1689 ms average large fluctuations mostly
combinatorial. - Reduce muon cone?
15Trigger CPU Time
-
- What to do with large CPU consuming events?
- Use Grid to avoid collapsing the standard
trigger farms. - All events requiring more time to process than a
certain limit time are automatically sent to
Crosssgrid for triggering
16What do we need from Tools Services
- GRID Application Programming Environment
- Verification of MPI use YES
- Performance prediction YES
- Monitoring YES (action if a node is down)
- Services Tools
- User friendly portals YES (we go for all XML)
- Roaming access YES (but implicit in portal)
- Efficient distributed data access YES (need close
contact) - Specific resource management YES (general
question, how to manage interactive parallel
jobs)
17Short Answer to WP2 questions
- Programming languages
- C, C,Java
- HLA no
- CCA could be, not priority
- Component structure
- Matrix involved in gradient for NN
- Granularity
- Important if NN becomes too big, and in SOMs
- Performance problems
- Main Latency
- Monitoring
- Check dead nodes, lets think what to do
- Node time for processing, Latency
- Storage element performance
18Short Answer to WP2 questions
- MPI yes
- Calls (initial list)
Now MPI_Init(argc, argv) MPI_Comm_size(MPI_COM
M_WORLD, nproc) MPI_Comm_rank(MPI_COMM_WORLD,
rank) MPI_Get_processor_name(processor_name,nam
elen) MPI_Barrier(MPI_COMM_WORLD) MPI_Bcast(buff
er_int, 1, MPI_INT, 0, MPI_COMM_WORLD) MPI_Finali
ze() MPI_Recv(buffer_dbl,num_var , MPI_DOUBLE,
i, 30, MPI_COMM_WORLD, status) MPI_Send(buffer_d
bl, num_var, MPI_DOUBLE, 0, 30,
MPI_COMM_WORLD) Soon Next MPI_Reduce(buffer_dbl
_slave, buffer_dbl_master, num_datos, MPI_DOUBLE,
MPI_SUM , 0, MPI_COMM_WORLD)
19What do we need from Testbed
- Definition of Interactive resources?
- Reserved/Priorized?
- Cached storage(?)
- Submission or Interactive session?
- Check latency and if possible impact of QoS (see
map) - Need about 3-5 testbed sites each one with 50
processors to make a realistic test of the
potential.
20CrossGrid WP4 - International Testbed Organisation
Network (Geant) setup