Title: MONARC : results and open issues
1MONARC results and open issues
2Layout of the talk
- Most material from Irwin Gaines talk at Chep2000
- The basic goals and structure of the project
- The Regional Centers
- Motivation
- Characteristics
- Functions
- Same Results from the simulations
- The need for more realistic implementation
oriented Models Phase-3 - Relations with GRID
- Status of the project Phase-3 LOI presented in
January, Phase-2 Final Report to be published
next week, Milestones and basic goals met
3MONARC
- A joint project (LHC experiments and CERN/IT) to
understand issues associated with distributed
data access and analysis for the LHC - Examine distributed data plans of current and
near future experiments - Determine characteristics and requirements for
LHC regional centers - Understand details of analysis process and data
access needs for LHC data - Measure critical parameters characterizing
distributed architectures, especially database
and network issues - Create modeling and simulation tools
- Simulate a variety of models to understand
constraints on architectures
4MONARC
- Models Of Networked Analysis
At Regional Centers - Caltech, CERN, FNAL, Heidelberg, INFN,
- Helsinki, KEK, Lyon, Marseilles, Munich,
Orsay, Oxford, RAL,Tufts, ... - GOALS
- Specify the main parameters characterizing the
Models performance throughputs, latencies - Determine classes of Computing Models feasible
for LHC (matched to network capacity and data
handling resources) - Develop Baseline Models in the feasible
category - Verify resource requirement baselines
(computing, data handling, networks) - COROLLARIES
- Define the Analysis Process
- Define Regional Center Architectures
- Provide Guidelines for the final Models
622 Mbits/s
FNAL 4.107 MIPS 110 Tbyte Robot
Desk tops
622 Mbits/s
Desk tops
University n.106MIPS m Tbyte Robot
N x 622 Mbits/s
Optional Air Freight
CERN n.107 MIPS m Pbyte Robot
Desk tops
622Mbits/s
622 Mbits/s
622 Mbits/s
5Working Groups
- Architecture WG
- Baseline architecture for regional centres,
Technology tracking, Survey of computing model of
current HENP experiments - Analysis Model WG
- Evaluation of LHC data analysis model and use
cases - Simulation WG
- Develop a simulation tool set for performance
evaluation of the computing models - Testbed WG
- Evaluate the performance of ODBMS, network in the
distributed environment
6 General Need for distributed data access and
analysis
- Potential problems of a single centralized
computing center include - - scale of LHC experiments difficulty of
accumulating and managing all resources at one
location - - geographic spread of LHC experiments
providing equivalent location independent access
to data for physicists - - help desk, support and consulting in same time
zone - - cost of LHC experiments optimizing use of
resources located world wide
7Motivations for Regional Centers
- A distributed computing architecture based on
regional centers offers - A way of utilizing the expertise and resources
residing in computing centers all over the world - Provide local consulting and support
- To maximize the intellectual contribution of
physicists all over the world without requiring
their physical presence at CERN - Acknowledgement of possible limitations of
network bandwidth - Allows people to make choices on how they analyze
data based on availability or proximity of
various resources such as CPU, data, or network
bandwidth.
8Future Experiment Survey
- Analysis/Results
- From the previous survey, we saw many sites
contributed to Monte Carlo generation - This is now the norm
- New experiments trying to use the Regional Center
concept - BaBar has Regional Centers at IN2P3 and RAL, a
smaller one in Rome - STAR has Regional Center at LBL/NERSC
- CDF and D0 offsite institutions paying more
attention as run gets closer.
9Future Experiment Survey
- Other observations/ requirements
- In the last survey, we pointed out the following
requirements for RCs - 24X7 support
- software development team
- diverse body of users
- good, clear documentation of all s/w and s/w
tools - The following are requirements for the central
site (I.e. CERN) - Central code repository easy to use and easily
accessible for remote sites - be sensitive to remote sites in database
handling, raw data handling and machine flavors - provide good, clear documentation of all s/w and
s/w tools - The experiments in this survey achieving the most
in distributed computing are following these
guidelines
10- Tier0 CERN
- Tier1 National Regional Center
- Tier2 Regional Center
- Tier3 Institute Workgroup Server
- Tier4 Individual Desktop
- Total 5 Levels
11storage network
12 Gbps
processors
1400 boxes 160 clusters 40 sub-farms
tapes
1.5 Gbps
0.8 Gbps
3 Gbps
8 Gbps
12 Gbps
farm network
480 Gbps
0.8 Gbps (daq)
100 drives
LAN-SAN routers
LAN-WAN routers
CMS Offline Farm at CERN circa 2006
250 Gbps
storage network
5 Gbps
0.8 Gbps
5400 disks 340 arrays ...
disks
assumes all disk tape traffic on storage
network double these numbers if all disk tape
traffic through LAN-SAN router
lmr for Monarc study- april 1999
12(No Transcript)
13Processor cluster
sub-farm 36 boxes, 144 cpus, 5 m2
basic box four 100 SI95 processors standard
network connection (2 Gbps) 15 of systems
configured as I/O servers (disk server,
disk-tape mover, Objy AMS, ..) with
additional connection to the storage
network cluster 9 basic boxes with a network
switch (lt10 Gbps) sub-farm 4 clusters - with a
second-level network switch (lt50 Gbps) one
sub-farm fits in one rack
cluster and sub-farm sizing adjusted to fit
conveniently the capabilities of network switch,
racking, power distribution components
lmr for Monarc study- april 1999
14Regional Centers
- Regional Centers will
- Provide all technical services and data services
required to do the analysis - Maintain all (or a large fraction of) the
processed analysis data. Possibly may only have
large subsets based on physics channels. Maintain
a fixed fraction of fully reconstructed and raw
data - Cache or mirror the calibration constants
- Maintain excellent network connectivity to CERN
and excellent connectivity to users in the
region. Data transfer over the network is
preferred for all transactions but transfer of
very large datasets on removable data volumes is
not ruled out. - Share/develop common maintenance, validation, and
production software with CERN and the
collaboration - Provide services to physicists in the region,
contribute a fair share to post-reconstruction
processing and data analysis, collaborate with
other RCs and CERN on common projects, and
provide services to members of other regions on a
best effort basis to further the science of the
experiment - Provide support services, training,
documentation, trouble shooting to RC and remote
users in the region
15Mass Storage Disk Servers Database Servers
Data Import
Data Export
Tier 2
Network from CERN
Local institutes
Network from Tier 2 and simulation centers
Production Reconstruction Raw/Sim--gtESD Schedule
d, predictable experiment/ physics groups
Production Analysis ESD--gtAOD AOD--gtDPD Schedule
d Physics groups
Individual Analysis AOD--gtDPD and
plots Chaotic Physicists
CERN
Tapes
Tapes
Desktops
Support Services
Physics Software Development
RD Systems and Testbeds
Info servers Code servers
Web Servers Telepresence Servers
Training Consulting Help Desk
16Mass Storage Disk Servers Database Servers
Data Import
Data Export
Total Storage
Robotic Mass Storage - 300TB Raw
Data 50TB 5107 events (5 of 1 year) Raw
(Simulated) Data 100TB 108 events EDS
(Reconstructed Data) 100TB - 109 events (50
of 2 years) AOD (Physics Object) Data 20TB
2109 events (100 of 2 years) Tag Data 2TB
(all) Calibration/Conditions data base 10TB
(only latest version of most data types kept
here) Central Disk Cache - 100TB (per user
demand) CPU Required for AMS database servers
??103 SI95 power
Tier 2
Network from CERN
Local institutes
Network from Tier 2 and simulation centers
Production Reconstruction Raw/Sim--gtESD Schedule
d, predictable experiment/ physics groups
Production Analysis ESD--gtAOD AOD--gtDPD Schedule
d Physics groups
Individual Analysis AOD--gtDPD and
plots Chaotic Physicists
CERN
Tapes
Tapes
Data Input Rate from CERN Raw Data - 5
50TB/yr ESD Data - 50 50TB/yr AOD Data -
All 10TB/yr Revised ESD - 20TB/yr
Data Input from Tier 2 Revised ESD and AOD -
10TB/yr Data Input from Simulation Centers Raw
Data - 100TB/yr
Data Output Rate to CERN AOD Data -
8 TB/yr Recalculated ESD - 10 TB/yr
Simulation ESD data - 10 TB/yr Data Output to
Tier 2 Revised ESD and AOD - 15 TB/yr Data
Output to local institutes ESD, AOD, DPD data -
20TB/yr
Desktops
Physics Software Development
RD Systems and Testbeds
Info servers Code servers
Web Servers Telepresence Servers
Training Consulting Help Desk
17Physics Sftware Development
Mass Storage Disk Servers Database Servers
Data Import
Data Export
Tier 2
Network from CERN
Local institutes
Web Servers Telepresence Servers
Network from Tier 2 and simulation centers
Production Reconstruction Raw/Sim--gtESD Schedule
d experiment/ physics groups
Production Analysis ESD--gtAOD AOD--gtDPD Schedule
d Physics groups
Individual Analysis AOD--gtDPD and
plots Chaotic Physicists
CERN
Tapes
Info servers Code servers
Tapes
Event Selection Jobs 10 physics groups
108 events (10samples) 3 times/yr
based on ESD and latest AOD data 50 SI95/evt gt
5000 SI95 power Physics Object creation Jobs
10 physics groups 107 events (1 samples)
8 times/yr based on selected event sample
ESD data 200 SI95/event gt 5000 SI95
power Derived Physics data creation Jobs 10
physics groups 107 events 20 times/yr
based on selected AOD samples, generates
canonical derived physics data 50 SI95/evt gt
3000 SI95 power Total 110 nodes of 100 SI95 power
Training Consulting Help Desk
Farms of low cost commodity computers, limited
I/O rate, modest local disk cache ----------------
------------------------------------- Reconstructi
on Jobs Reprocessing of raw data 108
events/year (10) Initial processing of
simulated data 108/year 1000 SI95-sec/event
gt 104 SI95 capacity 100 processing nodes
of 100 SI95 power
Derived Physics data creation Jobs 200
physicists 107 events 20 times/yr based on
selected AOD and DPD samples 20 SI95/evt gt
30,000 SI95 power Total 300 nodes of 100 SI95
power
RD Systems and Testbeds
18MONARC Analysis Process Example
19Model and Simulation parameters
- Have a new set of parameters common to all
simulating groups. - More realistic values, but still to be
discussed/agreed on the basis of Experiments
information.
1000 Proc_time_RAW SI95sec/event
(350) 25 Proc_Time_ESD
(2.5) 5
Proc_Time_AOD
(0.5) 3 Analyze_Time_TAG
3 Analyze_Time_AOD 15
Analyze_Time_ESD
(3) 600 Analyze_Time_RAW
(350) 100 Memory of
Jobs MB 5000 Proc_Time_Create_RAW
SI95sec/event (35) 1000
Proc_Time_Create_ESD
(1) 25 Proc_Time_Create_AOD
(1)
20Base Model used
- Basic Jobs
- Reconstruction of 107 events RAW--gt ESD --gt AOD
--gt TAG at CERNIts the production while the
data are coming from the DAQ (100 days of running
collecting a billion of events per year) - Analysis of 5 Working Groups each of 25 analyzers
on TAG only (no request to higher level data
samples). Every analyzer submit 4 sequential
jobs on 106 events.Each analyzer work start-time
is a flat random choice in the range of 3000
seconds.Each analyzer data sample of 106 events
is a random choice in the complete data sample of
TAG DataBase consisting of 107 events. - Transfer (FTP) of a 107 events ESD, AOD and TAG
from CERN to RC
- CERN Activities Reconstruction, 5 WG Analysis,
FTP transfer - RC Activities 5 (uncorrelated) WG Analysis,
receive FTP transfer
- Jobs paper estimate
- Single Analysis Job 1.67 CPU hours at CERN
6000 sec at CERN (same at RC) - Reconstruction at CERN for 1/500 RAW to ESD
3.89 CPU hours 14000 sec - Reconstruction at CERN for 1/500 ESD to AOD
0.03 CPU hours 100 sec
21Resources LAN speeds ?!
- In our Models the DB Servers are uncorrelated and
thus one activity uses a single Server. The
bottlenecks are the read and write speed to
and from the Server. In order to use the CPU
power at reasonable percentage we need a read
speed of at least 300 MB/s and a write speed of
100 MB/s (milestone already met today) - We use 100 MB/s in current simulations (10
Gbits/sec switched LANs in 2005 may be possible). - Processing node link speed is negligible in our
simulations. - Of course the real implementation of the Farms
can be different, but the results of the
simulation do not depend on real
implementation they are based on usable
resources.
See following slides
22More realistic values for CERN and RC
- Data Link speeds at 100 MB/sec (all values)
except - Node_Link_Speed at 10 MB/sec
- WAN Link speeds at 40 MB/sec
- CERN
- 1000 Processing nodes each of 500 SI95
- RC
- 200 Processing nodes each of 500 SI95
1000 Processing nodes times 500SI95 500kSI95
about the CPU power of CERN Tier0
disk space as for the number of DBs
100kSI95 processing Power 20 CERN
disk space as for the number of DBs
23Overall Conclusions
- MONARC simulation tools are
- sophisticated enough to allow modeling of complex
distributed analysis scenarios - simple enough to be used by non experts
- Initial modeling runs are alkready showing
interesting results - Future work will help identify bottlenecks and
understand constraints on architectures
24MONARC Phase 3
- More Realistic Computing Model Development
- Confrontation of Models with Realistic
Prototypes - At Every Stage Assess Use Cases Based on Actual
Simulation, Reconstruction and Physics Analyses - Participate in the setup of the prototyopes
- We will further validate and develop MONARC
simulation system using the results of these use
cases (positive feedback) - Continue to Review Key Inputs to the Model
- CPU Times at Various Phases
- Data Rate to Storage
- Tape Storage Speed and I/O
- Employ MONARC simulation and testbeds to study CM
variations, and suggest strategy improvements
25MONARC Phase 3
- Technology Studies
- Data Model
- Data structures
- Reclustering, Restructuring transport operations
- Replication
- Caching, migration (HMSM), etc.
- Network
- QoS Mechanisms Identify Which are important
- Distributed System Resource Management and Query
Estimators - (Queue management and Load Balancing)
- Development of MONARC Simulation Visualization
Tools for interactive Computing Model analysis
26Relation to GRID
- The GRID project is great!
- Development of s/w tools needed for implementing
realistic LHC Computing Models - farm management, WAN resource and data
management, etc. - Help in getting funds for real life testbed
systems (RC prototypes) - Complementarity GRID-MONARC hierarchical RC Model
- Hierarchy of RC is a safe option. If GRID will
bring big advancements, less hierarchical models
should alo become possible - Timings well matched
- MONARC Phase-3 to last 1 year bridge to GRID
project starting early in 2001 - Afterwards common work by LHC experiments for
developping the computing models will surely be
still needed in which project framework and for
how long we will see then...