Title: MONARC 2 distributed systems simulation
1MONARC 2- distributed systems simulation -
2The Goals of the Project
- To perform realistic simulation and modelling of
large scale distributed computing systems,
customised for specific large scale HEP
applications. - To provide a design framework to evaluate the
performance of a range of possible computer
systems, as measured by their ability to provide
the physicists with the requested data in the
required time, and to optimise the cost. - To narrow down a region in this parameter space
in which viable models can be chosen by any of
the future LHC-era experiments. - To offer a dynamic and flexible simulation
environment.
3LHC Computing Different from Previous
Experiment Generations
One of the four LHC detectors (CMS)
online system multi-level trigger filter out
background reduce data volume
40 MHz (40 TB/sec)
level 1 - special hardware
75 KHz (75 GB/sec)
level 2 - embedded processors
5 KHz (5 GB/sec)
level 3 - PCs
100 Hz (100-1000 MB/sec)
data processing offline analysis, selection
Raw recording rate 0.1 1 GB/sec3 - 8 PetaBytes
/ year
4Off-Line LHC Computing Data Analysis
Geographical dispersion of people and resources
Complexity the detector and the LHC
environment Scale 100 times more processing
power Petabytes per year of data
CMS
1800 Physicists 150 Institutes 32
Countries
VERY LARGE SCALE DISTRIBUTED SYSTEM AND IT HAS TO
PROVIDE (NEAR) REAL-TIME DATA ACCESS FOR ALL THE
PARTICIPANTS
5Regional Center Hierarchy (Worldwide Data Grid)
Experiment
PBytes/sec
Online System
1001000 MBytes/sec
Bunch crossing per 25 nsecs.Event is 1 MByte in
size
Offline Farm,CERN Computer
Tier 0 1
HPSS
0.6 - 2.5 Gbits/sec
FNAL Center
Italy Center
UK Center
France Center
Tier 1
2.4 Gbits/sec
Tier 2
622 Mbits/sec
Tier 3
Physicists work on analysis channels. Processing
power 200,000 of todays fastest PCs
Institute 0.25TIPS
Institute
Institute
Institute
100 - 1000 Mbits/sec
Physics data cache
Tier 4
Workstations
6Simulation Models
- The simulation model
- abstracts the components of the real system and
their interactions - must be equivalent to the simulated system
- Simulation models
- continuous time - the system is described by a
set of differential equations - discrete time - the state changes only at certain
time moments - In MONARC one of the discrete time models
(Discrete Event Simulation DES) the events
represent important activities from the system,
managed with the aid of an internal clock
7A Global View for Modelling
MONITORING
REAL Systems
Testbeds
8Regional Center Model
REGIONAL CENTER
LAN
FARM
9The Simulation Engine
- Provides the multithreading mechanism for the
simulation - The entities with time dependent behavior are
mapped on active objects - In the simulation engine management of active
objects and events - Thread reusability (thread pool)
Activity
Scheduler
AJob
Job
Event
Task
EventQueue
Farm
JobScheduler
Pool
WorkerThread
Engine
CPUUnit
10Multitasking Processing Model
Concurrent running tasks share resources (CPU,
memory, I/O) Interrupt driven scheme For each
new task or when one task is finished, an
interrupt is generated and all processing times
are recomputed.
11Engine tests
Processing a TOTAL of 100 000 simple jobs in
1 , 10, 100, 1000, 2 000 , 4 000, 10 000 CPUs
(number of CPUs number of parallel threads)
more tests http//monalisa.cacr.caltech.edu/MONA
RC/
12Job Scheduling
- Dynamically loadable modules for each regional
center - Basic job scheduler assigns the jobs to CPUs
from the local farm - More complex schedulers allow job migration
between regional centers
Dynamically loadable module
13Centralized Scheduling
Site A
GLOBAL Job Scheduler
14Distributed Scheduling market model
COST
Request
DECISION
JobScheduler
Site A
15Example simple distributed scheduling
- Very simple scheduling algorithm, based on
searching the center with the minimum load - We simulated the activity of 4 regional centers
- When all the centers are heavily loaded, the
number of job transfers grows unnecessarily
16Network Model
Simulated network components
Farm
Farm
WAN
WAN
LinkPort
LinkPort
LAN
LAN
Simulated local traffic
Simulated inter-regional traffic
17LAN/WAN Simulation Model
Link
Node
LAN
ROUTER
Internet Connections
Interrupt driven simulation for each new
message an interrupt is created and for all the
active transfers the speed and the estimated
time to complete the transfer are recalculated.
ROUTER
Continuous Flow between events ! An efficient and
realistic way to simulate concurrent transfers
having different sizes / protocols.
18Network Model
The TCP/IP layers are closely followed
Application Layer
Transport Layer
Internet Layer
Network Access Layer
19Data Model
Database Index
Client
Mapare
Database
LinkPort
Database
Task
Database Entity
Database
DContainer
DContainer
Database Server
Mass Storage
DContainer
20Data Model
- Generic Data
- Container
- Size
- Event Type
- Event Range
- Access Count
- INSTANCE
META DATA Catalog Replication Catalog
Network FILE
FILE
Data Base
Custom Data Server
FTP Server Node
DB Server
NFS Server
Export / Import
21Data Model
META DATA Catalog Replication Catalog
Data Processing JOB
Data Request
Data Container
Select from the options
JOB
List Of IO Transactions
22Activities Arrival Patterns
A flexible mechanism to define the Stochastic
process of how users perform data processing
tasks
Dynamic loading of Activity tasks, which are
threaded objects and are controlled by the
simulation scheduling mechanism
Physics Activities Injecting Jobs
Each Activity thread generates data processing
jobs
These dynamic objects are used to model the users
behavior
23Output of the simulation
Node
Simulation Engine
DB
Output Listener Filters
GRAPHICS
Router
Output Listener Filters
Log Files EXCEL
User C
Any component in the system can generate generic
results objects Any client can subscribe with a
filter and will receive the results it is
Interested in . VERY SIMILAR structure as in
MonALISA . We will integrate soon The output of
the simulation framework into MonaLISA
24Conclusions
- Modelling and understanding current systems,
their performance and limitations, is essential
for the design of the large scale distributed
processing systems. This will require continuous
iterations between modelling and monitoring - Simulation and Modelling tools must provide the
functionality to help in designing complex
systems and evaluate different strategies and
algorithms for the decision making units and the
data flow management. - For future development efficient distributed
scheduling algorithms, data replication, more
complex examples.
- http//monalisa.cacr.caltech.edu/MONARC