Grid Computing in CMS

About This Presentation

Title:

Description:

Number of Views:28

Avg rating:3.0/5.0

Slides: 16

Provided by: joseh156

Category:

more less

Transcript and Presenter's Notes

Title: Grid Computing in CMS

1
Grid Computing in CMS

2
Outline

3
CMS Computing Model

Distributed model for computing in CMS
Cope with computing requirements for storage,
processing and analysis of data provided by LHC
Computing resources are geographically
distributed, interconnected via high throughput
networks and operated by means of Grid software
CMS computing TDR just released (June 2005)

4
Tiered Architecture

5
Workload and Data Management Systems

6
WMS DMS Services Overview

DMS
WMS
7
Data Transfer and Placement System

PhEDEx (Physics Experiment Data Export)
Large scale dataset replica management system
Managed data flow following a transfer topology
(Tier0 ? Tier1 ? Tier2)
Routed multi-hop transfers. Routing agents
determine the best route
Reliable point-to-point transfers based on
unreliable Grid transfer tools
Set of quasi-independent, asynchronous software
agents posting messages in a central blackboard
Nodes subscribe for data allocated from other
nodes
Enables distribution management at dataset level
rather than at file level
Implements experiments policy on data placement
Allows prioritization and scheduling
In production since more than a year
Managing transfers of several TB/day
100 TB known to PhEDEx, 200 TB total replicated
Running at CERN, 7 Tier-1s, 10 Tier-2s

8
PhEDEx
9
MC Production on the Grid

US Open Science Grid (OSG) and LHC Computing Grid
(LCG)
McRunjob tool for running CMS production jobs
(preparation, submission, stage-in, execution,
stage-out, cleanup)
Developed by FNAL with contributions from other
CMS people
Highly configurable and flexible
Interfaced to all Grids and local farm production
Different production steps (generation,
simulation, digitization and reconstruction)
currently run separately
Production in both Grids since 2003
Production in LCG recently extended to run all
steps. Now ramping up
Local Farm production moving to the Grid

10
LCG Production Workflow

11
MC Production on the Grid
LCG
OSG

12
Data Analysis on the Grid

Data samples for the CMS Physics TDR distributed
in Tier-1 sites (80 million events)
End-to-end analysis via LCG Grid
Simple analysis scenario where data is
pre-located and jobs are sent to the data
CMS remote Analysis Builder (CRAB) tool for job
preparation, submission, execution and basic
monitoring
Several 10s of users and 100s of jobs per day
100K jobs submitted

13
Tests of CMS Computing System

It is crucial to test prototypes of Grid
resources and services of increasing scale and
complexity so that they can become production
services
Problems and missing components are identified
and addressed
Iterative process in computing system development
Scheduled CMS Computing Challenges and LCG
Service Challenges
CMS Data Challenge 2004
Tier-0 reconstruction _at_ 25 Hz and distribution to
Tier-1 for real-time analysis
Put in place CMS data transfer and placement
system (PhEDEx)
First large scale test of Grid WMS (real-time
analysis)
Problems identified small file size (transfer,
mass storage), slow central replica and metadata
LCG catalogue, lack of reliable file transfer
system in LCG
CMS Computing, Software and Analysis challenge
(summer 2006)
Full test of CMS computing system
LCG Service Challenges
SC3 (Sept-Dec 2005) test all experiment services
but analysis
SC4 (from April 2006) test of all computing
services

14
Experience in CMS Grid Computing

Basic Grid Infrastructure and Services in place
The Grid works but the issue is reliability
A lot still to be done
VO policy and priorities not yet quite
implemented in Grid WMS and DMS
Lack of dynamic behaviour in WMS and DMS
(rescheduling)
As a consequence custom-made data transfer and
placement system implemented in CMS DMS and
hierarchical task queue planned in CMS WMS
Primitive and high-latency job monitoring
Accounting only recently implemented
Reduce Grid overheads in WMS and monitoring
Putting effort in integration is crucial. Working
with sites

15
Summary

CMS has adopted a distributed computing model
which makes use of Grid technologies
Production CMS services on the Grid in place
Data Management and Workload Management systems
Data transfer and placement system
Monte Carlo production
Data Analysis
Steadily increase in scale and complexity
Basic Grid Infrastructure and Services in place
but reliability and stability are the problems
Lot of work ahead for Grid software providers and
CMS computing team

Write a Comment

User Comments (0)

About PowerShow.com

Grid Computing in CMS - PowerPoint PPT Presentation