Grid computing at CERN - PowerPoint PPT Presentation

About This Presentation
Title:

Grid computing at CERN

Description:

2005-01-20. Grid computing at CERN. Oxana Smirnova. Lund ... Distributed computing resources ... success rate, inadequacies of the workload management and ... – PowerPoint PPT presentation

Number of Views:17
Avg rating:3.0/5.0
Slides: 13
Provided by: oxan69
Category:

less

Transcript and Presenter's Notes

Title: Grid computing at CERN


1
Grid computing at CERN
  • Oxana Smirnova
  • Lund University/CERN
  • 2nd NGN meeting, Tallinn, January 20, 2005

2
CERN the European Particle Physics Lab
Large Hadron Collider Worlds biggest
accelerator at CERN
http//www.cern.ch
3
Collisions at LHC
4
Experiments at LHCand computing challenges
  • Data-intensive tasks
  • Large datasets, large files
  • Lengthy processing times
  • Large memory consumption
  • High throughput is necessary
  • Very distributed user base
  • 50 countries, thousands of researchers
  • Distributed computing resources of modest size
  • Produced and processed data are hence
    distributed, too
  • Issues of coordination, synchronization and
    authorization are outstanding

HEP community at CERN were the first to recognize
the necessity of Grid computing
5
Grid projects at and around CERN
  • MONARC project developed a multi-tiered model for
    distributed analysis of data
  • Particle Physics Data Grid (PPDG) and GriPhyN
    projects by US physicists started using Grid
    technologies
  • Used parts of the Globus Toolkit
  • Globus was picked up by the CERN-lead EU DataGrid
    (EDG) project
  • EDG did not satisfy production-level
    requirements many simpler solutions appeared
    (still, Globus-based)
  • NorduGrid (Northern Europe and others)
  • Grid3 (USA)
  • EGEEs gLite (EU, prototype)
  • LHC Computing Grid (LCG) builds Grid for CERN,
    uses modified EDG and aims towards gLite

6
LHC experiments usage of Grid
  • The experiments recently presented their
    computing models
  • All rely on Grid computing in many aspects
  • Common points
  • Multi-tiered hierarchy
  • Tier0 (CERN) ? Tier1 (regional) ? Tier2 (local)
  • Raw and reconstructed data 2-3 copies worldwide,
    analysis objects a copy per Tier1, some at Tier2
    (dedicated)
  • Grid(s) to be used to manage centralized
    production at Tier2s and processing at Tier1s,
    and eventually for analysis
  • Differences
  • 3 out of 4 use different non-LCG Grid-like
    solutions
  • ALICE AliEn (assume it will transform into
    gLite)
  • ATLAS Grid3, ARC
  • LHCb Dirac
  • Only ALICE makes explicit statement on the Grid
    middleware (needs AliEn)
  • Some see Grid as a necessity, others as a
    possible optimization
  • Some require a single Grid, others realize there
    will be many

7
LCG the central Grid project for CERN
  • Sometimes referred to as the fifth LHC
    experiment
  • Major activities (see http//cern.ch/lcg)
  • Fabric
  • Grid deployment and operations
  • Common applications
  • Distributed data analysis (ARDA)
  • Originally, have chosen EDG as the basic
    middleware
  • Applies some modifications uses only selected
    services
  • Took over EDG middleware support
  • Later, agreed to share some operational
    responsibilities and middleware with EGEE
  • CERN is actually the coordinator of EGEE (see
    http//cern.ch/egee)
  • EGEEs gLite middleware is expected to inherit
    many EDG solutions
  • Since late 2003, LCG is in the production mode,
    to be used by the LHC experiments
  • 80 sites, 7000 processors

8
LCG status today
  • LCG Comprehensive Review took place on November
    22-23, 2004
  • Materials publicly available at
    http//agenda.cern.ch/fullAgenda.php?idaa043872
  • Excerpts from the final report (from slides by
    K.Bos)
  • Middleware Progress was reported in the
    development and use of the middleware but the
    LHCC noted outstanding issues concerning the
    LCG-2 low job success rate, inadequacies of the
    workload management and data management systems,
    as well as delays in the release of the EGEE
    gLite services. Continued delays in gLite may
    hinder future progress in ARDA. LCG-2 has been
    used as a production batch system, but Grid-based
    analysis of the simulated data is only just
    starting. The interoperability of the various
    types of middleware being produced should be
    pursued together with common interface tools, and
    developers of the gLite middleware should remain
    available for the support phase.
  • Grid Deployment and Regional Centers Good
    progress was reported on the installation of Grid
    software in remote sites. A large amount of data
    has been processed on the LCG-2 Grid as part of
    the Data Challenges and the LCG-2 Grid has been
    operated successfully for several months.
    However, the LHCC noted that the service provided
    by LCG-2 was much less than production quality
    and the experiments and LCG Project expended a
    large amount of effort to be in a position to use
    the service.

9
LCG status, continued
  • Excerpts from the final report, continued
  • Fabric and Network The LHCC has no major
    concerns regarding the Fabric Area and Wide Area
    Networking. In view of the reported delays, the
    Committee will continue checking on the
    availability and performance of the CASTOR disk
    pool management system.
  • Applications area The LHCC noted the good
    progress in the Applications Area with all
    projects demonstrating significant steps in the
    development and production of their respective
    products and services. The major outstanding
    issues lie with the insufficient coordination
    between the Applications Area and ROOT and with
    the imminent reduction of manpower due to the
    transition from the development to the
    deployment, maintenance and support phases.
  • Management and Planning The LHCC took note of
    the upcoming milestones for the LCG and noted
    that discussions are currently underway to secure
    the missing manpower to develop, deploy and
    support the Grid services. The lines of
    responsibility and authority in the overall
    organization structure need further
    clarification.

10
Plans for the gLite middleware in 2005
  • End of March
  • use the gLite middleware (beta) on the extended
    prototype (eventually the pre-production service)
    (beta) and provide feedback (technical issues and
    collect high-level comments and experience from
    the experiments)
  • Release Candidate 1
  • End of June
  • use the gLite middleware (version 1.0) on the
    extended prototype (eventually the pre-production
    service) and provide feedback (technical issues
    and collect high-level comments and experience
    from the experiments)
  • Release 1
  • End of September
  • use the gLite middleware (version 1.1) on the
    extended prototype (eventually the pre-production
    service) and provide feedback (technical issues
    and collect high-level comments and experience
    from the experiments)
  • Interim Integrated Build
  • End of December
  • use the gLite middleware (version 1.2 - release
    candidate 2) on the extended prototype
    (eventually the pre-production service) and
    provide feedback (technical issues and collect
    high-level comments and experience from the
    experiments)
  • Release Candidate 2

Slide by F.Hemmer
11
LCG planning in 2005
  • February/March Fabric Grid workshop on the
    computing models
  • First quarter 2005
  • Improve/work out relations between Tier0/1/2
  • Understand data access patterns, define
    experiments shares at Tier1s
  • Prepare documentation for MoUs between LCG and
    Tier0/1/2 centers
  • Work on Technical Design Report
  • Other
  • March detailed plan for the service challenges
  • March phase 2 Applications Area plan
  • April initial plan for Tier-0/1/2 networking
  • April prepare final version of the LCG MoU
  • May proposal for middleware evolution
  • End June Technical Design Report
  • Detailed plan for installation and commissioning
    the LHC computing environment
  • September final installation and commissioning
    plan
  • October ready to sign the MoU

Based on slides by L.Robertson
12
Summary
  • CERN expects LCG to provide adequate Grid-like
    computing infrastructure for the future LHC data
    processing
  • The resources are available, and the owners will
    sign MoUs with CERN/LCG in 2006
  • The experiments were testing extensively the LCG
    system throughout 2004
  • No satisfactory production-level service
  • No optimization yet, needs tremendous efforts to
    keep it running
  • Other Grid solutions offered less resources but
    better reliability with less efforts (see e.g.
    talks at CHEP04)
  • Major problems
  • Operational and organizational issues
  • Inadequate middleware without official
    developers support
  • EGEE is expected to help out
  • Manpower for operation, infrastructure and
    support centers
  • Improved middleware (gLite)
  • Still, it becomes clear that there will be no
    single Grid solution for CERN
  • EDG/LCG, AliEn, gLite, Dirac, Grid3/OSG,
    NorduGrids ARC, INFN-Grid and counting all are
    being used and have avid supporters
  • Some expect LCG to concentrate on fabric,
    operations and applications
Write a Comment
User Comments (0)
About PowerShow.com