Middleware Development and Deployment Status - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Middleware Development and Deployment Status

Description:

Middleware Development and Deployment Status – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 32
Provided by: atd4
Category:

less

Transcript and Presenter's Notes

Title: Middleware Development and Deployment Status


1
Middleware Development and Deployment Status
Tony Doyle
2
Contents
  • What are the Challenges?
  • What is the scale?
  • How does the Grid work?
  • What is the status of (EGEE) middleware
    development?
  • What is the deployment status?
  • What is GridPP doing as part of the International
    effort?
  • What was GridPP1?
  • Is GridPP a Grid?
  • What is planned for GridPP2?
  • What lies ahead?
  • Summary
  • Why? What? How? When?

3
Science generates data and might require a Grid?
Earth Observation
Bioinformatics
Astronomy
Digital Curation
Healthcare
?
Collaborative Engineering
4
What are the challenges?
  • Must
  • share data between thousands of scientists with
    multiple interests
  • link major (Tier-0 Tier-1) and minor (Tier-1
    Tier-2) computer centres
  • ensure all data accessible anywhere, anytime
  • grow rapidly, yet remain reliable for more than a
    decade
  • cope with different management policies of
    different centres
  • ensure data security
  • be up and running routinely by 2007

5
What are the challenges?
2. Software efficiency
1. Software process
3. Deployment planning
4. Link centres
10. Policies
5. Share data
Data Management, Security and Sharing
8. Analyse data
9. Accounting
6. Manage data
7. Install software
6
Tier-1 Scale
Step-1.. financial planning
Step-2.. Compare to (e.g. Tier-1) expt.
requirements
Ian Foster / Carl Kesselman "A computational
Grid is a hardware and software infrastructure
that provides dependable, consistent, pervasive
and inexpensive access to high-end computational
capabilities."
Step-3.. Conclude that more than one centre is
needed
Step-4.. A Grid?
Currently network performance doubles every
year (or so) for unit cost.
7
What is the Grid? Hour Glass
I. Experiment Layer e.g. Portals
II. Application Middleware e.g. Metadata
III. Grid Middleware e.g. Information Services
IV. Facilities and Fabrics e.g. Storage Services
8
How do I start? http//www.gridpp.ac.uk/start/
  • Getting started as a Grid user
  • Quick start guide for LCG2GridPP guide to
    starting as a user of the Large Hadron Collider
    Computing Grid.
  • Getting an e-science certificateIn order to use
    the Grid you need a Grid certificate. This page
    introduces the UK e-Science Certification
    Authority, which issues cerficates to users. You
    can get a certificate from here.
  • Using the LHC Computing Grid (LCG)CERN's guide
    on the steps you need to take in order to become
    a user of the LCG. This includes contact details
    for support.
  • LCG user scenarioThis describes in a practical
    way the steps a user has to follow to send and
    run jobs on LCG and to retrieve and process the
    output successfully.
  • Currently being improved..

9
Job Submission(behind the scenes)
Replica Catalogue
Information Service
Resource Broker
Author. Authen.
Job Submission Service
Logging Book-keeping
Compute Element
10
Enabling Grids for E-sciencE
  • Deliver a 24/7 Grid service to European science
  • build a consistent, robust and secure Grid
    network that will attract additional computing
    resources.
  • continuously improve and maintain the middleware
    in order to deliver a reliable service to users.
  • attract new users from industry as well as
    science and ensure they receive the high standard
    of training and support they need.
  • 100 million euros/4years, funded by EU
  • gt400 software engineers service support
  • 70 European partners

11
Prototype MiddlewareStatus Plans (I)
  • Workload Management
  • AliEn TaskQueue
  • EDG WMS (plus new TaskQueue and Information
    Supermarket)
  • EDG LB
  • Computing Element
  • Globus Gatekeeper LCAS/LCMAPS
  • Dynamic accounts (from Globus)
  • CondorC
  • Interfaces to LSF/PBS (blahp)
  • Pull components
  • AliEn CE
  • gLite CEmon (being configured)

Blue deployed on development testbed Red
proposed
12
Prototype MiddlewareStatus Plans (II)
  • Storage Element
  • Existing SRM implementations
  • dCache, Castor,
  • FNAL LCG DPM
  • gLite-I/O (re-factored AliEn-I/O)
  • Catalogs
  • AliEn FileCatalog global catalog
  • gLite Replica Catalog local catalog
  • Catalog update (messaging)
  • FiReMan Interface
  • RLS (globus)
  • Data Scheduling
  • File Transfer Service (StorkGridFTP)
  • File Placement Service
  • Data Scheduler
  • Metadata Catalog
  • Simple interface defined (AliEnBioMed)
  • Information Monitoring
  • R-GMA web service version multi-VO
    support

13
Prototype MiddlewareStatus Plans (III)
  • Security
  • VOMS as Attribute Authority and VO mgmt
  • myProxy as proxy store
  • GSI security and VOMS attributes as enforcement
  • fine-grained authorization (e.g. ACLs)
  • globus to provide a set-uid service on CE
  • Accounting
  • EDG DGAS (not used yet)
  • User Interface
  • AliEn shell
  • CLIs and APIs
  • GAS
  • Catalogs
  • Integrate remaining services
  • Package manager
  • Prototype based on AliEn backend
  • evolve to final architecture agreed with ARDA
    team

14
CB
PMB
Deployment Board
User Board
Tier1/Tier2, Testbeds, Rollout Service specificat
ion provision
Requirements Application Development User feedb
ack
Metadata
Storage
Workload
Network
Security
Info. Mon.
15
Middleware Development
Network Monitoring
Configuration Management
Grid Data Management
Storage Interfaces
Information Services
Security
16
Application Development
ATLAS
LHCb
CMS
SAMGrid (FermiLab)
BaBar (SLAC)
QCDGrid
PhenoGrid
17
GridPP Deployment Status
GridPP deployment is part of LCG (Currently the
largest Grid in the world) The future Grid in
the UK is dependent upon LCG releases
  • Three Grids on Global scale in HEP (similar
    functionality)
  • sites CPUs
  • LCG (GridPP) 90 (15) 8700 (1500)
  • Grid3 USA 29 2800
  • NorduGrid 30 3200

18
LCG Overview
  • By 2007
  • 100,000 CPUs
  • - More than 100 institutes worldwide
  • building on complex middleware being developed
    in advanced Grid technology projects, both in
    Europe (Glite) and in the USA (VDT)
  • prototype went live in September 2003 in 12
    countries
  • Extensively tested by the LHC experiments during
    this summer

19
Deployment Status (26/10/04)
  • Incremental releases significant improvements in
    reliability, performance and scalability
  • within the limits of the current architecture
  • scalability is much better than expected a year
    ago
  • Many more nodes and processors than anticipated
  • installation problems of last year overcome
  • many small sites have contributed to MC
    productions
  • Full-scale testing as part of this years data
    challenges
  • GridPP The Grid becomes a reality widely
    reported

British Embassy (USA)
British Embassy (Russia)
Technology Sites
20
Data Challenges
  • Ongoing..
  • Grid and non-Grid Production
  • Grid now significant
  • ALICE - 35 CPU Years
  • Phase 1 done
  • Phase 2 ongoing

LCG
  • CMS - 75 M events and 150 TB first of this
    years Grid data challenges

Entering Grid Production Phase..
21
Data Challenge
  • 7.7 M GEANT4 events and 22 TB
  • UK 20 of LCG
  • Ongoing..
  • (3) Grid Production
  • 150 CPU years so far
  • Largest total computing requirement
  • Small fraction of what ATLAS need..

Entering Grid Production Phase..
22
LHCb Data Challenge
  • 424 CPU years (4,000 kSI2k months), 186M events
  • UKs input significant (gt1/4 total)
  • LCG(UK) resource
  • Tier-1 7.7
  • Tier-2 sites
  • London 3.9
  • South 2.3
  • North 1.4
  • DIRAC
  • Imperial 2.0
  • L'pool 3.1
  • Oxford 0.1
  • ScotGrid 5.1

Entering Grid Production Phase..
23
Paradigm ShiftTransition to Grid
424 CPU Years
Jun 8020 25 of DC04
May 8911 11 of DC04
Aug 2773 42 of DC04
Jul 7723 22 of DC04
24
More Applications
  • ZEUS uses LCG
  • needs the Grid to respond to increasing demand
    for MC production
  • 5 million Geant events on Grid since August 2004
  • QCDGrid
  • For UKQCD
  • Currently a 4-site data grid
  • Key technologies used
  • - Globus Toolkit 2.4
  • - European DataGrid
  • eXist XML database
  • managing a few hundred gigabytes of data

25
Issues
First large-scale Grid production problems
being addressed at all levels
LCG-2 MIDDLEWARE PROBLEMS AND REQUIREMENTS FOR
LHC EXPERIMENT DATA CHALLENGES
https//edms.cern.ch/file/495809/2.2/LCG2-Limitati
ons_and_Requirements.pdf
26
Is GridPP a Grid?
5
  • Coordinates resources that are not subject to
    centralized control
  • using standard, open, general-purpose protocols
    and interfaces
  • to deliver nontrivial qualities of service
  • YES.
  • This is why development and maintenance of LCG
    is important.
  • YES.
  • VDT (Globus/Condor-G) EDG/EGEE(Glite) meet
    this requirement.
  • YES.
  • LHC experiments data challenges over the summer
    of 2004.

http//www-fp.mcs.anl.gov/foster/Articles/WhatIsT
heGrid.pdf
http//agenda.cern.ch/fullAgenda.php?idaa042133
27
What was GridPP1?
  • A team that built a working prototype grid of
    significant scale
  • gt 1,500 (7,300) CPUs
  • gt 500 (6,500) TB of storage
  • gt 1000 (6,000) simultaneous jobs
  • A complex project where 82 of the 190 tasks for
    the first three years were completed

A Success The achievement of something desired,
planned, or attempted
28
Aims for GridPP2? From Prototype to Production
BaBarGrid
BaBar
EGEE
SAMGrid
D0
CDF
ATLAS
LHCb
EDG
ARDA
GANGA
LCG
CMS
ALICE
LCG
CERN Tier-0 Centre
CERN Prototype Tier-0 Centre
CERN Computer Centre
UK Tier-1/A Centre
UK Prototype Tier-1/A Centre
RAL Computer Centre
4 UK Tier-2 Centres
19 UK Institutes
4 UK Prototype Tier-2 Centres
Separate Experiments, Resources, Multiple Accounts
Prototype Grids
'One' Production Grid
2004
2007
2001
29
Planning GridPP2 ProjectMap
Structures agreed and in place (except LCG
phase-2)
30
What lies ahead? Some mountain climbing..
Annual data storage 12-14 PetaBytes per year
CD stack with 1 year LHC data ( 20 km)
100 Million SPECint2000
Importance of step-by-step planning
Pre-plan your trip, carry an ice axe and crampons
and arrange for a guide
Concorde (15 km)
In production terms, weve made base camp
? 100,000 PCs (3 GHz Pentium 4)
We are here (1 km)
Quantitatively, were 9 of the way there in
terms of CPU (9,000 ex 100,000) and disk (3 ex
12-143 years)
31
  • Why? 2. What?
  • 3. How? 4. When?
  • From Particle Physics perspective the Grid is
  • 1. needed to utilise large-scale computing
    resources efficiently and securely
  • 2. a) a working prototype running today on large
    testbed(s)
  • b) about seamless discovery of computing
    resources
  • c) using evolving standards for interoperation
  • d) the basis for computing in the 21st Century
  • e) not (yet) as transparent or robust as
    end-users need
  • 3. see the GridPP getting started pages
  • (two-day EGEE training courses available)
  • a) Now, at prototype level, for simple(r)
    applications (e.g. experiment Monte Carlo
    production)
  • b) September 2007 for more complex applications
    (e.g. data analysis) ready for LHC
Write a Comment
User Comments (0)
About PowerShow.com