The EU DataGrid - Introduction PowerPoint PPT Presentation

presentation player overlay
1 / 27
About This Presentation
Transcript and Presenter's Notes

Title: The EU DataGrid - Introduction


1
The EU DataGrid - Introduction
  • The European DataGrid Project Team
  • http//www.eu-datagrid.org/

Peter.Kunszt_at_cern.ch
2
Contents
  • The EDG Project scope
  • Achievements
  • EDG structure
  • Middleware Workpackages Goals, Achievements,
    Issues
  • Testbed Release Plans

3
Glossary
  • RB Resource Broker
  • VO Virtual Organisation
  • CE Computing Element
  • SE Storage Element
  • GDMP GRID Data Mirroring Package
  • LDAP Lightweighted Directory Access Protocol
  • LCFG Local Configuration System
  • LRMS Local Resource management system (Batch)
    (PBS, LSF)
  • WMS Workload Management System
  • LFN Logical File Name (like MyMu.dat)
  • SFN Site File Name ( like storageEl1.cern.ch/hom
    e/data/MyMu.dat )

4
The Grid vision
  • Flexible, secure, coordinated resource sharing
    among dynamic collections of individuals,
    institutions, and resource
  • From The Anatomy of the Grid Enabling Scalable
    Virtual Organizations
  • Enable communities (virtual organizations) to
    share geographically distributed resources as
    they pursue common goals -- assuming the absence
    of
  • central location,
  • central control,
  • omniscience,
  • existing trust relationships.

5
Grids Elements of the Problem
  • Resource sharing
  • Computers, storage, sensors, networks,
  • Sharing always conditional issues of trust,
    policy, negotiation, payment,
  • Coordinated problem solving
  • Beyond client-server distributed data analysis,
    computation, collaboration,
  • Dynamic, multi-institutional virtual orgs
  • Community overlays on classic org structures
  • Large or small, static or dynamic

6
EDG overview goals
  • DataGrid is a project funded by European Union
    whose objective is to exploit and build the next
    generation computing infrastructure providing
    intensive computation and analysis of shared
    large-scale databases.
  • Enable data intensive sciences by providing world
    wide Grid test beds to large distributed
    scientific organisations ( Virtual
    Organisations, Vos)
  • Start ( Kick off ) Jan 1, 2001
    End Dec 31, 2003
  • Applications/End Users Communities HEP, Earth
    Observation, Biology
  • Specific Project Objetives
  • Middleware for fabric grid management
  • Large scale testbed
  • Production quality demonstrations
  • Collaborate and coordinate with other projects
    (Globus, Condor, CrossGrid, DataTAG, etc)
  • Contribute to Open Standards and international
    bodies
  • ( GGF, IndustryResearch forum)

7
DataGrid Main Partners
  • CERN International (Switzerland/France)
  • CNRS - France
  • ESA/ESRIN International (Italy)
  • INFN - Italy
  • NIKHEF The Netherlands
  • PPARC - UK

8
Assistant Partners
  • Industrial Partners
  • Datamat (Italy)
  • IBM-UK (UK)
  • CS-SI (France)
  • Research and Academic Institutes
  • CESNET (Czech Republic)
  • Commissariat à l'énergie atomique (CEA) France
  • Computer and Automation Research Institute, 
    Hungarian Academy of Sciences (MTA SZTAKI)
  • Consiglio Nazionale delle Ricerche (Italy)
  • Helsinki Institute of Physics Finland
  • Institut de Fisica d'Altes Energies (IFAE) -
    Spain
  • Istituto Trentino di Cultura (IRST) Italy
  • Konrad-Zuse-Zentrum für Informationstechnik
    Berlin - Germany
  • Royal Netherlands Meteorological Institute (KNMI)
  • Ruprecht-Karls-Universität Heidelberg - Germany
  • Stichting Academisch Rekencentrum Amsterdam
    (SARA) Netherlands
  • Swedish Research Council - Sweden

9
Project Schedule
  • Project started on 1/Jan/2001
  • TestBed 0 (early 2001)
  • International test bed 0 infrastructure deployed
  • Globus 1 only - no EDG middleware
  • TestBed 1 ( now )
  • First release of EU DataGrid software to defined
    users within the project
  • HEP experiments (WP 8), Earth Observation (WP 9),
    Biomedical applications (WP 10)
  • Successful Project Review by EU March 1st 2002
  • TestBed 2 (October 2002)
  • Builds on TestBed 1 to extend facilities of
    DataGrid
  • TestBed 3 (March 2003) 4 (September 2003)
  • Project stops on 31/Dec/2003

10
EDG Highlights
  • The project is up and running!
  • All 21 partners are now contributing at
    contractual level
  • total of 60 man years for first year
  • All EU deliverables (40, gt2000 pages) submitted
  • in time for the review according to the contract
    technical annex
  • First test bed delivered with real production
    demos
  • All deliverables (code documents) available via
    www.edg.org
  • http//eu-datagrid.web.cern.ch/eu-datagrid/Deliver
    ables/default.htm
  • requirements, surveys, architecture, design,
    procedures, testbed analysis etc.

11
DataGrid work packages
  • The EDG collaboration is structured in 12 Work
    Packages
  • WP1 Work Load Management System
  • WP2 Data Management
  • WP3 Grid Monitoring / Grid Information Systems
  • WP4 Fabric Management
  • WP5 Storage Element
  • WP6 Testbed and demonstrators Production
    quality International Infrastructure
  • WP7 Network Monitoring
  • WP8 High Energy Physics Applications
  • WP9 Earth Observation
  • WP10 Biology
  • WP11 Dissemination
  • WP12 Management

12
Objectives for the first year of the project
  • WP1 workload
  • Job resource specification scheduling
  • WP2 data management
  • Data access, migration replication
  • WP3 grid monitoring services
  • Monitoring infrastructure, directories
    presentation tools
  • WP4 fabric management
  • Framework for fabric configuration management
    automatic sw installation
  • WP5 mass storage management
  • Common interface for Mass Storage Sys.
  • WP7 network services
  • Network services and monitoring
  • Collect requirements for middleware
  • Take into account requirements from application
    groups
  • Survey current technology
  • For all middleware
  • Core Services testbed
  • Testbed 0 Globus (no EDG middleware)
  • First Grid testbed release
  • Testbed 1 first release of EDG middleware

13
DataGrid Architecture
Local Application
Local Database
Local Computing
Grid
Grid Application Layer
Data Management
Metadata Management
Object to File Mapping
Job Management
Collective Services
Information Monitoring
Replica Manager
Grid Scheduler
Underlying Grid Services
Computing Element Services
Authorization Authentication and Accounting
Replica Catalog
Storage Element Services
Logging Book-keeping
SQL Database Services
Grid
Fabric services
Fabric
Node Installation Management
Monitoring and Fault Tolerance
Fabric Storage Management
Configuration Management
Resource Management
14
EDG Interfaces
Computing Elements
Mass Storage Systems HPSS, Castor
15
WP1 Work Load Management
Local Application
Local Database
Grid Application Layer
Data Management
Metadata Management
Object to File Mapping
Job Management
  • Goals
  • Maximize use of resources by efficient scheduling
    of user jobs
  • Achievements
  • Analysis of work-load management system
    requirements survey of existing mature
    implementations Globus Condor (D1.1)
  • Definition of architecture for scheduling res.
    mgmt. (D1.2)
  • Development of "super scheduling" component
    using application data and computing elements
    requirements
  • Issues
  • Integration with software from other WPs
  • Advanced job submission facilities

Collective Services
Information Monitoring
Replica Manager
Grid Scheduler
Underlying Grid Services
Computing Element Services
Authorization Authentication and Accounting
Replica Catalog
Storage Element Services
Logging Bookkeeping
SQL Database Services
Fabric services
Node Installation Management
Monitoring and Fault Tolerance
Fabric Storage Management
Configuration Management
Resource Management
Current components Job Description
Language Resource Broker Job Submission
Service Information Index User Interface Logging
Bookkeeping Service
16
WP2 Data Management
  • Goals
  • Coherently manage and share petabyte-scale
    information volumes in high-throughput
    production-quality grid environments
  • Achievements
  • Survey of existing tools and technologies for
    data access and mass storage systems (D2.1)
  • Definition of architecture for data management
    (D2.2)
  • Deployment of Grid Data Mirroring Package (GDMP)
    in testbed 1
  • Close collaboration with Globus, PPDG/GriPhyN
    Condor
  • Working with GGF on standards
  • Issues
  • Security clear mechanisms handling
    authentication and authorization

Local Application
Local Database
Grid Application Layer
Data Management
Metadata Management
Object to File Mapping
Job Management
Collective Services
Information Monitoring
Replica Manager
Grid Scheduler
Underlying Grid Services
Computing Element Services
Authorization Authentication and Accounting
Replica Catalog
Storage Element Services
Logging Bookkeeping
SQL Database Services
Fabric services
Node Installation Management
Monitoring and Fault Tolerance
Fabric Storage Management
Configuration Management
Resource Management
Current components GDMP Replica Catalog Replica
Manager Spitfire
17
WP3 Grid Monitoring Services
Local Application
Local Database
  • Goals
  • Provide information system for discovering
    resources and monitoring status
  • Achievements
  • Survey of current technologies (D3.1)
  • Coordination of schemas in testbed 1
  • Development of Ftree caching backend based on
    OpenLDAP (Light Weight Directory Access Protocol)
    to address shortcoming in MDS v1
  • Design of Relational Grid Monitoring Architecture
    (R-GMA) (D3.2) to be further developed with GGF
  • GRM and PROVE adapted to grid environments to
    support end-user application monitoring
  • Issues
  • MDS vs. R-GMA

Grid Application Layer
Data Management
Metadata Management
Object to File Mapping
Job Management
Collective Services
Information Monitoring
Replica Manager
Grid Scheduler
Underlying Grid Services
Computing Element Services
Authorizat ion Authentication and Accounting
Replica Catalog
Storage Element Services
Logging Book-keeping
SQL Database Services
Fabric services
Node Installation Management
Monitoring and Fault Tolerance
Fabric Storage Management
Configuration Management
Resource Management
Components MDS/Ftree R-GMA GRM/PROVE
18
WP4 Fabric Management
Local Application
Local Database
  • Goals
  • manage clusters (thousands) of nodes
  • Achievements
  • Survey of existing tools, techniques and
    protocols (D4.1)
  • Defined an agreed architecture for fabric
    management (D4.2)
  • Initial implementations deployed at several sites
    in testbed 1
  • Issues
  • How to ensure the node configurations are
    consistent and handle updates to the software
    suites

Grid Application Layer
Data Management
Metadata Management
Object to File Mapping
Job Management
Collective Services
Information Monitoring
Replica Manager
Grid Scheduler
Underlying Grid Services
Computing Element Services
Authorization Authentication and Accounting
Replica Catalog
Storage Element Services
Logging Book-keeping
SQL Database Services
Fabric services
Node Installation Management
Monitoring and Fault Tolerance
Fabric Storage Management
Configuration Management
Resource Management
Components LCFG PBS LSF info providers Image
installation Config. Cache Mgr
19
WP5 Mass Storage Management
  • Goals
  • Provide common user and data export/import
    interfaces to existing local mass storage systems
  • Achievements
  • Review of Grid data systems, tape and disk
    storage systems and local file systems (D5.1)
  • Definition of Architecture and Design for
    DataGrid Storage Element (D5.2)
  • Collaboration with Globus on GridFTP/RFIO
  • Collaboration with PPDG on control API
  • First attempt at exchanging Hierarchical Storage
    Manager (HSM) tapes
  • Issues
  • Scope and requirements for storage element
  • Inter-working with other Grids
  • Components
  • Storage Element info. providers
  • RFIO
  • MSS staging

20
WP7 Network Services
  • Goals
  • Review the network service requirements for
    DataGrid
  • Establish and manage the DataGrid network
    facilities
  • Monitor the traffic and performance of the
    network
  • Deal with the distributed security aspects
  • Achievements
  • Analysis of network requirements for testbed 1
    study of available network physical
    infrastructure (D7.1)
  • Use of European backbone GEANT since Dec. 2001
  • Initial network monitoring architecture defined
    (D7.2) and first tools deployed in testbed 1
  • Collaboration with Dante DataTAG
  • Working with GGF (Grid High Performance Networks)
    Globus (monitoring/MDS)
  • Issues
  • Resources for study of security issues
  • End-to-end performance for applications depend on
    a complex combination of components
  • Components
  • network monitoring tools
  • PingER
  • Udpmon
  • Iperf

21
WP6 TestBed Integration
Local Application
Local Database
  • Goals
  • Deploy testbeds for the end-to-end application
    experiments demos
  • Integrate successive releases of the software
    components
  • Achievements
  • Integration of EDG sw release 1.0 and deployment
  • Working implementation of multiple Virtual
    Organisations (VOs) s basic security
    infrastructure
  • Definition of acceptable usage contracts and
    creation of Certification Authorities group
  • Issues
  • Procedures for software integration
  • Test plan for software release
  • Support for production-style usage of the testbed

Grid Application Layer
Data Management
Metadata Management
Object to File Mapping
Job Management
Collective Services
Information Monitoring
Replica Manager
Grid Scheduler
Underlying Grid Services
Computing Element Services
Authorization Authentication and Accounting
Replica Catalog
Storage Element Services
Logging Bookkeeping
SQL Database Services
Fabric services
Node Installation Management
Monitoring and Fault Tolerance
Fabric Storage Management
Configuration Management
Resource Management
  • Components
  • Globus packaging EDG config
  • Build tools
  • End-user documents

22
Grid aspects covered by EDG testbed 1
VO servers LDAP directory for mapping users (with certificates) to correct VO Storage Element Grid-aware storage area, situated close to a CE
User Interface Submit monitor jobs, retrieve output Replica Manager Replicates data to one or more CEs
Job Submission Service Manages submission of jobs to Res. Broker Replica Catalog Keeps track of multiple data files replicated on different CEs
Information index Provides info about grid resources via GIIS/GRIS hierarchy Information Monitoring Provides info on resource utilization performance
Resource Broker Uses Info Index to discover select resources based on job requirements Grid Fabric Mgmt Configure, installs maintains grid sw packages and environ.
Logging and Bookkeeping Collects resource usage job status Network performance, security and monitoring Provides efficient network transport, security bandwidth monitoring
Computing Element Gatekeeper to a grid computing resource Testbed admin. Certificate auth.,user reg., usage policy etc.
23
Tasks for the WP6 integration team
  • Testing and integration of the Globus package
  • Exact definition of RPM lists (components) for
    the various testbed machine profiles (CE service
    , RB, UI, SE service , NE, WN, ) check
    dependencies
  • Perform preliminary centrally (CERN) managed
    tests on EDG m/w before green light for spread
    EDG testbed sites deployment
  • Provide, update end user documentation for
    installers/site managers, developers and end
    users
  • Define EDG release policies, coordinate the
    integration team staff with the various
    WorkPackage managers keep high
    inter-coordination.
  • Assign the reported bugs to the corresponding
    developers/site managers (BugZilla)
  • Complete support for the iTeam testing VO

24
EDG overview Middleware release schedule
  • Planned intermediate release schedule
  • Release 1.1 January 2002
  • Release 1.2 March 2002
  • Release 1.3 May 2002
  • Release 1.4 July 2002
  • Similar schedule for 2003
  • Each release includes
  • feedback from use of previous release by
    application groups
  • planned improvements/extension by middle-ware WPs
  • use of WP6 software infrastructure
  • feeds into architecture group

July
1.1.3
Internal
August
25
Release Plan details
  • Current release EDG 1.1.4
  • Deployed on testbed under RedHat 6.2
  • Finalising build of EDG 1.2 (now)
  • GDMP 3.0
  • GSI-enabled RFIO client and server
  • EDG 1.3 (internal)
  • Build using autobuild tools to ease future
    porting
  • Support for MPI on single site
  • EDG 1.4 (August)
  • Support RH 6.2 7.2
  • Basic support for interactive jobs
  • Integration of Condor DAGman
  • Use MDS 2.2 with first GLUE schema
  • EDG 2.0 (Oct)
  • Still based on Globus 2.x (pre-OGSA)
  • Use updated GLUE schema
  • Job partitioning check-pointing
  • Advanced reservation/co-allocation

See http//edms.cern.ch/document/333297 for
further details
26
EDG overview testbed schedule
  • Planned intermediate testbed schedule
  • Testbed 0 March 2001
  • Testbed 1 November 2001-January 2002
  • Testbed 2 September-October 2002
  • Testbed 3 March 2003
  • Testbed 4 September-October 2003
  • Number of EDG testbed sites permanently
    increasing currently 9 sites are visible to the
    CERN resource broker
  • Each site normally implements, at least
  • A central install config server (LCFG server)
  • WMS (WP1) dedicated machines UI, CE (g/k
    worker node(s) )
  • MDS Info Providers to the global EDG GIIS/GRIS
  • Network Monitoring

27
Development Production testbeds
  • Development
  • Initial set of 5 sites will keep small cluster of
    PCs for development purposes to test new versions
    of the software, configurations etc.
  • Production
  • More stable environment for use by application
    groups
  • more sites
  • more nodes per site (grow to meaningful size at
    major centres)
  • more users per VO
  • Usage already foreseen in Data Challenge
    schedules for LHC experiments
  • harmonize release schedules
Write a Comment
User Comments (0)
About PowerShow.com