The EGEE project: An overview Fr - PowerPoint PPT Presentation

About This Presentation
Title:

The EGEE project: An overview Fr

Description:

EGEE is a project funded by the European Union under contract IST-2003 ... to guide the implementation and certify the performance and functionality of the ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 26
Provided by: frdr55
Category:

less

Transcript and Presenter's Notes

Title: The EGEE project: An overview Fr


1
The EGEE projectAn overview Frédéric
HemmerEGEE Middleware Manager
Paradyn/Condor Week, April 16, 2004
EGEE is a project funded by the European Union
under contract IST-2003-508833
2
Contents
  • EGEE - what is it and why is it needed?
  • Grid operations providing a stable service
  • Grid middleware current and future
  • Networking activity
  • Summary
  • The material of this talk is the work of many
    people in EGEE and LCG

Despite its name EGEE is an International project
involving in particular Israel, Russia and US
3
Background
  • Networking, commodity computing and distributed
    software tools became ripe for Grid technology to
    start become available at the end of the 1990s
  • Many public funded projects (in the US and in the
    EU) launched since
  • Grid computing a key activity of the EU
    programmes
  • Industrial and commercial Grids have been
    following (see a good sample on the
    www.cern.ch/gridcafe portal and also
    www.gridstart.org)
  • Major IT vendors involved in Grid activity

4
EGEE Why?
  • Access to a production quality grid will change
    the way science and business is done
  • Current Grid RD projects run to completion
    within the next few months or next year
  • The EGEE partners have already made major
    progress in aligning national and regional Grid
    RD efforts, in preparation for EGEE
  • EGEE will preserve the current strong momentum of
    the European Grid community and the enthusiasm of
    the hundreds of young European researchers
    already involved in EU Grid projects (gt150 in EU
    DataGrid alone)

5
EGEE ManifestoEnabling Grids for E-science in
Europe
  • Goal
  • Create a wide European Grid production quality
    infrastructure on top of present and future EU RN
    infrastructure
  • Build On
  • EU and EU member states major
  • investments in Grid Technology
  • International connections (US and AP)
  • Several pioneering prototype results
  • Large Grid development teams in EU require major
    EU funding effort
  • Approach
  • Leverage current and planned national and
    regional Grid programmes
  • Work closely with relevant industrial Grid
    developers, NRENs and US-AP projects

 
Applications
Grid infrastructure
Geant network
6
EGEE Partners
  • Leverage national resources in a more effective
    way for broader European benefit
  • 70 leading institutions in 27 countries,
    federated in regional Grids

7
EGEE Project Structure
24 Joint Research
28 Networking
JRA1 Middleware Engineering and
Integration JRA2 Quality Assurance JRA3
Security JRA4 Network Services Development
NA1 Management NA2 Dissemination and
Outreach NA3 User Training and Education NA4
Application Identification and Support NA5
Policy and International Cooperation
Emphasis in EGEE is on operating a
production grid and supporting the end-users
48 Services
SA1 Grid Operations, Support and Management SA2
Network Resource Provision
8
EGEE Applications
  • EGEE Scope ALL-Inclusive for academic
    applications (open to industrial and
    socio-economic world as well)
  • The major success criterion of EGEE how many
    satisfied users from how many different domains ?
  • 5000 users (3000 after year 2) from at least 5
    disciplines
  • Two pilot applications selected to guide the
    implementation and certify the performance and
    functionality of the evolving infrastructure
    Physics Bioinformatics

Application domains and timelines are for
illustration only
9
The pilot applications
  • High Energy Physics with LHC Computing Grid
    (www.cern.ch/lcg) relies on a Grid infrastructure
    to store and analyse petabytes (1015 bytes) of
    real and simulated data. LCG is a major source of
    resources, requirements and a hard deadlines with
    no conventional solution available
  • In Biomedics several communities are facing
    equally daunting challenges to cope with the
    flood of bioinformatics and healthcare data. Need
    to access large and distributed non-homogeneous
    data and important on-demand computing
    requirements

10
EGEE Implementation
  • From day 1 (1st April 2004)
  • Production grid service based on the LCG
    infrastructure running LCG-2 grid middleware
  • LCG-2 will be maintained until the new generation
    has proven itself (fallback solution)
  • VDT support for Condor/GT2 based code is needed
    1H05 at least
  • In parallel develop a next generation grid
    facility
  • Produce a new set of grid services according to
    evolving standards (Web Services)
  • Run a development service providing early access
    for evaluation purposes
  • Will replace LCG-2 on production facility in 2005

11
EGEE and LCG
  • EGEE builds on the work of LCG to establish a
    grid operations service
  • LCG
  • a worldwide collaboration of
  • The LHC experiments
  • The Regional Computing Centres
  • Physics institutes
  • Mission
  • Prepare and deploy the computing environment that
    will be used by the experiments to analyse the
    LHC data
  • Strategy
  • Integrate thousands of computers at dozens of
    participating institutes worldwide into a global
    computing resource
  • Rely on software being developed in advanced grid
    technology projects, both in Europe and in the
    USA

12
Grid operations
  • Create, operate, support and manage a production
    quality infrastructure
  • Offered services
  • Middleware deployment and installation
  • Software and documentation repository
  • Grid monitoring and problem tracking
  • Bug reporting and knowledge database
  • VO services
  • Grid management services

13
Operations Structure
  • Implement the objectives to provide
  • Access to resources
  • Operation of EGEE as a reliable service
  • Deploy new middleware and resources
  • Support resource providers and users
  • With a clear layered structure
  • Operations Management Centre (CERN)
  • Overall grid operations coordination
  • Core Infrastructure Centres
  • CERN, France, Italy, UK, Russia (from M12)
  • Operate core grid services
  • Regional Operations Centres
  • One in each federation, in some cases these are
    distributed centres
  • Provide front-line support to users and resource
    centres
  • Support new resource centres joining EGEE in the
    regions
  • Support deployment to the resource centres
  • Resource Centres
  • Many in each federation of varying sizes and
    levels of service

instances
1
5
11
50
14
EGEE Computing Resources
  • Resource Centers foreseen in the project

April 2004 10 sites
July 2005 20 sites
15
Deployment Status
Core Sites already integrated With the other
sites (currently running LCG-1), the expected
capacity will exceed the previsions foreseen for
2004 around 4000 CPUs at about 30 sites
16
Deployment Issues
  • Need to expand on existing LCG service while
    maintaining stability
  • Add more sites/resources (some have no previous
    experience with grids)
  • Experience has shown that this can be effort
    consuming
  • Problematic sites have been causing problems for
    the whole system
  • Introduce applications and VOs from non-HEP
    (Bio-medical)
  • Need to clarify processes and information flow
  • Portability
  • Support for further platforms (currently just
    RedHat 7.3)
  • Middleware dependencies and packaging
  • Middleware Support
  • Deterministic Support Model has been formalized
  • Essential to have (so far excellent) VDT support
    for Condor/Globus
  • 24x7 operational support
  • Currently have GOC at RAL http//goc.grid-support.
    ac.uk/
  • Being replicated at Taipei (and maybe Canada?)
  • Prototype accounting system (based on R-GMA)
    ready for the release in April 2004 (testing,
    documentation and packaging done)

17
Expected Developments in 2004
  • General
  • LCG-2 will be the service run in 2004 aim to
    evolve incrementally
  • Goal is to run a stable service
  • Some functional improvements
  • Extend access to MSS tape systems, and managed
    disk pools
  • Distributed vs replicated replica catalogs
  • To avoid reliance on single service instances
  • Operational improvements
  • Monitoring systems move towards proactive
    problem finding, ability to take sites
    on/offline experiment monitoring
  • Continual effort to improve reliability and
    robustness
  • Develop accounting and reporting
  • Address integration issues
  • With large clusters, with storage systems
  • Ensure that large clusters can be accessed via
    grid
  • Issue of integrating with other applications and
    non-LHC experiments
  • New release foreseen end April 2004

18
EGEE Middleware Activity
  • Hardening and re-engineering of existing
    middleware functionality, leveraging the
    experience of partners
  • Activity concentrated in few major centers and
    organized in Software clusters
  • Key services
  • Data Management (CERN)
  • Information Collection (UK)
  • Resource Brokering, Accounting (Italy-Czech
    Republic)
  • Quality Assurance (France)
  • Grid Security (Northern Europe)
  • Middleware Integration (CERN)
  • Middleware Testing (CERN)

19
Characteristics of the new middleware
  • Develop a lightweight stack of generic middleware
    useful to LHC experiments and BioMedicals based
    upon existing components
  • Biomedical applications have important security
    requirements (e.g. confidentiality) that need to
    be addressed.
  • Focus is on re-engineering and hardening
  • Early prototype and fast feedback turnaround
    envisaged
  • Use a service oriented approach
  • A note on OGSI/WSRF/WS/.
  • Still discussing nothing has settled yet
  • Need to take a step back
  • Focus on the service decomposition, semantics,
    interplay rather than the envelope
  • WS seems to provide a useful abstraction
  • Widely used in industry, Grid projects, Internet
    computing (Google, Amazon)
  • Need to follow standardization efforts to be able
    to adopt them once settled

20
Middleware approach
  • Formed a design team with members from
  • AliEn
  • VDT
  • EDG
  • Started intense technical discussion to
  • Break down the proposed architecture to real
    components
  • Identify critical components (and what existing
    software to use for the first instance of a
    prototype)
  • Define semantics and interfaces of these
    component
  • Focus on key services discussed exploit existing
    components
  • Initially an ad-hoc installation at CERN and
    Wisconsin
  • Aim to have first instance ready by end of April
  • Open only to a small user community
  • Expect frequent changes (also API changes) based
    on user feedback and integration of further
    services
  • Enter a rapid feedback cycle
  • Continue with the design of remaining services
  • Enrich/harden existing services based on early
    user-feedback

21
Initial Services
  • Data management
  • Storage Element
  • SRM based allow POSIX-like access
  • Workload management
  • Computing Element
  • Allow pull and push mode
  • More discussions needed
  • Information and monitoring
  • Security
  • Guiding principles
  • Lightweight services
  • Easily and quickly deployable
  • Interoperability
  • Allow for multiple implementations (medium/long
    term)
  • Being based on WS should help
  • Co-existence with deployed infrastructure
  • Run as an application
  • Security
  • Need to integrate components with quite different
    security models
  • Start with a minimalist approach based on VOMS
    and myProxy

22
Towards a prototype
Initial prototype components considered for
April04 To be extended/changed
Draft
http//cern.ch/erwin/MW-WD.0.17.zip
23
Condor Team contributions to EGEE Middleware
  • Many years of experience in designing and running
    real world distributed systems
  • Essential for relatively new Grid Middleware
    technologies
  • Many of the problems we see today are related to
    robustness, deployment, scalability
  • Proven scheduling technologies
  • Condor/Condor-G
  • Leadership in the new Middleware Design Group
  • Monthly face-to-face meetings covering all
    essential parts of Middleware
  • Miron Livnys influence and contributions are
    essential
  • Support of Middleware Components for the existing
    LCG-2 code base (VDT)
  • Condor, Globus GT2 leveraging the NSF Middleware
    Initiative
  • Proactive, bilateral problem resolution and
    enhancements
  • US contribution to EGEE project
  • Essential, as many applications are/will be
    worldwide, not only European

24
EGEE Networking Activity
  • Dissemination and outreach
  • Lead by TERENA
  • User training and induction
  • Lead by Unv Edin. (NeSC)
  • The success of EGEE is measured by the impact it
    has on collaborative European science
  • The goal is to support communities of users
  • Therefore induction and training have a high
    priority from the outset
  • Application identification and support
  • Two pilot application centers (for high energy
    physics and biomedical grids)
  • One more generic component dealing with longer
    term recruitment and support of other communities
  • Policy and International cooperation
  • Establish Grid policy forum
  • Coordinate relations with other projects (EU and
    beyond)
  • Training courses (based on EDG tutorials) will be
    available from July 2004
  • Grid school near Naples, Italy 18-30 July 2004
  • http//www.dma.unina.it/murli/GridSummerSchool200
    4/

25
Summary
  • EGEE is expected to deliver a production Grid
    infrastructure for scientific applications
  • The project just started weeks ago
  • We have a running grid service based on LCG-2
  • All EGEE activities are well advanced and ready
    to go
  • Biomedical and physics are the pilot applications
    domains that will lead the exploitation of the
    EGEE Grid infrastructure
  • US contribution essential through support of
    existing middleware and design of new generation
    middleware
  • The first project conference will be held in Cork
    (Ireland) 18-22nd April
  • http//public.eu-egee.org/kickoff/index.html

26
Further Information
To know more EGEE www.eu-egee.org
Write a Comment
User Comments (0)
About PowerShow.com