Title: Grids
1Grids Achtergronden en praktijkin het EU Data
Grid
- David Groep, NIKHEFdavidg_at_nikhef.nl
http//www.dutchgrid.nl/ http//www.eu-datagrid.or
g/http//www.edg.org/
2Talk Outline
- The vision
- What makes a Grid
- How was it created?
- Building a production Grid in Europe
- Will it become commonplace?
3Grid a vision
Federico.Carminati_at_cern.ch
next beyond distributed computing
4Beyond distributed computing
- A grid integrates resources that are
- not owned or administered by one single
organisation - speak a common, open protocol that is generic
- working as a coordinated, transparent system
- And
- can be used by many people from multiple
organisations - that work together in one Virtual Organisation
Checklist items based on Ian Foster What is the
Grid? July 2002
next virtual organisations
5Virtual Organisations
- A VO is a temporary alliance of stakeholders
- Users
- Service providers
- Information Providers
A set of individuals or organisations, not under
single hierarchical control, temporarily joining
forces to solve a particular problem at hand,
bringing to the collaboration a subset of their
resources, sharing those at their discretion and
each under their own conditions.
Viewgraph Foster, Kesselman, Tuecke, the Globus
Project
next common and open protocols
6Enhanced collaboration
- owners of resources and data stay in control
- sharing conditions are explicit,
- and can vary for every resource or service
- each VO, and each user, has its own view of the
Grid - his own grid is transparent and gives easy
access - results can again be shared under specified
conditions - the Grid a user sees is flexible and resilient to
failure
7Common and open protocols
- Resources must talk standard protocols
- for interoperability of application toolkits
Application Toolkits
DUROC
MPICH-G2
Condor-G
VLAM-G
Grid Services
GRAM
GridFTP
Information
Replica
Grid Security Infrastructure (GSI)
Grid Fabric
FARMS
Supers
Desktops
TCP/IP
Apparatus
DBs
next protocol standards
8Common protocol example
GridFTP server
- Data Access Protocol
- GridFTP protocol can be used to access different
types of systems - Single Sign-On
- Security enabled
- performance enhancements
- generic, usable for many applications
tape robotwith CXFS
disk-basedstorage
CASTOR
Job
9Standard protocols
- New Grid protocols based on popular Web Services
- Open Grid Services Architecture
- service discovery
- many different bindings
- easily integrated in hosting environments (Java,
WebSphere, .NET) - is entirely generic
- adds transient services, stateful services
- Global Grid Forum (GGF) promotes the open
standards process
next access in a coordinated way
10Access in a coordinated way
- New qualities-of-service
- Transparently crossing of domain
boundariessatisfying constraints of - site autonomy
- authenticity, integrity, confidentiality
- single sign-on to all services
- ways to address services collectively
- preferably via portals and visual programming
next example GOME analysis
11Example GOME analysis
- Task ozone is the component in the atmosphere
that protects us from harmful UV radiation. Its
concentration varies widely. What is happening? - the EnviSat satellite is orbiting the earth and
measuring light absorption in the atmosphere - the absorption is related to the ozone
concentration,but needs instrument corrections - ground-based observation give absolute
concentrations - linking both datasets can give us the
concentration everywhere - terabytes of data come in at several ground
stations, and various labs need the final
products - ? Grid can provide a good solution to
this problem
next GOME analysis on the Grid, domains
12Example Ozone Analysis on the Grid
LIDARdatabase
10100100010111101001000100101101010010001000101011
01010010101010000101111010100101001101001001011100
10010010100100111110101010010101110010101010101010
01001001111101010100100010100101100010100000101010
00101001000101111010010001001011010100100010001010
11010100101010100001011110101001010011010010010111
00100100101001001111101010100101011100101010101010
10010010011111010101001000101001011000101000001010
1000
NOPREGO
resourcebroker
validation
visualize
OPERA
next DataGrid overview
13A Working Grid the EU DataGrid
- Objective
- build the next generation computing
infrastructure providing intensive computation
and analysis of shared large-scale databases,
from hundreds of TeraBytes to PetaBytes, across
widely distributed scientific communities - official start in 2001
- 21 partners
- in the Netherlands NIKHEF, SARA, KNMI
- Pilot applications earth observation,
bio-medicine, high-energy physics - aim for production and stability
next history of grids
14Other applications in the EU DataGrid
- Physics _at_ CERN
- LHC particle accellerator
- operational in 2007
- 10 Petabyte per year
- 150 countries
- gt 10000 Users
- lifetime 20 years
40 MHz (40 TB/sec)
level 1 - special hardware
75 KHz (75 GB/sec)
level 2 - embedded
5 KHz (5 GB/sec)
level 3 - PCs
100 Hz (100 MB/sec)
data recording offline analysis
http//www.cern.ch/
next BioMedical
15Bio informatics in EU DataGrid
- For access to data
- Large network bandwidth to access computing
centers - Support of Data banks replicas (easier and
faster mirroring) - Distributed data banks
- For interpretation of data
- GRID enabled algorithms BLAST on distributed
data banks, distributed data mining
next GSI and VOMS
16Realising the Grid Vision
- Grid was the logical next step in the end of the
1990 - Harnassing desktop power became commonplace
1988 Condor, later SETI_at_Home, Entropia,
Distributed.NET - Peer-to-peer data access protocols emerged
1999 Napster, later Gnutella, KaZaa, BitTorrent - Network access became extremely fast 1997 wide
area bandwidth starts to double every 9 months! - 1997 Globus starts developing basic middleware
1996 middleware by Legion, 2000 Unicore - Massive take-up of the Grid vision in 1999 lead
in Europe by the EU DataGrid others include
NASA-IPG, CrossGrid, GridLab, PPDG, Alliance,
next the EU DataGrid project
17Grid Security Infrastructure
- Crucial in Grid computing it gives Single
Sign-On - GSI uses a Public Key Infrastructure with
proxy-ing and delegation - multiple VOs per user, groups and role support
contracts
Service 2
Grid Service 1
connect to providers
VOMS overview Luca dellAgnello and Roberto
Cecchini, INFN and EDG WP6
next information services overview
18VO Membership Service features
- User can exploit membership of multiple VOs
- User can pick selected Roles for specific tasks
- Site authorization based on VO membership
- but has all the means to act on per-user
characteristics! - Fine-grained authorization for data base and
replica access - All connections are two-way authenticated
- no spoofing
- no data corruption
- no spying
19What is needed to get the work done
- Fabric information
- what are the resources (computers, disk, tape)
available to my VO? - how do I access these resources (the contact
information)? - Physical meta-data
- when was this dataset written?
- where can I find copies of it close to me?
- Contextual meta-data or information
- Which datasets contain feature X?
- Which DNA sequence corresponds to this protein?
- Actual storage, processing power, network
connectivity
next spitfire
20Spitfire Access to Data Bases
- based on common EDG Trust and Authorization
Manager - VO and Role mapping to data base views
- Access via
- Browser
- Web Service
- Commands
Screenshots Gavin McCance, Glasgow University
and EDG WP2
next R-GMA
21Grid information R-GMA
- Relational Grid Monitoring Architecture
- a Global Grid Forum standard
- Implemented by a relational model
- used by grid brokers
Screenshots R-GMA Browser, Steve Ficher et al.,
RAL and EDG WP3
next RLS and RMC
22Replica Location Service
- Search on file attributes (date, name, )
- Find replicas on (close) Storage Elements
SE2CERN
CECERN
SE1SARA
cacheUvA DAS2
CEDAS-2
next CE and RB, brokering and LCAS
23Compute Brokering reliable execution
- User can delegate all job actions to the Resource
Broker and go away - Reliable scheduling of jobs over the entire grid
(as seen from the R-GMA information system) - Users are roaming, and can retrieve their results
anywhere, anytime
next EDG test bed overview
24Current EU DataGrid Facilities
EDG and LCG sites
Core site
NIKHEF
RAL
Tokyo Taipei BNL
CERN
Lyon
1000 CPUs100 Tbyte storage several key
databases 60 sites, 600 users in 7 VOs
CNAF
next using EDG, VisualJob
25Using the DataGrid for Real
Screenshots Krista Joosten and David Groep,
NIKHEF
next Portals
26Portals
Screenshots ICES/KIS and WTCW VLAM-G INFN-GRID
and EDG Genius NPACI Rocks
next conclusions and outlook
27What more is there to see and do?
- The current Grids are only the beginning!
- portals will get more users on the Grid
- more functionality, better resilience, strong
reliability - joining the Grid will be as simple as joining a
file-sharing network - EGEE a pan-European Grid Infrastructure being
created today - The EU DataGrid project web www.edg.org
- DutchGrid Platform www.dutchgrid.nl
- For other grid projects, see www.gridstart.org
www.enterthegrid.com