Title: Using Grid Computing
1Using Grid Computing
David Groep, NIKHEF2002-07-15
2The Grid, But Why?
- Physics _at_ CERN
- LHC particle accellerator
- operational in 2007
- 5-10 Petabyte per year
- 150 countries
- gt 10000 Users
- lifetime 20 years
40 MHz (40 TB/sec)
level 1 - special hardware
75 KHz (75 GB/sec)
level 2 - embedded
5 KHz (5 GB/sec)
level 3 - PCs
100 Hz (100 MB/sec)
data recording offline analysis
3CPU Data Requirements
4More Reasons Why
ENVISAT
- 3500 MEuro programme cost
- 10 instruments on board
- 200 Mbps data rate to ground
- 400 Tbytes data archived/year
- 100 standard products
- 10 dedicated facilities in Europe
- 700 approved science user projects
5And More
Bio-informatics
- For access to data
- Large network bandwidth to access computing
centers - Support of Data banks replicas (easier and
faster mirroring) - Distributed data banks
- For interpretation of data
- GRID enabled algorithms BLAST on distributed
data banks, distributed data mining
6Common Ground
- Large amounts of data
- Distributed, ad-hoc user community
- Problems are distributable
- Need for resources grows faster than market
- Network grows faster than the application needs
- Willingness to share resources
- if security and integrity is guaranteed
7The One-Liner
- Resource sharing and coordinated problem solving
in dynamic multi-institutional virtual
organisations
8What is Grid computing?
- Dependable, consistent and pervasive access
- Combining resources from various organizations
- Virtual Organizations user-based view on Grid
- Technical challenges
- transparent decisions for the user
- uniformity in access methods
- secure crack resistant
- authentication, authorization, accounting (AAA)
quota
9Grid Middleware
- Globus Project started 1997
- de facto-standard
- Reference implementation of Gridforum standards
- Large community effort
- Basis of several projects, including EU-DataGrid
- Toolkit bag-of-services' approach
- Successful test beds, with single sign-on, etc
10In The Beginning
- Distributed Computing
- synchronous processing
- High-Throughput Computing
- asynchronous processing
- On-Demand Computing
- dynamic resources
- Data-Intensive Computing
- databases
- Collaborative Computing
- science
Ian Foster and Carl Kesselman, editors, The
Grid Blueprint for a New Computing
Infrastructure, Morgan Kaufmann, 1999
11Grid Architecture
Make all resources talk standard
protocols Promote interoperability of application
toolkit, similar to interoperability of networks
by Internet standards
Application Toolkits
DUROC
MPICH-G2
Condor-G
VLAM-G
Grid Services
GRAM
GridFTP
MDS
ReplicaSrv
Grid Security Infrastructure (GSI)
Grid Fabric
Condor
MPI
PBS
Internet
Linux
SUN
12OGSA new directions
- Looks superficially like web services
- Based on common standards
- WSDL
- SOAP
- UDDI
- Adds
- Transient services
- State of distributed activities
- Workflow, videoconf, distributed data analysis
- Management of service instances
- Grid Security Infrastructure
13EU DataGrid
HEP, EO, Bio
ResourceBroker
Data ReplicasDatabasesMass storage
FabricNetwork
14Looking for Resources
- Resource Brokerage based on matchmaking (Condor)
- Information Services Mesh
- Meta-computing directory
- Replica Catalogues
- DataGrid http//marianne.in2p3.fr/
15Submitting a Job
16Locating a Replica
- Grid Data Mirror Package
- Moves data across sites
- Replicates both files and individual objects
- Catalogue used by Broker
- Replica Location Service (giggle)
- Read-only copies owner by the Replica Manager.
- http//cmsdoc.cern.ch/cms/grid
17Sending Your Data
- Tape robots, disks, etc. share GridFTP interface
- Supports single-sign-on and confidentiality
- Optimize for high-speed gt1Gbit/s networks
- In the future automatic optimizations,
bandwidth reservations, directory-enabled
networking,
18Grid-enabled Databases?
- SpitFireuniform access to persistent storage on
the Grid - Multiple roles support
- Compatible with GSI (single sign-on) though CoG
- Uses standard stuff JDBC, SOAP, XML
- Supports various back-end data bases
http//hep-proj-spitfire.web.cern.ch/hep-proj-spit
fire/
19DataGrid Test Bed 1
- DataGrid TB1
- 14 countries
- 21 major sites
- Growing rapidly
- Submitting Jobs
- Login only once,run everywhere
- Cross administrativeboundaries in asecure and
trusted way - Mutual authorization
20DutchGrid Platform
- DutchGrid
- Test bed coordination
- PKI security
- Participation by
- NIKHEFFOM, VU, UvA, Utrecht, Nijmegen
- KNMI, SARA
- AMOLF
- DAS-2 (ASCI)TUDelft, Leiden, VU, UvA, Utrecht
- Telematics Institute
21And now for some Technical Details
22Resources
- Current startup-resources to be (ab)used
- NIKHEF
- Several Globus test machines (try them now from
your desk!) - 50x2 CPUs D0 cluster
- 2x10x2 (40) CPUs LHCb at NIKHEF(WCW) VU
- 10x2 CPUs Alice NIKHEF(WCW)
- ca. 4x2 CPUs Alice Utrecht
- ca. 10x2 CPUs D0 Nijmegen
- Lots of disk dedicated 1.3TByte cache server
- DAS-II 200 dual-PIIIs systems some disk
(2TByte) - Spread over 5 locations (NIKHEF is one!)
- SARA tape robot (gt200TByte), some clusters
- More systems (NCF) to come this year
23Start using the grid
- All the necessary client tools are on all
Linux and Solaris systems - You just need
- Credentials/tokens for the Grid (see next slides)
- Authorization to use resources(you get all
NIKHEF resources by default) - Information on which resources to use effectively
24Your Grid Credentials
- You will use resources across several domains
- You may not care about security and authorization
- But the remote site admin will !
- All communications are authenticated usingX.509
Public Key Certificates - The technology used to securecredit card
transactions on the web (https//) - Uniquely binds name/affiliation to a digital
token
25Certification Authorities
- CAs act as trusted third parties
- Remote sites trust the CA for a proper binding
- They will not do authentication again, soonly
authorization left. - CAs are highly valuable crack one to
impersonate others on the Grid (and abuse
resources) - Registration Authorities do in-person ID checks
26CAs in DataGrid
- 10 National CAs (one per EU country)
- Each one has a detailed policy and practice
statement - NIKHEF operates the CA for DutchGridSee
http//www.dutchgrid.nl/ca - Get a certificate from the DutchGrid CAbefore
you can start using the Grid - Its valuable, protect it with a pass phrase
- One cert valid for all DataGrid sites
27The Proxy
- A proxy certificate is a limited-lifetime
delegationwithout a pass phrase to protect it - Implements the single sign-on for Grid
- Valid for 12 hours (by default)
- Use it to
- Run your jobs
- Get access to your data
- Get it, by running grid-proxy-init
28Now see for yourself
29Getting a Certificate
- Initialize your environment for the Grid
- Use the Globus local guide fromhttp//www.dutchgr
id.nl/Support/ - Send the result to ca_at_nikhef.nlyou will be
contacted by phone - Put the certificate (sent by mail) in
yourHOME/.globus/usercert.pem - Or use the Web at http//certificate.nikhef.nl/use
rhelp.html
30Using the Grid
- Request authorization grid.support_at_nikhef.nl
- Look what is out there using grid-info-search
orhttp//marianne.in2p3.fr/datagrid/giis/giis-bro
wse.html - Try some local hosts
- bilbo, kilogram, triangel
- kilogramdavidg1009 globus-job-run
dommel.wins.uva.nl /usr/ucb/quota -v - Disk quotas for random (uid 12xxx)
- Filesystem usage quota limit timeleft
files quota limit timeleft - /home/random 13067 1500000 2000000
0 0 0 - kilogramdavidg1010
- Start running your analysis/MC/other jobs
31grid-proxy-init
- kilogramdavidg1003 grid-proxy-init
- Your identity /Odutchgrid/Ousers/Onikhef/CNDa
vid Groep - Enter GRID pass phrase for this identity
PassPhrase - Creating proxy ...................................
. Done - Your proxy is valid until Wed Sep 26 055053
2001
32GridFTP
- Universal high-performance file transfer
- Extends the FTP protocol with
- Single sign-on (GSI, GSSAPI, RFC2228)
- Parallel streams for speed-up
- Striped access (ftp from multiple sites to be
faster) - Clients gsincftp, globus-url-copy.
33Whats Next?
- Some of the nice user-features to come
- Finding data files by characteristics(give me
all golden decays) - Moving your job to where the data is
- Automatic partitioning of jobs
- Support true-interactive work
- Better network utilisation (faster access to
data) -
- If you are in the DataGrid project, ask your WP
leader for authorization in TB1