Title: NPACI Grid: Using Grid Software to Enhance NPACI
1NPACI Grid Using Grid Software to Enhance NPACI
Applications and Systems At the University of
Michigan
Tutorial 10 Abhijit Bose Ken MacInnis Brian
Wickman NPACI All Hands Meeting, March 18-20,
2003
2Agenda
- Current Grid Resources at Michigan
- How to use Grid Services (user-centric view)
- NPACI Grid Application Showcase High-Energy
Physics - (DZero/SAM Grid)
- Setting up Grid Services
- Demonstration
3Current NPACI Grid Resources at Michigan (March
2003)
Calibration DB Servers
Local Naming Service
SAM FSS
WAN
SAM Station Servers
SAM Stagers
ADSM Storage
/adsm2 72 TB
SAM-Grid Gateway/D0 Gatekeeper
NFS-mounted
Kx509 Client
UM Network
Globus-PBS Job Manager
Hypnos 256-cpu AMD cluster 2000MP, 1 GB/cpu, 1
TB scratch (PBS/Maui Scheduler)
MGRID Gateway/Globus Gatekeeper
(NMI)
NFS-mounted
RAID Server
/d0raid 750GB-1TB
Gigabit Ethernet
4Grid Resources at Michigan (2003)
- Integrate the rest of the resources with the Grid
- 100-cpu AMD cluster (morpheus) being expanded
- 177-cpu IBM SP2, 24-cpu IBM Nighthawk
- Other college and dept.-level clusters as part
of MGRID - University-wide KCA (prototype already up and
running, - policies under
development) - University-wide Kx509 and NMI deployment
- Possible NPACI/CAC 64-bit cluster acquisition in
3Q, 2003
5Grid Resources at Michigan (2003)
- We are also looking at
- AFS clients on Grid-enabled clusters unified
view of user - directories
- Scalability and feasibility of using GPFS
Parallel File System - for cluster I/O
- Dzero/NFSv4 Integration to provide sandboxing
to local - file systems and fine-grained access control
- Many other activities via the MGRID initiative
(attend - Wednesdays Talk Parallel Session 2 Grid
Experiences)
6Steps in Using a Grid-enabled Resource (http//ww
w.npaci.edu/globus)
- Get NPACI allocations, accounts on Grid-enabled
systems - 2. Get appropriate certificates (More later)
- Set appropriate environment (can be in your
.cshrc or other shell rcfile) - e.g. (on chi.grid.umich.edu) do
- GLOBUS_LOCATION/usr/grid
- source /usr/grid/etc/globus-user-env.sh
-
- 4. Create a RSL resource request/submission
script (More later) - 5. Submit RSL script for appropriate Globus job
manager via - globusrun o r f
7Authentication and Certificates
Globus GSI uses public key cryptography and
digital signatures (http//www.globus.org/se
curity/overview.html) Primary Motivations -
need for secure communication among the elements
of a Grid - support security across
organizational boundaries (distributed
security entities rather than a centrally managed
entity) - support a single sign-on for users,
especially authenticate and authorize
users to multiple resources Need to have some
basic knowledge of Globus credentials, KX.509,
Kerberos etc. Very Important when requesting
access to multiple resources spanning
different administrative domains. This talk
will not cover the details of Globus GSI.
8Globus Certificates
- Central to GSI are certificates users and Grid
services are identified by their certificates
(identification and authentication) - four information fields in each certificate
- - subject name of the user or the service it
identifies - - public key of the user or the service
- - identity of the Certificate Authority (CA)
that signed the certificate - certifying the user or the service
- - a digital signature (e.g. using MD5),
certificate of the above CA - Since the CA certifies the link between the
subject name and the public key, the CA is
trusted (e.g. a Grid administrator decides which - CAs to trust).
9Example of a Certificate (using
CA_at_chi.grid.umich.edu) X.509 Format
Certificate Data Version 3
(0x2) Serial Number 1 (0x1)
Signature Algorithm md5WithRSAEncryption
Issuer OGrid, OUGlobusTest, OUsimpleCA-chi.eng
in.umich.edu, CNGlobus Simple CA
Validity Not Before Feb 19 141148
2003 GMT Not After Feb 19 141148
2004 GMT Subject OGrid, OUGlobusTest,
OUsimpleCA-chi.engin.umich.edu,
OUengin.umich.edu, CNKen MacInnis
Subject Public Key Info Public Key
Algorithm rsaEncryption RSA Public
Key (1024 bit) Modulus (1024
bit) 00d1746c9d55ac15
58eb26c2bc27fc
6164f7bd0cc695a1f474570738f0a1
749575b6a3e467670b940d4
302708e
137b0513ca46835908fa6735151b33
43e1250c9ec7ea6697cf329
c2317df
4c3c361d171c7f31e8fcac6c057e58
79f2c45d08cb20380fa42d2
c9d83e5
4c81d1a89cf5978b1deaf0bc474e8b
10e3ce468aeeb3ffe1
Exponent 65537 (0x10001) X509v3
extensions Netscape Cert Type
SSL Client, SSL Server, S/MIME, Object
Signing
10Example of a Certificate (using
CA_at_chi.grid.umich.edu) X.509 Format
Signature Algorithm md5WithRSAEncryption
86f9bba023f0fde6fa999170a9e842d4b
216 2aa832f401096de608632175
58f47b9d94be d03d00e3e9a911a4
9362f373e6004fe562d7
8a25daf98ef2e83e829e14c5b8a825b70a
7c 5e970f292b67d45b1e4db65fd
690f4e01cee 402db381f4dca1e8
3e97026b3b55af3579b1
8a967684a9090702708ae46c69a12316ac
62 3dfc -----BEGIN CERTIFICATE----- MII
CbTCCAdagAwIBAgIBATANBgkqhkiG9w0BAQQFADBmMQ0wCwYDV
QQKEwRHcmlk MRMwEQYDVQQLEwpHbG9idXNUZXN0MSUwIwYDVQ
QLExxzaW1wbGVDQS1jaGkuZW5n aW4udW1pY2guZWR1MRkwFwY
DVQQDExBHbG9idXMgU2ltcGxlIENBMB4XDTAzMDIx OTE0MTE0
OFoXDTA0MDIxOTE0MTE0OFowfDENMAsGA1UEChMER3JpZDETMB
EGA1UE CxMKR2xvYnVzVGVzdDElMCMGA1UECxMcc2ltcGxlQ0E
tY2hpLmVuZ2luLnVtaWNo LmVkdTEYMBYGA1UECxMPZW5naW4u
dW1pY2guZWR1MRUwEwYDVQQDEwxLZW4gTWFj .. -----END
CERTIFICATE----- One can have multiple
certificates for access to different resources (
stored in HOME/.globus directory) total
24 drwxrwxr-x 6 kmacinni kmacinni 4096 Feb
27 1623 . drwxrwxr-x 4 kmacinni kmacinni
4096 Mar 3 1129 .. drwxrwxr-x 2 kmacinni
kmacinni 4096 Feb 19 1418 chi-simpleCA drwxrw
xr-x 2 kmacinni kmacinni 4096 Feb 27 1624
doe-energy drwxrwxr-x 2 kmacinni kmacinni
4096 Feb 19 1413 globus drwxrwxr-x 2 kmacinni
kmacinni 4096 Feb 19 1419 ncsa
11Kerberos Network Authentication System
Kerberos Key Distribution Center
KDC
1,3
2,4
5
Alice
Bob
user
service
Login Phase Once Per
Session 1. Alice - KDC I am Alice 2. KDC -
Alice TGTAlice,TGS,KA,TGSKTGS,TKA,KA,,KCTK
A Accessing Services Every time a new/current
kerberized service is requested 3. Alice - TGS
Alice, Bob, TGT, TKA,TGS 4. KCT -
Alice TKT Alice, Bob, KA,BKB, TK
A,TGS,KA,BKA ,TGS 5. Alice - Bob I am Alice,
TKT, TKA,B TGS Ticket Granting Service (often
same entity as KDC) KA Shared key between Alice
and KDC (derived from Alices password upon
login) KA,TGS Session key for Alice and KDC
KTGS Shared key between KDC and TGS KA,B
Session key for Alice and Bob T Timestamp to
prevent replay attacks (requires synchronized
clocks)
12KX.509 Certificates
- The story so far.Alice has a Kerberos ticket on
the workstation she is - logged into. But Globus uses X.509 certificates
how does Alice use - Globus-enabled services ?
- KX.509, developed at CITI, University of
Michigan is a Kerberized client - program (resides on Alices workstation)
-
- - generates a X.509 certificate and a private
key based on the existing - Kerberos ticket
- - both are normally stored in the same
Kerberos ticket cache - (most often in the /tmp directory)
- - the temporary key are destroyed when
- Kerberos ticket expires
- Therefore, by adopting K.X509, an Kerberos-based
organization can - deploy and use Globus-enabled services without
changing its security - infrastructure. Kerberos is the most widely
deployed network - authentication system currently in use.
13Kerberos - K.X509 and Globus Proxy Certificate
Creation
- Steps (chi.grid.umich.edu)
- Obtain and cache a Kerberos5 Ticket Granting
Ticket (TGT) - abose_at_chi abose kinit abose
- Password for abose_at_GRID.UMICH.EDU
- abose_at_chi abose ls -al /tmp grep abose
- -rw------- 1 abose abose 483 Mar 18 0034
krb5cc_108355_PIfNIZ - (2) Obtain X.509 certificate from KCA and store
in /tmp as well - abose_at_chi abose kx509
- (3) Convert X.509 certificate to Globus Proxy
certificate and cache - abose_at_chi abose kxlist -p
14Kerberos - K.X509 and Globus Proxy Certificate
Creation (continued)
(you can use either kxlist p or
grid-proxy-init to generate Globus proxy
certificate) Content of the certificate Service
kx509/certificate issuer /CUS/STMichigan/LAnn
Arbor/OUniversity of Michigan/CNMGrid Test
KCA subject /CUS/STMichigan/LAnn
Arbor/OUniversity of Michigan/OUMGrid Test
KCA/CNabose/0.9.2342.19200300.100.1.1abose/Emai
labose_at_GRID.UMICH.EDU serial34 hash8ca5c718
Note for Grid Administrators need to add the
subject line from above and the username on the
host in the grid-map file (GLOBUS_LOCATION/etc/gr
id-mapfile or any Location specified in Globus
Gatekeeper configuration file GLOBUS_LOCATION/e
tc/globus-gatekeeper.conf)
15Globus Resource Specification Language (RSL)
Basics
- Use RSL to specify resources you need at the time
of - submission
- globusrun o r chi/jobmanager-pbs f req.rsl
- ( -r resource name, -f RSL filename )
- - Good to know some of the basic value pairs
- indicates single resource request to Globus
Resource - Allocation Manager (GRAM), conjunction of
pairs. - indicates request for multiple resources
(coallocation) - introduces new variable scope
16Globus Resource Specification Language (RSL)
Basics
- variables defined in one clause of a
multi-request are not visible to the other
clauses - RSL tokens
- Cant be any of the following as part of an
unquoted literal - 'Â (plus), 'Â (ampersand), 'Â (pipe),
(' (left paren), )' (right paren), ' (equal),
' (right angle),
!' (exclamation), "' (double quote),
''Â (apostrophe), 'Â (carat), 'Â (pound), and
'Â (dollar). - Common RSL tokens
- arguments, count, directory, executable,
jobType, - environment, maxTime, maxWallTime, gramMyjob,
- maxCpuTime, stdin, stdout, stderr, queue,
project, dryRun, maxMemory, minMemory,
hostCount
17RSL continued
Example (single resource for now) globusrun
-r chi/jobmanager-pbs ' (executable"/home/abo
se/test.exe") (host_count2) (count4)
(arguments-t 100 f out.dat")
(email_addressabose_at_umich.edu")
(queue"cac") (pbs_stageinmorpheus/home/
abose/test.exe") (pbs_stageoutmorpheus/h
ome/abose/out.dat") (pbs_stdout"/tmp/stdou
t") (pbs_stderr"/tmp/stderr")
(maxwalltime10)(jobtype"mpi) get test.exe
from morpheus and run it on hypnos - submitted
by Globus gatekeeper on chi using PBS job manager
18RSL continued
Example Resulting PBS Submission Script on
Hypnos ! /bin/sh PBS batch job script built
by Globus job manager PBS -S /bin/sh PBS -M
abose_at_umich.edu PBS -m n PBS -q cac PBS -W
stagein/home/abose/test.exe_at_morpheus.engin.umich.
edu/home/abose/test.exe PBS -W
stageout/home/abose/out.dat_at_morpheus.engin.umich.
edu/home/abose/out.dat PBS -l
walltime1000 PBS -o hypnos/tmp/stdout PBS -e
hypnos/tmp/stderr PBS -l nodes2 PBS
-v X509_USER_PROXY/home/abose/.globus/.gass_cache
/local/md5/1c/fd/d3/753b90 28dfec2ddd6df84cd06c/md
5/0a/4b/1d/599dac54863d650c2531cb92fc/data,GLOBUS_
LOCATION/usr/grid,GLOBUS_GRAM_JOB_CONTACThttps
//chi.grid.umich.edu58963/ 575/1047861360/,GLOBUS
_GRAM_MYJOB_CONTACTURLx-nexus//chi.grid.umich.ed
u58 964/, HOME/home/abose,LOGNAMEabose,LD_LIBRA
RY_PATH Change to directory requested by
user cd /home/abose /usr/gmpi.pgi/bin/mpirun np
4 /home/abose/test.exe t 100 f out.dat
19An Application Grid Domain using NPACI resources
DZero/SAM-Grid Deployment at Michigan
Timelines Planning Meetings (CAC and Fermi
Labs) Sep-Oct, 2002 Demonstration of SAM-Grid at
SC2002 Nov, 2002 Deployment/Site-customization
Dec, 2002 - Mar, 2003 Target NPACI
Allocation/Production Runs at Michigan
April, 2003 (plus site
visits, students spent part of their time at
Fermi Labs) Slide Courtesy Jianming Qian,
Univ. of Michigan and Lee Leuking, FNAL
20The D0 Collaboration
- 500 Physicists
- 72 institutions
- 18 Countries
21(No Transcript)
22Scale of Challenges
Computing sufficient CPU cycles and storages,
good network bandwidth Software efficient
reconstruction program, realistic
simulation, easy data-access,
50 Hz
2 MHz
5 KHz
1 KHz
Level - 1
Level - 2
Level - 3
- Luminosity-dependent physics menu leads to
approximately - constant Level 3 output rate
- With a combined duty factor of 50, we are
writing at 25 Hz DC, - corresponding to 800 million events a year
23Data Path
Raw
Data
MC
Offsite Farms
Fermilab Farm
Data Handling System
?
Offsite
Fermilab
24Major Software
- Trigger algorithms (Level-2/Level-3) and
simulation - Filtering through firmware programming and/or
(partial) reconstruction, - event building and monitoring. Simulating Level-1
hardware and wrap - Level-2 and Level-3 code offline simulation.
- Data management and access
- SAM (Sequential Access to data via Meta-data) is
used to catalog - data files produced by the experiment and
provides distributed data - storage and access to production and analysis
sites. - Event reconstruction
- Reconstructing all physics object candidates,
producing Data - Summary Tape (DST) and Thumbnails (TMB) for
further analyses. - Physics and detector simulation
- Off the shelf event generators to simulate
physics processes and - home grown Geant-based (slow) and parameterized
(fast) programs - to simulate detector responses.
25Computing Architecture
Central Data Storage
dØmino
DØ
Central Analysis Backend
(CAB)
Remote Analysis
26Storage and Disk Needs
For two-year running
Storage store all officially produced and user
derived datasets at Fermilab robotic tape
system ? 1.5 PB Disk all data and some MC
TMBs are disk resident, sufficient disk cache
for user files ? 40 TB at analysis centers
27Analysis CPU Needs
CPU needs are estimated based on the layered
analysis approach
- DST based
- Resource intensive, limited to physics, object
ID, and detector groups - Example insufficient TMB information, improved
algorithms, bug fixes, -
- TMB based
- Medium resource required, expect to be done
mostly by subgroups - Example creating derived datasets, direct
analyses on TMB, - Derived datasets
- Individuals done daily on their desktops and/or
laptops - Example Root-tree level analyses, selection
optimization,
The CPU needs is about 4 THz for a data sample
of two-year running
28Analysis Computing
RAC (Regional Analysis Center)
- dØmino and its backend (CAB) at Fermilab
Computing Center - provided and managed by Fermilab Computing
Division - dØmino is a cluster of SGI O2000 CPUs, provides
limited CPU power, - but large disk caches and high performance I/O
- CAB is a 160 dual 1.8 GHz AMD CPU Linux farm on
dØmino backend, - should provide majority of analysis computing
at Fermilab - CluEDØ at DØ
- over 200 Linux desktop PCs from collaborating
institutions - managed by volunteers from the collaboration
- CAB and CluEDØ are expected to provide half of
the estimated analysis - CPU needs, the remaining is to be provided by
regional analysis centers
29Overview of SAM (Sequential Access to data via
Meta-Data)
Database Server(s) (Central Database)
Name Server
Global Resource Manager(s)
Log server
Shared Globally
Station 1 Servers
Station 3 Servers
Local
Station n Servers
Station 2 Servers
Mass Storage System(s)
Arrows indicate Control and data flow
Shared Locally
30Components of a SAM Station
/Consumers
Producers/
Project Managers
Cache Disk
Temp Disk
MSS or Other Station
MSS or Other Station
File Storage Server
Station Cache Manager
File Storage Clients
File Stager(s)
Data flow
Control
Worker nodes
31SAM Deployment
- The success of SAM data handling system is the
first step towards - utilizing offsite computing resources
- SAM stations deployed at collaborating
institutions provide easy - data storage and access
Only most active sites are shown
32(No Transcript)
33SAM-GRID
SAM-GRID is a Particle Physics Data Grid project.
It integrates Job and Information Management
(JIM) with the SAM data management system. A
first version of SAM-GRID is successfully
demonstrated at Super Computing 2002
in Baltimore SAM-GRID could be an important job
management tool for our offsite analysis efforts
34JIM v1 Deployment Plan to Achieve the April 1
Milestone
- Lee Lueking,
- Igor Terekhov,
- Gabriele Garzoglio
- Fermilab
35Objectives of SAMGrid
- Bring standard grid technologies (Globus and
Condor) to the Run II experiments. - Enable globally distributed computing for D0 and
CDF. - JIM complements SAM by adding job management and
monitoring to data handling. - Together, JIM SAM SAMGrid
36Principal Functionality
- Enable all authorized users to use off-fermi-site
computing resources - Provide standard interface for all job submission
and monitoring - Jobs can be 1. Analysis, 2. Reconstruction, 3.
Monte Carlo, 4. Generic (vanilla) - JIM v1 features
- Submission to SAM station of users choice,
- Automatic selection of SAM station based on
amount of input data cached at each station - Web-based monitoring
37Job Management
User Interface
User Interface
Submission Client
Submission Client
Match Making Service
Match Making Service
Broker
Queuing System
Queuing System
Information Collector
Information Collector
JOB
Data Handling System
Data Handling System
Data Handling System
Data Handling System
Execution Site 1
Execution Site n
Computing Element
Computing Element
Computing Element
Storage Element
Storage Element
Storage Element
Storage Element
Storage Element
Grid Sensors
Grid Sensors
Grid Sensors
Grid Sensors
Computing Element
38Site Requirements
- Linux i386 hardware architecture machines
- SAM station with working sam submit
- For MC clusters, mc_runjob installed
- Submission and execution sites continuously run
SAM and JIM servers with auto-restart procedures
provided. - Execution sites can configure their Grid users
and batch queues to avoid self inflicted DOS, eg
all users mapped to one user d0grid with
limited resources. - Firewalls must have specific ports open for
incoming connections to SAM and JIM. Client hosts
may include FNAL and all submission sites. - Execution sites trust grid credentials used by
D0, including DOE Science Grid, FNAL KCA, and
others by agreement. To use FNAL KCA, Kerberos
client must be installed at the submission site.
39JIM V1 Deployment
- A site can join SAM-Grid with combos of services
- Monitoring
- Execution
- Submission
- April 1, 2003 Expect 5 initial sites for SAMGrid
deployment, and 20 submission sites. - May 1, 2003 A few additional sites, depending on
success and use of initial deployment. - Summer 2003 Continue to add execution and
submission sites. Hope to grow to dozens exe. and
hundreds of sub. sites. - CAB is powerful resource at FNAL, but...
- Globus software not well supported on IRIX (CAB
station server runs on d0mino). - FNAL computer security team restricts Grid jobs
to in situ exes, or KCA certificates for user
supplied exes.
40SAMGrid Dependencies
41Expectations from D0
- By March 15 Need 2 volunteers to help set up
beta-sites and conduct submission tests. - We expect runjob to be interface with the JIM v1
release to run MC jobs. - In the early stages, it may require ½ FTE at each
site to deploy and help troubleshoot and fix
problems. - Initial deployment expectations
- GrkdKa Analysis site
- Imperial College and Lancaster MC sites
- U. Michigan (NPACI) Reconstruction center.
- Second round of deployments Lyon (ccin2p3),
Manchester, MSU, Princeton, UTA. - Others include NIKHEF and Prague, need to
understand EDG/LCG implications.
42How Do We Spell Success?
- Figures of merit
- Number of jobs successfully started in remote exe
batch queues. - If a job crashes its beyond our control.
- May be issues related to data delivery that could
be included as special failure mode. - How effectively CPU is used at remote sites
- May change scheduling algorithm for job
submission and/or tune queue config at sites. - Requires cooperation of participating sites
- Ease of use and how much work gets done on the
Grid.