Title: Introduction to Globus with Condor-G
1Introduction to Globus with Condor-G
Israel Academic Grid (IAG)
- Itzhak Ben Akiva (TAU)
- David Front (WI)
2Agenda
- Grid security and certificates
- Globus
- Condor-G
- Condor-G submission examples
- References
3 Grid Security Infrastructure (GSI)
- GSI is a set of tools, libraries and protocols
used in Globus to allow users and applications to
securely access resources. - Based on a public key infrastructure, with
certificate authorities and X509 certificates
Proxies and delegation for secure single Sign-on
Proxies and Delegation
SSL/ TLS
PKI (CAs and Certificates)
SSL for Authentication And message protection
PKI for credentials
4Public Key Infrastructure (PKI)
- PKI allows you to know that a given public key
belongs to a given user - PKI builds off of asymmetric encryption
- Each entity has two keys public and private
- Data encrypted with one key can only be decrypted
with other. - The private key is known only to the entity
- The public key is given to the world encapsulated
in a X.509 certificate
5Certificates
- Certificates link between public key identity
of a - Person, organization, or device (subject)
- Associated with use of private key
- Used by a relying party
- Certificate Authority (CA) are responsible for
establishing identity - CA generates key pair, and digitally signs the
public key making it a Certificate
6Certificates what CA to use?
- The following possibilities go from the less
secure to the most secure -
- Anyone can become a certificate authority
- http//www.onlamp.com/pub/a/onlamp/2003/02/06/linu
xhacks.html - Free certificate authorities
- http//www.thawte.com/
- http//www.verisign.com/Â ...
- Globus CAhttp//www-fp.globus.org/gt2.4/admin/gu
ide-verify.htmlcert coerces some security
limitations such as - Domain of host to get a certificate should be the
same as requestors email domain - EDG CA
- http//igc.services.cnrs.fr/Datagrid-fr/english/i
ndex.htmlEDG does not honor Globus
certificates.EDG Grants certificates only to
people that are known personally by an authorized
third person. - For each non-Globus CA, each site should
configure the CA as being a trusted one. - Globus is trusted by default and EDG has RPMs for
adding it as a trusted CA. - Recommendation until Machba supplies
certificates, use Globus CA.
7Globus with Condor-G. A resource broker to be
added later
8Globus
- Gurus
- Ian Foster
- Carl Kesselman
- Globus Project
- www.globus.org
- Globus is a bag of tools
- Grid SW projects use Globus
9What is Globus?
- A research and development project that enables
the application of Grid concepts to scientific
and engineering computing. - Globus Toolkit allows build Grids, develop Grid
applications - Globus Project research targets technical
challenges and Globus Toolkit supplies a set of
services and software libraries to support Grids
and Grid applications - (GRAM) resource management
- (GSI) security
- (MDS) information infrastructure
- (GASS) data management
- (HBM) fault detection
- (Nexus and globus_io) portability communication
10The Globus Toolkit contents
- A bag of services components to develop grid
applications programming tools - Component have a C application programmer
interface (API) - Some Components have Java classes and/or command
line tools - Prototypes of
- higher components (resource brokers,
co-allocators) - and services.
- Others use the Globus Toolkit to develop
- higher-level services,
- application frameworks,
- and scientific/engineering applications
- Example
- Condor-G uses Globus for its high-throughput
computing framework
11Globus Toolkit GRAM GSI
- Globus Resource Allocation Manager (GRAM)
- Resource allocation
- Process creation
- Monitoring
- Management
- Maps requests
- expressed in a Resource Specification Language
(RSL) - into commands
- to local schedulers and computers.
- Grid Security Infrastructure (GSI)
- A single-sign-on, run-anywhere authentication
service, - local control over access rights
- mapping from global to local user identities.
- Smartcard support increases credential security.
12Globus Toolkit MDS GASS
- Monitoring and Discovery Service (MDS)
- Extensible Grid information service
- Combines data discovery mechanisms with the
Lightweight Directory Access Protocol (LDAP). - Uniform framework for providing and accessing
system configuration and status information, such
as - Compute server configuration
- Network status,
- Locations of replicated datasets
- Global Access to Secondary Storage (GASS)
- Implements a variety of automatic and
programmer-managed data movement and data access
strategies - Enables remote programs to read and write local
data.
13Globus Toolkit Nexus, globus_io, HBM, GPT
- Nexus and globus_io
- communication services for heterogeneous
environments - multimethod communication
- multithreading
- single-sided operations
- The Heartbeat Monitor (HBM)
- Allows system administrators or ordinary users to
- detect failure of system components
- detect failure of application processes
- Globus Packaging Tool (GPT)
- VDTs Packman installs Globus and
Condor-G.Hence, is it more appropriate for
Condor-G users.
14Layered Grid Architecture(By Analogy to Internet
Architecture)
15 The Globus Toolkit in One Slide
- Grid protocols (GSI, GRAM, ) enable resource
sharing within virtual orgs toolkit provides
reference implementation - ( Globus Toolkit services)
Protocols (and APIs) are central to Globus
toolkit
16Globus Toolkit missing, weak, plans 1
- (GRAM) resource management
- Condor-G adds reliable job submission
- (EDG) Resource broker choose Globus resource to
submit job - Globus plans support end-to-end performance
management and fault tolerance via network
scheduling, advance reservations, and
policy-based authorization. - (GSI) security
- Using X.509 certificates has various limitations.
For example - If a user does not use a pass phrase, anyone
that puts a hand on her certificate can use it. - (MDS) information infrastructure
- LDAP too weak for frequently changing
information (EDG uses RDBM instead of LDAP)
17Globus Toolkit missing, weak, plans 2
- (GASS) data management
- Globus current replica management capabilities
are limited - Globus plans provide high-performance access to
large amounts of data (terabytes or petabytes). - (HBM) fault detection
- (Nexus and globus_io) portability communication
- Others
- Weak accounting
- The firewall problem In order to submit a
Globus job, some Internet ports should be opened.
This is a security problem. - Weak fabric tools
- Installation/configuration
- VDT adds configurable installation via Packman
- EDG adds Client-server installation updating
via LCFG - Weak monitoring tools
- Restricted support for Windows.
18Globus support for Windows
- Port of the Globus Toolkit to the Windows XP/2000
platform is under development/test. - Using Grid resources from Windows systems or turn
Windows systems into Grid resources - The Java CoG Kit (http//www.globus.org/cog/)
provides access to Grid services via the Java
programming language, available on Windows. - A Java-based GRAM service is currently being
developed. - The Condor software from the Condor Project at
the University of Wisconsin (http//www.cs.wisc.ed
u/condor/) provides job management services that
allow you to submit jobs to a local service that
then submits your jobs to remote resources for
execution. Condor can use Grid resources to
execute these jobs. Condor is available for
Windows.
19Globus toolkit (re)structure
Service naming
Soft state management
Reliable invocation
GRAM
MDS
GridFTP
MDS
???
Notification
GSI
GSI
GSI
Other Service or Application
Compute Resource
Data Resource
Lots of good mechanisms, but (with the exception
of GSI) not that easily incorporated into other
systems
20Service Oriented Architecture (SOA)
- New buzzwords
- Services (in addition to protocols and APIs)
- Open Grid Services Architecture (OGSA)
- Web services
- Soap
- XML
Service Registry
Service Requestor
Service Provider
OGSA may become standard
21The Grid Service Interfaces/Behaviors Service
Data
Service data element
Service data element
Service data element
Binding properties - Reliable invocation -
Authentication
Implementation
Hosting environment/runtime (C, J2EE, .NET, )
22Condor-G
- Guru
- Miron Livny
- Condor Project
- http//www.cs.wisc.edu/condor
- Condor-G manual
- http//www.cs.wisc.edu/condor/manual/v6.4/5_2Condo
r_G.html - Condor is a scheduler, similar to PBS, LSF and
others - Condor-G is (the submission) part of Condor.
- It adds to Globus Reliable job submission
23Condor-G from Globus eyes
- Condor-G adds to Globus reliable job submission.
It lets you - Submit jobs into a queue
- have a log detailing the life cycle of your jobs
- manage your input and output files
- along with everything else you expect from a job
queuing system. - Condor-G does more than Globus toolkit's
globusrun command - It allows you to submit many jobs at once
- and then to monitor those jobs with a convenient
interface - receive notification when jobs complete or fail
- maintain your Globus credentials which may expire
while a job runs - Condor-G is a fault-tolerant systemIf your
machine crashes, you can still perform all of
these functions when your machine returns to
life.
24Condor-G from Condor eyes
- Condor-G is a Globus-enabled version of the
Condor scheduler.It uses Globus to handle
inter-organizational problems like - Security
- Resource management for supercomputers,
- Executable staging.
- Hence The same Condor tools that access local
resources are now able to use the Globus
protocols to access resources at multiple sites. - Condor-G manages both a queue of jobs and the
resources from one or more sites where those jobs
can execute. It communicates with these
resources and transfers files to and from these
resources using Globus mechanisms, such as - GSI
- GRAM protocol for job submission,
- and a local GASS server for file transfers.
- The mutual look
- Condor can be used to submit jobs to systems
managed by Globus. - Globus tools can be used to submit jobs to
systems managed by Condor.
25how Condor-G interacts with Globus protocols
Figure 5.1 Remote Execution by Condor-G on
Globus managed resources
26Submitting a job to Condor-G example 1
- Run your compiled program on a different Globus
resource - Make sure your Condor server service is running
on the Condor server. - (Not explained here)
- Make sure you have your Grid credentials, create
a proxy grid-proxy-init - To submit a job
- condor_ submit lt submit-description-file-namegt
- The following sample runs a job on the Origin2000
at NCSA - executable test
- globusscheduler modi4.ncsa.uiuc.edu/jobmanager
- universe globus
- output test.out
- log test.log queue
- The executable for this example is transferred
from the local machine to the remote machine. - By default, Condor transfers the executable, as
well as any files specified by the input command.
- This executable must be compiled for the correct
intended platform.
27Submitting a job to Condor-G example 1 cont.
- The globusscheduler command is dependent on the
scheduling software available on remote resource.
This required command will change based on the
Grid resource intended for execution of the job. - All Condor-G jobs are submitted to the globus
universe. Henceuniverse globus is always
required in the submit description file. - IONo input file is specified for this example
job. Any output (file specified by the output)
or error (file specified by the error) is
transferred from the remote machine to the local
machine as it is produced. This implies that
these files may be incomplete in the case where
the executable does not finish running on the
remote resource. The job log file is maintained
on the local machine. - To submit this job to Condor-G for execution on
the remote machine, use condor_submit
test.submit where test.submit is the name of the
submit description file.
28Submitting a job to Condor-G example 1 cont.
- Example output from condor_ q for this submission
looks like -
- condor_q
- -- Submitter wireless48.cs.wisc.edu
lt128.105.48.14833012gt wireless48.cs.wi - ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
- 7.0 epaulson 3/26 1408 0000000 I 0 0.0 test
- 1 jobs 1 idle, 0 running, 0 held
- After a short time, Globus accepts the job.
- Again running condor_ q will now result in
- condor_q
- -- Submitter wireless48.cs.wisc.edu
lt128.105.48.14833012gt wireless48.cs.wi - ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
- 7.0 epaulson 3/26 1408 0000115 R 0 0.0
test - 1 jobs 0 idle, 1 running, 0 held
- Then, very shortly after that, the queue will be
empty again, because the job has finished
29Submitting a job to Condor-G example 2
- Run the (prestaged) Unix ls program on a
different Globus resource - executable /bin/ls
- Transfer_Executable false
- globusscheduler vulture.cs.wisc.edu/jobmanager
- universe globus
- output ls-test.out
- log ls-test.log queue
-
- The executable is pre-staged. Being on the remote
machine, there is no need to transfer it before
execution. - The required globusscheduler and universe
commands are present. - The command Transfer_Executable FALSE
identifies the executable as being pre-staged. - In this case, the executable command gives the
path to the executable on the remote machine.
30Submitting a job to Condor-G example 3
- Submit a Perl script to be run as a Condor job.
- The Perl script both lists and sets environment
variables for a job. -
- Save the following Perl script with the name
env-test.pl, to be used as a Condor job
executable - !/usr/bin/env perl
- foreach key (sort keys(ENV))
- print "key ENVkey\n"
- exit 0
- Run the Unix command chmod 755 env-test.pl to
make the Perl script executable. - Create the following submit description file
executable env-test.pl - globusscheduler biron.cs.wisc.edu/jobmanager
- universe globus
- environment foobar
- zotqux
- output env-test.out
- log env-test.log queue
31Submitting a job to Condor-G example 3 cont.
- When the job has completed, the output file
env-test.out should contain something like this - GLOBUS_GRAM_JOB_CONTACT https//biron.cs.wisc.ed
u36213/30905/1020633947/ - GLOBUS_GRAM_MYJOB_CONTACT URLx-nexus//biron.cs.
wisc.edu36214 - GLOBUS_LOCATION /usr/local/globus
- GLOBUS_REMOTE_IO_URL /home/epaulson/.globus/.gas
s_cache/globus_gass_cache_1020633948 - HOME /home/epaulson
- LANG en_US
- LOGNAME epaulson
- X509_USER_PROXY /home/epaulson/.globus/.gass_cac
he/globus_gass_cache_1020633951 - foo bar
- zot qux
32Submitting a job to Condor-G example 3 cont.
- Of particular interest is GLOBUS_REMOTE_IO_URL
environment variable - Condor-G automatically starts up a GASS remote
I/O server on the submitting machine. - Because of the potential for either side of the
connection to fail, the URL for the server
cannot be passed directly to the job. - Instead, it is put into a file, and the
GLOBUS_REMOTE_IO_URL environment variable points
to this file. - Remote jobs can read this file and use the URL it
contains to access the remote GASS server running
inside Condor-G. - If the location of the GASS server changes (for
example, if Condor-G restarts), Condor-G will
contact the Globus gatekeeper and update this
file on the machine where the job is running. - It is therefore important that all accesses to
the remote GASS server check this file for the
latest location.
33Submitting a job to Condor-G last example
- A Perl script that uses the GASS server in
Condor-G to copy input files to the execute
machine. (the remote job counts the number of
lines in a file.) - !/usr/bin/env perl use FileHandle
- use Cwd
- STDOUT-gtautoflush()
- gassUrl cat ENVGLOBUS_REMOTE_IO_URL
- chomp gassUrl
- ENVLD_LIBRARY_PATH ENVGLOBUS_LOCATION.
"/lib" - urlCopy ENVGLOBUS_LOCATION."/bin/globus-url-
copy" - globus-url-copy needs a full pathname
- pwd getcwd()
- print "urlCopy gassUrl/etc/hosts
file//pwd/temporary.hosts\n\n" - urlCopy gassUrl/etc/hosts file//pwd/temporary
.hosts - open(file, "temporary.hosts")
- while(ltfilegt) print _
- exit 0
34Submitting a job to Condor-G last example Cont.
- The submit file
- executable gass-example.pl
- globusscheduler biron.cs.wisc.edu/jobmanager
- universe globus
- output gass.out
- log gass.log queue
- There are two optional submit description file
commands of note x509userproxy and globusrsl. - 1) The x509userproxy command specifies the path
to an X.509 proxy, as - x509userproxy /path/to/proxy
- If this optional command is not present in the
submit description file,then Condor-G checks the
value of the environment variable X509_USER_PROXY
for the location of the proxy. - If this environment variable is not present, then
Condor-G looks for the proxy in the file
/tmp/x509up_u0000, where the trailing zeros in
this file name are replaced with the Unix user
id.
35Submitting a job to Condor-G last example Cont.
- 2)The globusrsl command is used to add
additional attribute settings to a job's RSL
string, as - globusrsl (namevalue)(namevalue)
- An example of this command in a submit
description file - globusrsl (projectTest_Project)
- This example's attribute name for the additional
RSL is project, and the value assigned is
Test_Project.
36Limitations of Condor-G
- No checkpoints.
- No matchmaking.
- File transfer is limited. There are no file
transfer mechanisms for files other than the
executable, stdin, stdout, and stderr. - No job exit codes. Job exit codes are not
available. - Limited platform availability. Condor-G is only
available on Linux, Solaris, Digital UNIX, and
IRIX. HP-UX support will hopefully be available
later.
37References
- Globus Project www.globus.org
- Overviews of Grid computing
- Anatomy of the grid
- http//www-fp.globus.org/research/papers.htmlan
atomy - Physiology of the gridhttp//www-fp.globus.org/r
esearch/papers.htmlOGSA - Older, extensive
- The Grid Blueprint for a New Computing
Infrastructure, I. Foster and C. Kesselman
(Eds), Morgan Kaufmann, 1999. - Globus FAQ http//www-fp.globus.org/about/faq/gene
ral.html - Globus installation http//www-fp.globus.org/gt2/a
dmin/guide-verify.html - Condor-G manual http//www.cs.wisc.edu/condor/man
ual/v6.4/5_2Condor_G.html - A topical school on Grid computing will be held
in Vico Equense, Italy during the last two weeks
of July, 2003.For details, send an email to
grid-chool_at_ggf.org. - Global Grid Forum www.gridforum.org