Title: FERMIGRID USE CASE
1ReSS Resource Selection Service for National
and Campus Grid Infrastructures
Parag Mhashilkar, Gabriele Garzoglio, Tanya
Levshina, Steve Timm parag, garzoglio, tlevshin,
timm_at_fnal.gov Computing Division, Fermilab,
Batavia, IL 60510, USA
INTRODUCTION The Open Science Grid (OSG) offers
access to hundreds of compute elements (CE) and
storage elements (SE) via standard Grid
interfaces. The Resource Selection Service (ReSS)
is a push-based workload management system that
is integrated with the OSG information systems
and resources. ReSS integrates standard Grid
tools such as Condor, as a brokering service, and
the gLite CEMon, for gathering and publishing
resource information, in Glue Schema format. ReSS
is used in OSG by Virtual Organizations (VO) such
as US CMS, Dark Energy Survey (DES), DZero and
Engagement VO. ReSS is also used as a Resource
Selection Service for Campus Grids, such as
Fermigrid. VOs use ReSS to automate the resource
selection in their workload management system to
run jobs over the grid. In the past year, the
system has been enhanced to enable publication
and selection of storage resources and of any
special software or software libraries (like MPI
libraries) installed at computing resources. In
this poster, we demonstrate the ReSS Service, its
architecture, its typical usage on the two scales
of a National Cyber Infrastructure Grid, such as
OSG, and of a campus grid, such as FermiGrid.
Additionally we present workload management
system requirements from the coming era of LHC
data taking.
MAPPING GLUE SCHEMA TREE TO CONDOR CLASSADS
Site
Cluster
Storage Element
CE1
CE2
StorageArea1
StorageArea2
SubCluster1
SubCluster2
VO1
VO2
VO2
VO3
VO1
VO2
VO2
VO3
Number of classads contributed by Computing
Elements NCE-classads Ncluster x NSC x NCE x
NVO Number of classads contributed by Storage
Elements NSE-classads NSE x NVO x NSA Total
number of classads per Site NTotal-classads
NCE-classads NSE-classads
Mapping Glue Schema V1.3 to Condor ClassAd
Classad Mapping Rules
- GLUE SCHEMA V1.3 TO CLASSAD CONVERSION
- The mapping is built by considering all the
possible combinations of inter-related Computing
Elements (CE), Clusters, Sub-clusters (SC), and
Virtual Organizations (VO). In other words, each
combination contains a single CE, Cluster, SC,
and VO. - The mapping for Storage Elements is built by
considering all the possible combinations of
inter-related Storage Elements (SE), Storage
Areas (SA), and the VO supported by each SA. In
other words, each combination contains a single
SE, SA and VO. Attributes in each combination are
then mapped to a single old ClassAd.
ReSS ARCHITECTURAL DIAGRAM
ReSS Central Services
JOB
Which Gatekeeper?
- STATUS
- FUTURE WORK
- Current Status
- The ReSS system is deployed at two major Grid
infrastructures, the OSG and the FermiGrid. The
machines run the ReSS services at a low load
(lt1). - FermiGrid The ReSS publishing service (CEMon) is
deployed on 7 campus clusters, advertising around
3000 classads for a total of more than 17,250 job
slots to OSG. The information gatherer runs on
virtual machine with 3GB of RAM and 4 cpus. The
condor matchmaker runs on a virtual machine with
2 cpus and 2GB of RAM. - OSG CEMon is deployed at about 75 sites,
producing over 7000 classads. ReSS central
services run on Xeon 3.2 GHz 4-CPU machines with
4 GB of RAM. - Future Work
- Improve the quality of information published for
SE as the Glue Schema evolves. - Support MPI use cases in addition to the existing
ones. - Extend the support for Glue Schema V2 based on
the OSGs needs. - Improved security for resource registration
- Support ReSS deployment under High-Availability
mode.
Information Gatherer
Condor Match Maker
Classads
Condor Scheduler
JOB
Gatekeeper 2
Classads
Classads
CEMon
Gatekeeper2
CEMon
Gatekeeper1
CLUSTER
CLUSTER
JOBS
JOBS
Info
Info
GIP
job-manager
GIP
job-manager
job-manager
job-manager
job-manager
job-manager
CONCLUSION The Resource Selection Service (ReSS)
provides cluster-level resource selection for the
Open Science Grid and FermiGrid Campus Grid. The
system uses the Glue Schema model to describe
resources and the Condor classad format to
publish information. ReSS integrates the Condor
match making service, for resource selection,
with gLite CEMon, for information gathering and
publishing. The system naturally interfaces with
the Condor-G scheduling system. ReSS is a
lightweight, scalable, and robust infrastructure
for resource selection of push-based job handling
middleware.
Ress deployment USE CASES for OSG and Fermigrid
FERMIGRID USE CASE
Which Site ?
ReSS Central Services (OSG)
FermiGrid
Information Gatherer
Condor Match Maker
JOB
Classads
Classads
FermiGrid Campus Grid
FermiGrid Gateway
JOB
FermiGrid ReSS
Condor Scheduler
Which Cluster?
JOB to Cluster 1
Condor Match Maker
Cluster1
FERMIGRID CLUSTER 1
Information Gatherer
Classads
FERMIGRID CLUSTER 2
- REFERENCES
- The Resource Selection Home Page
https//twiki.grid.iu.edu/twiki/bin/view/ResourceS
election/WebHome - Open Science Grid Home Page http//opensciencegri
d.org - The GLUE schema home page http//glueschema.forge
.cnaf.infn.it - G. Garzoglio, T. Levshina, P. Mhashilkar, S.
Timm, "ReSS a Resource Selection Service for the
Open Science Grid.", Proceedings of the
International Symposium of Grid Computing
(ISGC07), March 2007, Taipei, Taiwan