Title: Resource Brokering: the EuroGridGRIP approach
1Resource Brokering the EuroGrid/GRIP approach
- Donal Fellows, John Brooke, Jon MacLaren
- E-Science NorthWest _at_ University of Manchester UK
- http//www.esnw.ac.uk
2Grid Interoperability
- In European and Japanese Grid projects there are
two major middleware systems deployed, Globus
(US) and Unicore (Europe/Japan). - Globus is mainly deployed in cluster-based Grids
and Unicore in projects with complex
heterogeneous architectures (e.g. specialist HPC
architectures). - The FP 5 project GRIP began looking at the
question of how resource requests could be
handled from Unicore to Globus and the FP6
project takes this work forward into the world of
service-based architectures (e.g. OGSA)
3Starting point - GRIP
- EU Funded FP5 Project as part of Information
Society Technologies Programme IST 2001-32257 - http//www.grid-interoperability.org/
4A Dual Job-Space
- Thus we have a space of requests defined as a
vector space of the computational needs of users
over a Grid. For many jobs most of the entries in
the vector will be null. - We have another space of services who can
produce cost vectors for costing for the user
jobs (providing they can accommodate them). - This is an example of a dual vector space.
- A strictly defined dual space is probably too
rigid but can provide a basis for simulations. - The abstract job requirements will need to be
agreed. It may be a task for a broker to
translate a job specification to a user job for
a given Grid node.
54 - Dual Space
Scalar cost in tokens
1
Job vector
Cost
2
Cost vector
User Job
6 Computational resource
- Computational jobs ask questions about the
internal structure of the provider of
computational power in a manner that an
electrically powered device does not. - For example, do we require specific compilers,
libraries, disk resource, visualization servers? - What if it goes wrong, do we get compensation? If
we transfer data and methods of analysis over the
Internet is it secure? - A resource broker for high performance
computation is a different order of complexity to
a broker for an electricity supplier.
7EuroGrid Meteo-Grid
8Resource Requestor and Provider Spaces
- Resource requestor space (RR), in terms of what
the user wants e.g. Relocatable Weather Model,
106 points, 24 hours, full topography. - Resource Provider space (RP), 128 processors,
Origin 3000 architecture, 40 Gigabytes Memory,
1000 Gigabytes disk space, 100 Mb/s connection. - We may even forward on requests from one resource
provider to another, recasting of O3000 job in
terms of IA64 cluster, gives different resource
set. - Linkage and staging of different stages of
workflow require environmental support, a hosting
environment. - We can have multiple offers in RP space for the
same RR values
9Abstract Functions for a resource broker
- Resource discovery, for workflows as well as
single jobs. - Resource capability checking, do the offering
sites have ALL necessary capability and
environmental support for instantiating the
workflow. - Inclusion of Quality of Service policies in the
offers. - Information necessary for the negotiation between
client and provider and mechanisms for ensuring
contract compliance. - Document submitted to GPA-RG group of GGF.
10Brokers as Virtual Organizations
Users
VirtualOrganisationBrokers
OrganizationFirewalls
SystemBrokers
ComputeResources
11Federated Brokering
12Brokering and OGSA Services
13Possible OGSA Broker
- Interoperating OGSA services
14Site Configuration
Gateway
Users Contact NJSes or Broker (for site-wide
brokering)
Delegate (site-wide brokering only)
Delegate (site-wide brokering only)
Broker
NJS
NJS
LRC
LRC
IDB
IDB
Potential to Share (Partial?) IDBs between NJSes
(CSAR Config?)
TSI
TSI
TSI SuppliesDynamic Datato IDB
15UoM Broker architecture
To outside world
16Broker functions
- A simple Resource Check request Can this job
run here, checks static qualities like software
resources (e.g. Gaussian98) as well as capacity
resources like quotas (disk quotas, CPU, etc.) - A Quality of Service request returns a range of
turnaround time, and cost, as part of a Ticket.
If the Ticket is presented (within its lifetime)
with the job, the turnaround and cost estimates
should be met.
17 Grid Resource Description Problem
- Two Independent Grid Systems
- Unicore (http//www.unicore.org/)
- Globus (http//www.globus.org/)
- Both Need to Describe Systems that run Compute
Jobs - Very Different Description Languages
- Unicores Resource model, part of the AJO
Framework - Globuss GLUE Schema (DataTAG, iVGDL) for GT2 and
GT3 - For interoperability, we want to take a Unicore
job and run it on Globus resources - Therefore, we need to translate the Jobs
Resource Requirements between the two Systems
18Methodology fortranslation servce
- Address Data Transformation Issues for
Translating Attributes - Find a technology that has these characteristics
- can model the two ontologies
- has support for linking abstract concepts to code
fragments - easily allows someone to update mappings
- is appropriate for a video conferencing setting
- writes modelling information to a file format
that can be used by other applications - Use the data files created by the application to
run the translator service.
19Unicore Modelling Resources
20GLUEModelling resources
21GLUE Marking up transcripts
22GLUE Provenance Information
23Compatible Concepts
24Translation Service Prototype
25Conclusions
- Interoperability of grid resource requests is at
the heart of the abstract idea of computational
resource that can cross Grid domain boundaries - We wish to provide application users with
seamless access to resources, they should not
need to know details of the machines on which
they run. - High level abstractions do not yet exist as
standards, so we have to create ontologies that
can translate differing modelling abstractions
for Grid resources. - Our current translations lose much information
in crossing between current middleware systems
(e.g. Globus and Unicore).
26Continuation of interoperability research
- Research Centre Jülich
- (Project manager)
- Consorzio Interuniversitario
- per il Calcolo Automatico
- dellItalia Nord Orientale
- Fujitsu Laboratories of Europe
- University of Warsaw
- Intel GmbH
- University of Manchester
- T-Systems SfR
http//www.unigrids.org
27GLUE Container Classes
- GLUE has container classes that include
Computing Element, Cluster, Subcluster and
Host. From the heading Representing
Information, the GLUE document indicates - hosts are composed into sub-clusters,
sub-clusters are grouped into clusters, and then
computing elements refer to one or more clusters.
These container objects may hold any number
optional auxiliary classes that actually describe
the GRID features.
28GLUE Auxiliary Classes
- The documentation provides few details about the
nature of a Host other than that it is a
physical computing element. Much of the
meaning for Host has to be derived from what it
might contain. Consider the following two valid
definitions
A Host is a physical computing element
characterized by Main Memory, a Benchmark, a
Network Adapter and an Operating System
A Host is a physical computing element
characterized by an Architecture, a Processor and
an Operating System.
29Map conceptsbetween ontologies
- Unicore and GLUE have different philosophies for
describing resources -( - In Unicore, the resources are described in terms
of resource requests - In GLUE, resources are described in terms of the
availability of resources.
30Local Brokering Configurations
Client
Client
Gateway
Gateway
Broker
Broker
NJS
NJS
R-GMA
NJS
NJS
IDB
TSI/Host
GT3
Host
Host
Host
Site-Wide Brokering
Normal EUROGRID/GRIP Brokering
31RR and RP Spaces