Title: Grid Technology A Web Services Globus OGSA
1Grid Technology AWeb ServicesGlobus OGSA Grid
ArchitectureCERN GenevaApril 1-3 2003
- Geoffrey Fox
- Community Grids Lab
- Indiana University
- gcf_at_indiana.edu
2With Thanks to
- Tony Hey my co-speaker and
- I adapted presentations from
- Marlon Pierce
- Dennis Gannon
- Globus
- Malcolm Atkinson
- David de Roure
3Fermilab Experiments 1975-1980
Regge Theory 1978
Hadron Jets in 1977 Compared to Feynman Field
(Fox) Model
E350
-t
E260
200 GeV hp
4Caltech Hypercube
JPL Mark II 1985 Chuck Seitz 1983
Hypercube as a cube
5History New York Times 1984
- One of today's fastest computers is the Cray 1,
which can do 20 million to 80 million operations
a second. But at 5 million, they are expensive
and few scientists have the resources to tie one
up for days or weeks to solve a problem. - Poor old Cray and Cyber (another super
computer) don't have much of a chance of getting
any significant increase in speed,'' Fox said.
Our ultimate machines are expected to be at
least 1,000 times faster than the current fastest
computers.'' (80 gigaflops predicted. Earth
Simulator is 40,000 gflops) - But not everyone in the field is as impressed
with Caltech's Cosmic Cube as its inventors are.
The machine is nothing more nor less than 64
standard, off-the-shelf microprocessors wired
together, not much different than the innards of
64 IBM personal computers working as a unit. - The Caltech Hypercube was just a cluster of
PCs!
6History New York Times 1984
- We are using the same technology used in PCs
(personal computers) and Pacmans,'' Seitz said.
The technology is an 8086 microprocessor capable
of doing 1/20th of a million operations a second
with 1/8th of a megabyte of primary storage.
Sixty-four of them together will do 3 million
operations a second with 8 megabytes of storage. - Computer scientists have known how to make such a
computer for years but have thought it too
pedestrian to bother with. - It could have been done many years ago,'' said
Jack B. Dennis, a computer scientist at the
Massachusetts Institute of Technology who is
working on a more radical and ambitious approach
to parallel processing than Seitz and Fox. - There's nothing particularly difficult about
putting together 64 of these processors,'' he
said. But many people don't see that sort of
machine as on the path to a profitable result.' - So clusters are a trivial architecture (1984)
- So architecture is unchanged unfortunately
after 20 years research, programming model is
also the same (message passing)
7What is a Grid I?
- Collaborative Environment (Ch2.2,18)
- Combining powerful resources, federated computing
and a security structure (Ch38.2) - Coordinated resource sharing and problem solving
in dynamic multi-institutional virtual
organizations (Ch6) - Data Grids as Managed Distributed Systems for
Global Virtual Organizations (Ch39) - Distributed Computing or distributed systems
(Ch2.2,10) - Enabling Scalable Virtual Organizations (Ch6)
- Enabling use of enterprise-wide systems, and
someday nationwide systems, that consist of
workstations, vector supercomputers, and parallel
supercomputers connected by local and wide area
networks. Users will be presented the illusion of
a single, very powerful computer, rather than a
collection of disparate machines. The system will
schedule application components on processors,
manage data transfer, and provide communication
and synchronization in such a manner as to
dramatically improve application performance.
Further, boundaries between computers will be
invisible, as will the location of data and the
failure of processors. (Ch10)
8What is a Grid II?
- Supporting e-Science representing increasing
global collaborations of people and of shared
resources that will be needed to solve the new
problems of Science and Engineering (Ch36) - As infrastructure that will provide us with the
ability to dynamically link together resources as
an ensemble to support the execution of
large-scale, resource-intensive, and distributed
applications. (Ch1) - Makes high-performance computers superfluous
(Ch6) - Metasystems or metacomputing systems (Ch10,37)
- Middleware as the services needed to support a
common set of applications in a distributed
network environment (Ch6) - Next Generation Internet (Ch6)
- Peer-to-peer Network (Ch10, 18)
- Realizing thirty year dream of science fiction
writers that have spun yarns featuring worldwide
networks of interconnected computers that behave
as a single entity. (Ch10)
9What is Grid Technology?
- Grids support distributed collaboratories or
virtual organizations integrating concepts from - The Web
- Distributed Objects (CORBA Java/Jini COM)
- Globus Legion Condor NetSolve Ninf and other High
Performance Computing activities - Peer-to-peer Networks
- With perhaps the Web being the most important for
Information Grids and Globus for Compute
Grids - Use Information Grids and not usual Data Grids as
distributed file systems (holding lots of
data!) are handled in Compute Grids
10PPPH Paradigms Protocols Platforms and Hosting I
- We will start from the Web view and assert that
basic paradigm is - Meta-data rich Web Services communicating via
messages - These have some basic support from some runtime
such as .NET, Jini (pure Java), Apache
TomcatAxis (Web Service toolkit), Enterprise
JavaBeans, WebSphere (IBM) or GT3 (Globus Toolkit
3) - These are the distributed equivalent of operating
system functions as in UNIX Shell - Called Hosting Environment or platform
11Some Basic Observations
- Grids manage and share asynchronous resources in
a rather centralized fashion - Peer-to-peer networks are just like Grids with
different implementations of services like
registration and look-up - Web Services interact with messages
- Everything (including applications like
PowerPoint will be a WS?) see later short
discussion - Computers are fast and getting faster. One can
afford many strategies that used to be
unrealistic - All messages can be publish/subscribe
- Software message routing
- XML will be used for most interesting data and
meta-data - One will store/consider data and meta-data
separately but often use same technology to
manage both of them. - Need Synchronous and Asynchronous Resource
Sharing - Integrate Grid and Collaboration technology
12Classic Grid Architecture
Resources
Content Access
Composition
Middle TierBrokers Service Providers
Netsolve
Security
Collaboration
Computing
Middle Tier becomes Web Services
Clients
Users and Devices
13What is a Web Service I
- A web service is a computer program running on
either the local or remote machine with a set of
well defined interfaces (ports) specified in XML
(WSDL) - In principle, computer program can be in any
language (Fortran .. Java .. Perl .. Python) and
the interfaces can be implemented in any way what
so ever - Interfaces can be method calls, Java RMI
Messages, CGI Web invocations, totally compiled
away (inlining) but - The simplest implementations involve XML messages
(SOAP) and programs written in net friendly
languages like Java and Python - Web Services separate the meaning of a port
(message) interface from its implementation - Enhances/Enables Re-usable component model of ANY
electronic resource
14Raw Data
RawResources
Raw Data
(Virtual) XML Data Interface
WS
WS
etc.
XML WS to WS Interfaces
(Virtual) XML Knowledge (User) Interface
Render to XML Display Format
(Virtual) XML Rendering Interface
Clients
15What is a Web Service II
- Web Services have important implication that ALL
interfaces are XML messages based. In contrast - Most Windows programs have interfaces defined as
interrupts due to user inputs - Most software have interfaces defined as methods
which might be implemented as a message but this
is often NOT explicit
16What is a Web Service III
- Everything electronic is a resource
- Computers Programs People
- Data (from sensors to this presentation to email
to databases) - Everything electronic is a distributed object
- All resources have interfaces which are defined
in XML for both properties (data-structure) and
methods (service, function, subroutine)
(Resources are Services) - We can assume that a data-structure property has
getproperty() and setproperty(value) methods to
act as interface - All resources are linked by messages with
structure, which must be specifiable in XML - All resources have a URI such as unique//a/b/c
.
17WSDL Abstractions
- WSDL abstracts a program as an entity that does
something given one or more inputs with its
results defined by streams on one or more
outputs. - Functions are defined by method name and
parametersmethodname(parm1,parm2, parmN) - Where parameters are Input Output or both
- In WSDL, we will have a Web Service which like a
(Java or CORBA Program) can be thought of as a
(distributed) object with many methods - Instead of a function call, the calling routine
sends an XML message to the Web Service
specifying methodname and values of the
parameters - Note name of function is just another parameter
18Details of WSDL Protocol Stack
- UDDI finds where programs are
- remote( (distributed) programs are just Web
Services - (not a great success)
- WSFL links programs together(under revision as
BPEL4WS) - WSDL defines interface (methods, parameters, data
formats) - SOAP defines structure of message including
serialization of information - HTTP is negotiation/transport protocol
- TCP/IP is layers 3-4 of OSI
- Physical Network is layer 1 of OSI
19Education as a Web Service
- Can link to Science as a Web Service and
substitute educational modules - Learning Object XML standards already exist
from IMS/ADL http//www.adlnet.org need to
update architecture - Web Services for virtual university include
- Registration
- Performance (grading)
- Authoring of Curriculum
- Online laboratories for real and virtual
instruments - Homework submission
- Quizzes of various types (multiple choice, random
parameters) - Assessment data access and analysis
- Synchronous Delivery of Curricula
- Scheduling of courses and mentoring sessions
- Asynchronous access, data-mining and knowledge
discovery - Learning Plan agents to guide students and
teachers
20What are System and Application Services?
- There are generic Grid system services security,
collaboration, persistent storage, universal
access - OGSA (Open Grid Service Architecture) is
implementing these as extended Web Services - An Application Web Service is a capability used
either by another service or by a user - It has input and output ports data is from
sensors or other services - Consider Satellite-based Sensor Operations as a
Web Service - Satellite management (with a web front end)
- Each tracking station is a service
- Image Processing is a pipeline of filters which
can be grouped into different services - Data storage is an important system service
- Big services built hierarchically from basic
services - Portals are the user (web browser) interfaces to
Web services
21Application Web Services
- Note Service model integrates sensors, sensor
analysis, simulations and people - An Application Web Service is a capability used
either by another service or by a user - It has input and output ports data is from
users, sensors or other services - Big services built hierarchically from basic
services
22The Application Service Model
- As bandwidth of communication (between) services
increases one can support smaller services - A service is a component and is a replacement
for a library in case where performance allows - Services (components) are a sustainable model of
software development each service has
documented capability with standards compliant
interfaces - XML defines interfaces at several levels
- WSDL at Service interface level and XSIL or
equivalent for scientific data format - A service can be written as Perl, Python, Java
Servlet, Enterprise JavaBean, CORBA (C or
Fortran) Object - Communication protocol can be RMI (Java), IIOP
(CORBA) or SOAP (HTTP, XML)
23Application with W3C DOM Structure as a Web
Service
Data
Resource Facing Ports
Application as a Web service Application Model
Remaining W3C DOM Semantic Events
MVCM Model
Control
User FacingPorts
View
CControl
Events as Messages
Rendering as Messages
Application Viewand SelectedControl
V View
247 Primitives in WSDL
- types which provides data type definitions used
to describe the messages exchanged. - message which represents an abstract definition
of the data being transmitted. A message consists
of logical parts, each of which is associated
with a definition within some type system. - operation an abstract description of an action
supported by the service. - portType which is a set of abstract operations.
Each operation refers to an input message and
output messages. - binding which specifies concrete protocol and
data format specifications for the operations and
messages defined by a particular portType. - port which specifies an address for a binding,
thus defining a single communication endpoint. - service which is used to aggregate a set of
related ports
25(No Transcript)
26lt?xml version"1.0" encoding"UTF-8"?gt ltwsdldefin
itionsgt ltwsdlmessage name"execLocalCommandRes
ponse"gt ltwsdlmessage name"execLocalCommandReques
t"gt ltwsdlportType name"SJwsImp"gt ltwsdloperation
name"execLocalCommand" parameterOrder"in0"gt
ltwsdlinput message"implexecLocalCommandReque
st" name"execLocalCommandRequest"/gt
ltwsdloutput message"implexecLocalCommandRespons
e" name"execLocalCommandResponse"/gt
lt/wsdloperationgt lt/wsdlportTypegt ltwsdlbinding
name"SubmitjobSoapBinding" type"implSJwsImp"gt
ltwsdlsoapbinding style"rpc"
transport"http//schemas.xmlsoap.org/soap/http"/gt
ltwsdloperation name"execLocalCommand"gt
ltwsdlsoapoperation soapAction""/gt
ltwsdlinput name"execLocalCommandRequest"gt
ltwsdloutput name"execLocalCommandResponse"gt lt/ws
dloperationgt lt/wsdlbindinggt ltwsdlservice
name"SJwsImpService"gt ltwsdlport
binding"implSubmitjobSoapBinding"
name"Submitjob"gt lt/wsdlservicegt lt/wsdldefinit
ionsgt
27Discussion of 7 WSDL Primitives
- types specify data-structures which are
equivalent to arguments of methods - message specifies collections of types and is
equivalent to set of arguments in a method call.
Note that it is an abstract method in Java
terminology - operation is a a collection of input output and
fault messages there are 4 types of operation
one-way(service just receives a message),
request-response(RPC), solicit-response,
notification (services pushes out a message) - portType represents a single channel that can
support multiple operations. It is abstract as
specified as a set of operations. It is
equivalent to a interface or abstract class in
Java - binding tells you transport and message format
for a porttype (which can have multiple bindings
to reflect say performance-portability trades) - port combines a binding and an endpoint network
address (URL) and is like a class instance - service consists of multiple ports and is
equivalent to a program in Java
28OGSA OGSI Hosting Environments
- Start with Web Services in a hosting environment
- Add OGSI to get a Grid service and a component
model - Add OGSA to get Interoperable Grid correcting
differences in base platform and adding key
functionalities
29Functional Level above OGSA
- Systems Management and Automation
- Workload / Performance Management
- Security
- Availability / Service Management
- Logical Resource Management
- Clustering Services
- Connectivity Management
- Physical Resource Management
- Perhaps Data Access belongs here
30Two-level Programming I
- The paradigm implicitly assumes a two-level
Programming Model - We make a Service (same as a distributed object
or computer program running on a remote
computer) using conventional technologies - C Java or Fortran Monte Carlo module
- Data streaming from a sensor or Satellite
- Specialized (JDBC) database access
- Such nuggets accept and produce data from users
files and databases - The Grid is built by coordinating such nuggets
assuming we have solved problem of programming
the nugget
31Two-level Programming II
- The Grid is discussing the linkage and
distribution of the nuggets with the
onlyaddition runtime interfaces to Grid as
opposed to UNIX data streams - Familiar from use of UNIX Shell, PERL or Python
scripts to produce real applications from core
programs - Such interpretative environments are the single
processor analog of Grid Programming - Some projects like GrADS from Rice University are
looking at integration between nugget levels but
dominant effort looks at each level separately
32Why we can dream of using HTTP and that slow stuff
- We have at least three tiers in computing
environment - Client (user portal discussed Thursday)
- Middle Tier (Web Servers/brokers)
- Back end (databases, files, computers etc.)
- In Grid programming, we use HTTP (and used to use
CORBA and Java RMI) in middle tier ONLY to
manipulate a proxy for real job - Proxy holds metadata
- Control communication in middle tier only uses
metadata - Real (data transfer) high performance
communication in back end
33UserServices
GridComputingEnvironments
CoreGrid
34OGSA OGSI Hosting Environments
- Start with Web Services in a hosting environment
- Add OGSI to get a Grid service and a component
model - Add OGSA to get Interoperable Grid correcting
differences in base platform and adding key
functionalities
35PPPH Paradigms Protocols Platforms and Hosting II
- Self-describing programs/interfaces are key to
scaling - Minimize amount of work system has to do
- Hide as much as possible in services and
applications - Protocols describe (in principle at least)
those rules that system obeys and uses to deliver
information between services (processes) - Interfaces tell the service what to do to
interpret the results of communication - HTTP is the dominant transport protocol of the
Web - HTML is the interface telling browser how to
render - But you can extend interface to allow PDF,
multimedia, PowerPoint using helper
applications which are (with more or less
convenience) which are automatically downloaded
if not already available - Mime types essentially self-describe each
interface
36Analogy with Web II
- HTTP and HTML are the analogies on the client
side - A Web Service generalizes a CGI Script on
server side - CGI is essentially a Distributed Object
technology allowing server to access an arbitrary
program labeled by a URL plus an ugly syntax to
specify name and parameters of program to run - Roughly WSDL (Web Service Description Language)
is a better to specify program name and its
parameters - Web uses other protocols HTTPS for secure links
and RTP etc. for multimedia (UDP) streams - These again are required to integrate system
codecs like MPEG are interfaces interpreted by
client - There are further protocols like H323 and SIP
which will be placed (IMHO) by HTTP plus RTP etc.
We should minimize number of protocols to get
maintainable systems
37PPPH Paradigms Protocols Platforms and Hosting
III
- There are set of system capabilities which cannot
be captured as standalone services and permeate
Grid - Meta-data rich Message-linked Web Services is
permeating paradigm - Component Model such as Enterprise JavaBean
(EJB) or OGSI describes the formal structure of
services EJB if used lives inside OGSI in our
Grids - Invocation Framework describes how you interact
with system - Security in fine grain fashion to provide
selective authorization (Globus and EDG WP6) - Policy context describes rules for this
particular Grid - Transport mechanisms abstract concepts like ports
and Quality of Service - Messaging abstracts destination and customization
of content - Network (monitoring, performance) EDG WP7
- Fabric (resources) EDG WP4
38Architecture in Pictures I
Invocation Framework
39Architecture in Pictures IIOGSA Interoperable
Grid
40Architecture in Pictures IIIOGSA Federated Grid
Mediation Serviceconverting between OGSA and
native services
Mediation Service
41Virtualization
- The Grid could and sometimes does virtualize
various concepts - Location URI (Universal Resource Identifier)
virtualizes URL - Replica management (caching) virtualizes file
location generalized by GriPhyn virtual data
concept - Protocol message transport and WSDL bindings
virtualize transport protocol as a QoS request - P2P or Publish-subscribe messaging virtualizes
matching of source and destination services - Semantic Grid virtualizes Knowledge as a
meta-data query - Brokering virtualizes resource allocation
- Virtualization implies references can be indirect
42IFS Interfaces and Functionality and Semantics I
- The Grid platform tries to minimize detail in
protocols and maximize detail in interfaces to
enhance scaling - However rich meta-data and semantics are critical
for correct and interesting operation - Put as much semantic interpretation as you can
into specific services - Lack of Semantic interoperation is in fact main
weakness of todays Grids and Web services - Everything becomes a service (See example of
education) whether system or application level - There are some very important Global Services
- Discovery (look up) and Registration of service
metadata - Workflow
- MetaSchedulers
43IFS Interfaces and Functionality and Semantics II
- There are many other generally important services
- OGSA-DAI The Database Service
- Portal Service linked to by WSRP (Web services
for Remote Portals) - Notification of events
- Job submission
- Provenance interpret meta-data about history of
data - File Interfaces
- Sensor service satellites
- Visualization
- Basic brokering/scheduling
44Globus in a Nutshell from IPG
- GT2 (or Globus Toolkit 2) is original (non web
service based) version which is basis of EDG
(European Data Grid) work - C programs and libraries
- See Chapter 5 of book with background in chapters
2-4 and 37 - http//www.ipg.nasa.gov/ipgusers/globus/
- http//www.globusworld.org/globusworld_web/jw2_pro
gram_tut.htm
45Globus GT2 from IPG
- The goal of the Globus GT2 is to provide
dependable, consistent, pervasive access to
high-end resources. - This is original Grid start general recently to
virtual organizations and data grids - The Globus Project offers the most widely used
computing grid middleware. The Globus Project is
a joint effort of Argonne National Laboratory,
the Informational Sciences Institute of the
University of Southern California, in
collaboration with numerous other organizations
including  NCSA, NPACI, UCSD, and NASA. See
http//www.globus.org/ for history, goals,
release and usage notes, software distributions,
and research papers.
46Globus GT2 II
- Grid Fabric Layer One The fabric of the Grid
comprises the underlying systems, computers,
operating systems, networks, storage systems, and
routersthe building blocks. - Grid Services Layer TwoGrid services integrate
the components of the Grid fabric. Examples of
the services that are provided by Globus Toolkit
2 - GRAMThe Globus Resource Allocation Manager,
GRAM, is a basic library service that provides
capabilities to do remote-submission job start
up. GRAM unites Grid machines, providing a common
user interface so that you can submit a job to
multiple machines on the Grid fabric. GRAM is a
general, ubiquitous service, with specific
application toolkit commands built on top of it - MDSThe Monitoring and Discovery Service, also
known as GIS, the Grid Information Service,
provides information service. You query MDS to
discover the properties of the machines,
computers and networks that you want to use how
many processors are available at this moment?
What bandwidth is provided? Is the storage on
tape or disk? Is the visualization device an
immersive desk or CAVE? Using an LDAP
(Lightweight Directory Access Protocol) server,
MDS provides middleware information in a common
interface to put a unifying picture on top of
disparate equipment. - Contd
47Globus GT2 III
- GSI gss-api library for adding authentication to
a program. GSI provides programs, such as
grid-proxy-init, to facilitate login to a variety
of sites, while each site has its own flavor of
security measures. That is, on the fabric layer,
the various machines you want to use might be
governed by disparate security policies GSI
provides a means of simplifying multiple remote
logins. The standard installation is based on a
PKI security system the Kerberos installation of
Globus is less standard. (Some installations with
DoE and DoD insist on Kerberos) - GridFTP A new (in Globus 2.0) protocol for file
transfer over a grid. This is a Global Grid Forum
standard - GASS Globus Access to Secondary Storage, provides
command-line tools and C APIs for remotely
accessing data. GASS integrates GridFTP, HTTP,
and local file I/O to enable secure transfers
using any combination of these protocols..
48Globus GT2 IV
- Application Toolkits Layer ThreeApplication
toolkits use Grid Services to provide
higher-level capabilities, often targeted to
specific classes of application. - For example, the Globus development team has
created a set of Grid service tools and a
toolkit of programs for running remotely
distributed jobs. These include remote job
submission commands ( globusrun,
globus-job-submit, globus-job-run), built on top
of the GRAM service, and MPICH-G2, a Grid-enabled
implementation of the Message Passing Interface
(MPI). - A more modern interface is through CoG Kits
(Commodity Grid) to different languages Perl
Python Java see chapter 26 of Book - The Java CoG kit provides a natural way to link
GT2 to a Web service framework - Globus Toolkit 3 (GT3) effectively integrated CoG
Kit interface with core Globus by wrapping all
Globus Services as Web services
49Job Submission in Globus
- Very similar to UNIX Shell build Portal Web
Interfaces to specific or general Shell commands.
Some example commands - globusrun Runs a single executable on a remote
site with an RSL specification. - globus-job-cancel Cancels a job previously
started using globus-job-submit. - globus-job-run Allows you to run a job at one or
several remote resources. It translates the
program arguments to an RSL request and uses
globusrun to submit the job. - globus-job-clean Kills the job if it is still
running and cleans the information concerning the
job. - globus-job-status Display the status of the job.
See also globus-get-output to check the standard
output or standard error of your job. - These are all controlled by metadata specified by
the Globus Resource Specification Language (RSL)
which provides a common language to describe jobs
and the resources required to run them. - http//www.globus.org/gram/gram_rsl_parameters.htm
l - The simplest RSL expression looks something like
the following. (executable/bin/ls)
50Virtual Data Toolkit VDT from GriPhyn
- http//www.lsc-group.phys.uwm.edu/vdt/
- Trillium (PPDG from DoE GriPhyn and iVDgL from
NSF) is major US effort building Grid application
software with a strong particle physics emphasis - VDT is their major software release and its heart
is Condor and GT2. - There is some virtual data software as well but
not clear if this is of interest in production
use (interesting research area) - Condor (Chapter 11 of Book) is powerful job
scheduler for clusters and cycle scavenging - It has a well developed interface (ClassAds) for
defining requirements of jobs and matching to
compute capabilities
51OGSA/OGSI Top Level View
Chapters 7 to 9 of Book http//www.gridforum.org/M
eetings/ggf7/docs/default.htm http//www.globuswo
rld.org/globusworld_web/jw2_program_tut.htm
- OGSA is the set of core Grid services
- Stuff you cant live without
- If you built a Grid you would need to invent
these things
52OGSI Open Grid Service Interface
- http//www.gridforum.org/ogsi-wg
- It is a component model for web services.
- It defines a set of behavior patterns that each
OGSI service must exhibit. - Every Grid Service portType extends a common
base type. - Defines an introspection model for the service
- You can query it (in a standard way) to discover
- What methods/messages a port understands
- What other port types does the service provide?
- If the service is stateful what is the current
state? - A set of standard portTypes for
- Message subscription and notification
- Service collections
- Each service is identified by a URI called the
Grid Service Handle - GSHs are bound dynamically to Grid Services
References (typically wsdl docs) - A GSR may be transient. GSHs are fixed.
- Handle map services translate GSHs into GSRs.
53OGSI and Stateful Services
- Sometimes you can send a message to a service,
get a result and thats the end - This is a statefree service
- However most non-trivial services need state to
allow persistent asynchronous interactions - OGSI is designed to support Stateful services
through two mechanisms - Information Port where you can query for SDE
(Service Definition Elements) - Factories that allow one to view a Service as a
class (in an object-oriented language sense)
and create separate instances for each Service
invocation - There are several interesting issues here
- Difference between Stateful interactions and
Stateful services - System or Service managed instances
54Factories and OGSI
- Stateful interactions are typified by amazon.com
where messages carry correlation information
allowing multiple messages to be linked together - Amazon preserves state in this fashion which is
in fact preserved in its database permanently - Stateful services have state that can be queried
outside a particular interaction - Also note difference between implicit and
explicit factories - Some claim that implicit factories scale as each
service manages its own instances and so do not
need to worry about registering instances and
lifetime management - See WS-Addressing from largely IBM and
Microsofthttp//msdn.microsoft.com/webservices/de
fault.aspx?pull/library/en-us/dnglobspec/html/ws-
addressing.asp
Explicit Factory
Implicit Factory
55Open Grid Service Architecture
- OGSA-WG chaired by
- Ian Foster, ANL and Univ. of Chicago
- Jeff Nick, IBM
- Dennis Gannon, IU
- Active Members from
- IBM, Fujitsu, NEC, SUN, Hitachi, Avaki
- Univ. of Mich, Chicago, Indiana (not much
academic involvement)
56OGSA Core Services I
- Registries, and namespace bindings
- Registry is a collection of services indexed by
service metadata. - find me a service with property X.
- Directory is a map from a namespace to GSHs.
- A namespace is a human understandable version of
a Grid Handle - Queues
- For building schedulers and resource brokers
- Jobs and other requests are in queues
- This is high-level messaging
57Security
- Base this on Web Services Security
- Authentication
- 2-way. Who are you and who am I?
- Authorization
- What am I authorized to use/see/modify
- Accounting/Billing
- (not really security see monitoring)
- Privacy
- Group Access
- Easily create a group to share access to a
virtual Grid. - Very complex issues related to services and
message delivery.
58Common Resource Model
- Every resource on the grid that is manageable is
represented by a service instance - CRM is the Schema hierarchy that defines each
resource (with its meta-data) - Service for a resource presents its management
interface to authorized parties.
59Policy Management
- Policy management services
- Mechanism to publish policy and the services it
applies to. - Policy life-cycle mgmt.
- Policy languages exist for routing, security,
resource use
60Grid Service Orchestration
- Creating new services by composing other services
- Two types of Orchestration
- Composition in space
- One services is directly invoking another
- Composition in time
- Managing the workflow
- First do this.
- Then do this and that
- When that is done do this
- If something goes wrong do this
- And so on
61Data Services
- Distributed Data Access
- Data Caching
- Data Replication Services
- Metadata Catalog Services
- Storage Services
62Metering Resource Consumption
- At what granularity do services report resource
consumption? - How do they report it?
- How are services metered?
63Transactions
- Two threads/workflows must synchronize and agree
they have done so before moving on. - Usually involves modification to two or more
persistent states - WS-transactions has been proposed.
64Messaging, Events, Logging
- Messaging
- Delivery Model
- Queuing and Pub/Sub message delivery (not clear
to me why these are different as
publish/subscribe implemented as topic labeled
queues) - Events
- Time stamped messages
- Standard XML schemas
- Standard Logging
- MQSeries (IBM), JMS (Java Message Service) and
NaradaBrokering (Indiana) provide this but most
naturally at level of platform/hosting
environment
65Where should Messaging be?
- One can define messaging at the OGSA level above
the hosting environment but that makes it
difficult to virtualize messaging and support
network performance - Publish-subscribe or better queued messaging
naturally supports optimized routing based on
network performance - One can naturally support collaborative Web
services in same fashion in a way that it MUCH
easier that GrooveNetworks and other
collaborative environments (WebeX,
Placeware(Microsoft)) do as long as every
application is a Web service - OGSA location of messages is fine for low volume
logging or notification events - Not good for events on video application where
each frame is an update event
66Application as a Web service
From Collaboration As a WS
Events
Rendering
From Master
Participating Client
67Collaboration Shared Display
- Sharing can be done at any point on object or
Web Service pipeline
SharedDisplay
Shared Web Service
Shared Export
Shared Event
Master
Event(Message)Service
Shared Display shares framebuffer with
eventscorresponding to changedpixels in master
client.
Object Display
As long as pipeline uses messages, easy tomake
collaborativeWindows framebuffers and in fact
most applications do NOT expose a message based
update interface
Object Display
68Shared Input Port (Replicated WS) Collaboration
Collaboration as a WSSet up Session with XGSP
Master
Event(Message)Service
OtherParticipants
69Shared Output Port Collaboration
Collaboration as a WSSet up Session with XGSP
Web Service Message Interceptor
Master
WS Display
WS Viewer
Text Chat Whiteboard Multiple masters
Event(Message)Service
OtherParticipants
WSDisplay
WS Viewer
70NaradaBrokering
- Based on a network of cooperating broker nodes
- Cluster based architecture allows system to scale
to arbitrary size - Originally designed to provide uniform software
multicast to support real-time collaboration
linked to publish-subscribe for asynchronous
systems. - Now has four major core functions
- Message transport (based on performance
measurement) in heterogeneous multi-link fashion - General publish-subscribe including JMS JXTA
and support for RTP-based audio/video
conferencing - Filtering for heterogeneous clients
- Federation of multiple instances of Grid services
71Role of Event/Message Brokers
- We will use events and messages interchangeably
- An event is a time stamped message
- Our systems are built from clients, servers and
event brokers - These are logical functions a given computer
can have one or more of these functions - In P2P networks, computers typically
multifunction in Grids one tends to have
separate function computers - Event Brokers just provide message/event
services servers provide traditional distributed
object services as Web services - There are functionalities that only depend on
event itself and perhaps the data format they do
not depend on details of application and can be
shared among several applications - NaradaBrokering is designed to provide these
functionalities - MPI provided such functionalities for all
parallel computing
72Engineering Issues Addressedby Event / Messaging
Service
- Application level Quality of Service e.g.
give audio highest priority - Tunnel through firewalls proxies
- Filter messages to slow (collaborative/real-time)
clients - Choose Hardware or Software multicast
- Scaling of software multicast
- Efficient calculation of destinations and
routes. - Integrate synchronous and asynchronous
collaboration with same messaging, control,
archiving for all functions - Transparently replace single server JMS systems
with a distributed solution. - Provides reliable inter-peer group messaging for
JXTA - Open Source (high quality) messaging
73NaradaBrokering implements an Event Service
- Filter is mapping to PDA or slow communication
channel (universal access) see our PDA adaptor - Workflow implements message process
- Routing illustrated by JXTA and includes firewall
- Destination-Source matching illustrated by JMS
using Publish-Subscribe mechanism - These use Security model (being implemented)
based on WS-Sec
74Narada Broker Network
(P2P) Community
For message/events service
Broker
Broker
(P2P) Community
Resource
Broker
Hypercube topology for brokers? Tree for distance
education with teacher at root
Broker
Broker
(P2P) Community
Software multicast
Broker
(P2P) Community
75NaradaBrokering Communication
- Applications interface to NaradaBrokering through
UserChannels which NB constructs as a set of
links between NB Broker waystations which may
need to be dynamically instantiated - UserChannels have publish/subscribe semantics
with XML topics - Links implement a single conventional data
protocol. - Interface to add new transport protocols within
the Framework - Administrative channel negotiates the best
available communication protocol for each link - Different links can have different underlying
transport implementations - Implementations in the current release include
support for TCP,UDP, Multicast, SSL and RTP.
HTTP, HTTPS support will be available in Feb 2003
release. - Supports communication through proxies such as
iPlanet, Netscape and Apache. - Supports communication through firewalls such as
Microsoft ISA, Checkpoint.
76Performance/Routing in Message-based Architecture
B2
B3
- In traveling from cities A to B (say 3 separate
passengers), one chooses between and changes
transport mechanism at waystations to optimize
cost, time, comfort, scenic beauty - Waystations are now NB brokers where one chooses
transport protocol (individual or collective) - Able to choose between car, type of car, plane,
train etc - Able to dynamically create waystations to cope
with problems and acts as hubs for multicast
messages - Knows about traffic jams and can assign the HOV
lane
77Note on Optimization
- Note in parallel computing, couldnt do much
dynamic optimization as aiming at microsecond
latency - Natural to use hardware routing
- In Grid, time scales are different
- 100 millisecond quite normal network latency
- 30 millisecond typical packet time sensitivity
(this is one audio or video frame) but even here
can buffer 10-100 frames on client (conferencing
to streaming) - 1 millisecond is time for a Java server to
think - Jitter in latency (transit time) due to routing,
processing (in NB) or packet loss recovery is
important property - Grid needs and can tolerate significant dynamic
optimization
78Sender/receiver/broker - (Pentium-3, 1 GHz, 256
MB RAM). 100 Mbps LAN. JDK-1.3, Red Hat Linux 7.3
79(No Transcript)
80(No Transcript)
81(No Transcript)
82Narada Performance Web Service
- Performance measurements are used by Links in
- Reconfiguring Connectivity between nodes
- Deciding underlying transport protocol
- Determining possible filtering
- Each node determines performance of links of
which it is endpoint - Individual node web services are aggregated as
another Web Service -
Probably should replace by a more sophisticated
measurement package
- Factors measured include
- Transit delays, bandwidth, Jitter, Receiving
rates. - Performance measurements are
- Spaced out at increasing intervals for healthy
channels. - Factors selectively measured for unhealthy
channels. - No repeated measurements of bandwidth for
example. - Injected into Narada network as XML events
Administrative Interface
83The Overall Architecture
- The Grid is defined by a collection of
distributed Services - For many users the primary interaction with the
Grid will be through a portal
Event and logging Services
The User
Application Factory Services
Messaging and group collaboration
Portal Server
Directory index Services
MyProxy Server
User's Persistent Context
Metadata Directory Service(s)
84Application Portal in a Minute (box)
- Systems like Unicore, GPDK, Gridport (HotPage),
Gateway, Legion provide Grid or GCE Shell
interfaces to users (user portals) - Run a job find its status manipulate files
- Basic UNIX Shell-like capabilities
- Application Portals (Problem Solving
Environments) are often built on top of Shell
Portals but this can be quite time confusing - Application Portal Shell Portal Web Service
Application (factory) Web service
85Application Web service
- Application Web Service is ONLY metadata
- Application is NOT touched
- Application Web service defined by two sets of
schema - First set defines the abstract state of the
application - What are my options for invoking myapp?
- Dub these to be abstract descriptors
- Second set defines a specific instance of the
application - I want to use myapp with input1.dat on
solar.uits.indiana.edu. - Dub these to be instance descriptors.
- Each descriptor group consists of
- Application descriptor schema
- Host (resource) descriptor schema
- Execution environment (queue or shell) descriptor
schema
86(No Transcript)
87Web Services as a Portlet
- Each Web Service naturally has a user interface
specified as just another port - Customizable for universal access
- This gives each Web Service a Portlet view
specified (in XML as always) by WSRP (Web
services for Remote Portals) - So component model for resources automatically
gives a component model for user interfaces - When you build your application, you define
portletat same time
Application as a WSGeneral Application
PortsInterface with other WebServices
User Face ofWeb ServiceWSRP Ports define WS as
a Portlet
Web Services have other ports (Grid Service) to
be OGSI compliant
88Online Knowledge Center built from Portlets
A set of UIComponents
- Web Services provide a component model for the
middleware (see large common component
architecture effort in Dept. of Energy) - Should match each WSDL component with a
corresponding user interface component - Thus one must use a component model for the
portal with again an XML specification (portalML)
of portal component
89HTML
Jetspeed Architecture
Turbine Servlet
JSP template
ECS Root to HTML
Screen Manager
PSML
ECS
PortletController
PortletController
ECS
ECS
ECS
PortletControl
ECS
ECS
ECS
ECS
ECS
Portlet
Portlet
Portlet
Portlet
Portlet
Portlets
HTML Local files
JSP or VM Local templates
WebPage Remote HTML
Portlets User implemented using Portal API
XML RSS, OCS, or other Local or remote
Data
90Portlets and Portal Stacks
- User interfaces to Portal services (Code
Submission, Job Monitoring, File Management for
Host X) are all managed as portlets. - Users, administrators can customize their portal
interfaces to just precisely the services they
want.
Aggregation Portals (Jetspeed)
User facing Web Service Ports
Message Security, Information Services
Application Grid Web Services
Core Grid Services
91Jetspeed Computing Portal Choose Portlets
92Choose Portlet Layout
Choose 1-column Layout
Original 2-column Layout
93File management
Tabs indicate available portlet interfaces.
Lists user files on selected host,
noahsark. File operations include Upload,
download, Copy, rename, crossload
94(No Transcript)
95Sample page with several portlets proxy
credential manager, submission, monitoring
96Administer Grid Portal
Provide information about application and host
parameters
Select application to edit