Grid Technology A Web Services Globus OGSA - PowerPoint PPT Presentation

1 / 96
About This Presentation
Title:

Grid Technology A Web Services Globus OGSA

Description:

Remaining W3C DOM Semantic Events. Data. Control. Application with W3C DOM Structure as a Web Service. User Facing. Ports. Resource. Facing Ports ... – PowerPoint PPT presentation

Number of Views:117
Avg rating:3.0/5.0
Slides: 97
Provided by: grid2
Category:

less

Transcript and Presenter's Notes

Title: Grid Technology A Web Services Globus OGSA


1
Grid Technology AWeb ServicesGlobus OGSA Grid
ArchitectureCERN GenevaApril 1-3 2003
  • Geoffrey Fox
  • Community Grids Lab
  • Indiana University
  • gcf_at_indiana.edu

2
With Thanks to
  • Tony Hey my co-speaker and
  • I adapted presentations from
  • Marlon Pierce
  • Dennis Gannon
  • Globus
  • Malcolm Atkinson
  • David de Roure

3
Fermilab Experiments 1975-1980
Regge Theory 1978
Hadron Jets in 1977 Compared to Feynman Field
(Fox) Model
E350
-t
E260
200 GeV hp
4
Caltech Hypercube
JPL Mark II 1985 Chuck Seitz 1983
Hypercube as a cube
5
History New York Times 1984
  • One of today's fastest computers is the Cray 1,
    which can do 20 million to 80 million operations
    a second. But at 5 million, they are expensive
    and few scientists have the resources to tie one
    up for days or weeks to solve a problem.
  • Poor old Cray and Cyber (another super
    computer) don't have much of a chance of getting
    any significant increase in speed,'' Fox said.
    Our ultimate machines are expected to be at
    least 1,000 times faster than the current fastest
    computers.'' (80 gigaflops predicted. Earth
    Simulator is 40,000 gflops)
  • But not everyone in the field is as impressed
    with Caltech's Cosmic Cube as its inventors are.
    The machine is nothing more nor less than 64
    standard, off-the-shelf microprocessors wired
    together, not much different than the innards of
    64 IBM personal computers working as a unit.
  • The Caltech Hypercube was just a cluster of
    PCs!

6
History New York Times 1984
  • We are using the same technology used in PCs
    (personal computers) and Pacmans,'' Seitz said.
    The technology is an 8086 microprocessor capable
    of doing 1/20th of a million operations a second
    with 1/8th of a megabyte of primary storage.
    Sixty-four of them together will do 3 million
    operations a second with 8 megabytes of storage.
  • Computer scientists have known how to make such a
    computer for years but have thought it too
    pedestrian to bother with.
  • It could have been done many years ago,'' said
    Jack B. Dennis, a computer scientist at the
    Massachusetts Institute of Technology who is
    working on a more radical and ambitious approach
    to parallel processing than Seitz and Fox.
  • There's nothing particularly difficult about
    putting together 64 of these processors,'' he
    said. But many people don't see that sort of
    machine as on the path to a profitable result.'
  • So clusters are a trivial architecture (1984)
  • So architecture is unchanged unfortunately
    after 20 years research, programming model is
    also the same (message passing)

7
What is a Grid I?
  • Collaborative Environment (Ch2.2,18)
  • Combining powerful resources, federated computing
    and a security structure (Ch38.2)
  • Coordinated resource sharing and problem solving
    in dynamic multi-institutional virtual
    organizations (Ch6)
  • Data Grids as Managed Distributed Systems for
    Global Virtual Organizations (Ch39)
  • Distributed Computing or distributed systems
    (Ch2.2,10)
  • Enabling Scalable Virtual Organizations (Ch6)
  • Enabling use of enterprise-wide systems, and
    someday nationwide systems, that consist of
    workstations, vector supercomputers, and parallel
    supercomputers connected by local and wide area
    networks. Users will be presented the illusion of
    a single, very powerful computer, rather than a
    collection of disparate machines. The system will
    schedule application components on processors,
    manage data transfer, and provide communication
    and synchronization in such a manner as to
    dramatically improve application performance.
    Further, boundaries between computers will be
    invisible, as will the location of data and the
    failure of processors. (Ch10)

8
What is a Grid II?
  • Supporting e-Science representing increasing
    global collaborations of people and of shared
    resources that will be needed to solve the new
    problems of Science and Engineering (Ch36)
  • As infrastructure that will provide us with the
    ability to dynamically link together resources as
    an ensemble to support the execution of
    large-scale, resource-intensive, and distributed
    applications. (Ch1)
  • Makes high-performance computers superfluous
    (Ch6)
  • Metasystems or metacomputing systems (Ch10,37)
  • Middleware as the services needed to support a
    common set of applications in a distributed
    network environment (Ch6)
  • Next Generation Internet (Ch6)
  • Peer-to-peer Network (Ch10, 18)
  • Realizing thirty year dream of science fiction
    writers that have spun yarns featuring worldwide
    networks of interconnected computers that behave
    as a single entity. (Ch10)

9
What is Grid Technology?
  • Grids support distributed collaboratories or
    virtual organizations integrating concepts from
  • The Web
  • Distributed Objects (CORBA Java/Jini COM)
  • Globus Legion Condor NetSolve Ninf and other High
    Performance Computing activities
  • Peer-to-peer Networks
  • With perhaps the Web being the most important for
    Information Grids and Globus for Compute
    Grids
  • Use Information Grids and not usual Data Grids as
    distributed file systems (holding lots of
    data!) are handled in Compute Grids

10
PPPH Paradigms Protocols Platforms and Hosting I
  • We will start from the Web view and assert that
    basic paradigm is
  • Meta-data rich Web Services communicating via
    messages
  • These have some basic support from some runtime
    such as .NET, Jini (pure Java), Apache
    TomcatAxis (Web Service toolkit), Enterprise
    JavaBeans, WebSphere (IBM) or GT3 (Globus Toolkit
    3)
  • These are the distributed equivalent of operating
    system functions as in UNIX Shell
  • Called Hosting Environment or platform

11
Some Basic Observations
  • Grids manage and share asynchronous resources in
    a rather centralized fashion
  • Peer-to-peer networks are just like Grids with
    different implementations of services like
    registration and look-up
  • Web Services interact with messages
  • Everything (including applications like
    PowerPoint will be a WS?) see later short
    discussion
  • Computers are fast and getting faster. One can
    afford many strategies that used to be
    unrealistic
  • All messages can be publish/subscribe
  • Software message routing
  • XML will be used for most interesting data and
    meta-data
  • One will store/consider data and meta-data
    separately but often use same technology to
    manage both of them.
  • Need Synchronous and Asynchronous Resource
    Sharing
  • Integrate Grid and Collaboration technology

12
Classic Grid Architecture
Resources
Content Access
Composition
Middle TierBrokers Service Providers
Netsolve
Security
Collaboration
Computing
Middle Tier becomes Web Services
Clients
Users and Devices
13
What is a Web Service I
  • A web service is a computer program running on
    either the local or remote machine with a set of
    well defined interfaces (ports) specified in XML
    (WSDL)
  • In principle, computer program can be in any
    language (Fortran .. Java .. Perl .. Python) and
    the interfaces can be implemented in any way what
    so ever
  • Interfaces can be method calls, Java RMI
    Messages, CGI Web invocations, totally compiled
    away (inlining) but
  • The simplest implementations involve XML messages
    (SOAP) and programs written in net friendly
    languages like Java and Python
  • Web Services separate the meaning of a port
    (message) interface from its implementation
  • Enhances/Enables Re-usable component model of ANY
    electronic resource

14
Raw Data
RawResources
Raw Data
(Virtual) XML Data Interface
WS
WS
etc.
XML WS to WS Interfaces
(Virtual) XML Knowledge (User) Interface
Render to XML Display Format
(Virtual) XML Rendering Interface
Clients
15
What is a Web Service II
  • Web Services have important implication that ALL
    interfaces are XML messages based. In contrast
  • Most Windows programs have interfaces defined as
    interrupts due to user inputs
  • Most software have interfaces defined as methods
    which might be implemented as a message but this
    is often NOT explicit

16
What is a Web Service III
  • Everything electronic is a resource
  • Computers Programs People
  • Data (from sensors to this presentation to email
    to databases)
  • Everything electronic is a distributed object
  • All resources have interfaces which are defined
    in XML for both properties (data-structure) and
    methods (service, function, subroutine)
    (Resources are Services)
  • We can assume that a data-structure property has
    getproperty() and setproperty(value) methods to
    act as interface
  • All resources are linked by messages with
    structure, which must be specifiable in XML
  • All resources have a URI such as unique//a/b/c
    .

17
WSDL Abstractions
  • WSDL abstracts a program as an entity that does
    something given one or more inputs with its
    results defined by streams on one or more
    outputs.
  • Functions are defined by method name and
    parametersmethodname(parm1,parm2, parmN)
  • Where parameters are Input Output or both
  • In WSDL, we will have a Web Service which like a
    (Java or CORBA Program) can be thought of as a
    (distributed) object with many methods
  • Instead of a function call, the calling routine
    sends an XML message to the Web Service
    specifying methodname and values of the
    parameters
  • Note name of function is just another parameter

18
Details of WSDL Protocol Stack
  • UDDI finds where programs are
  • remote( (distributed) programs are just Web
    Services
  • (not a great success)
  • WSFL links programs together(under revision as
    BPEL4WS)
  • WSDL defines interface (methods, parameters, data
    formats)
  • SOAP defines structure of message including
    serialization of information
  • HTTP is negotiation/transport protocol
  • TCP/IP is layers 3-4 of OSI
  • Physical Network is layer 1 of OSI

19
Education as a Web Service
  • Can link to Science as a Web Service and
    substitute educational modules
  • Learning Object XML standards already exist
    from IMS/ADL http//www.adlnet.org need to
    update architecture
  • Web Services for virtual university include
  • Registration
  • Performance (grading)
  • Authoring of Curriculum
  • Online laboratories for real and virtual
    instruments
  • Homework submission
  • Quizzes of various types (multiple choice, random
    parameters)
  • Assessment data access and analysis
  • Synchronous Delivery of Curricula
  • Scheduling of courses and mentoring sessions
  • Asynchronous access, data-mining and knowledge
    discovery
  • Learning Plan agents to guide students and
    teachers

20
What are System and Application Services?
  • There are generic Grid system services security,
    collaboration, persistent storage, universal
    access
  • OGSA (Open Grid Service Architecture) is
    implementing these as extended Web Services
  • An Application Web Service is a capability used
    either by another service or by a user
  • It has input and output ports data is from
    sensors or other services
  • Consider Satellite-based Sensor Operations as a
    Web Service
  • Satellite management (with a web front end)
  • Each tracking station is a service
  • Image Processing is a pipeline of filters which
    can be grouped into different services
  • Data storage is an important system service
  • Big services built hierarchically from basic
    services
  • Portals are the user (web browser) interfaces to
    Web services

21
Application Web Services
  • Note Service model integrates sensors, sensor
    analysis, simulations and people
  • An Application Web Service is a capability used
    either by another service or by a user
  • It has input and output ports data is from
    users, sensors or other services
  • Big services built hierarchically from basic
    services

22
The Application Service Model
  • As bandwidth of communication (between) services
    increases one can support smaller services
  • A service is a component and is a replacement
    for a library in case where performance allows
  • Services (components) are a sustainable model of
    software development each service has
    documented capability with standards compliant
    interfaces
  • XML defines interfaces at several levels
  • WSDL at Service interface level and XSIL or
    equivalent for scientific data format
  • A service can be written as Perl, Python, Java
    Servlet, Enterprise JavaBean, CORBA (C or
    Fortran) Object
  • Communication protocol can be RMI (Java), IIOP
    (CORBA) or SOAP (HTTP, XML)

23
Application with W3C DOM Structure as a Web
Service
Data
Resource Facing Ports
Application as a Web service Application Model
Remaining W3C DOM Semantic Events
MVCM Model
Control
User FacingPorts
View
CControl
Events as Messages
Rendering as Messages
Application Viewand SelectedControl
V View
24
7 Primitives in WSDL
  • types which provides data type definitions used
    to describe the messages exchanged.
  • message which represents an abstract definition
    of the data being transmitted. A message consists
    of logical parts, each of which is associated
    with a definition within some type system.
  • operation an abstract description of an action
    supported by the service.
  • portType which is a set of abstract operations.
    Each operation refers to an input message and
    output messages.
  • binding which specifies concrete protocol and
    data format specifications for the operations and
    messages defined by a particular portType.
  • port which specifies an address for a binding,
    thus defining a single communication endpoint.
  • service which is used to aggregate a set of
    related ports

25
(No Transcript)
26
lt?xml version"1.0" encoding"UTF-8"?gt ltwsdldefin
itionsgt ltwsdlmessage name"execLocalCommandRes
ponse"gt ltwsdlmessage name"execLocalCommandReques
t"gt ltwsdlportType name"SJwsImp"gt ltwsdloperation
name"execLocalCommand" parameterOrder"in0"gt
ltwsdlinput message"implexecLocalCommandReque
st" name"execLocalCommandRequest"/gt
ltwsdloutput message"implexecLocalCommandRespons
e" name"execLocalCommandResponse"/gt
lt/wsdloperationgt lt/wsdlportTypegt ltwsdlbinding
name"SubmitjobSoapBinding" type"implSJwsImp"gt
ltwsdlsoapbinding style"rpc"
transport"http//schemas.xmlsoap.org/soap/http"/gt
ltwsdloperation name"execLocalCommand"gt
ltwsdlsoapoperation soapAction""/gt
ltwsdlinput name"execLocalCommandRequest"gt
ltwsdloutput name"execLocalCommandResponse"gt lt/ws
dloperationgt lt/wsdlbindinggt ltwsdlservice
name"SJwsImpService"gt ltwsdlport
binding"implSubmitjobSoapBinding"
name"Submitjob"gt lt/wsdlservicegt lt/wsdldefinit
ionsgt
27
Discussion of 7 WSDL Primitives
  • types specify data-structures which are
    equivalent to arguments of methods
  • message specifies collections of types and is
    equivalent to set of arguments in a method call.
    Note that it is an abstract method in Java
    terminology
  • operation is a a collection of input output and
    fault messages there are 4 types of operation
    one-way(service just receives a message),
    request-response(RPC), solicit-response,
    notification (services pushes out a message)
  • portType represents a single channel that can
    support multiple operations. It is abstract as
    specified as a set of operations. It is
    equivalent to a interface or abstract class in
    Java
  • binding tells you transport and message format
    for a porttype (which can have multiple bindings
    to reflect say performance-portability trades)
  • port combines a binding and an endpoint network
    address (URL) and is like a class instance
  • service consists of multiple ports and is
    equivalent to a program in Java

28
OGSA OGSI Hosting Environments
  • Start with Web Services in a hosting environment
  • Add OGSI to get a Grid service and a component
    model
  • Add OGSA to get Interoperable Grid correcting
    differences in base platform and adding key
    functionalities

29
Functional Level above OGSA
  • Systems Management and Automation
  • Workload / Performance Management
  • Security
  • Availability / Service Management
  • Logical Resource Management
  • Clustering Services
  • Connectivity Management
  • Physical Resource Management
  • Perhaps Data Access belongs here

30
Two-level Programming I
  • The paradigm implicitly assumes a two-level
    Programming Model
  • We make a Service (same as a distributed object
    or computer program running on a remote
    computer) using conventional technologies
  • C Java or Fortran Monte Carlo module
  • Data streaming from a sensor or Satellite
  • Specialized (JDBC) database access
  • Such nuggets accept and produce data from users
    files and databases
  • The Grid is built by coordinating such nuggets
    assuming we have solved problem of programming
    the nugget

31
Two-level Programming II
  • The Grid is discussing the linkage and
    distribution of the nuggets with the
    onlyaddition runtime interfaces to Grid as
    opposed to UNIX data streams
  • Familiar from use of UNIX Shell, PERL or Python
    scripts to produce real applications from core
    programs
  • Such interpretative environments are the single
    processor analog of Grid Programming
  • Some projects like GrADS from Rice University are
    looking at integration between nugget levels but
    dominant effort looks at each level separately

32
Why we can dream of using HTTP and that slow stuff
  • We have at least three tiers in computing
    environment
  • Client (user portal discussed Thursday)
  • Middle Tier (Web Servers/brokers)
  • Back end (databases, files, computers etc.)
  • In Grid programming, we use HTTP (and used to use
    CORBA and Java RMI) in middle tier ONLY to
    manipulate a proxy for real job
  • Proxy holds metadata
  • Control communication in middle tier only uses
    metadata
  • Real (data transfer) high performance
    communication in back end

33
UserServices
GridComputingEnvironments
CoreGrid
34
OGSA OGSI Hosting Environments
  • Start with Web Services in a hosting environment
  • Add OGSI to get a Grid service and a component
    model
  • Add OGSA to get Interoperable Grid correcting
    differences in base platform and adding key
    functionalities

35
PPPH Paradigms Protocols Platforms and Hosting II
  • Self-describing programs/interfaces are key to
    scaling
  • Minimize amount of work system has to do
  • Hide as much as possible in services and
    applications
  • Protocols describe (in principle at least)
    those rules that system obeys and uses to deliver
    information between services (processes)
  • Interfaces tell the service what to do to
    interpret the results of communication
  • HTTP is the dominant transport protocol of the
    Web
  • HTML is the interface telling browser how to
    render
  • But you can extend interface to allow PDF,
    multimedia, PowerPoint using helper
    applications which are (with more or less
    convenience) which are automatically downloaded
    if not already available
  • Mime types essentially self-describe each
    interface

36
Analogy with Web II
  • HTTP and HTML are the analogies on the client
    side
  • A Web Service generalizes a CGI Script on
    server side
  • CGI is essentially a Distributed Object
    technology allowing server to access an arbitrary
    program labeled by a URL plus an ugly syntax to
    specify name and parameters of program to run
  • Roughly WSDL (Web Service Description Language)
    is a better to specify program name and its
    parameters
  • Web uses other protocols HTTPS for secure links
    and RTP etc. for multimedia (UDP) streams
  • These again are required to integrate system
    codecs like MPEG are interfaces interpreted by
    client
  • There are further protocols like H323 and SIP
    which will be placed (IMHO) by HTTP plus RTP etc.
    We should minimize number of protocols to get
    maintainable systems

37
PPPH Paradigms Protocols Platforms and Hosting
III
  • There are set of system capabilities which cannot
    be captured as standalone services and permeate
    Grid
  • Meta-data rich Message-linked Web Services is
    permeating paradigm
  • Component Model such as Enterprise JavaBean
    (EJB) or OGSI describes the formal structure of
    services EJB if used lives inside OGSI in our
    Grids
  • Invocation Framework describes how you interact
    with system
  • Security in fine grain fashion to provide
    selective authorization (Globus and EDG WP6)
  • Policy context describes rules for this
    particular Grid
  • Transport mechanisms abstract concepts like ports
    and Quality of Service
  • Messaging abstracts destination and customization
    of content
  • Network (monitoring, performance) EDG WP7
  • Fabric (resources) EDG WP4

38
Architecture in Pictures I
Invocation Framework
39
Architecture in Pictures IIOGSA Interoperable
Grid
40
Architecture in Pictures IIIOGSA Federated Grid
Mediation Serviceconverting between OGSA and
native services
Mediation Service
41
Virtualization
  • The Grid could and sometimes does virtualize
    various concepts
  • Location URI (Universal Resource Identifier)
    virtualizes URL
  • Replica management (caching) virtualizes file
    location generalized by GriPhyn virtual data
    concept
  • Protocol message transport and WSDL bindings
    virtualize transport protocol as a QoS request
  • P2P or Publish-subscribe messaging virtualizes
    matching of source and destination services
  • Semantic Grid virtualizes Knowledge as a
    meta-data query
  • Brokering virtualizes resource allocation
  • Virtualization implies references can be indirect

42
IFS Interfaces and Functionality and Semantics I
  • The Grid platform tries to minimize detail in
    protocols and maximize detail in interfaces to
    enhance scaling
  • However rich meta-data and semantics are critical
    for correct and interesting operation
  • Put as much semantic interpretation as you can
    into specific services
  • Lack of Semantic interoperation is in fact main
    weakness of todays Grids and Web services
  • Everything becomes a service (See example of
    education) whether system or application level
  • There are some very important Global Services
  • Discovery (look up) and Registration of service
    metadata
  • Workflow
  • MetaSchedulers

43
IFS Interfaces and Functionality and Semantics II
  • There are many other generally important services
  • OGSA-DAI The Database Service
  • Portal Service linked to by WSRP (Web services
    for Remote Portals)
  • Notification of events
  • Job submission
  • Provenance interpret meta-data about history of
    data
  • File Interfaces
  • Sensor service satellites
  • Visualization
  • Basic brokering/scheduling

44
Globus in a Nutshell from IPG
  • GT2 (or Globus Toolkit 2) is original (non web
    service based) version which is basis of EDG
    (European Data Grid) work
  • C programs and libraries
  • See Chapter 5 of book with background in chapters
    2-4 and 37
  • http//www.ipg.nasa.gov/ipgusers/globus/
  • http//www.globusworld.org/globusworld_web/jw2_pro
    gram_tut.htm

45
Globus GT2 from IPG
  • The goal of the Globus GT2 is to provide
    dependable, consistent, pervasive access to
    high-end resources.
  • This is original Grid start general recently to
    virtual organizations and data grids
  • The Globus Project offers the most widely used
    computing grid middleware. The Globus Project is
    a joint effort of Argonne National Laboratory,
    the Informational Sciences Institute of the
    University of Southern California, in
    collaboration with numerous other organizations
    including  NCSA, NPACI, UCSD, and NASA. See
    http//www.globus.org/ for history, goals,
    release and usage notes, software distributions,
    and research papers.

46
Globus GT2 II
  • Grid Fabric Layer One The fabric of the Grid
    comprises the underlying systems, computers,
    operating systems, networks, storage systems, and
    routersthe building blocks.
  • Grid Services Layer TwoGrid services integrate
    the components of the Grid fabric. Examples of
    the services that are provided by Globus Toolkit
    2
  • GRAMThe Globus Resource Allocation Manager,
    GRAM, is a basic library service that provides
    capabilities to do remote-submission job start
    up. GRAM unites Grid machines, providing a common
    user interface so that you can submit a job to
    multiple machines on the Grid fabric. GRAM is a
    general, ubiquitous service, with specific
    application toolkit commands built on top of it
  • MDSThe Monitoring and Discovery Service, also
    known as GIS, the Grid Information Service,
    provides information service. You query MDS to
    discover the properties of the machines,
    computers and networks that you want to use how
    many processors are available at this moment?
    What bandwidth is provided? Is the storage on
    tape or disk? Is the visualization device an
    immersive desk or CAVE? Using an LDAP
    (Lightweight Directory Access Protocol) server,
    MDS provides middleware information in a common
    interface to put a unifying picture on top of
    disparate equipment.
  • Contd

47
Globus GT2 III
  • GSI gss-api library for adding authentication to
    a program. GSI provides programs, such as
    grid-proxy-init, to facilitate login to a variety
    of sites, while each site has its own flavor of
    security measures. That is, on the fabric layer,
    the various machines you want to use might be
    governed by disparate security policies GSI
    provides a means of simplifying multiple remote
    logins. The standard installation is based on a
    PKI security system the Kerberos installation of
    Globus is less standard. (Some installations with
    DoE and DoD insist on Kerberos)
  • GridFTP A new (in Globus 2.0) protocol for file
    transfer over a grid. This is a Global Grid Forum
    standard
  • GASS Globus Access to Secondary Storage, provides
    command-line tools and C APIs for remotely
    accessing data. GASS integrates GridFTP, HTTP,
    and local file I/O to enable secure transfers
    using any combination of these protocols..

48
Globus GT2 IV
  • Application Toolkits Layer ThreeApplication
    toolkits use Grid Services to provide
    higher-level capabilities, often targeted to
    specific classes of application.
  • For example, the Globus development team has
    created a set of Grid service tools and a
    toolkit of programs for running remotely
    distributed jobs. These include remote job
    submission commands ( globusrun,
    globus-job-submit, globus-job-run), built on top
    of the GRAM service, and MPICH-G2, a Grid-enabled
    implementation of the Message Passing Interface
    (MPI).
  • A more modern interface is through CoG Kits
    (Commodity Grid) to different languages Perl
    Python Java see chapter 26 of Book
  • The Java CoG kit provides a natural way to link
    GT2 to a Web service framework
  • Globus Toolkit 3 (GT3) effectively integrated CoG
    Kit interface with core Globus by wrapping all
    Globus Services as Web services

49
Job Submission in Globus
  • Very similar to UNIX Shell build Portal Web
    Interfaces to specific or general Shell commands.
    Some example commands
  • globusrun Runs a single executable on a remote
    site with an RSL specification.
  • globus-job-cancel Cancels a job previously
    started using globus-job-submit.
  • globus-job-run Allows you to run a job at one or
    several remote resources. It translates the
    program arguments to an RSL request and uses
    globusrun to submit the job.
  • globus-job-clean Kills the job if it is still
    running and cleans the information concerning the
    job.
  • globus-job-status Display the status of the job.
    See also globus-get-output to check the standard
    output or standard error of your job.
  • These are all controlled by metadata specified by
    the Globus Resource Specification Language (RSL)
    which provides a common language to describe jobs
    and the resources required to run them.
  • http//www.globus.org/gram/gram_rsl_parameters.htm
    l
  • The simplest RSL expression looks something like
    the following. (executable/bin/ls)

50
Virtual Data Toolkit VDT from GriPhyn
  • http//www.lsc-group.phys.uwm.edu/vdt/
  • Trillium (PPDG from DoE GriPhyn and iVDgL from
    NSF) is major US effort building Grid application
    software with a strong particle physics emphasis
  • VDT is their major software release and its heart
    is Condor and GT2.
  • There is some virtual data software as well but
    not clear if this is of interest in production
    use (interesting research area)
  • Condor (Chapter 11 of Book) is powerful job
    scheduler for clusters and cycle scavenging
  • It has a well developed interface (ClassAds) for
    defining requirements of jobs and matching to
    compute capabilities

51
OGSA/OGSI Top Level View
Chapters 7 to 9 of Book http//www.gridforum.org/M
eetings/ggf7/docs/default.htm http//www.globuswo
rld.org/globusworld_web/jw2_program_tut.htm
  • OGSA is the set of core Grid services
  • Stuff you cant live without
  • If you built a Grid you would need to invent
    these things

52
OGSI Open Grid Service Interface
  • http//www.gridforum.org/ogsi-wg
  • It is a component model for web services.
  • It defines a set of behavior patterns that each
    OGSI service must exhibit.
  • Every Grid Service portType extends a common
    base type.
  • Defines an introspection model for the service
  • You can query it (in a standard way) to discover
  • What methods/messages a port understands
  • What other port types does the service provide?
  • If the service is stateful what is the current
    state?
  • A set of standard portTypes for
  • Message subscription and notification
  • Service collections
  • Each service is identified by a URI called the
    Grid Service Handle
  • GSHs are bound dynamically to Grid Services
    References (typically wsdl docs)
  • A GSR may be transient. GSHs are fixed.
  • Handle map services translate GSHs into GSRs.

53
OGSI and Stateful Services
  • Sometimes you can send a message to a service,
    get a result and thats the end
  • This is a statefree service
  • However most non-trivial services need state to
    allow persistent asynchronous interactions
  • OGSI is designed to support Stateful services
    through two mechanisms
  • Information Port where you can query for SDE
    (Service Definition Elements)
  • Factories that allow one to view a Service as a
    class (in an object-oriented language sense)
    and create separate instances for each Service
    invocation
  • There are several interesting issues here
  • Difference between Stateful interactions and
    Stateful services
  • System or Service managed instances

54
Factories and OGSI
  • Stateful interactions are typified by amazon.com
    where messages carry correlation information
    allowing multiple messages to be linked together
  • Amazon preserves state in this fashion which is
    in fact preserved in its database permanently
  • Stateful services have state that can be queried
    outside a particular interaction
  • Also note difference between implicit and
    explicit factories
  • Some claim that implicit factories scale as each
    service manages its own instances and so do not
    need to worry about registering instances and
    lifetime management
  • See WS-Addressing from largely IBM and
    Microsofthttp//msdn.microsoft.com/webservices/de
    fault.aspx?pull/library/en-us/dnglobspec/html/ws-
    addressing.asp

Explicit Factory
Implicit Factory
55
Open Grid Service Architecture
  • OGSA-WG chaired by
  • Ian Foster, ANL and Univ. of Chicago
  • Jeff Nick, IBM
  • Dennis Gannon, IU
  • Active Members from
  • IBM, Fujitsu, NEC, SUN, Hitachi, Avaki
  • Univ. of Mich, Chicago, Indiana (not much
    academic involvement)

56
OGSA Core Services I
  • Registries, and namespace bindings
  • Registry is a collection of services indexed by
    service metadata.
  • find me a service with property X.
  • Directory is a map from a namespace to GSHs.
  • A namespace is a human understandable version of
    a Grid Handle
  • Queues
  • For building schedulers and resource brokers
  • Jobs and other requests are in queues
  • This is high-level messaging

57
Security
  • Base this on Web Services Security
  • Authentication
  • 2-way. Who are you and who am I?
  • Authorization
  • What am I authorized to use/see/modify
  • Accounting/Billing
  • (not really security see monitoring)
  • Privacy
  • Group Access
  • Easily create a group to share access to a
    virtual Grid.
  • Very complex issues related to services and
    message delivery.

58
Common Resource Model
  • Every resource on the grid that is manageable is
    represented by a service instance
  • CRM is the Schema hierarchy that defines each
    resource (with its meta-data)
  • Service for a resource presents its management
    interface to authorized parties.

59
Policy Management
  • Policy management services
  • Mechanism to publish policy and the services it
    applies to.
  • Policy life-cycle mgmt.
  • Policy languages exist for routing, security,
    resource use

60
Grid Service Orchestration
  • Creating new services by composing other services
  • Two types of Orchestration
  • Composition in space
  • One services is directly invoking another
  • Composition in time
  • Managing the workflow
  • First do this.
  • Then do this and that
  • When that is done do this
  • If something goes wrong do this
  • And so on

61
Data Services
  • Distributed Data Access
  • Data Caching
  • Data Replication Services
  • Metadata Catalog Services
  • Storage Services

62
Metering Resource Consumption
  • At what granularity do services report resource
    consumption?
  • How do they report it?
  • How are services metered?

63
Transactions
  • Two threads/workflows must synchronize and agree
    they have done so before moving on.
  • Usually involves modification to two or more
    persistent states
  • WS-transactions has been proposed.

64
Messaging, Events, Logging
  • Messaging
  • Delivery Model
  • Queuing and Pub/Sub message delivery (not clear
    to me why these are different as
    publish/subscribe implemented as topic labeled
    queues)
  • Events
  • Time stamped messages
  • Standard XML schemas
  • Standard Logging
  • MQSeries (IBM), JMS (Java Message Service) and
    NaradaBrokering (Indiana) provide this but most
    naturally at level of platform/hosting
    environment

65
Where should Messaging be?
  • One can define messaging at the OGSA level above
    the hosting environment but that makes it
    difficult to virtualize messaging and support
    network performance
  • Publish-subscribe or better queued messaging
    naturally supports optimized routing based on
    network performance
  • One can naturally support collaborative Web
    services in same fashion in a way that it MUCH
    easier that GrooveNetworks and other
    collaborative environments (WebeX,
    Placeware(Microsoft)) do as long as every
    application is a Web service
  • OGSA location of messages is fine for low volume
    logging or notification events
  • Not good for events on video application where
    each frame is an update event

66
Application as a Web service
From Collaboration As a WS
Events
Rendering
From Master
Participating Client
67
Collaboration Shared Display
  • Sharing can be done at any point on object or
    Web Service pipeline

SharedDisplay
Shared Web Service
Shared Export
Shared Event
Master
Event(Message)Service
Shared Display shares framebuffer with
eventscorresponding to changedpixels in master
client.
Object Display
As long as pipeline uses messages, easy tomake
collaborativeWindows framebuffers and in fact
most applications do NOT expose a message based
update interface
Object Display
68
Shared Input Port (Replicated WS) Collaboration
Collaboration as a WSSet up Session with XGSP
Master
Event(Message)Service
OtherParticipants
69
Shared Output Port Collaboration
Collaboration as a WSSet up Session with XGSP
Web Service Message Interceptor
Master
WS Display
WS Viewer
Text Chat Whiteboard Multiple masters
Event(Message)Service
OtherParticipants
WSDisplay
WS Viewer
70
NaradaBrokering
  • Based on a network of cooperating broker nodes
  • Cluster based architecture allows system to scale
    to arbitrary size
  • Originally designed to provide uniform software
    multicast to support real-time collaboration
    linked to publish-subscribe for asynchronous
    systems.
  • Now has four major core functions
  • Message transport (based on performance
    measurement) in heterogeneous multi-link fashion
  • General publish-subscribe including JMS JXTA
    and support for RTP-based audio/video
    conferencing
  • Filtering for heterogeneous clients
  • Federation of multiple instances of Grid services

71
Role of Event/Message Brokers
  • We will use events and messages interchangeably
  • An event is a time stamped message
  • Our systems are built from clients, servers and
    event brokers
  • These are logical functions a given computer
    can have one or more of these functions
  • In P2P networks, computers typically
    multifunction in Grids one tends to have
    separate function computers
  • Event Brokers just provide message/event
    services servers provide traditional distributed
    object services as Web services
  • There are functionalities that only depend on
    event itself and perhaps the data format they do
    not depend on details of application and can be
    shared among several applications
  • NaradaBrokering is designed to provide these
    functionalities
  • MPI provided such functionalities for all
    parallel computing

72
Engineering Issues Addressedby Event / Messaging
Service
  • Application level Quality of Service e.g.
    give audio highest priority
  • Tunnel through firewalls proxies
  • Filter messages to slow (collaborative/real-time)
    clients
  • Choose Hardware or Software multicast
  • Scaling of software multicast
  • Efficient calculation of destinations and
    routes.
  • Integrate synchronous and asynchronous
    collaboration with same messaging, control,
    archiving for all functions
  • Transparently replace single server JMS systems
    with a distributed solution.
  • Provides reliable inter-peer group messaging for
    JXTA
  • Open Source (high quality) messaging

73
NaradaBrokering implements an Event Service
  • Filter is mapping to PDA or slow communication
    channel (universal access) see our PDA adaptor
  • Workflow implements message process
  • Routing illustrated by JXTA and includes firewall
  • Destination-Source matching illustrated by JMS
    using Publish-Subscribe mechanism
  • These use Security model (being implemented)
    based on WS-Sec

74
Narada Broker Network
(P2P) Community
For message/events service
Broker
Broker
(P2P) Community
Resource
Broker
Hypercube topology for brokers? Tree for distance
education with teacher at root
Broker
Broker
(P2P) Community
Software multicast
Broker
(P2P) Community
75
NaradaBrokering Communication
  • Applications interface to NaradaBrokering through
    UserChannels which NB constructs as a set of
    links between NB Broker waystations which may
    need to be dynamically instantiated
  • UserChannels have publish/subscribe semantics
    with XML topics
  • Links implement a single conventional data
    protocol.
  • Interface to add new transport protocols within
    the Framework
  • Administrative channel negotiates the best
    available communication protocol for each link
  • Different links can have different underlying
    transport implementations
  • Implementations in the current release include
    support for TCP,UDP, Multicast, SSL and RTP.
    HTTP, HTTPS support will be available in Feb 2003
    release.
  • Supports communication through proxies such as
    iPlanet, Netscape and Apache.
  • Supports communication through firewalls such as
    Microsoft ISA, Checkpoint.

76
Performance/Routing in Message-based Architecture
B2
B3
  • In traveling from cities A to B (say 3 separate
    passengers), one chooses between and changes
    transport mechanism at waystations to optimize
    cost, time, comfort, scenic beauty
  • Waystations are now NB brokers where one chooses
    transport protocol (individual or collective)
  • Able to choose between car, type of car, plane,
    train etc
  • Able to dynamically create waystations to cope
    with problems and acts as hubs for multicast
    messages
  • Knows about traffic jams and can assign the HOV
    lane

77
Note on Optimization
  • Note in parallel computing, couldnt do much
    dynamic optimization as aiming at microsecond
    latency
  • Natural to use hardware routing
  • In Grid, time scales are different
  • 100 millisecond quite normal network latency
  • 30 millisecond typical packet time sensitivity
    (this is one audio or video frame) but even here
    can buffer 10-100 frames on client (conferencing
    to streaming)
  • 1 millisecond is time for a Java server to
    think
  • Jitter in latency (transit time) due to routing,
    processing (in NB) or packet loss recovery is
    important property
  • Grid needs and can tolerate significant dynamic
    optimization

78
Sender/receiver/broker - (Pentium-3, 1 GHz, 256
MB RAM). 100 Mbps LAN. JDK-1.3, Red Hat Linux 7.3
79
(No Transcript)
80
(No Transcript)
81
(No Transcript)
82
Narada Performance Web Service
  • Performance measurements are used by Links in
  • Reconfiguring Connectivity between nodes
  • Deciding underlying transport protocol
  • Determining possible filtering
  • Each node determines performance of links of
    which it is endpoint
  • Individual node web services are aggregated as
    another Web Service

Probably should replace by a more sophisticated
measurement package
  • Factors measured include
  • Transit delays, bandwidth, Jitter, Receiving
    rates.
  • Performance measurements are
  • Spaced out at increasing intervals for healthy
    channels.
  • Factors selectively measured for unhealthy
    channels.
  • No repeated measurements of bandwidth for
    example.
  • Injected into Narada network as XML events

Administrative Interface
83
The Overall Architecture
  • The Grid is defined by a collection of
    distributed Services
  • For many users the primary interaction with the
    Grid will be through a portal

Event and logging Services
The User
Application Factory Services
Messaging and group collaboration
Portal Server
Directory index Services
MyProxy Server
User's Persistent Context
Metadata Directory Service(s)
84
Application Portal in a Minute (box)
  • Systems like Unicore, GPDK, Gridport (HotPage),
    Gateway, Legion provide Grid or GCE Shell
    interfaces to users (user portals)
  • Run a job find its status manipulate files
  • Basic UNIX Shell-like capabilities
  • Application Portals (Problem Solving
    Environments) are often built on top of Shell
    Portals but this can be quite time confusing
  • Application Portal Shell Portal Web Service
    Application (factory) Web service

85
Application Web service
  • Application Web Service is ONLY metadata
  • Application is NOT touched
  • Application Web service defined by two sets of
    schema
  • First set defines the abstract state of the
    application
  • What are my options for invoking myapp?
  • Dub these to be abstract descriptors
  • Second set defines a specific instance of the
    application
  • I want to use myapp with input1.dat on
    solar.uits.indiana.edu.
  • Dub these to be instance descriptors.
  • Each descriptor group consists of
  • Application descriptor schema
  • Host (resource) descriptor schema
  • Execution environment (queue or shell) descriptor
    schema

86
(No Transcript)
87
Web Services as a Portlet
  • Each Web Service naturally has a user interface
    specified as just another port
  • Customizable for universal access
  • This gives each Web Service a Portlet view
    specified (in XML as always) by WSRP (Web
    services for Remote Portals)
  • So component model for resources automatically
    gives a component model for user interfaces
  • When you build your application, you define
    portletat same time

Application as a WSGeneral Application
PortsInterface with other WebServices
User Face ofWeb ServiceWSRP Ports define WS as
a Portlet
Web Services have other ports (Grid Service) to
be OGSI compliant
88
Online Knowledge Center built from Portlets
A set of UIComponents
  • Web Services provide a component model for the
    middleware (see large common component
    architecture effort in Dept. of Energy)
  • Should match each WSDL component with a
    corresponding user interface component
  • Thus one must use a component model for the
    portal with again an XML specification (portalML)
    of portal component

89
HTML
Jetspeed Architecture
Turbine Servlet
JSP template
ECS Root to HTML
Screen Manager
PSML
ECS
PortletController
PortletController
ECS
ECS
ECS
PortletControl
ECS
ECS
ECS
ECS
ECS
Portlet
Portlet
Portlet
Portlet
Portlet
Portlets
HTML Local files
JSP or VM Local templates
WebPage Remote HTML
Portlets User implemented using Portal API
XML RSS, OCS, or other Local or remote
Data
90
Portlets and Portal Stacks
  • User interfaces to Portal services (Code
    Submission, Job Monitoring, File Management for
    Host X) are all managed as portlets.
  • Users, administrators can customize their portal
    interfaces to just precisely the services they
    want.

Aggregation Portals (Jetspeed)
User facing Web Service Ports
Message Security, Information Services
Application Grid Web Services
Core Grid Services
91
Jetspeed Computing Portal Choose Portlets
92
Choose Portlet Layout
Choose 1-column Layout
Original 2-column Layout
93
File management
Tabs indicate available portlet interfaces.
Lists user files on selected host,
noahsark. File operations include Upload,
download, Copy, rename, crossload
94
(No Transcript)
95
Sample page with several portlets proxy
credential manager, submission, monitoring
96
Administer Grid Portal
Provide information about application and host
parameters
Select application to edit
Write a Comment
User Comments (0)
About PowerShow.com