Introduction to Grid Computing and the Globus Toolkit - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

Introduction to Grid Computing and the Globus Toolkit

Description:

When I need to run a toaster, I don't care where the power comes from. Coal, wind, dynamo, etc ... I just plug my toaster into the wall, and I have power! ... – PowerPoint PPT presentation

Number of Views:106
Avg rating:3.0/5.0
Slides: 45
Provided by: jennife62
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Grid Computing and the Globus Toolkit


1
Introduction to Grid Computing and the Globus
Toolkit
  • Jennifer M. Schopf
  • Argonne National Lab
  • National eScience Centre

2
Some Background Questions
  • How many people have heard of Web Services?
  • How many people have heard of Grids or Grid
    Computing?
  • How many people have heard of Globus?

3
Some Background Questions
  • How many people have heard of Web Services?
  • How many people have heard of Grids or Grid
    Computing?
  • How many people have heard of Globus?
  • How many of you could explain any of these to
    your date at a dinner party?

4
Overview
  • What is a Grid
  • What does the Globus Toolkit do?
  • Security
  • Monitoring
  • Resource Management
  • Data Management
  • Example Application Grid3
  • Conclusions

5
What is a Grid?
  • Resource sharing
  • Computers, storage, sensors, networks,
  • Sharing always conditional issues of trust,
    policy, negotiation, payment,
  • Coordinated problem solving
  • Beyond client-server distributed data analysis,
    computation, collaboration,
  • Dynamic, multi-institutional virtual orgs
  • Community overlays on classic org structures
  • Large or small, static or dynamic

6
Why call it a Grid?
  • Model this approach after a power grid
  • When I need to run a toaster, I dont care where
    the power comes from
  • Coal, wind, dynamo, etc
  • I just plug my toaster into the wall, and I have
    power!
  • Vision for Computational Grids
  • I dont care where my compute cycles are coming
    from, I just need to run my application
  • No, were not there yet

7
Not A New Idea
  • Late 70s Networked operating systems
  • Late 80s Distributed operating system
  • Early 90s Heterogeneous computing
  • Mid 90s Metacomputing, or parallel distributed
    computing
  • Then the Grid Foster and Kesselman, 1999

8
Why is this hard/different?
  • Lack of central control
  • Cannot dictate what runs on a resource, how or
    when
  • Different policies at different sites
  • Heterogeneity is everywhere
  • Shared resources
  • Contention, variability
  • Communication
  • Different sites implies different sys admins,
    users, institutional goals, and often strong
    personalities

9
So why do it?
  • Computations that need to be done with a time
    limit
  • Data that cant fit on one site
  • Data owned by multiple sites
  • Applications that need to be run bigger, faster,
    more

10
What Kinds of Applications?
  • Computation intensive
  • Interactive simulation (climate modeling,
    financial markets)
  • Very large-scale simulation and analysis (galaxy
    formation, gravity waves, battlefield simulation,
    business models)
  • Engineering (parameter studies, linked component
    models)
  • Data intensive
  • Experimental data analysis (high-energy physics)
  • Image and sensor analysis (astronomy, climate
    study, ecology)
  • Distributed collaboration
  • Online instrumentation (microscopes, x-ray
    devices, etc.)
  • Remote visualization (climate studies, biology)
  • Engineering (large-scale structural testing,
    chemical engineering)
  • In all cases, the problems were big enough that
    they required people in several organization to
    collaborate and share computing resources, data,
    instruments.

11
How Do Grids Relate to Web Services? Both are
Service-Oriented Architectures!
  • Idea is simple (and old)
  • Define remote activities in terms of interface
    and behavior, not implementation
  • Devil is in the details
  • How to describe, discover, access, various type
    of service (semantically practically)
  • Grids Web services
  • Broad adoption, flexible XML-based model
  • Standards including WSDL, SOAP, WS-Security
  • Interfaces still being defined to date
  • Performance challenges

12
Grid and Web Services Convergence
  • The definition of WSRF means that the Grid and
    Web services communities can move forward on a
    common base.

13
Summary of Introduction
  • Applications have grown beyond what a single
    resource can handle
  • Scientists are using Grids to address this need
    but its hard!
  • The Globus Toolkit can add a level of indirection
    and some standard tools to help
  • Grids are now using Web services now and have
    broader acceptance because of this

14
Overview
  • What is a Grid
  • What does the Globus Toolkit do?
  • Security
  • Monitoring
  • Resource Management
  • Data Management
  • Example Application Grid3
  • Conclusions

15
What Is the Globus Toolkit?
  • Collection of solutions to common problems when
    building collaborative distributed applications.
  • A set of basic Grid services
  • Job submission/management
  • File transfer (individual, queued)
  • Database access
  • Data management (replication, metadata)
  • Monitoring/Indexing system information
  • A Grid development environment for your own
    services
  • Building blocks for WSRF-compliant Web Services,
    including security infrastructure
  • Tools and Examples
  • The prerequisites for many Grid community tools!

16
Globus IsStandard Plumbing for the Grid
  • Not turnkey solutions, but building blocks and
    tools for application developers and system
    integrators.
  • Some components (e.g., file transfer) go farther
    than others (e.g., remote job submission) toward
    end-user relevance.
  • Since these solutions exist and others are
    already using them (and theyre free), its
    easier to reuse than to reinvent.
  • And compatibility with other Grid systems comes
    for free!

17
How it Really Happens
ComputeServer
SimulationTool
ComputeServer
WebBrowser
WebPortal
RegistrationService
Camera
TelepresenceMonitor
DataViewerTool
Camera
Database service
ChatTool
DataCatalog
Database service
CredentialRepository
Database service
Certificate authority
Resources implement standard access management
interfaces
Collective services aggregate /or virtualize
resources
Users work with client applications
Application services organize VOs enable access
to other services
18
How it Really Happens(without the Globus Toolkit)
ComputeServer
A
SimulationTool
ComputeServer
B
WebBrowser
WebPortal
RegistrationService
Camera
TelepresenceMonitor
DataViewerTool
Camera
Database service
C
ChatTool
DataCatalog
Database service
D
CredentialRepository
Database service
E
Certificate authority
Resources implement standard access management
interfaces
Collective services aggregate /or virtualize
resources
Users work with client applications
Application services organize VOs enable access
to other services
19
How it Really Happens(with the Grid)
ComputeServer
GlobusGRAM
SimulationTool
ComputeServer
GlobusGRAM
WebBrowser
Portal/ CHEF
Globus IndexService
Camera
TelepresenceMonitor
DataViewerTool
Camera
Database service
GlobusOGSA- DAI
CHEF ChatTeamlet
GlobusMCS/RLS
Database service
GlobusOGSA DAI
MyProxy Cred. Rep.
Database service
GlobusOGSA DAI
CertificateAuthority
Resources implement standard access management
interfaces
Collective services aggregate /or virtualize
resources
Users work with client applications
Application services organize VOs enable access
to other services
20
Why Grid Security is Hard
  • Resources being used may be valuable the
    problems being solved sensitive
  • Resources are often located in distinct
    administrative domains
  • Each resource has own policies procedures
  • Set of resources used by a single computation may
    be large, dynamic, and unpredictable
  • Not just client/server, requires delegation
  • It must be broadly available applicable
  • Standard, well-tested, well-understood protocols
    integrated with wide variety of tools

21
Security Tools
  • Grid Security is based on public key
    infrasturcture
  • Basic Grid Security Mechanisms
  • Certificate Generation Tools
  • Certificate Management Tools
  • Getting users registered to use a Grid
  • Getting Grid credentials to wherever theyre
    needed in the system
  • Authorization/Access Control Tools
  • Storing and providing access to system-wide
    authorization information

22
Basic Grid Security Mechanisms
  • Globus Toolkit provides
  • Grid-wide identities implemented as PKI
    certificates
  • Transport-level and message-level authentication
  • Ability to delegate credentials to agents
  • Ability to map between Grid local identities
  • Local security administration enforcement
  • Single sign-on support implemented as proxies
  • A plug in framework for authorization decisions

23
Basic Grid Security Mechanisms
  • Basic security mechanisms are provided as
    libraries/classes and APIs.
  • Integrated with other GT tools and services
  • Integrated with many Grid community tools and
    services (and applications systems)
  • A few stand-alone tools are also included.

24
A Cautionary Note
  • Grid security mechanisms are tedious to set up.
  • If exposed to users, hand-holding is usually
    required.
  • These mechanisms can be hidden entirely from end
    users, but still used behind the scenes.
  • These mechanisms exist for good reasons.
  • Many useful things can be done without Grid
    security.
  • It is unlikely that an ambitious project could go
    into production operation without security like
    this.
  • Most successful projects end up using Grid
    security, but using it in ways that end users
    dont see much.

25
Monitoring and Discovery Challenges
  • Grid Information Service
  • Requirements and characteristics
  • Uniform, flexible access to information
  • Scalable, efficient access to dynamic data
  • Access to multiple information sources
  • Decentralized maintenance
  • Secure information provision

26
Monitoring and Discovery Service in GT4 (MDS4)
  • WS-RF compatible
  • Monitoring of basic service data
  • Primary use case is discovery of services
  • Starting to be used for up/down statistics

27
MDS4 Information Providers
  • Code that generates resource property information
  • Were called service data providers in GT3
  • XML Based not LDAP
  • Basic cluster data
  • Interfaces to Ganglia, Hawkeye
  • GLUE schema
  • Some service data from GT4 services
  • Start, timeout, etc
  • Soft-state registration
  • Push and pull data models

28
MDS4 Index Service
  • Index Service is both registry and cache
  • Subscribes to information providers
  • Data, datatype, data provider information
  • Caches last value of all data
  • In memory default approach

29
MDS4 Trigger Service
  • Compound consumer-producer service
  • Subscribe to a set of resource properties
  • Set of tests on incoming data streams to evaluate
    trigger conditions
  • When a condition matches, email is sent to
    pre-defined address

30
The Resource Management Challenge
  • Enabling secure, controlled remote access to
    heterogeneous computational resources and
    management of remote computation
  • Authentication and authorization
  • Resource discovery characterization
  • Reservation and allocation
  • Computation monitoring and control
  • Addressed by a set of protocols services
  • GRAM protocol as a basic building block
  • Resource brokering co-allocation services
  • GSI for security, MDS for discovery

31
GRAM - Basic Job Submission and Control Service
  • A uniform service interface for remote job
    submission and control
  • Includes file staging and I/O management
  • Includes reliability features
  • Supports basic Grid security mechanisms
  • Available in Pre-WS and WS
  • GRAM is not a scheduler.
  • No scheduling
  • No metascheduling/brokering
  • Often used as a front-end to schedulers, and
    often used to simplify metaschedulers/brokers

32
CondorG
  • The Condor project has produced a helper
    front-end to GRAM
  • Managing sets of subtasks
  • Reliable front-end to GRAM to manage
    computational resources
  • Note this is not Condor which promotes
    high-throughput computing, and use of idle
    resources

33
Data Management
(3) Log. Info
(1) Attribute Specification
Replica Catalog
Metadata Catalog
Application
(4) Multiple Locations
(2) Logical Collection and Logical File Name
MDS
(5) Selected Replica
Replica Selection
(6)PhysInfo
Performance Information Predictions
NWS
GridFTP Control Channel
Disk Cache
GridFTPDataChannel
Tape Library
Disk Array
Disk Cache
Replica Location 1
Replica Location 2
Replica Location 3
34
GridFTP
  • A high-performance, secure, reliable data
    transfer protocol optimized for high-bandwidth
    wide-area networks
  • FTP with well-defined extensions
  • Uses basic Grid security (control and data
    channels)
  • Multiple data channels for parallel transfers
  • Partial file transfers
  • Third-party (direct server-to-server) transfers
  • Reusable data channels
  • Command pipelining
  • GGF recommendation GFD.20

35
Striped GridFTP Service
  • A distributed GridFTP service that runs on a
    storage cluster
  • Every node of the cluster is used to transfer
    data into/out of the cluster
  • Head node coordinates transfers
  • Multiple NICs/internal busses lead to very high
    performance
  • Maximizes use of Gbit WANs

36
RFT - File Transfer Queuing
  • A WSRF service for queuing file transfer requests
  • Server-to-server transfers
  • Checkpointing for restarts
  • Database back-end for failovers
  • Allows clients to requests transfers and then
    disappear
  • No need to manage the transfer
  • Status monitoring available if desired

37
OGSA-DAI
  • OGSA interface for accessing XML and relational
    data stores
  • Implements the GGF DAIS WG standard (in progress)

Figure courtesy of Malcolm Atkinson and Rob
Baxter, UK eScience Center
38
Where is Globus Today
  • Previous versions of the software is currently
    available and in use by hundreds of projects
  • www.globus.org
  • GT4 is what Ive mostly talked about
  • WS-RF based, latest standards
  • Beta currently available, Final in April 2005
  • GT2 software (pre-ws, mixed standards) is also
    included in the GT4 release
  • Complete functionality
  • Not interoperable

39
(No Transcript)
40
How to Get Involved
  • Become a GT4 Friend!
  • Open group of people from various organizations
    working with GT4 pre-release code and documents
  • Reporting problems in code and documents
  • Contributing ideas, tests, documentation
  • Building GT4-enabled applications
  • Weekly telephone calls
  • Discussion list
  • To subscribe to the GT4 friends list, send an
    email to majordomo_at_globus.org which contains the
    words subscribe gt4-friends in the message body

41
General Globus Help and Support
  • Globus-discuss list
  • discuss_at_globus.org
  • http//globus.org/about/contacts.html
  • Bugzilla
  • Bugzilla.globus.org
  • GT4 Information
  • gt4-friends_at_globus.org
  • Weekly telecons for early testers

42
Overview
  • What is a Grid
  • What does the Globus Toolkit do?
  • Security
  • Monitoring
  • Resource Management
  • Data Management
  • Example Application Grid3
  • Conclusions

43
  • Grid2003 An Operational Grid
  • 28 sites (2100-2800 CPUs) growing
  • 400-1300 concurrent jobs
  • 8 substantial applications CS experiments
  • Running since October 2003

Korea
http//www.ivdgl.org/grid2003
44
Grid2003 Project Goals
  • Ramp up U.S. Grid capabilities in anticipation of
    LHC experiment needs in 2005.
  • Build, deploy, and operate a working Grid.
  • Include all U.S. LHC institutions.
  • Run real scientific applications on the Grid.
  • Provide state-of-the-art monitoring services.
  • Cover non-technical issues (e.g., SLAs) as well
    as technical ones.
  • Unite the U.S. CS and Physics projects that are
    aimed at support for LHC.
  • Common infrastructure
  • Joint (collaborative) work

45
ExampleGrid2003Workflows
Genome sequence analysis
Sloan digital sky survey
Physics data analysis
46
Grid2003 Components
  • Security
  • GT GSI, CAS, GSI-OpenSSH
  • Monitoring
  • GT MDS, MonALISA, Ganglia
  • Job Submission
  • GT GRAM, Condor-G, Chimera Pegasus
  • Data Tools
  • GT GridFTP, GT RLS, GT MCS

47
Grid2003 Metrics
48
Grid2003 Summary
  • Working Grid for wide set of applications
  • Joint effort between application scientists,
    computer scientists
  • Globus software as a starting point, additions
    from other communities as needed

49
What Should You TakeAway From This Talk
  • Grids are a way to work between administrative
    domains
  • The Globus Toolkit offers a starting point to
    building these applications
  • Many applications both in science and business
    use these resources
  • Much work still to be done in this area- many
    open research questions!

50
GlobalCommunity
51
For More Information
  • Jennifer Schopf
  • jms_at_mcs.anl.gov
  • www.mcs.anl.gov/jms
  • Globus Alliance
  • www.globus.org
  • Global Grid Forum
  • www.ggf.org

2nd Edition www.mkp.com/grid2
52
  • More details

53
Globus Certificate Service
  • An online service that issues low-quality GSI
    certificates
  • Intended for people who want to experiment with
    Grid components that require certificates but do
    not have any other means of acquiring
    certificates.
  • These certificates are not to be used on
    production systems.
  • Not a true Certificate Authority (CA)
  • No revoking or reissuing certificates
  • No verification of identities
  • The service itself is not especially secure.

54
Simple CA
  • A convenient method of setting up a certificate
    authority (CA).
  • The Certificate Authority can then be used to
    issue certificates for users and services that
    work with GSI and WS-Security.
  • Simple CA is intended for operators of small Grid
    testing environments and users who are not part
    of a larger Grid.
  • Most production Grids will not accept
    certificates that are not signed by a well-known
    CA, so the certificates generated by Simple CA
    will usually not be sufficient to gain access to
    production services.

55
MyProxy
  • MyProxy is a remote service that stores user
    credentials.
  • Users can request proxies for local use on any
    system on the network.
  • Web Portals can request user proxies for use with
    back-end Grid services.
  • Grid administrators can pre-load credentials in
    the server for users to retrieve when needed.
  • Greatly simplifies certificate management!

56
CAS Community Authorization Service
  • CAS allows resource providers to specify
    course-grained access control policies in terms
    of communities as a whole.
  • Fine-grained access control is delegated to the
    community.
  • Resource providers maintain ultimate authority
    over their resources (including per-user control
    and auditing) but are spared most day-to-day
    policy administration tasks.

57
VOMS
  • A community-level group membership system
  • Database of user roles
  • Administrative tools
  • Client interface
  • voms-proxy-init
  • Uses client interface to produce an attribute
    certificate (instead of proxy) that includes
    roles capabilities signed by VOMS server
  • Works with non-VOMS services, but gives more info
    to VOMS-aware services
  • Allows VOs to centrally manage user roles

58
Chimera Virtual Data
  • Captures both logical and physical steps in a
    data analysis process.
  • Transformations (logical)
  • Derivations (physical)
  • Builds a catalog.
  • Results can be used to replay analysis.
  • Generation of DAG (via Pegasus)
  • Execution on Grid
  • Catalog allows introspection of analysis process.

Sloan Survey Data
Galaxy cluster size distribution
59
Pegasus Workflow Transformation
  • Converts Abstract Workflow (AW) into Concrete
    Workflow (CW).
  • Uses Metadata to convert user request to logical
    data sources
  • Obtains AW from Chimera
  • Uses replication data to locate physical files
  • Delivers CW to DAGman
  • Executes using Condor
  • Publishes new replication and derivation data in
    RLS and Chimera (optional)

ChimeraVirtual DataCatalog
MetadataCatalog

t
DAGman
ReplicaLocationService
Condor
ComputeServer
StorageSystem
ComputeServer
StorageSystem
StorageSystem
ComputeServer
ComputeServer
60
MCS - Metadata Catalog Service
  • A stand-alone metadata catalog service
  • WSRF service interface
  • Stores system-defined and user-defined attributes
    for logical files/objects
  • Supports manipulation and query
  • Integrated with OGSA-DAI
  • OGSA-DAI provides metadata storage
  • When run with OGSA-DAI, basic Grid authentication
    mechanisms are available

61
RLS - Replica Location Service
  • A distributed system for tracking replicated data
  • Consistent local state maintained in Local
    Replica Catalogs (LRCs)
  • Collective state with relaxed consistency
    maintained in Replica Location Indices (RLIs)
  • Performance features
  • Soft state maintenance of RLI state
  • Compression of state updates
  • Membership and partitioning information
    maintenance
  • Note
  • RLS (developed by Globus Alliance and the
    DataGrid Project) replaces earlier components in
    the Globus Toolkit 2.x.
Write a Comment
User Comments (0)
About PowerShow.com