Globus Toolkit - PowerPoint PPT Presentation

About This Presentation
Title:

Globus Toolkit

Description:

Ian Foster Argonne National Laboratory University of Chicago Univa Corporation Credits Globus Toolkit v4 is the work of many talented Globus Alliance members, at ... – PowerPoint PPT presentation

Number of Views:341
Avg rating:3.0/5.0
Slides: 124
Provided by: LisaChil1
Learn more at: https://www.mcs.anl.gov
Category:

less

Transcript and Presenter's Notes

Title: Globus Toolkit


1
Globus Toolkit 4
Ian Foster Argonne National Laboratory University
of Chicago Univa Corporation
2
Credits
  • Globus Toolkit v4 is the work of many talented
    Globus Alliance members, at
  • Argonne Natl. Lab U.Chicago
  • USC Information Sciences Corporation
  • National Center for Supercomputing Applns
  • U. Edinburgh
  • Swedish PDC
  • Univa Corporation
  • Other contributors at other institutions
  • Supported by DOE, NSF, UK EPSRC, and other sources

3
On April 29, 2005 the Globus Alliance
releasedthe finest version of the Globus Toolkit
to date!
Dont take our word for it! Read the UK eScience
Evaluation of GT4 www.nesc.ac.uk/technical_papers/
UKeS-2005-03.pdf (Reachable from www.globus.org,
under News)
4
Overview
  • Background and Globus approach
  • Globus Toolkit current capabilities
  • Future directions
  • Related tools

5
  • A new age has dawned in scientific and
    engineering research, pushed by continuing
    progress in computing, information, and
    communication technology, and pulled by the
    expanding complexity, scope and scale of todays
    challenges. The capacity of this technology has
    crossed thresholds that now make possible a
    comprehensive cyberinfrastructure on which to
    build new types of scientific and engineering
    knowledge environments and organizations, and to
    pursue research in new ways and with increased
    efficacy

National Science Foundation Blue Ribbon Advisory
Panel, 2003
6
What specific problem is the Globus
Toolkitdesigned to address?
7
  • Ultimately, the Globus Toolkit
  • is designed to enable the
  • creation and maintenance of
  • Virtual Organizations

8
History
  • In the early 90s, I (Foster) and others (e.g.,
    Carl Kesselman, USC-ISI) enjoyed helping
    scientists apply distributed computing
  • Opportunities seemed ripe for the picking
  • Application of technology always uncovers new and
    interesting requirements
  • Science is cool
  • Big/innovative science is even cooler

9
History (continued)
  • While helping to build/integrate a diverse range
    of applications, the same problems kept showing
    up over and over again
  • Too many different security systems
  • Too many different scheduling/execution
    mechanisms
  • Too many different storage systems
  • Too many different monitoring/status/event systems

10
What Kinds of Applications?
  • Computation intensive
  • Interactive simulation (climate modeling)
  • Large-scale simulation and analysis (galaxy
    formation, gravity waves, event simulation)
  • Engineering (parameter studies, linked models)
  • Data intensive
  • Experimental data analysis (e.g., physics)
  • Image sensor analysis (astronomy, climate)
  • Distributed collaboration
  • Online instrumentation (microscopes, x-ray)
    Remote visualization (climate studies, biology)
  • Engineering (large-scale structural testing)

11
Key Common Feature
  • The size and/or complexity of the problem
    requires that people in several organizations
    collaborate and share computing resources, data,
    instruments

12
An Example Problem
  • The Large Hadron Collider (LHC)
  • Largest machine ever built by humans!
  • Located at CERN, Geneva Switzerland
  • Particle accelerator and collider with a
    circumference of 16.8 miles
  • Scheduled to go into production in 2007

13
An Example Problem (continued)
  • Will generate 10 Petabytes (107 Gigabytes) of
    information per year
  • This information must be processed and stored
    somewhere
  • It is beyond the scope of a single institution to
    manage this problem

14
Virtual Organizations
  • Distributed resources and people
  • Linked by networks, crossing admin domains
  • Sharing resources, common goals
  • Dynamic

R
R
R
R
R
R
R
R
R
R
R
R
R
R
VO-A
VO-B
15
Virtual Organizations
  • Distributed resources and people
  • Linked by networks, crossing admin domains
  • Sharing resources, common goals
  • Dynamic
  • Fault tolerant

R
R
R
R
R
R
R
R
R
R
R
R
VO-A
VO-B
16
The Globus Approach
17
The Role of the Globus Toolkit
  • A collection of solutions to problems that come
    up frequently when building collaborative
    distributed applications
  • Heterogeneity
  • A focus, in particular, on overcoming
    heterogeneity for application developers
  • Standards
  • We capitalize on and encourage use of existing
    standards (IETF, W3C, OASIS, GGF)
  • GT also includes reference implementations of
    new/proposed standards in these organizations

18
Layers in the Grid
19
A Typical eScience Use of GlobusNetwork for
Earthquake Eng. Simulation
Links instruments, data, computers, people
20
Without the Globus Toolkit
ComputeServer
A
SimulationTool
ComputeServer
B
WebBrowser
WebPortal
RegistrationService
Camera
TelepresenceMonitor
DataViewerTool
Camera
Application Developer 10
Off the Shelf 12
Globus Toolkit 0
Grid Community 0
Database service
C
ChatTool
DataCatalog
Database service
D
CredentialRepository
Database service
E
Certificate authority
Resources implement standard access management
interfaces
Collective services aggregate /or virtualize
resources
Users work with client applications
Application services organize VOs enable access
to other services
21
With the Globus Toolkit
ComputeServer
GlobusGRAM
SimulationTool
ComputeServer
GlobusGRAM
WebBrowser
CHEF
Globus IndexService
Camera
TelepresenceMonitor
DataViewerTool
Camera
Application Developer 2
Off the Shelf 9
Globus Toolkit 4
Grid Community 4
Database service
GlobusDAI
CHEF ChatTeamlet
GlobusMCS/RLS
Database service
GlobusDAI
MyProxy
Database service
GlobusDAI
CertificateAuthority
Resources implement standard access management
interfaces
Collective services aggregate /or virtualize
resources
Users work with client applications
Application services organize VOs enable access
to other services
22
The Globus ToolkitStandard Plumbing for the
Grid
  • Not turnkey solutions, but building blocks
    tools for application developers system
    integrators
  • Some components (e.g., file transfer) go farther
    than others (e.g., remote job submission) toward
    end-user relevance
  • Easier to reuse than to reinvent
  • Compatibility with other Grid systems comes for
    free
  • Today the majority of the GT public interfaces
    are usable by application developers and system
    integrators
  • Relatively few end-user interfaces
  • In general, not intended for direct use by end
    users (scientists, engineers, marketing
    specialists)

23
The Application-Infrastructure Gap
  • Dynamicand/orDistributedApplications

24
Bridging the GapGrid Infrastructure
Users
  • Service-oriented applications
  • Wrap applications as services
  • Compose applicationsinto workflows

Composition
Workflows
Invocation
ApplnService
ApplnService
  • Service-oriented Gridinfrastructure
  • Provision physicalresources to support
    application workloads

25
Grid Infrastructure
  • Distributed management
  • Of physical resources
  • Of software services
  • Of communities and their policies
  • Unified treatment
  • Build on Web services framework
  • Use WS-RF, WS-Notification (or WS-Transfer/Man)
    to represent/access state
  • Common management abstractions interfaces

26
Globus is Open Source Grid Infrastructure
  • Implement key Web services standards
  • State, notification, security,
  • Software for Grid infrastructure
  • Service-enable new existing resources
  • E.g., GRAM on computer, GridFTP on storage
    system, custom application services
  • Uniform abstractions mechanisms
  • Tools to build applications that exploit Grid
    infrastructure
  • Registries, security, data management,
  • Enabler of a rich tool service ecosystem

27
An eBusiness Use of GlobusSAP Demonstration _at_
GlobusWorld
  • 3 Globus-enabled applns
  • CRM Internet Pricing Configurator (IPC)
  • CRM Workforce Management (WFM)
  • SCM Advanced Planner Optimizer (APO)
  • Applications modified to
  • Adjust to varying demand resources
  • Use Globus to discover provision resources

SAP AG R/3 Internet Pricing Configurator (IPC)
28
Overview
  • Background and Globus approach
  • Globus Toolkit current capabilities
  • Future directions
  • Related tools

29
The Globus Toolkit is a Collection of Components
  • A set of loosely-coupled components, with
  • Services and clients
  • Libraries
  • Development tools
  • GT components are used to build Grid-based
    applications and services
  • GT can be viewed as a Grid SDK
  • GT components can be categorized across two
    different dimensions
  • By broad domain area
  • By protocol support

30
GT Domain Areas
  • Core runtime
  • Infrastructure for building new services
  • Security
  • Apply uniform policy across distinct systems
  • Execution management
  • Provision, deploy, manage services
  • Data management
  • Discover, transfer, access large data
  • Monitoring
  • Discover monitor dynamic services

31
GT Protocols
  • Web service protocols
  • WSDL, SOAP
  • WS Addressing, WSRF, WSN
  • WS Security, SAML, XACML
  • WS-Interoperability profile
  • Non Web service protocols
  • Standards-based, such as GridFTP
  • Custom

32
Stateless vs. Stateful Services
FileTransferService
Client
move (A to B)
move
  • Without state, how does client
  • Determine what happened (success/failure)?
  • Find out how many files completed?
  • Receive updates when interesting events arise?
  • Terminate a request?
  • Few useful services are truly stateless, but WS
    interfaces alone do not provide built-in support
    for state

33
FileTransferService (without WSRF)
FileTransferService
Client
move (A to B) transferID
move
whatHappen
state
tellMeWhen
cancel
  • Developer reinvents wheel for each new service
  • Custom management and identification of state
    transferID
  • Custom operations to inspect state synchronously
    (whatHappen) and asynchronously (tellMeWhen)
  • Custom lifetime operation (cancel)

34
WSRF in a Nutshell
  • Service
  • State representation
  • Resource
  • Resource Property
  • State identification
  • Endpoint Reference
  • State Interfaces
  • GetRP, QueryRPs, GetMultipleRPs, SetRP
  • Lifetime Interfaces
  • SetTerminationTime
  • ImmediateDestruction
  • Notification Interfaces
  • Subscribe
  • Notify
  • ServiceGroups

Service
GetRP
GetMultRPs
EPR
EPR
EPR
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
35
FileTransferService (w/ WSRF)
FileTransferService
Client
createResource (A to B) EPR
createResource
getRP
queryRPs
destroy
  • Developer specifies custom method to
    createResource and leaves the rest to WSRF
    standards
  • State exposed as Resource Resource Properties
    and identified by Endpoint Reference (EPR)
  • State inspected by standard interfaces (GetRP,
    QueryRPs)
  • Lifetime management by standard interfaces
    (Destroy)

36
Globus Toolkit version 2 (GT2)
Web ServicesComponents
Pre-WS Authentication Authorization
GridFTP
C Common Libraries
Grid Resource Alloc. Mgmt (GRAM)
Monitoring Discovery (MDS)
Non-WS Components
Data Mgmt
Security
CommonRuntime
Execution Mgmt
Info Services
37
Globus Toolkit version 3 (GT3)
Data Access Integration
CommunityAuthorization
Web ServicesComponents
WS Authentication Authorization
Reliable File Transfer
Grid Resource Alloc. Mgmt (WS GRAM)
MDS3
Java WS Core
Pre-WS Authentication Authorization
GridFTP
C Common Libraries
Grid Resource Alloc. Mgmt (GRAM)
Monitoring Discovery (MDS)
Non-WS Components
Replica Location
eXtensible IO (XIO)
Data Mgmt
Security
CommonRuntime
Execution Mgmt
Info Services
38
Core
Globus Toolkit version 4 (GT4)
Contrib/Preview
Grid Telecontrol Protocol
Depre-cated
Community Scheduling Framework
Delegation
Data Replication
Python WS Core
WebMDS
Data Access Integration
CommunityAuthorization
Trigger
C WS Core
Workspace Management
Web ServicesComponents
Authentication Authorization
Reliable File Transfer
Grid Resource Allocation Management
Index
Java WS Core
Pre-WS Authentication Authorization
GridFTP
Pre-WS Grid Resource Alloc. Mgmt
Pre-WSMonitoring Discovery
C Common Libraries
Non-WS Components
Replica Location
eXtensible IO (XIO)
Credential Mgmt
www.globus.org
Data Mgmt
Security
CommonRuntime
Execution Mgmt
Info Services
39
Globus Toolkit Open Source Grid Infrastructure
Globus Toolkit v4 www.globus.org
Data Replication
Replica Location
Grid Telecontrol Protocol
CredentialMgmt
Data Access Integration
Community Scheduling Framework
Delegation
Python Runtime
WebMDS
Reliable File Transfer
CommunityAuthorization
Trigger
C Runtime
Workspace Management
GridFTP
Authentication Authorization
Grid Resource Allocation Management
Index
Java Runtime
Data Mgmt
Security
CommonRuntime
Execution Mgmt
Info Services
40
4.0 is not a typical .0 release,but the
culmination of months of testing
3.0.2
3.2.1
3.0.1
4.0.1
3.0.0
3.2.0
4.0.0
3.9.4
3.9.2
3.9.0
3.9.5
3.9.3
3.9.1
3.3.0
CVS trunk
Stable release branch
Development release
Stable release
41
GT4 Components
Your C Client
Your Python Client
Your Java Client
Your Python Client
Your Python Client
Your C Client
Your C Client
CLIENT
Your Java Client
Your Java Client
Your Python Client
Your C Client
Your Java Client
Interoperable WS-I-compliant SOAP messaging
X.509 credentials common authentication
RFT
GRAM
Delegation
Index
Trigger
Archiver
Your C Service
CAS
OGSA-DAI
Your Python Service
GTCP
Your Java Service
Your Java Service
RLS
Pre-WS MDS
SimpleCA
MyProxy
GridFTP
Pre-WS GRAM
C WS Core
pyGlobus WS Core
Java Services in Apache Axis Plus GT Libraries
and Handlers
C Services using GT Libraries and Handlers
Python hosting, GT Libraries
SERVER
42
Our Goals for GT4
  • Usability, reliability, scalability,
  • Web service components have quality equal or
    superior to pre-WS components
  • Documentation at acceptable quality level
  • Consistency with latest standards (WS-, WSRF,
    WS-N, etc.) and Apache platform
  • WS-I Basic Profile compliant
  • WS-I Basic Security Profile compliant
  • New components, platforms, languages
  • And links to larger Globus ecosystem

43
Globus Toolkit Open Source Grid Infrastructure
Globus Toolkit v4 www.globus.org
Data Replication
Replica Location
Grid Telecontrol Protocol
CredentialMgmt
Data Access Integration
Community Scheduling Framework
Delegation
Python Runtime
WebMDS
Reliable File Transfer
CommunityAuthorization
Trigger
C Runtime
Workspace Management
GridFTP
Authentication Authorization
Grid Resource Allocation Management
Index
Java Runtime
Data Mgmt
Security
CommonRuntime
Execution Mgmt
Info Services
44
GT4 Web Services Runtime
  • Supports both GT (GRAM, RFT, Delegation, etc.)
    user-developed services
  • Redesign to enhance scalability, modularity,
    performance, usability
  • Leverages existing WS standards
  • WS-I Basic Profile WSDL, SOAP, etc.
  • WS-Security, WS-Addressing
  • Adds support for emerging WS standards
  • WS-Resource Framework, WS-Notification
  • Java, Python, C hosting environments
  • Java is standard Apache

45
GT4 WS Core in a Nutshell
Implementation of WSRF Resources,
EndpointReferences, ResourceProperties
Service
Operation Providers pre-build implementations of
WSRF operations
GetRP
GetMultRPs
EPR
EPR
SetRP
EPR
Notification implementation Topics, TopicSet,
Embedded Notification Consumer service
QueryRPs
Subscribe
SetTermTime
Implementations of Resources (ReflectionResource,
PersistentReflectionResource) and
ResourceProperties (SimpleResourceProperty,
ReflectionResourceProperty)
Destroy
46
GT4 WS Core in a Nutshell
ResourceHome The home owns the Resource
instances in the service
Service
GetRP
GetMultRPs
SingletonResourceHome manages single instance of
Resource
EPR
EPR
SetRP
EPR
QueryRPs
ServiceResourceHome for services that support a
single Resource instance
Subscribe
SetTermTime
ResourceHome
ResourceHomeImpl manages multiple Resource
instances. Supports resources with in-memory
state and resources with persistent (on disk)
state
Destroy
47
GT4 WS Core in a Nutshell
Service Container host multiple services in
container one JVM process
more details based on AXIS service container,
processes SOAP messages, ResourceContext
extension.
48
GT4 WS Core in a Nutshell
Secure Communication Transport, Message,
Conversation (Transport demonstrates best
performance)
PIP
PDP
Configurable Security Policies Policy
Information Points (PIPs), Policy Decision Points
(PDP) -- chained
Example authorization PDPs GridMap, SAML
implementations,XACML policies
49
GT4 WS Core in a Nutshell
WorkManager thread pool, site independent
work manager
PIP
PDP
Apache Database Connection Pool library (JDBC
DataSource implementation)
JNDI Directory manages internal, shared objects
(ResourceHomes, WorkManager, Configuration
objects,)
WorkManager
DB Conn Pool
JNDI Directory
50
GT4 WS Core in a Nutshell
Deploy Service Container standalone or within
Apache Tomcat
PIP
PDP
WorkManager
DB Conn Pool
JNDI Directory
51
GT4 Web Services Runtime
52
Modeling State in Web Services
Authentication Authorization are applied to all
requests
Factoryservice
Service requestor (e.g., user application)
Registry
Interactions standardized using WSDL and SOAP
53
WSRF WS-Notification
  • Naming and bindings (basis for virtualization)
  • Every resource can be uniquely referenced, and
    has one or more associated services for
    interacting with it
  • Lifecycle (basis for fault resilient state mgmt)
  • Resources created by services following factory
    pattern
  • Resources destroyed immediately or scheduled
  • Information model (basis for monitoring,
    discovery)
  • Resource properties associated with resources
  • Operations for querying and setting this info
  • Asynchronous notification of changes to
    properties
  • Service groups (basis for registries, collective
    svcs)
  • Group membership rules membership management
  • Base Fault type

54
WSRF/WSNs Compared(HPDC 2005)
GT4-Java GT4-C pyGridWare WSRFLite WSRF.NET
Languages supported Java C Python Perl C/C/VBasic, etc.
WS-Security password profile Yes No In progress In progress Yes
WS-Security X.509 profile Yes In progress Yes In progress Yes
WS-SecureConversation Yes No Yes No Yes
TLS/SSL Yes Yes Yes Yes Yes
Authorization Multiple Multiple Callout None
Persistence of WS-Resources Yes Not default Yes Yes Yes
Memory Footprint JVM 10M 22 KB 12 MB 12 MB Depends
Memory size per WS-Resource Depends on resource state 70B Depends on resource state 0 (file/DB) or 10B (process) Depends on resource state
Unmodified hosting environment Yes No Yes Yes (Apache) Yes
Compliance with WS-I Basic Profile Yes Yes Yes In progress Yes
Compliance with WS-I Basic Security Profile Yes Yes Yes No Yes
Logging Log4J Yes Yes Yes WSE diagnostics
WS-ResourceLifetime Yes Yes Yes Yes Yes
WS-ResourceProperties Yes Yes Yes Yes Yes
WS-ServiceGroup Yes Yes Yes Yes Yes
WS-BaseFaults Yes Yes Yes Yes Yes
WS-BaseNotification Yes Consumer Yes No Yes
WS-BrokeredNotification Partial No No No Yes
WS-Topics Partial Partial Partial No Partial
55
GetRP Test
  • Distributed client and service on same LAN
  • (times in milliseconds)

149.67
No Security
X509 Signing
HTTPS
25.57
181.96
17.1
140.5
55.6
81.39
10.05
8.23
N/A
2.34
14.8
11.46
12.91
2.85
56
GT4 WS Core Performance
(1) Message-level security (times in milliseconds)
GT4 Java GT4 C GT4 Python WSRF.NET
GetRP 181.96 14.77 140.50 81.39
SetRP 182.04 14.99 142.21 82.48
CreateR 188.46 14.98 132.26 96.22
DestroyR 182.03 15.76 136.12 86.89
Notify 219.51 N/A 244.93 101.57
(2) Transport-level security (times in
milliseconds)
GT4 Java GT4 C GT4 Python WSRF.NET
getRP 11.46 2.85 149.67 12.91
setRP 11.47 2.86 150.79 12.3
createR 18.00 2.82 132.60 20.84
destroyR 14.92 2.71 149.21 16.05
Notify 29.26 9.67 169.07 45.0
WSRF/WSNs Compared, HPDC 2005.
57
Globus Toolkit Open Source Grid Infrastructure
Globus Toolkit v4 www.globus.org
Data Replication
Replica Location
Grid Telecontrol Protocol
CredentialMgmt
Data Access Integration
Community Scheduling Framework
Delegation
Python Runtime
WebMDS
Reliable File Transfer
CommunityAuthorization
Trigger
C Runtime
Workspace Management
GridFTP
Authentication Authorization
Grid Resource Allocation Management
Index
Java Runtime
Data Mgmt
Security
CommonRuntime
Execution Mgmt
Info Services
58
Globus Security
  • Control access to shared services
  • Address autonomous management, e.g., different
    policy in different work-groups
  • Support multi-user collaborations
  • Federate through mutually trusted services
  • Local policy authorities rule
  • Allow users and application communities to set up
    dynamic trust domains
  • Personal/VO collection of resources working
    together based on trust of user/VO

59
Virtual Organization (VO) Concept
  • VO for each application or workload
  • Carve out and configure resources for a
    particular use and set of users

60
GT4 Security
Users
61
GT4 Security
  • Public-key-based authentication
  • Extensible authorization framework based on Web
    services standards
  • SAML-based authorization callout
  • As specified in GGF OGSA-Authz WG
  • Integrated policy decision engine
  • XACML policy language, per-operation policies,
    pluggable
  • Credential management service
  • MyProxy (One time password support)
  • Community Authorization Service
  • Standalone delegation service

62
GT4s Use of Security Standards
Supported, Supported, Fastest,
but slow but insecure so default
63
GT-XACML Integration
  • eXtensible Access Control Markup Language
  • OASIS standard, open source implementations
  • XACML sophisticated policy language
  • Globus Toolkit ships with XACML runtime
  • Included in every client and server built on GT
  • Turned-on through configuration
  • that can be called transparently from runtime
    and/or explicitly from application
  • and we use the XACML-model for our Authz
    Processing Framework

64
GT Authorization Framework
65
Other Security Services Include
  • MyProxy
  • Simplified credential management
  • Web portal integration
  • Single-sign-on support
  • KCA kx.509
  • Bridging into/out-of Kerberos domains
  • SimpleCA
  • Online credential generation
  • PERMIS
  • Authorization service callout

66
Globus Toolkit Open Source Grid Infrastructure
Globus Toolkit v4 www.globus.org
Data Replication
Replica Location
Grid Telecontrol Protocol
CredentialMgmt
Data Access Integration
Community Scheduling Framework
Delegation
Python Runtime
WebMDS
Reliable File Transfer
CommunityAuthorization
Trigger
C Runtime
Workspace Management
GridFTP
Authentication Authorization
Grid Resource Allocation Management
Index
Java Runtime
Data Mgmt
Security
CommonRuntime
Execution Mgmt
Info Services
67
GT4 Data Management
  • Stage/move large data to/from nodes
  • GridFTP, Reliable File Transfer (RFT)
  • Alone, and integrated with GRAM
  • Locate data of interest
  • Replica Location Service (RLS)
  • Replicate data for performance/reliability
  • Distributed Replication Service (DRS)
  • Provide access to diverse data sources
  • File systems, parallel file systems, hierarchical
    storage GridFTP
  • Databases OGSA DAI

68
GridFTP in GT4
Disk-to-disk onTeraGrid
  • 100 Globus code
  • No licensing issues
  • Stable, extensible
  • IPv6 Support
  • XIO for different transports
  • Striping ? multi-Gb/sec wide area transport
  • 27 Gbit/s on 30 Gbit/s link
  • Pluggable
  • Front-end e.g., future WS control channel
  • Back-end e.g., HPSS, cluster file systems
  • Transfer e.g., UDP, NetBLT transport

69
Reliable File TransferThird Party Transfer
  • Fire-and-forget transfer
  • Web services interface
  • Many files directories
  • Integrated failure recovery
  • Has transferred 900K files

RFT Client
SOAP Messages
Notifications(Optional)
RFT Service
GridFTP Server
GridFTP Server
70
Replica Location Service
  • Identify location of files via logical to
    physical name map
  • Distributed indexing of names, fault tolerant
    update protocols
  • GT4 version scalable stable
  • Managing 40 million files across 10 sites

Index
Index
Local DB Update send (secs) Bloom filter (secs) Bloom filter (bits)
10K lt1 2 1 M
1 M 2 24 10 M
5 M 7 175 50 M
71
Reliable Wide Area Data Replication
LIGO Gravitational Wave Observatory
Birmingham
Replicating gt1 Terabyte/day to 8 sites gt30
million replicas so far MTBF 1 month
www.globus.org/solutions
72
OGSA-DAI
  • Provide service-based access to structured data
    resources as part of Globus
  • Specify a selection of interfaces tailored to
    various styles of data accessstarting with
    relational and XML

73
The OGSA-DAI Framework
Application
Client Toolkit
OGSA-DAI service
Engine
SQLQuery
Activities
GZip
GridFTP
XPath
readFile
XSLT
JDBC
Data Resources
XMLDB
File
MySQL
DB2
XIndice
SWISS PROT
SQL Server
Data- bases
74
Extensibility Example
OGSA-DAI service
Engine
SQLQuery
SQLQuery
Multiple SQL GDS
JDBC
MySQL
75
OGSA-DAI A Framework for Building Applications
  • Supports data access, insert and update
  • Relational MySQL, Oracle, DB2, SQL Server,
    Postgres
  • XML Xindice, eXist
  • Files CSV, BinX, EMBL, OMIM, SWISSPROT,
  • Supports data delivery
  • SOAP over HTTP
  • FTP GridFTP
  • E-mail
  • Inter-service
  • Supports data transformation
  • XSLT
  • ZIP GZIP
  • Supports security
  • X.509 certificate based security

76
OGSA-DAI Other Features
  • A framework for building data clients
  • Client toolkit library for application developers
  • A framework for developing functionality
  • Extend existing activities, or implement your own
  • Mix and match activities to provide functionality
    you need
  • Highly extensible
  • Customise our out-of-the-box product
  • Provide your own services, client-side support,
    and data-related functionality

77
Globus Toolkit Open Source Grid Infrastructure
Globus Toolkit v4 www.globus.org
Data Replication
Replica Location
Grid Telecontrol Protocol
CredentialMgmt
Data Access Integration
Community Scheduling Framework
Delegation
Python Runtime
WebMDS
Reliable File Transfer
CommunityAuthorization
Trigger
C Runtime
Workspace Management
GridFTP
Authentication Authorization
Grid Resource Allocation Management
Index
Java Runtime
Data Mgmt
Security
CommonRuntime
Execution Mgmt
Info Services
78
Execution Management (GRAM)
  • Common WS interface to schedulers
  • Unix, Condor, LSF, PBS, SGE,
  • More generally interface for process execution
    management
  • Lay down execution environment
  • Stage data
  • Monitor manage lifecycle
  • Kill it, clean up
  • A basis for application-driven provisioning

79
GT4 WS GRAM
  • 2nd-generation WS implementation optimized for
    performance, flexibility, stability, scalability
  • Streamlined critical path
  • Use only what you need
  • Flexible credential management
  • Credential cache delegation service
  • GridFTP RFT used for data operations
  • Data staging streaming output
  • Eliminates redundant GASS code

80
GT4 WS GRAM Architecture
Service host(s) and compute element(s)
SEG
Job events
GT4 Java Container
Compute element
GRAM services
Local job control
GRAM services
Local scheduler
Job functions
sudo
GRAM adapter
Delegate
Transfer request
Client
Delegation
Delegate
GridFTP
User job
RFT File Transfer
FTP control
FTP data
Remote storage element(s)
GridFTP
81
GT4 WS GRAM Architecture
Service host(s) and compute element(s)
SEG
Job events
GT4 Java Container
Compute element
GRAM services
Local job control
GRAM services
Local scheduler
Job functions
sudo
GRAM adapter
Delegate
Transfer request
Client
Delegation
Delegate
GridFTP
User job
RFT File Transfer
FTP control
FTP data
Remote storage element(s)
GridFTP
Delegated credential can be Made available to
the application
82
GT4 WS GRAM Architecture
Service host(s) and compute element(s)
SEG
Job events
GT4 Java Container
Compute element
GRAM services
Local job control
GRAM services
Local scheduler
Job functions
sudo
GRAM adapter
Delegate
Transfer request
Client
Delegation
Delegate
GridFTP
User job
RFT File Transfer
FTP control
FTP data
Remote storage element(s)
GridFTP
Delegated credential can be Used to authenticate
with RFT
83
GT4 WS GRAM Architecture
Service host(s) and compute element(s)
SEG
Job events
GT4 Java Container
Compute element
GRAM services
Local job control
GRAM services
Local scheduler
Job functions
sudo
GRAM adapter
Delegate
Transfer request
Client
Delegation
Delegate
GridFTP
User job
RFT File Transfer
FTP control
FTP data
Remote storage element(s)
GridFTP
Delegated credential can be Used to authenticate
with GridFTP
84
WS GRAM Performance
  • Time to submit a basic GRAM job
  • Pre-WS GRAM lt 1 second
  • WS GRAM 2 seconds
  • Concurrent jobs
  • Pre-WS GRAM 300 jobs
  • WS GRAM 32,000 jobs
  • Various studies are underway to test latest
    software

85
GT4 WS GRAM Performance
Number of Client Threads (M) Number of Client Threads (M) Number of Client Threads (M) Number of Client Threads (M) Number of Client Threads (M) Number of Client Threads (M) Number of Client Threads (M) Number of Client Threads (M)
1 2 4 8 16 32 64 128
1 7 15 29 57 80 69 69 70
2 15 29 58 79 74 70 70 64
4 29 58 78 77 68 69 52 69
8 59 77 77 72 65 27   69
16 77 77 75 64 27     50
32 76 75 68 64 67    
64 75 73 70 66 65  
128 80 72 64 63 71
Sustained Job Load Per Client Thread (N)
All numbers are simple jobs/minute, no delegation
or staging
86
Workspace ServiceThe Hosted Activity
Policy
Allocate/provision Configure Initiate
activity Monitor activity Control activity
Activity
Client
Environment
Resource provider
Interface
87
Activities Can Be Nested
Client
Policy
Client

Client


Environment

Resource provider
Interface
88
For Example
Provisioning, management, and monitoring at all
levels
89
Dynamic Service Deployment
Community A
Community Z
  • Requirements
  • Community control
  • Persistence
  • Resource guarantees
  • Non- interference
  • Community scheduling logic
  • Data distribution
  • Community management
  • Science services
  • ...

90
Virtual Machine Costs
Job in booted VM
GRAM job in paused VM
GRAM job
91
Virtual OSG Clusters
OSG
92
Globus Toolkit Open Source Grid Infrastructure
Globus Toolkit v4 www.globus.org
Data Replication
Replica Location
Grid Telecontrol Protocol
CredentialMgmt
Data Access Integration
Community Scheduling Framework
Delegation
Python Runtime
WebMDS
Reliable File Transfer
CommunityAuthorization
Trigger
C Runtime
Workspace Management
GridFTP
Authentication Authorization
Grid Resource Allocation Management
Index
Java Runtime
Data Mgmt
Security
CommonRuntime
Execution Mgmt
Info Services
93
Monitoring and Discovery
  • Every service should be monitorable and
    discoverable using common mechanisms
  • WSRF/WSN provides those mechanisms
  • A common aggregator framework for collecting
    information from services, thus
  • MDS-Index Xpath queries, with caching
  • MDS-Trigger perform action on condition
  • (MDS-Archiver Xpath on historical data)
  • Deep integration with Globus containers
    services every GT4 service is discoverable
  • GRAM, RFT, GridFTP, CAS,

94
GT4 Monitoring Discovery
Clients (e.g., WebMDS)
GT4 Container
WS-ServiceGroup
MDS-Index
Registration WSRF/WSN Access

adapter
GT4 Cont.
GT4 Container
MDS-Index
MDS-Index
Custom protocols for non-WSRF entities
Automated registration in container
GridFTP
RFT
GRAM
User
95
Index Server Performance
  • As the MDS4 Index grows, query rate and response
    time both slow, although sublinearly
  • Response time slows due to increasing data
    transfer size
  • Full Index is being returned
  • Response is re-built for every query
  • Real question how much over simple WS-N
    performance?

96
Information Providers
  • GT4 information providers collect information
    from some system and make it accessible as WSRF
    resource properties
  • Growing number of information providers
  • Ganglia, CluMon, Nagios
  • SGE, LSF, OpenPBS, PBSPro, Torque
  • Many opportunities to build additional ones
  • E.g., network monitoring, storage systems,
    various sensors

97
GT4 Summary
Your C Client
Your Python Client
Your Java Client
Your Python Client
Your Python Client
Your C Client
Your C Client
CLIENT
Your Java Client
Your Java Client
Your Python Client
Your C Client
Your Java Client
Interoperable WS-I-compliant SOAP messaging
X.509 credentials common authentication
RFT
GRAM
Delegation
Index
Trigger
Archiver
Your C Service
CAS
OGSA-DAI
Your Python Service
GTCP
Your Java Service
Your Java Service
RLS
Pre-WS MDS
SimpleCA
MyProxy
GridFTP
Pre-WS GRAM
C WS Core
pyGlobus WS Core
Java Services in Apache Axis Plus GT Libraries
and Handlers
C Services using GT Libraries and Handlers
Python hosting, GT Libraries
SERVER
98
GT4 Documentationis Much Improved!
99
Overview
  • Background and Globus approach
  • Globus Toolkit current capabilities
  • Future directions
  • Related tools

100
The Globus Commitment to Open Source
  • Globus was first established as an open source
    project in 1996
  • The Globus Toolkit is open source to
  • allow for inspection
  • for consideration in standardization processes
  • encourage adoption
  • in pursuit of ubiquity and interoperability
  • encourage contributions
  • harness the expertise of the community
  • The Globus Toolkit is distributed under the
    (BSD-style) Apache License version 2

101
The FutureStructure
  • NSF Community Driven Improvement of Globus
    Software (CDIGS) project
  • 5 years of funding for GT enhancement
  • Regular Globus roadmaps outlining plans
  • GlobDev http//dev.globus.org
  • Apache-like community development site
  • Community governance of components
  • Globus Toolkit other related software
  • Open for business early 2006
  • Globus Alliance GlobDev committers

102
GlobDev
  • The current set of Globus components will be
    organized into several Globus Projects
  • Projects release products
  • Each project will have its own group of
    Committers
  • committers are responsible for governance on
    matters relating to their products
  • The Globus Management Committee will
  • provide overall guidance and conflict resolution
  • approve the creation of new Globus Projects

103
The FutureContent
  • We now have a solid and extremely powerful Web
    services base
  • Next, we will build an expanded open source Grid
    infrastructure
  • Virtualization
  • New services for provisioning, data management,
    security, VO management
  • End-user tools for application development
  • Etc., etc.
  • And of course responding to user requests for
    other short-term needs

104
The Future
  • We now have a solid and extremely powerful Web
    services base
  • Next, we will build an expanded open source Grid
    infrastructure
  • Virtualization
  • New services for provisioning, data management,
    security, VO management
  • End-user tools for application development
  • Etc., etc.
  • And of course responding to user requests for
    other short-term needs

105
Short-Term Priorities Security
  • Improve GSI error reporting diagnostics
  • Secure password, one-time password, Kerberos
    support for initial log on
  • Trust roots, use of GridLogon
  • Identity/attribute assertions in GT auth.
    callouts (e.g., Shib, PERMIS, VOMS, SAML)
  • Extend CAS admin policy support
  • Security logging with management control for
    audit purposes

106
Short-Term Priorities Data Management
  • Space bandwidth management in GridFTP
  • Concurrency in globus-url-copy
  • Priorities in RFT
  • Data replication service
  • Enhance policy support in data services
  • Physical file name creation service
  • Scalable distributed metadata manager

107
Short-Term Priorities Execution Management
  • Implement GGF JSDL once finalized
  • Advance reservation support
  • Policy-driven restart of persistent jobs
  • Improved information collection for jobs
  • Improved management of job collections
  • Credential refresh
  • Development of workspace service
  • Integration of virtual machines (Xen, VMware) and
    associated services
  • Windows port of WS GRAM

108
Short-Term Priorities Information Services
  • Many more information sources, including gateways
    to other systems
  • Automated configuration of monitoring
  • Specialized monitoring displays
  • Performance optimization of registry
  • Archiver service
  • Helper tools to streamline integration of new
    information sources

109
Short-Term Priorities WS Core
  • Streamlined container configuration
  • Remote management interface
  • Dynamic service deployment
  • Service isolation multiple service instances
  • WS-Notification, subscription performance
  • Full functionality in C WS Core
  • Optimized WS-ServiceGroup support
  • WS-SecureConversation support

110
What to Expect from theGlobus Alliance in the
Coming Months
  • Support for users of GT4
  • Working to make sure the toolkit meets user needs
  • Answering questions on the mailing lists
  • Further improving documentation
  • Normal evolution of performance, scalability and
    feature enhancements
  • Further development of tools and services in
    support of VOs
  • Expanding contributions to Globus

111
Overview
  • Background and Globus approach
  • Globus Toolkit current capabilities
  • Future directions
  • Related tools

112
The Globus Ecosystem
  • Globus components address core issues relating to
    resource access, monitoring, discovery, security,
    data movement, etc.
  • GT4 being the latest version
  • A larger Globus ecosystem of open source and
    proprietary components provide complementary
    components
  • A growing list of components
  • These components can be combined to produce
    solutions to Grid problems
  • Were building a list of such solutions

113
Many Tools Build on, or Can Contribute to,
GT4-Based Grids
  • Condor-G, DAGman
  • MPICH-G2
  • GRMS
  • Nimrod-G
  • Ninf-G
  • Open Grid Computing Env.
  • Commodity Grid Toolkit
  • GriPhyN Virtual Data System
  • Virtual Data Toolkit
  • GridXpert Synergy
  • Platform Globus Toolkit
  • VOMS
  • PERMIS
  • GT4IDE
  • Sun Grid Engine
  • PBS scheduler
  • LSF scheduler
  • GridBus
  • TeraGrid CTSS
  • NEES
  • IBM Grid Toolbox

114
DocumentingThe Grid Ecosystem
The Grid Ecosystem Software Components for Grid
Systems And Applications
www.grids-center.org
115
Example Solutions
  • Portal-based User Reg. System (PURSE)
  • VO Management Registration Service
  • Service Monitoring Service
  • TeraGrid TGCP Tool
  • Lightweight Data Replicator
  • GriPhyN Virtual Data System

116
Condor-G
  • The Condor Project _at_ U Wisconsin Madison develops
    software for high-throughput computing on
    collections of distributed compute resources
  • Condor-G is an interface to GRAM created by the
    Condor team that allows users to submit jobs to
    GRAM servers

117
GridShib
  • Allows the use of Shibboleth-transported
    attributes for authorization in GT4 deployments
  • And, more generally, SAML support
  • 2 year project started December 1, 2004
  • Participants
  • Von Welch, UIUC/NCSA (PI)
  • Kate Keahey, UChicago/Argonne (PI)
  • Frank Siebenlist, Argonne
  • Tom Barton, UChicago
  • Beta software released September 16, 2005

118
Handle System
  • The Handle System from CNRI (http//www.handle.net
    ) is a general-purpose global name service
    enabling secure name resolution over the internet
  • The Handle System-GT Integration Project
    leverages the Handle System for identifier and
    resolution services through tight integration
    with GT4s Web services protocols

119
MPICH-G2
  • MPICH-G2, developed at Northern Illinois
    University and Argonne National Lab, is a
    grid-enabled implementation of the MPI v1.1
    standard
  • MPICH-G2 is implemented using the pre-WS GRAM
    component in GT4 integration with GT4 WS GRAM is
    expected in the near future

120
Nimrod/G
  • Nimrod is a specialized parametric modeling
    system from Monash University
  • Nimrod/G uses a simple declarative parametric
    modeling language to express parameter sweep
    experiments. Based on GT4 WS services, Nimrod/G
    enables the formulation, execution and monitoring
    of multiple individual parametric experiments

121
Ninf-G4
  • Ninf-G4, from AIST, is a reference implementation
    of the GGF standard GridRPC API
  • Ninf-G4 is provides higher-level programming APIs
    for the development and execution of parallel
    applications on the Grid

122
PERMIS
  • PERMIS is an EU-funded Privilege Management
    service that implements Role-Based Access Control
  • Thanks to the work of the UK Grid Engineering
    Task Force, services running in a Java WS Core
    container can use PERMIS via GT4s SAML
    authorization callouts

123
SRB
  • SRB is a package from SDSC providing a uniform
    interface for connecting to network-based
    heterogeneous data resources
  • GT4s GridFTP includes an interface to SRB data
    sources, and vice versa

124
Sun Grid Engine
  • Sun Grid Engine is an open source distributed
    resource management system from Sun Microsystems
  • In a collaboration between the London e-Science
    Centre, Gridwise and MCNC, the Sun Grid Engine
    has been integrated with GT4

125
Tells Us About YourGrid Tools Solutions
  • We list links to related projects on the Related
    Software of the Globus Toolkit web
    www.globus.org/toolkit/tools/
  • Solutions are documented on the Globus web
    www.globus.org/solutions/
  • If weve got details wrong or you have a
    GT4-related tool to list on our website, please
    send mail to info_at_globus.org

126
Questions?
Write a Comment
User Comments (0)
About PowerShow.com