The Globus Toolkit Miscellaneous Topics - PowerPoint PPT Presentation

1 / 61
About This Presentation
Title:

The Globus Toolkit Miscellaneous Topics

Description:

Value is one of 'mpi', 'single', 'multiple', or 'condor' ... condor: Start a count Condor processes running in 'standard universe' Fusion Workshop ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 62
Provided by: ianf172
Category:

less

Transcript and Presenter's Notes

Title: The Globus Toolkit Miscellaneous Topics


1
The Globus ToolkitMiscellaneous Topics
  • Kate Keahey
  • with help from Sam Lang

2
Overview
  • GRAM monitoring and management
  • MDS more monitoring
  • DataGrids very high-level introduction
  • GSI some highlights
  • Security Goodies MyProxy and CAS

3
Monitoring Remarks
  • Flavors of monitoring
  • Job, resource, performance, etc.
  • Sensors
  • Generic (for example system sensors)
  • User defined (for example application-based)
  • Modes of interaction
  • Push (subscription)
  • Pull (inquiry)
  • Fusion
  • Job monitoring
  • Queuing system based
  • Anything else?

4
GRAMGrid Resource Allocation and Management
5
Resource Management Review
  • The Globus Resource Allocation Manager (GRAM) API
    allows programs to be started on remote
    resources, despite local heterogeneity
  • Resource Specification Language (RSL) is used to
    communicate requirements

6
GRAM how it works
/CUS/OGlobus//CNKate Keahey keahey
(2)
Globus Services
(1) HTTP 1.1
(3)
gatekeeper
(5) parse
1) authenticate 2) delegate
fork/exec/su
Job Manager
(6) Job request
Monitor and control
(4) Return job contact
Local Resource Manager
MDS client API calls to locate resources
MDS client API calls to get resource info
(7) Allocate create processes
MDS Grid Index Info Server (GIIS)
Query current status of resource
Process
Process
MDS Grid Resource Info Server (GRIS)
Process
7
GRAM monitoring
Fork/exec
(1) HTTP 1.1
Job Manager
Globus-script-lttypegt-poll/rm
Fork/exec
Globus-script-lttypegt-submit
Fork/qsub
Both pull and push models monitoring state only
8
GRAM client interfaces
  • GRAM client
  • GRAM myjob
  • When a set of processes in a single job startup,
    they may need to self organize
  • GRAM jobmanager
  • Essentially allows you to introduce your own
    protocol for doing things remotely or locally

9
globus_gram_client
  • globus_gram_client_job_request()
  • Submit a job to a remote resource
  • RSL specifies the job to be run
  • globus_gram_client_job_status()
  • Check the status of the job
  • PENDING, ACTIVE, FAILED, DONE, SUSPENDED
  • Can also get job status through callbacks
  • globus_gram_client_callback_allow,disallow,check
    ()
  • globus_gram_client_job_cancel()
  • Cancel/kill a pending or active job

10
Finding The Gatekeeper
  • globus_gram_client_job_request() requires a
    contact string to find the gatekeeper
  • hostnameport/servicesubject
  • hostname host of gatekeeper
  • required
  • port port on which gatekeeper is listening
  • defaults to well known port gsigatekeeper
    2119
  • service gatekeeper service to invoke
  • defaults to jobmanager
  • subject security subject name of gatekeeper
  • Defaults to standard host cert form
    /cnhostname
  • Applies fuzzy match to deal with interface names,
    etc.

11
globus_gram_client
  • globus_gram_client_callback_allow()globus_gram_cl
    ient_callback_disallow()globus_gram_client_callba
    ck_check()
  • Create/destroy a client port to listen for
    asynchronous state change callbacks
  • Callback to local function on state change
  • globus_gram_client_job_callback_register()globus_
    gram_client_job_callback_unregister()
  • Register with job manager to receive callbacks

12
Job Contact
  • globus_gram_client_job_request() returns a job
    contact
  • Opaque string
  • Other globus_gram_client_() functions use the
    job contact to find the right job manager to
    which requests are made
  • Job contact string can be passed between
    processes, even on different machines

13
State Change Callbacks
  • GRAM managed job can be in the states
  • Pending, Active, Failed, Done, Suspended
  • GRAM client can register for asynchronous state
    change callbacks
  • Registration can be done during submission
  • Globus_gram_client_job_request()
  • Registration can be done later by any process,
    using the job contact
  • globus_gram_client_job_callback_register()

14
Resource Specification Language
  • Much of the power of GRAM is in the RSL
  • Common language for specifying job requests
  • A conjunction of (attributevalue) pairs
  • GRAM understands a well defined set of attributes

15
RSL Attributes For GRAM
  • (executablestring)
  • Program to run
  • A file path (absolute or relative) or URL
  • (directorystring)
  • Directory in which to run (default is HOME)
  • (argumentsarg1 arg2 arg3...)
  • List of string arguments to program
  • (environment(E1 v1)(E2 v2))
  • List of environment variable name/value pairs

16
RSL Attributes For GRAM
  • (stdinstring)
  • Stdin for program
  • A file path (absolute or relative) or URL
  • (stdoutstring)
  • Stdout for program
  • A file path (absolute or relative) or URL
  • (stderrstring)
  • Stdout for program
  • A file path (absolute or relative) or URL

17
RSL Attributes For GRAM
  • (countinteger)
  • Number of processes to run (default is 1)
  • (hostCountinteger)
  • On SMP multi-computers, number of nodes to
    distribute the count processes across
  • (projectstring)
  • Project (account) against which to charge
  • (queuestring)
  • Queue into which to submit job

18
RSL Attributes For GRAM
  • (maxTimeinteger)
  • Maximum wall clock or cpu runtime (schedulerss
    choice) in minutes
  • (maxWallTimeinteger)
  • Maximum wall clock runtime in minutes
  • (maxCpuTimeinteger)
  • Maximum CPU runtime in minutes

19
RSL Attributes For GRAM
  • (maxMemoryinteger)
  • Maximum amount of memory for each process in
    megabytes
  • (minMemoryinteger)
  • Minimum amount of memory for each process in
    megabytes

20
RSL Attributes For GRAM
  • (jobTypevalue)
  • Value is one of mpi, single, multiple, or
    condor
  • mpi Run the program using mpirun -np ltcountgt
  • single Only run a single instance of the
    program, and let the program start the other
    count-1 processes.
  • multiple Start ltcountgt instances of the program
    using the appropriate scheduler mechanism
  • condor Start a ltcountgt Condor processes running
    in standard universe

21
RSL Attributes for GRAM
  • (gramMyjobvalue)
  • Value is one of collective, independent
  • Defines how the globus_gram_myjob library will
    operate on the ltcountgt processes
  • collective Treat all ltcountgt processes as part
    of a single job
  • independent Treat each of the ltcountgt processes
    as an independent uniprocessor job
  • (dryRuntrue)
  • Do not actually run job

22
globus_rsl
  • Module for manipulating RSL expressions
  • Parse an RSL string into a data structure
  • Functions to manipulate the data structure
  • Unparse the data structure into a string
  • Can be used to assist in writing brokers or
    filters which refine an RSL specification

23
Resources
  • Documentation
  • http//www.globus.org/gram
  • http//www.globus.org/gram/rsl
  • Come to to the tutorial!

24
Monitoring with MDS-2
25
MDS-2 Features
  • Support for Virtual Organizations (VOs)
  • Community-specific
  • Dynamic in nature
  • Scalable
  • Many queries, entries, VOs, etc.
  • Independent
  • Resources, VOs dont affect each other
  • Graceful degradation of service
  • Tolerates failures
  • Extensible
  • Secure

26
Globus MDS-2
  • Service scales with Grid growth
  • Loose consistency model tolerates failures

27
MDS-2 Protocols
  • Grid Resource Inquiry Protocol (GRIP)
  • Describe information about a resource
  • LDAP-based data model for information, request
    and response formats
  • Grid Resource Registration Protocol (GRRP)
  • Soft-state protocol with periodic notification
    reduces the number of hanging links
  • Add/invite resource info to a directory

28
MDS-2 Implementation
  • Grid Resource Information Service (GRIS)
  • Provides resource description
  • Authenticates and parses each request,
    communicates with information providers
  • Grid Index Information Service (GIIS)
  • Provides aggregate directory
  • Can to represent a VO
  • Hierarchical groups of resources
  • Lightweight Dir. Access Protocol (LDAP)
  • Standard with many client implementations
  • Used for GRIP (and GRRP currently)

29
MDS-2 Implementation (example)
MDS-2
GRAM Reporter
client
Issue request to MDS
Invoke provider (or use cached information)
  • GRAM Reporter most of the generic job
    information you can get out of scheduler plus
    scheduler-specific option
  • Can be combined via GIIS
  • Reporter needs to be installed specially
  • Output is cached for a configurable period of
    time

30
MDS-2 Implementation (cntd)
Provider API
MDS-2
provider
client
Issue request to MDS
Invoke provider (or use cached information)
  • Two variants of provider API
  • Scripts
  • Loadable modules
  • In order to extend MDS-2 write your own schema
    and provider
  • Information on how to write a provider
  • http//www.globus.org/gt2/mds2.1/creating_new_prov
    iders.pdf
  • Currently only a pull interaction model
  • MDS 2.2 (to ship in Q2 2002) ability to
    subscribe to state changes

31
Stock MDS-2.1 GRIS Providers
  • globus-version reports Globus software
  • grid-info-host reports host OS info
  • grid-info-host-interfaces reports host NICs
  • grid-info-host-load reports host CPU status
  • grid-info-host-filesystem reports host disk
    status
  • globus-gram-reporter reports Globus job status
  • In progress information about storage and
    network performance

32
Security
  • Each piece of information can be associated with
    credential spec
  • Anonymous (no restriction)
  • Provider trusts directory
  • Provider does not trust the directory
  • Provider partially trusts the directory

33
Visualizing MDS Data
  • Java LDAP browser scripts
  • http//www.globus.org/mds
  • Grid Searcher
  • Alliance funded project to do simple searches
    over MDS
  • Server or client mode
  • http//anchor.nwu.edu/GridSearcher/
  • Hotpage
  • NPACI portal
  • https//hotpage.npaci.edu/

34
(No Transcript)
35
(No Transcript)
36
(No Transcript)
37
Data Grids
38
The Data Grid Problem
  • Enable a geographically distributed community
    of thousands to perform sophisticated,
    computationally intensive analyses on Petabytes
    of data
  • The term Data Grid is often used
  • Unfortunate as it implies a distinct
    infrastructure, which it isnt but easy to say

39
Where Data Grids are going
  • High-speed, reliable access to remote data
  • Automated discovery of best copy of data
  • Manage replication to improve performance
  • Co-schedule compute, storage, network
  • Transparency wrt delivered performance
  • Enforce access control on data
  • Allow representation of global resource
    allocation policies
  • Central Q How must Grid architecture be extended
    to support these functions?

40
A Model Architecture for Data Grids
Attribute Specification
Replica Catalog
Metadata Catalog
Application
Multiple Locations
Logical Collection and Logical File Name
MDS
Selected Replica
Replica Selection
Performance Information Predictions
NWS
GridFTP Control Channel
Disk Cache
GridFTPDataChannel
Tape Library
Disk Array
Disk Cache
Replica Location 1
Replica Location 2
Replica Location 3
41
Globus Toolkit Components
  • Two major Data Grid components
  • 1. Data Transport and Access
  • Common protocol
  • Secure, efficient, flexible, extensible data
    movement
  • Family of tools supporting this protocol
  • 2. Replica Management Architecture
  • Simple scheme for managing
  • multiple copies of files
  • collections of files
  • APIs, white papers http//www.globus.org

42
Grid FTP
  • Common, extensible transfer protocol
  • Common protocol means all can interoperate
  • Interface to many storage systems
  • HPSS, DPSS, file systems
  • Plan for SRB integration
  • Fast
  • Striped, parallel transfer
  • SC01 Challenge Award 2.8 Gbps
  • Despite name is NOT restricted to file transfer
  • Can do memory to memory transfers
  • No embaddable server as yet, but can forward to a
    program on same resource
  • Enchanced reliability

43
GridFTP (cntd)
  • Suite of communication libraries and related
    tools that support
  • GSI, Kerberos security
  • Third-party transfers
  • Parameter set/negotiate
  • Partial file access
  • Reliability/restart
  • Large file support
  • Data channel reuse
  • All based on a standard, widely deployed protocol
  • Integrated instrumentation
  • Loggin/audit trail
  • Parallel transfers
  • Striping (cf DPSS)
  • Policy-based access control
  • Server-side computation
  • Proxies (firewall, load bal)

44
Grid Security Infrastructureand Other Security
Goodies
45
Logging on to the Grid
  • To run programs, authenticate to Globus
  • grid-proxy-init
  • Enter PEM pass phrase
  • Creates a temporary, local, short-lived proxy
    credential for use by our computations
  • Options for grid-proxy-init
  • -hours ltlifetime of credentialgt
  • -bits ltlength of keygt
  • -help

46
grid-proxy-init Details
  • grid-proxy-init creates the local proxy file.
  • User enters pass phrase, which is used to decrypt
    private key.
  • Private key is used to sign a proxy certificate
    with its own, new public/private key pair.
  • Users private key not exposed after proxy has
    been signed
  • Proxy placed in /tmp, read-only by user
  • NOTE No network traffic!
  • grid-proxy-info displays proxy details

47
Destroying Your Proxy (logout)
  • To destroy your local proxy that was created by
    grid-proxy-init
  • grid-proxy-destroy
  • This does NOT destroy any proxies that were
    delegated from this proxy.
  • You cannot revoke a remote proxy
  • Usually create proxies with short lifetimes

48
Important Files
  • /etc/grid-security
  • hostcert.pem certificate used by the server in
    mutual authentication
  • hostkey.pem private key corresponding to the
    servers certificate (read-only by root)
  • grid-mapfile maps grid subject names to local
    user accounts (really part of gatekeeper, not GSI
    itself)
  • /etc/grid-security/certificates
  • CA certificates certs that are trusted when
    validating certs, and thus neednt be verified
  • ca-signing-policy.conf defines the subject names
    for which each trusted CA is allowed to sign
    certificates

49
Important Files
  • HOME/.globus
  • usercert.pem Users certificate (subject name,
    public key, CA signature)
  • userkey.pem Users private key (encrypted using
    the users pass phrase)
  • /tmp
  • Proxy file(s) Temporary file(s) containing
    unencrypted proxy private key and certificate
    (readable only by users account)
  • Same approach Kerberos uses for protecting
    tickets

50
Secure Services
  • On most unix machines, inetd listens for incoming
    service connections and passes connections to
    daemons for processing.
  • On Grid servers, the gatekeeper securely performs
    the same function for many services
  • It handles mutual authentication using files in
    /etc/grid-security
  • It maps to local users via the gridmap file

51
Sample Gridmap File
  • Gridmap file maintained by Globus administrator
  • Entry maps Grid-id into local user name(s)

Distinguished name
Local

username "/CUS/OGlobus/ONP
ACI/OUSDSC/CNRich Gallup
rpg "/CUS/OGlobus/ONPACI/OUSDSC/CNRichard
Frost frost "/CUS/OGlobus/OUSC/OUISI/CNC
arl Kesselman u14543 "/CUS/OGlobus/OAN
L/OUMCS/CNIan Foster itf
52
Delegation
  • Delegation remote creation of a (second level)
    proxy credential
  • New key pair generated remotely on server
  • Proxy cert and public key sent to client
  • Clients signs proxy cert and returns it
  • Server (usually) puts proxy in /tmp
  • Allows remote process to authenticate on behalf
    of the user
  • Remote process impersonates the user

53
Generic Security Service API
  • The GSS-API is the IETF standard for adding
    authentication, delegation, message integrity,
    and message confidentiality to applications.
  • For secure communication between two parties over
    a reliable channel (e.g. TCP)
  • GSS-API separates security from communication,
    which allows security to be easily added to
    existing communication code.
  • Effectively placing transformation filters on
    each end of the communication link
  • Globus Toolkit components all use GSS-API to
    incorporate security
  • Globus Toolkit GSS-API speaks GSI protocol

54
globus_gss_assist
  • The globus_gss_assist module is a Globus Toolkit
    specific wrapper around GSS-API which makes it
    easier to use
  • Hides some of the gross details of GSS-API
    (authenticating an existing connection)
  • Still maintains separation from communication
    method
  • Implements many useful functions (for example
    searching the gridmapfile)

55
Security
  • Authorization
  • Self
  • Identity
  • Callback
  • Message wrapping
  • Integrity (checksum)
  • encryption

56
globus_io and security
  • For even easier security integration with socket
    code, use the globus_io module
  • Simple to add authentication and authorization to
    TCP socket code
  • But looses separation of security from
    communication method

57
MyProxy
  • Web browsers can use GSI for authentication but
    cannot delegate
  • Enter MyProxy
  • Secure server and client tools
  • Client delegates a proxy associated with tag and
    password
  • Browser retrieves proxy from the server and acts
    on users behalf
  • Other uses
  • Managing credentials
  • Credential wallet

58
Community Authorization
  • Authorization NxM problem
  • Restricted proxy (proxy with restricted use)
  • Community Authorization Service (CAS)
  • Community negotiates access to resources
  • Resource outsources fine-grain auth to CAS
  • Resource only needs to know about CAS user
    credential
  • CAS handles user registration, group membership
  • User who wants access to resource asks CAS for a
    capability credential
  • Restricted proxy of the CAS user credential,
    checked by resource

59
Community Authorization
CAS maintained community policy database

User

60
Community Authorization
  • Policy language in CAS
  • Neutral to policy language
  • A field in restricted proxy specifies it
  • Simple policy language
  • A feature release end of January

61
In Summary
  • Come to the training
Write a Comment
User Comments (0)
About PowerShow.com