Globus Toolkit 4: Futures and Open Issues - PowerPoint PPT Presentation

1 / 83
About This Presentation
Title:

Globus Toolkit 4: Futures and Open Issues

Description:

... end: e.g., future WS control channel. Back-end: e.g., HPSS, cluster file ... the world to do more science more efficiently then ever before. ... and Discovery ' ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 84
Provided by: jennif275
Category:

less

Transcript and Presenter's Notes

Title: Globus Toolkit 4: Futures and Open Issues


1
Globus Toolkit 4Futures and Open Issues
  • Jennifer M. Schopf
  • UK National eScience Centre
  • Argonne National Lab

2
What is a Grid
  • Resource sharing
  • Computers, storage, sensors, networks,
  • Sharing always conditional issues of trust,
    policy, negotiation, payment,
  • Coordinated problem solving
  • Beyond client-server distributed data analysis,
    computation, collaboration,
  • Dynamic, multi-institutional virtual orgs
  • Community overlays on classic org structures
  • Large or small, static or dynamic

3
Not A New Idea
  • Late 70s Networked operating systems
  • Late 80s Distributed operating system
  • Early 90s Heterogeneous computing
  • Mid 90s - Metacomputing
  • Then the Grid Foster and Kesselman, 1999
  • Also called parallel distributed computing

4
Why is this hard/different?
  • Lack of central control
  • Where things run
  • When they run
  • Shared resources
  • Contention, variability
  • Communication
  • Different sites implies different sys admins,
    users, institutional goals, and often strong
    personalities

5
So why do it?
  • Computations that need to be done with a time
    limit
  • Data that cant fit on one site
  • Data owned by multiple sites
  • Applications that need to be run bigger, faster,
    more

6
What Is the Globus Toolkit?
  • The Globus Toolkit is a collection of solutions
    to problems that frequently come up when trying
    to build collaborative distributed applications
  • Heterogeneity
  • Focus on simplifying heterogenity for application
    developers
  • Working towards more vertical solutions in
    future versions.
  • Standards
  • Capitalize on and encourage use of existing
    standards (IETF, W3C, OASIS, GGF).
  • Reference implementations of new/proposed
    standards in these organizations.

7
With Grid Computing Forget Homogeneity!
  • Trying to force homogeneity on users is futile.
    Everyone has their own preferences, sometimes
    even dogma.
  • The Internet provides the model

8
Evolution of the Grid
App-specific Services
Open Grid Services Arch
Web services
Increased functionality, standardization
GGF OGSI, WSRF, (leveraging OASIS, W3C,
IETF) Multiple implementations, including Globus
Toolkit
X.509, LDAP, FTP,
Globus Toolkit
Defacto standards GGF GridFTP, GSI (leveraging
IETF)
Custom solutions
Time
9
Globus is Service-Oriented Infrastructure
Technology
  • Software for service-oriented infrastructure
  • Service enable new existing resources
  • E.g., GRAM on computer, GridFTP on storage
    system, custom application service
  • Uniform abstractions mechanisms
  • Tools to build applications that exploit
    service-oriented infrastructure
  • Registries, security, data management,
  • Open source open standards
  • Each empowers the other
  • eg monitoring across different protocols is
    hard
  • Enabler of a rich tool service ecosystem

10
Globus Toolkit V4.0
  • Major release planned April 29th 2005
  • Second internal release candidate cut yesterday
    still on good track for this date
  • Fifteen months of design, development, and
    testing
  • 1.8M lines of code
  • Major contributions from five institutions
  • Hundreds of millions of service calls executed
    over weeks of continuous operation
  • Significant improvements over GT3 code base in
    all dimensions

11
Our Goals for GT4
  • Usability, reliability, scalability,
  • Web service components have quality equal or
    superior to pre-WS components
  • Documentation at acceptable quality level
  • Consistency with latest standards (WS-, WSRF,
    WS-N, etc.) and Apache platform
  • WS-I Basic (Security) Profile compliant
  • New components, platforms, languages
  • And links to larger Globus ecosystem

12
(No Transcript)
13
GT4 Components and Performance
  • Globus Toolkit Components
  • Core
  • Security
  • Data Management
  • Resource Management
  • Monitoring
  • Performance in the broadest sense of the word.
  • How fast
  • How many
  • How stable
  • (How easy)
  • www-unix.globus.org/toolkit/docs/development/4.0-d
    rafts/perf_overview.html

14
GT4 Web Services Core
  • Supports both Globus services (GRAM, RFT,
    Delegation, etc.) user-developed services
  • Redesign to enhance scalability, modularity,
    performance, usability
  • Leverages existing WS standards
  • WS-I Basic Profile WSDL, SOAP, etc.
  • WS-Security, WS-Addressing
  • Adds support for emerging WS standards
  • WS-Resource Framework, WS-Notification
  • Java, Python, C hosting environments

15
GT4 Web Services Core
16
Open Source/Open Standards
  • WSRF developed in collaboration with IBM
  • Currently in OASIS process
  • Contributions to Apache for
  • WS-Security
  • WS-Addressing
  • Axis
  • Apollo (WSRF)
  • Hermes (WS-Notification)

17
Java Core Performance
  • Weve been working hard to increase basic
    messaging performance
  • Factor of four improvement over GT3 so far
  • Reliability
  • Core can scale to a very large number of
    resources (gt10,000)

18
Java Core Messaging Performance
19
GT4 Security Highlights
  • Standards based support for message level and
    transport level security
  • Transport level is default due to performance
  • Standards based authorization (SAML) via
    Community Authorization Service (CAS) or callouts
  • Stand-alone delegation service
  • More authentication options
  • MyProxy, simpleCA,

20
GT4s Use of Security Standards
21
GT4 Security
Users
22
Security Performance
  • Weve measured performance for both WS and
    transport security mechanisms
  • See next slide for graph
  • Transport security is significantly faster than
    WS security
  • We made transport security (i.e. https) our
    default
  • Were working on making it even faster by using
    connection caching

23
(No Transcript)
24
GT4 Data Management
  • Stage large data to/from nodes
  • Replicate data for performance reliability
  • Locate data of interest
  • Provide access to diverse data sources
  • File systems, parallel file systems, hierarchical
    storage (GridFTP)
  • Databases (OGSA-DAI)

25
GT4 Data Functions
  • Find your data Replica Location Service
  • Managing 40M files in production settings
  • Move/access your data GridFTP, RFT
  • High-performance striped data movement
  • Couple data execution management
  • GRAM uses GridFTP and RFT for staging

26
GridFTP in GT4
  • 100 Globus code
  • No licensing issues
  • Stable, extensible
  • IPv6 Support
  • XIO for different transports
  • Striping ? multi-Gb/sec wide area transport
  • Pluggable
  • Front-end e.g., future WS control channel
  • Back-end e.g., HPSS, cluster file systems
  • Transfer e.g., UDP, NetBLT transport

27
GridFTP Performance
  • TeraGrid Striping results
  • 30Gbs network, 32 IBM ia64 nodes
  • Ran varying number of stripes
  • Ran both memory-to-memory and disk-to-disk

28
Memory to MemoryStriping Performance
  • High linear scalability (slope near 1)
  • 27 Gbs on a 30 Gbs link (90 utilization) with 32
    nodes

29
Disk to Disk Striping Performance
  • Limited by the storage system
  • Achieved 17.5 Gbs

30
And in conversation
  • We think we have hit the limit of python code.
    The GridFTP C libraries are delivering data so
    fast to the buffers that the python client code
    cannot keep up in doing the fseek, fwrite, and
    then re-register the data callback. We are going
    to have to code our "transfer agents" entirely in
    C for S5.Not a bad problem to have
  • Scott Koranda, Dept of Physics, University of
    Minnesota

31
Reliable File TransferThird Party Transfer
  • Fire-and-forget transfer
  • Web services interface
  • Many files directories
  • Integrated failure recovery

RFT Client
SOAP Messages
Notifications(Optional)
RFT Service
GridFTP Server
GridFTP Server
32
RFT Performance Stats
  • Current maximum request size is approx 20,000
    entries with a default 64MB heap size.
  • Infinite transfer - LAN
  • 120,000 transfers (servers were killed by
    mistake)
  • Was a good test. Found a corner case where
    postgres was not able to perform 3 update
    queries / sec and was using up CPU
  • Infinite transfer WAN
  • 67000 transfers (killed because of the same
    reason as above)
  • Sloan Digital Sky Survey DR3 archive move
  • 900K files, 6 TB
  • Killed the transfer several times for
    recoverability testing
  • No human intervention has been required to date

33
Replica Location Service
  • Identify location of files via logical to
    physical name map
  • Distributed indexing of names, fault tolerant
    update protocols
  • GT4 version scalable stable
  • Managing 40 million files across 10 sites

Index
Index
34
LIGO Use of RLS
  • Some hands-on numbers
  • Produce 1 TB per day
  • 8 sites
  • gt 3 million entries in the RLS
  • gt 30 million files
  • This replication of data using RLS and GridFTP
    is enabling more gravitational wave data analysts
    across the world to do more science more
    efficiently then ever before. Globus RLS and
    GridFTP are in the critical path for LIGO data
    analysis.

35
Data Replication Service (tech preview)
  • Pull missing files to local site

Site B
Site A
List of required Files
Reliable File TransferService
Data Replication Service
Data Replication Service
Reliable File Transfer Service
GridFTP
Local ReplicaCatalog
Replica LocationIndex
Local Replica Catalog
ReplicaLocationIndex
GridFTP
36
OGSA-DAI
  • Flexible Composable Middleware
  • Data access
  • Relational XML Databases, semi-structured files
  • Data integration
  • Multiple data delivery mechanisms, data
    translation
  • Extensible Efficient framework
  • Request documents contain multiple tasks
  • A task execution of an activity
  • Group work to enable efficient operation
  • Extensible set of activities
  • gt 30 predefined, framework for writing your own
  • Moves computation to data
  • Pipelined and streaming evaluation
  • Concurrent task evaluation

37
OGSA-DAI
  • Current Release Release 5 in GT4
  • Added Installation wizards indexed files
  • gt1100 registered users we know about
  • Running on 3 message passing infrastructures
  • Release 6 May 2005
  • Improved client side API
  • Explicit control of sequential parallel tasks
  • Dynamic reconfigurability
  • WS-DAI reference implementation

38
Execution Management (GRAM)
  • Common WS interface to schedulers
  • Unix, Condor, LSF, PBS, SGE,
  • More generally interface for process execution
    management
  • Lay down execution environment
  • Stage data
  • Monitor manage lifecycle
  • Kill it, clean up
  • A basis for application-driven provisioning

39
GT4 GRAM
  • 2nd-generation WS implementation
  • optimized for performance, stability,
    scalability
  • Streamlined critical path
  • Use only what you need
  • Flexible credential management
  • Credential cache delegation service
  • GridFTP RFT used for data operations
  • Data staging streaming output
  • Eliminates redundant GASS code
  • Single and multi-job support

40
GT4 GRAM StructureWSRF/WSN Poster Child
Service host(s) and compute element(s)
GT4 Java Container
Compute element
Local job control
GRAM services
GRAM services
Local scheduler
Job functions
sudo
GRAM adapter
Delegate
Transfer request
Delegation
Client
Delegate
GridFTP
User job
RFT File Transfer
FTP control
FTP data
Remote storage element(s)
GridFTP
41
Some of our Goals
  • GRAM should add little to no overhead compared
    to an underlying batch system
  • Submit as many jobs to GRAM as is possible to the
    underlying scheduler
  • Goal - 10,000 jobs to a batch scheduler
  • Goal efficiently fill the process table for
    fork scheduler
  • Submit/process jobs as fast to GRAM as is
    possible to the underlying scheduler
  • Goal - 1 per second
  • We are not there yet
  • A range of limiting factors at play

42
Design Decisions
  • Efforts and features towards the goal
  • Allow job brokers the freedom to optimize
  • E.g. Condor-G is smarter than globusrun
  • Protocol steps made optional and shareable
  • Reduced cost for GRAM service on host
  • Single WSRF host environment
  • Better job status monitoring mechanisms
  • More scalable/reliable file handling
  • GridFTP and RFT instead of globus-url-copy
  • Removal of non-scalable GASS caching
  • GT4 tests performing better than GT3 did
  • But more work to do

43
GRAM 3.9.4 performance
  • Throughput
  • Test Simple job to fork scheduler (/bin/date)
    no staging, streaming, or cleanup
  • 77 jobs/min sustained
  • 60 jobs/minute with delegation
  • Long Running test
  • Ran 500,000 sequential jobs over 23 days
  • These included staging, delegation, fork job
    manager

44
Gram Performance (2)
  • Concurrency
  • Job submits to Condor scheduler (long running
    sleep job) no staging, streaming, or cleanup no
    delegation
  • Current limit is 32,000 jobs due to a Linux
    directory limit
  • using multiple sub-directories will resolve this,
    look for this in 4.2

45
Monitoring and Discovery
  • Every service should be monitorable and
    discoverable using common mechanisms
  • WSRF/WSN provides those mechanisms
  • A common aggregator framework for collecting
    information from services, thus
  • Index Service Registry supporting Xpath queries,
    with caching
  • Trigger Service perform action on condition
  • Deep integration with Globus containers
    services every GT4 service is discoverable
  • GRAM, RFT, GridFTP, CAS,

46
GT4 Monitoring Discovery
Clients (e.g., WebMDS)
GT4 Container
WS-ServiceGroup
MDS-Index
Registration WSRF/WSN Access

adapter
GT4 Cont.
GT4 Container
MDS-Index
MDS-Index
Custom protocols for non-WSRF entities
Automated registration in container
GridFTP
RFT
GRAM
User
47
MDS4 Extensibility
  • Aggregator framework provides
  • Registration management
  • Collection of information from Grid Resources
  • Plug in interface for data access, collection
    ,query,
  • WebMDS framework provides for customized display
  • XSLT transformations

48
MDS4 in 3.9.5Index Query Performance
  • Small queries 10 minute averages
  • Message size 7.5 KB
  • Requests processed 11262
  • Average round-trip time in milliseconds 16
  • Medium queries 10 minute averages
  • Message Size 32KB
  • Queries processed 6232
  • Average round-trip time in milliseconds 29

49
Long Running Test
  • Ran 14 days (killed by accident during other
    testing)
  • Over 94 million requests processed,
  • 76 requests/sec average
  • 13 millisecond average Query RTT
  • Has also had diperf tests run against it (next
    slide)

50
(No Transcript)
51
GT4 Documentationis Much Improved!
52
The Globus Ecosystem
  • Globus components address core issues relating to
    resource access, monitoring, discovery, security,
    data movement, etc.
  • GT4 being the latest version
  • A larger Globus ecosystem of open source and
    proprietary components provide complementary
    components
  • A growing list of components
  • These components can be combined to produce
    solutions to Grid problems
  • Were building a list of such solutions

53
Many Tools Build on, or Can Contribute to,
GT4-Based Grids
  • Condor-G, DAGman
  • MPICH-G2
  • GRMS
  • Nimrod-G
  • Ninf-G
  • Open Grid Computing Env.
  • Commodity Grid Toolkit
  • GriPhyN Virtual Data System
  • Virtual Data Toolkit
  • GridXpert Synergy
  • Platform Globus Toolkit
  • VOMS
  • PERMIS
  • GT4IDE
  • Sun Grid Engine
  • PBS scheduler
  • LSF scheduler
  • GridBus
  • TeraGrid CTSS
  • NEES
  • IBM Grid Toolbox

54
2005 and Beyond
  • We have a solid Web services base
  • We now want to build, on that base, a open source
    service-oriented infrastructure
  • Virtualization
  • New services for provisioning, data management,
    security, VO management
  • End-user tools for application development
  • Etc., etc.

55
Globus and its User Community
  • How can we best support you?
  • We try to provide the best software we can
  • We use bugzilla other community tools
  • We work to grow the set of contributors
  • How can you best support us?
  • Become a contributor of software, bug fixes,
    answers to questions, documentation
  • Provide us with success stories that can justify
    continued Globus development
  • Promote Globus within your communities

56
Working with GT4
  • Download and use the software, and provide
    feedback
  • Join gt4friends_at_globus.org mail list
  • Review, critique, add to documentation
  • Globus Doc Project http//gdp.globus.org
  • Tell us about your GT4-related tool, service, or
    application

57
So
  • GT4 is a significant step forward in the quality,
    functionality and standards compliance of GT.
  • Beta release available for immediate use, final
    April 29th
  • Downloads and docs at
  • www.globustoolkit.org

2nd Edition www.mkp.com/grid2
58
But
59
Things heard about Grids
  • Isnt the Grid just a funding construct? (SC
    01)
  • "Grid computing has been more hype than reality,
    - Hewlett-Packard CEO Carly Fiorina, 10/03
  • Customers don't need the Globus Toolkit to do
    high-performance compute clusters" - Charles
    Fitzgerald, a Microsoft general manager,
    Information Week 1/05
  • We tried to install Globus and found out that it
    was too hard to do. So we decided to just write
    our own.

60
Where are allthe (happy) users?
  • In July 04 I spoke with 25 UK user groups, and
    on occasion it got ugly
  • www.nesc.ac.uk/technical_papers/UKeS-2004-08.pdf
  • Many users have been told to use the Grid to get
    funding, not because they actually want to
  • There are a few well known successes (LHC,
    CACTUS, and a couple others) but this isnt
    widespread enough to be considered more than a
    one-off

61
We expected using Gridsto be a lot of work
  • Parallel computing showed us that they If you
    build it they will come scenario just wont work
  • Until debuggers, fast compilers, languages,
    libraries, etc. the users didnt want to use
    parallel machines
  • Many hundreds, even thousand, of hours went into
    re-writing codes for parallel machines

62
but how much is acceptable?
  • There is the impression (right or wrong) that
    only heroic efforts will allow you to use a Grid
  • Some re-writing of code required
  • Access to resources isnt easy even once code is
    changed

63
Where are we today?
  • What a user would like
  • Run my job, finish by lunch
  • Get a data set that has these attributes
  • Tell me when that simulation will finish
  • Where are we today
  • Specify exact machines, data files, explicit data
    transfers, etc
  • Little (or no) dynamic information or prediction

64
Where are we today (cont)
  • General agreement we have basic functionality
  • Tell me what this set of resources look like
  • Run this job on that resource
  • Transfer this file
  • Globus (among others) does give these basic
    building blocks (mostly)
  • General agreement general functionality isnt
    enough by far

65
How do we move forward?
  • Users will only come when they have decent tools
  • Simple enough for easy use
  • Robust enough for stupid use
  • Still allow work arounds for hard-core use
  • Users are hampered by software that doesnt do
    what they need it to
  • Globus is NOT an end-to-end solution

66
2) Why doesnt Grid softwaredo what we need it
to yet?
  • Globus doesnt provide end-to-end solutions
  • Globus Ecosystem Tutorial
  • Globus is building blocks still missing
    vertical solutions
  • Mismatch between developer vision of use and
    users vision of use
  • Many tools are used off label

67
Off Label Use
  • Tool built to do A is used for B
  • This is good since a user has something to use
  • This is bad since the tool is being in a way that
    wasnt envisaged
  • Arch concerns
  • Scaling concerns
  • Etc
  • But without use of the tool, theres no way to
    know how it will be used!

68
What is a usage scenario?
  • Information from the user about a specific use
    case
  • Whats the right level of detail?
  • Whats a general use case?
  • Note much application built software is one-off,
    but we need general tools that can adapt
  • Who does this?
  • Application scientists and computer scientists
    speak different languages (eg. C. Pancake)

69
Will Grid softwareever meet users needs?
  • Without better communication between developers
    and users, the Grid cannot succeed
  • Grids are about people, not just technology

70
3. Need for StandardsInformation as a Case Study
  • Open question how should I store the
    information about a Grid?
  • Globus Monitoring and Discovery Service (MDS)
  • A tool that does streaming data like R-GMA?
  • A cluster tool over many sites like Ganglia?
  • A certification tool like Inca from the TG
    project?
  • A Grid-wide data base?
  • All of these are right for some of the data, no
    one is right for all uses

71
Why are so many tools bad?
  • Large number of tools isnt bad
  • Large number of tools that have no way to
    interoperate is!
  • Grid3 has 8 different tools in use
  • LHC has an equal number (at least!)

72
Need for Standard Interfaces
  • Need for standard APIs and protocols to allow
    easier
  • Access to data sources
  • Registration of data
  • Archiving tools
  • Standards for what information is available
  • Standards for what that information means
  • Standards for communication of errors
  • This is in part what inspired Globuss move to
    web services!

73
Standards are a necessity,not a luxury
  • Without standards of all kinds protocols, APIs,
    languages - all the information in the world
    wont do us any good
  • Open question about right process for
    standardization
  • GGF, OASIS, IETF
  • Need for standards vs standardizing too soon
  • Need for standard vs time lag for agreement on
    standards

74
4. How do we make Grids secure?
  • Without security we cant have a Grid
  • EVERYTHING needs to be secure-
  • Who can run on a machine
  • File transfers
  • What data does someone have access to (program
    data, system data)
  • Who can access which services?

75
Security vs. Usability
  • Users want security but dont want to deal with
    it
  • If security is hard- it wont be used
  • Most security (including Grid Security
    Infrastructure (GSI)) is based on public key
    infrastructure (PKI)
  • Users have files (public and private keys) that
    must be secure, use reasonable passwords, etc.

76
What about
  • Multiple certificates?
  • Group access?
  • Dynamic policy changes?
  • Scalability?
  • Overheads
  • Etc., etc., etc

77
Security is an open question
  • Until security is made easier to use, it wont be
    used
  • Until security is made easier to manage at the
    group level, it wont be used
  • Without security no one will really use the Grid

78
5. Socio-political Issues
  • Hardest problems are often not technical ones
  • Multiple administration domains means multiple
    policies
  • Multiple countries means multiple communication
    styles
  • Decisions are often made on non-technical basis

79
Communication is hard
  • Too many people in the mix
  • Not everyone is informed of status updates
  • Often hallway conversation becomes what people
    believe
  • Too often assumptions are not verified
  • Many communication styles can lead to
    misunderstandings

80
What to do?
  • Ongoing efforts to continue better communication
    are needed to build a global community
  • When in doubt ask someone of directly!
  • And please constructive criticism, reporting of
    errors, etc just saying Globus Sucks simply
    isnt helpful ?

81
6 Other open problems
  • What performance is acceptable?
  • What do we do about variance?
  • What about easier testbed setup?
  • Where are the benefits to encourage sharing on
    the Grid?
  • Where are the benefits for the sys admins users
    get a plus, PIs get a plus but what about them?
  • How do we educate the funding agencies about the
    need for hardened software, documentation, and
    support?
  • What cost models are needed by the Grid?
  • Economic Grids are only the first step
  • And many more

82
Progress
  • Significant improvements in security
    infrastructure
  • Basic functionality is much closer
  • More funding aid for support
  • Need for better-defined use cases and simpler
    deployment has been strengthened, as has the need
    for basic information and basic information
    services

83
Where are theperformance metrics for success?
  • No more Grid papers, just a footnote that
    states This work was achieved using the Grid
  • Supercomputer centers dont give a user the
    choice of using their machines or the Grid, that
    line doesnt exist (TG does this now!)
  • SuperComputing demos can be run at any time of
    the year

84
Conclusion
  • Many interesting problems are left both in
    terms of research and deployment issues
  • Much work is being done to help address these
    open issues
  • Next years open issues will be different yet

85
References
  • This talk
  • www.mcs.anl.gov/jms/Talks (not there yet)
  • Globus Alliance
  • www.globus.org
  • Globus Performance
  • http//www-unix.globus.org/toolkit/docs/developmen
    t/4.0-drafts/perf_overview.html
  • Journal paper version open questions (dated)
  • www.mcs.anl.gov/Pubs/jmspubs.html
  • Conversations with 25 UK User groups
  • http//www.nesc.ac.uk/technical_papers/UKeS-2004-0
    8.pdf

86
Contact Information
  • Jennifer M. Schopf
  • jms_at_mcs.anl.gov
  • www.mcs.anl.gov/jms
  • Support from DOE, NSF, Microsoft, NeSC, JISC

87
  • Slides on how globus works

88
How Globus Works
  • Globus is a distributed open source community
    with many contributors users
  • CVS, documentation, bugzilla, email lists
  • Modular structure allows many to contribute
  • Globus Alliance Board provides governance when
    needed
  • Meritocracy individuals who demonstrate ongoing
    contributions commitment
  • Primarily what to include, when to release
  • Globus Alliance is an informal partnership of
    organizations led by Board members

89
Evolution of the Globus Alliance
  • Argonne/U.Chicago (Childers, Foster) 1995
  • USC/ISI (Kesselman) 1995
  • Edinburgh (Atkinson, Parsons) 2003
  • Swedish PDC (Johnsson, Mulmo) 2003
  • NCSA (Welch) 2004
  • Univa (Czajkowski, Tuecke) 2004
  • Other contributors will surely be added

90
From eScience to eBusiness
  • Since 2001, growing interest in Globus for
    commercial use
  • Enterprises, IT vendors, ISVs asking Globus
    leaders to address commercial needs
  • But hard to do in a research laboratory
  • In response, we have created two new
    organizations
  • Globus Consortium
  • Univa

91
Globus Consortium(www.globusconsortium.com)
  • Nonprofit organization funded by companies to
    advance Globus Toolkit for enterprise use
  • Initial sponsor members HP, IBM, Intel, Sun
  • Initial contributors Nortel, Univa
  • First two projects already identified
  • Member-driven software quality improvements
  • Contributions to job submission standards
  • Other projects to be defined, e.g.
  • Develop new features key to enterprise use
  • Education outreach

92
  • Provider of commercial support, services,
    products around open source Globus
  • Commercial distribution of GT4 beyond
  • Integration with enterprise systems
  • Committed to open source open standards
  • Founded by Tuecke, Foster, Kesselman
  • Tuecke left Argonne to be CEO
  • Foster, Kesselman remain at Argonne, ISI
  • Experienced management team
  • Rich Miller, Vas Vasiliadis, Paul Davé, Bob
    Mandel

93
Globus and its User Community
  • How can we best support you?
  • We try to provide the best software we can
  • We use bugzilla other community tools
  • We work to grow the set of contributors
  • How can you best support us?
  • Become a contributor of software, bug fixes,
    answers to questions, documentation
  • Provide us with success stories that can justify
    continued Globus development
  • Promote Globus within your communities

94
Working with GT4
  • Download and use the software, and provide
    feedback
  • Join gt4friends_at_globus.org mail list
  • Review, critique, add to documentation
  • Globus Doc Project http//gdp.globus.org
  • Tell us about your GT4-related tool, service, or
    application

95
So
  • GT4 is a significant step forward in the quality,
    functionality and standards compliance of GT.
  • Beta release available for immediate use, final
    April 29th
  • Downloads and docs at
  • www.globustoolkit.org

2nd Edition www.mkp.com/grid2
96
  • Slides on open issues

97
6. What about performance?
  • Its not enough to use the Grid, it has to
    perform otherwise, why bother?
  • First prototypes rarely consider perf.
  • MDS1centralized LDAP
  • MDS2decentralized LDAP
  • MDS3decentralized Grid service
  • MDS4-decentralized Web service
  • Often performance is simply not known

98
Performance of GIS Information Servers vs. Number
of Users
Zhang, Freschl, and Schopf, A Performance Study
of Monitoring and Information Services for
Distributed Systems, submitted to HPDC 2003.
99
What we found
  • Performance can be a matter of deployment
  • Effect of background load
  • Effect of network bandwidth
  • Performance can be affected by underlying
    infrastructure
  • LDAP/JAVA strengths and weaknesses
  • Performance can be improved using standard CS
    techniques
  • Caching multi-threading etc.

100
Moral
  • Performance should be analyzed early and often
  • Prototypes should be recognized as such and
    thrown out
  • Without performance, no reason to use a Grid

101
7. What do we do about variance?
  • Resources on the Grid change with time
  • Bandwidth
  • CPU load
  • Disk space
  • Memory usage
  • Queue sizes

102
Variance technical problem
  • How do you tell if something is slow versus
    broken?
  • How do you make a prediction?

103
Variance socio-political
  • Users want the same application to take roughly
    the same amount of time every time you run it
  • Our experience a longer running time thats
    more predictable is preferred to a high variance,
    high risk situation

104
Moral
  • Variance is here to live with, we need techniques
    to take advantage of it

105
8. How do we set up a Grid testbed?
  • Bill Johnson, LBNL, talks about this often, based
    on IPG experience
  • Get the sys admins involved
  • Have a standard set-up
  • Make this a priority at the start of a project
  • Accounting open issue
  • Cross-site scheduling open issue
  • Policies across sites open issue

106
Example Installing Globus
  • GT2- how do you know its installed ok?
  • Now have test scripts
  • http//www-unix.globus.org/toolkit/testing/
  • But it would be better to have something
    automatic
  • GT3 how many configuration files do I need to
    work with?

107
Moral
  • Users are building testbeds, but this is still
    hard
  • Need to have rule of thumb published for
    assistance with this

108
4. How do we understand information once we get
it?
  • Assume we have access to information about the
    Grid can we use it?
  • Grid3 has 8 different tools in use, that give
    conflicting answers
  • A monitoring system says the load on machine X
    is Y
  • A scheduler wants to evaluate this data
  • No common language for this to be communicated
  • Some effort now to come up with a common schema
    (GLUE schema, work with CIM in GGF) but this only
    touched the surface, no agreement for moving
    forward
Write a Comment
User Comments (0)
About PowerShow.com