History - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

History

Description:

Very large-scale simulation and analysis (galaxy formation, gravity waves, ... Two Cardinal Rules of the Grid. You can't rely on homogeneity. ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 31
Provided by: leeli
Category:
Tags: cardinal | history

less

Transcript and Presenter's Notes

Title: History


1
History
  • For years, a few whacky computer scientists have
    been trying to help other scientists use
    distributed computing.
  • Interactive simulation (climate modeling)
  • Very large-scale simulation and analysis (galaxy
    formation, gravity waves, battlefield simulation)
  • Engineering (parameter studies, linked component
    models)
  • Experimental data analysis (high-energy physics)
  • Image and sensor analysis (astronomy, climate
    study, ecology)
  • Online instrumentation (microscopes, x-ray
    devices, etc.)
  • Remote visualization (climate studies, biology)
  • Engineering (large-scale structural testing,
    chemical engineering)
  • In these cases, the scientific problems are big
    enough that they require people in several
    organizations to collaborate and share computing
    resources, data, instruments.

2
What Types of Problems?
  • Your system administrators cant agree on a
    uniform authentication system, but you have to
    allow your users to authenticate once (using a
    single password) then use services on all
    systems, with per-user accounting.
  • You need to be able to offload work during peak
    times to systems at other companies, but the
    volume of work theyll accept changes from
    day-to-day.

3
What Types of Problems?
  • You and your colleagues have 6000 datasets from
    the past 50 years of studies that you want to
    start sharing, but no one is willing to submit
    the data to a centrally-managed storage system or
    database.
  • You need to run 24 experiments that each use six
    large-scale physical experimental facilities
    operating together in real time.

4
Two Cardinal Rules of the Grid
  • You cant rely on homogeneity.
  • Impossible to achieve in the real world.
  • STRATEGY - Plan on dealing with diverse systems
    and use mechanisms to manage heterogeneity.
  • You cant rely on trust.
  • Severely limits participation.
  • STRATEGY - Provide a security model that can
    express complicated social networks.
  • STRATEGY - Use full disclosure when making
    requests (who is requesting, authorizing, and
    authenticating the request) and give service
    owners and service hosts tools to enforce local
    policies.

5
Challenging Applications
  • The applications that Grid technology is aimed at
    are not easy applications!
  • The reason these things havent been done before
    is because people believed it was too hard to
    bother trying.
  • If youre trying to do these things, youd better
    be prepared for it to be challenging.
  • Grid technologies are aimed at helping to
    overcome the challenges.
  • They solve some of the most common problems
  • They encourage standard solutions that make
    future interoperability easier
  • They were developed as parts of real projects
  • In many cases, they benefit from years of lessons
    from multiple applications
  • Ever-improving documentation, installation,
    configuration, training

6
Earth System Grid
  • Goal Give climate scientists easier access to
    the distributed data and resources that they
    require to perform their research.
  • Developed new technologies for (1) creating and
    operating "filtering servers" capable of
    performing sophisticated analyses, and (2)
    delivering results to users.

7
Collaborative Engineering NEES
U.Nevada Reno
www.neesgrid.org
8
Laser Interferometer Gravitational Wave
Observatory
  • Goal Observe gravitational waves predicted by
    theory.
  • Three physical detectors in two locations. (Plus
    GEO detector in Germany.)
  • Ten data centers for data analysis.
  • Collaborators in 40 institutions on at least
    three continents.

9
Cancer Biology
10
NSFs TeraGrid
  • TeraGrid DEEP Integrating NSFs most powerful
    computers (60 TF)
  • 2 PB Online Data Storage
  • National data visualization facilities
  • Worlds most powerful network (national
    footprint)
  • TeraGrid WIDE Science Gateways Engaging
    Scientific Communities
  • 90 Community Data Collections
  • Growing set of community partnerships spanning
    the science community.
  • Leveraging NSF ITR, NIH, DOE and other science
    community projects.
  • Engaging peer Grid projects such as Open Science
    Grid in the U.S. as peer Grids in Europe and
    Asia-Pacific.
  • Base TeraGrid CyberinfrastructurePersistent,
    Reliable, National
  • Coordinated distributed computing and information
    environment
  • Coherent User Outreach, Training, and Support
  • Common, open infrastructure services

UC/ANL
PSC
PU
NCSA
IU
ORNL
UCSD
UT
  • A National Science Foundation Investment in
    Cyberinfrastructure
  • 100M 3-year construction (2001-2004)
  • 150M 5-year operation enhancement (2005-2009)

Slide courtesy of Ray Bair, Argonne National
Laboratory
11
What End Users Need
Secure, reliable, on-demand access to
data, software, people, and other
resources (ideally all via a Web Browser!)
12
How it Really Happens
ComputeServer
SimulationTool
ComputeServer
WebBrowser
WebPortal
RegistrationService
Camera
TelepresenceMonitor
DataViewerTool
Camera
Database service
ChatTool
DataCatalog
Database service
CredentialRepository
Database service
Certificate authority
Resources implement standard access management
interfaces
Collective services aggregate /or virtualize
resources
Users work with client applications
Application services organize VOs enable access
to other services
13
How it Really Happens
  • Implementations are provided by a mix of
  • Application-specific code
  • Off the shelf tools and services
  • Tools and services from the Globus Toolkit
  • Tools and services from the Grid community
    (compatible with GT)
  • Glued together by
  • Application development
  • System integration

14
Globus Philosophy
  • Globus was first established as an open source
    project in 1996
  • The Globus Toolkit is open source to
  • Allow for inspection
  • for consideration in standardization processes
  • Encourage adoption
  • in pursuit of ubiquity and interoperability
  • Encourage contributions
  • harness the expertise of the community
  • The Globus Toolkit is distributed under the
    (BSD-style) Apache License version 2

15
dev.globus
  • Governance model based on Apache Jakarta
  • Consensus based decision making
  • Globus software is organized as several dozen
    Globus Projects
  • Each project has its own Committers responsible
    for their products
  • Cross-project coordination through shared
    interactions and committers meetings
  • A Globus Management Committee
  • Overall guidance and conflict resolution

16
http//dev.globus.org
Guidelines(Apache Jakarta) Infrastructure(CVS,
email,bugzilla, Wiki) Projects Include
17
Open Source ! Free time
  • Globus development is well-funded.
  • The open source model facilitates contributions.
  • NSF and DOE sponsor Globus development at several
    institutions via multiple grants, totaling
    gt5M/yr.
  • Non-U.S. science agencies also contribute to
    Globus development.
  • Corporations also sponsor developers.
  • NSF explicitly funds Globus improvements.
  • CDIGS Community-Driven Improvements to Globus
    Software

18
Globus Technology Areas
  • Core runtime
  • Infrastructure for building new services
  • Security
  • Apply uniform policy across distinct systems
  • Execution management
  • Provision, deploy, manage services
  • Data management
  • Discover, transfer, access large data
  • Monitoring
  • Discover monitor dynamic services

19
Globus Software dev.globus.org
Globus Projects
OGSA-DAI
GT4
MPICH- G2
Data Rep
Replica Location
Java Runtime
MyProxy
Delegation
GridWay
GridFTP
MDS4
CAS
C Runtime
GSI- OpenSSH
Incubator Mgmt
Reliable File Transfer
GRAM
Python Runtime
C Sec
GT4 Docs
Incubator Projects
Cog WF
GAARDS
Virt WkSp
MEDICUS
OGRO
GDTE
UGP
GridShib
Dyn Acct
Gavia JSC
DDM
Metrics
LRMA
HOC-SA
PURSE
Introduce
WEEP
Gavia MS
SGGC
ServMark
Security
Execution Mgmt
Info Services
Common Runtime
Other
Data Mgmt
20
What Is the Globus Toolkit?
  • The Globus Toolkit is a collection of solutions
    to problems that frequently come up when trying
    to build collaborative distributed applications.
  • Heterogeneity
  • To date (v1.0 - v4.0), the Toolkit has focused on
    simplifying heterogenity for application
    developers.
  • We are increasingly including more vertical
    solutions that implement typical application
    patterns.
  • Security
  • The Grid Security Infrastructure (GSI) allows
    collaborators to share resources without blind
    trust.
  • Standards
  • Our goal has been to capitalize on and encourage
    use of existing standards (IETF, W3C, OASIS,
    GGF).
  • The Toolkit also includes reference
    implementations of new/proposed standards in
    these organizations.

21
Whats In the Globus Toolkit?
  • A Grid development environment
  • Develop new OGSA-compliant Web Services
  • Develop applications using Java or C/C Grid
    APIs
  • Secure applications using basic security
    mechanisms
  • A set of basic Grid services
  • Job submission/management
  • File transfer (individual, queued)
  • Database access
  • Data management (replication, metadata)
  • Monitoring/Indexing system information
  • Tools and Examples
  • The prerequisites for many Grid community tools

22
Leveraging Existingand Proposed Standards
  • SSL/TLS v1 (from OpenSSL) (IETF)
  • LDAP v3 (from OpenLDAP) (IETF)
  • X.509 Proxy Certificates (IETF)
  • GridFTP v1.0 (GGF)
  • OGSI v1.0 (GGF)
  • WSRF (OASIS)
  • And others on the road to standardization
  • DAI, WS-Agreement, WSDL 2.0, WSDM, SAML, XACML

23
Areas of Competence
  • Connectivity Layer Solutions
  • Service Management (WS Core)
  • Monitoring/Discovery (WS Core)
  • Security (GSI and WS-Security)
  • Communication (XIO)
  • Resource Layer Solutions
  • Computing / Processing Power (GRAM)
  • Data Access/Movement (GridFTP, OGSA-DAI)
  • In development Telecontrol (GTCP)
  • Collective Layer Solutions
  • Data Management (RLS, DRS, RFT, OGSA-DAI)
  • Monitoring/Discovery (Index, Trigger, Archiver
    services)
  • Security (CAS, MyProxy)

24
How To Use the Globus Toolkit
  • By itself, the Toolkit has surprisingly limited
    end user value.
  • Theres very little user interface material
    there.
  • You cant just give it to end users (scientists,
    engineers, marketing specialists) and tell them
    to do something useful!
  • The Globus Toolkit is useful to application
    developers and system integrators.
  • Youll need to have a specific application or
    system in mind.
  • Youll need to have the right expertise.
  • Youll need to set up prerequisite
    hardware/software.
  • Youll need to have a plan.

25
An Ecosystem of Grid Software
  • There isnt a Grid software kit for everybody
    (yet).
  • Varying requirements
  • Experimentation and learning
  • Reluctance to invest in a static solution
  • There are many tools that work well together.
  • Results of successful projects
  • Reusable solutions
  • Implication Integrate it yourself (for now).
  • Provides considerable flexibility
  • Requires expertise and effort
  • Reminder These are ambitious applications!

26
Methodology
  • Building a Grid system or application is
    currently an exercise in software integration.
  • Define user requirements
  • Derive system requirements or features
  • Survey existing components
  • Identify useful components
  • Develop components to fit into the gaps
  • Integrate the system
  • Deploy and test the system
  • Maintain the system during its operation
  • This should be done iteratively, with many loops
    and eddies in the flow.

27
Globus User Community
  • Large diverse
  • 10s of national Grids, 100s of applications,
    1000s of users probably much more
  • Every continent except Antarctica
  • Applications ranging across many sciences
  • Dozens (at least) of commercial deployments
  • Successful
  • Many production systems doing real work
  • Many applications producing real results
  • Smart, energetic, demanding
  • Constant stream of new use cases tools

28
GlobalCommunity
29
Examples ofProduction Scientific Grids
  • APAC (Australia)
  • China Grid
  • China National Grid
  • DGrid (Germany)
  • EGEE
  • NAREGI (Japan)
  • Open Science Grid
  • Taiwan Grid
  • TeraGrid
  • ThaiGrid
  • UK Natl Grid Service

30
The Importance of Community
  • All Grid technology is evolving rapidly.
  • Web services standards
  • Grid interfaces
  • Grid implementations
  • Grid hosting services (ASP, SSP, etc.)
  • Community is important!
  • Best practices (GGF, OASIS, etc.)
  • Open source (Linux, Axis, Globus, etc.)
  • Application of community standards is vital.
  • Increases leverage
  • Mitigates (a bit) effects of rapid evolution
  • Paves the way for future integration/partnership
Write a Comment
User Comments (0)
About PowerShow.com