The GRID Era Vanguard, Miami 23 September 2002 - PowerPoint PPT Presentation

About This Presentation
Title:

The GRID Era Vanguard, Miami 23 September 2002

Description:

The GRID Era Vanguard, Miami 23 September 2002 Gordon Bell gbell_at_microsoft.com Bay Area Research Center Microsoft Corporation Grid Technology Background ... – PowerPoint PPT presentation

Number of Views:169
Avg rating:3.0/5.0
Slides: 47
Provided by: Gordo8
Category:

less

Transcript and Presenter's Notes

Title: The GRID Era Vanguard, Miami 23 September 2002


1
The GRID EraVanguard, Miami23 September 2002
  • Gordon Bell gbell_at_microsoft.com
  • Bay Area Research Center
  • Microsoft Corporation

2
(No Transcript)
3
Grid Technology
  • Background
  • Taxonomy
  • Grids from seti_at_home to arbitrary cluster
    platform
  • Grid-type examples and web services
  • Summary

4
Bright spots in the evolution from prototypes
to early suppliers
  • Early efforts
  • UC/Berkeley NOW U of WI Condor NASA
    BeowulfgtAirframes
  • Argonne (Foster el al) Grid Globus Toolkit,
    Grid Forum
  • Entropia startup (Andrew Chien)
  • Andrew Grimshaw - Avaki
  • Making Legion vision real. A reality check.
  • United Devices MetaProcessor Platform
  • UK e-Sciences research program. Apps-based
    funding. Web services based Grid data
    orientation.
  • Nimrod at Monash University
  • Parameter scans other low hanging fruit
  • Encapsulate apps! Excel-- language/control
    mgmt.
  • Legacy apps. No time or resources to modify
    code independent of age, author, or language
    e.g. Java.
  • Grid Services Gray et al Skyservice and
    Terraservice
  • Goal providing a web service must be as easy as
    publishing and using a web pageand will occur!!!

5
Grid Taxonomy c2002
X
  • Taxonomy interesting vs necessity
  • Cycle scavenging and object evaluation (e.g.
    seti_at_home, QCD)
  • File distribution/sharing for IP theft e.g.
    Napster
  • Databases programs for a community(astronomy,
    bioinformatics, CERN, NCAR)
  • Workbenches web workflow chem, bio
  • Exchanges many sites operating together
  • Single, large objectified pipeline e.g. NASA.
  • Grid as a cluster platform! Transparent
    arbitrary access including load balancing
  • Homogeneous/heterogeneous computers
  • Fixed or variable network loading
  • Intranet, extranet, internet (many organizations)

Web SVCs
6
Grids Ready for prime time.
  • Economics thief, scavenger, power, efficiency or
    resource e.g. programs and database sharing?
  • Embarrassingly parallel apps e.g. parameter
    scans killer apps
  • Coupling large, separated apps
  • Entry points for web services
  • Research funding thats where the money is.

7
Grid ComputingConcepts, Appplications, and
Technologies
  • Ian Foster
  • Mathematics and Computer Science Division
  • Argonne National Laboratory
  • and
  • Department of Computer Science
  • The University of Chicago
  • www.mcs.anl.gov/foster/talks.htm.

Grid Computing in Canada Workshop, University of
Alberta, May 1, 2002
8
Globus Toolkit
  • A software toolkit addressing key technical
    problems in the development of Grid-enabled
    tools, services, and applications
  • Offer a modular set of orthogonal services
  • Enable incremental development of grid-enabled
    tools and applications
  • Implement standard Grid protocols and APIs
  • Available under liberal open source license
  • Large community of developers users
  • Commercial support

9
Globus Toolkit Core Services(work in progress
since 1996)
  • Small, standards based set of protocols.Embedded
    in Open Source ToolkitEnabling web services and
    applications
  • Scheduling (Globus Resource Alloc. Manager)
  • Low-level scheduler API
  • Information (Directory Service) UDDI
  • Uniform access to structure/state information
  • Communications (Nexus)
  • Multimethod communication QoS management
  • Security (Globus Security Infrastructure)
  • Single sign-on, key management
  • Health and status (Heartbeat monitor)
  • Remote file access (Global Access to Storage)

10
Living in an Exponential World(1) Computing
Sensors
  • Moores Law transistor count doubles each 18
    months

Magnetohydro- dynamics star formation
11
The 13.6 TF TeraGridComputing at 40 Gb/s
Site Resources
Site Resources
26
HPSS
HPSS
4
24
External Networks
External Networks
8
5
Caltech
Argonne
External Networks
External Networks
NCSA/PACI 8 TF 240 TB
SDSC 4.1 TF 225 TB
Site Resources
Site Resources
HPSS
UniTree
NCSA, SDSC, Caltech, Argonne
www.teragrid.org
12
Access Grid
  • High-end group work and collaboration technology
  • Grid services being used for discovery,
    configuration, authentication
  • O(50) systems deployed worldwide
  • Basis for SC2001 SC Global event in November
    2001
  • www.scglobal.org

www.accessgrid.org
13
(No Transcript)
14
Grids at NASA Aviation Safety
Wing Models
  • Lift Capabilities
  • Drag Capabilities
  • Responsiveness

Stabilizer Models
Airframe Models
  • Deflection capabilities
  • Responsiveness

Crew Capabilities - accuracy - perception -
stamina - re-action times - SOPs
Engine Models
  • Braking performance
  • Steering capabilities
  • Traction
  • Dampening capabilities
  • Thrust performance
  • Reverse Thrust performance
  • Responsiveness
  • Fuel Consumption

Landing Gear Models
15
A Large Virtual Organization CERNs Large Hadron
Collider
  • 1800 Physicists, 150 Institutes, 32 Countries
  • 100 PB of data by 2010 50,000 CPUs?

16
Life Sciences Telemicroscopy
DATA ACQUISITION
PROCESSING,ANALYSIS
ADVANCEDVISUALIZATION
NETWORK
COMPUTATIONALRESOURCES
IMAGING INSTRUMENTS
LARGE DATABASES
17
Nimrod/G and GriddLeS Grid Programming with Ease
  • David Abramson
  • Monash University
  • DSTC

18
Building on Legacy Software
  • Nimrod
  • Support parametric computation without
    programming
  • High performance distributed computing
  • Clusters (1994 1997)
  • The Grid (1997 - ) (Added QOS through
    Computational Economy)
  • Nimrod/O Optimisation on the Grid
  • Active Sheets Spreadsheet interface
  • GriddLeS
  • General Grid Applications using Legacy Software
  • Whole applications as components
  • Using no new primitives in application

19
Parametric Execution
  • Study the behaviour of some of the output
    variables against a range of different input
    scenarios.
  • Allows real time analysis for many applications
  • More realistic simulations
  • More rigorous science
  • More robust engineering

20
Some science is hitting a wallFTP and GREP are
not adequate (Jim Gray)
  • You can FTP 1 MB in 1 sec.
  • You can FTP 1 GB / min.
  • 2 days and 1K
  • 3 years and 1M
  • You can GREP 1 GB in a minute
  • You can GREP 1 TB in 2 days
  • You can GREP 1 PB in 3 years.
  • 1PB 10,000 gtgt 1,000 disks
  • At some point you need indices to limit
    search parallel data search and analysis
  • Goal using dbases. Make it easy to
  • Publish Record structured data
  • Find data anywhere in the network
  • Get the subset you need!
  • Explore datasets interactively
  • Database becomes the file system!!!

21
SkyServer delivering a web service to the
astronomy community. Prototype for other
sciences? Gray, Szalay, et al
  • First paper on the SkyServer
  • http//research.microsoft.com/gray/Papers/MSR_TR
    _2001_77_Virtual_Observatory.pdf
  • http//research.microsoft.com/gray/Papers/MSR_TR
    _2001_77_Virtual_Observatory.doc
  • Later, more detailed paper for database community
  • http//research.microsoft.com/gray/Papers/MSR_TR
    _01_104_SkyServer_V1.pdf
  • http//research.microsoft.com/gray/Papers/MSR_TR
    _01_104_SkyServer_V1.doc

22
What can be learned from Sky Server?
  • Its about data, not about harvesting flops
  • 1-2 hr. query programs versus 1 wk programs based
    on grep
  • 10 minute runs versus 3 day compute searches
  • Database viewpoint. 100x speed-ups
  • Avoid costly re-computation and searches
  • Use indices and PARALLEL I/O. Read / Write gtgt1.
  • Parallelism is automatic, transparent, and just
    depends on the number of computers/disks.
  • Limited experience and talent to use dbases.

23
Sloan Digital Sky Survey Analysis
Size distribution of galaxy clusters?
24
Network concerns
  • Very high cost
  • (1 1) / GByte to send on the net Fedex and
    160 GByte shipments are cheaper
  • DSL at home is 0.15 - 0.30
  • Disks cost less than 2/GByte to purchase
  • Low availability of fast links (last mile
    problem)
  • Labs universities have DS3 links at most, and
    they are very expensive
  • Traffic Instant messaging, music stealing
  • Performance at desktop is poor
  • 1- 10 Mbps very poor communication links
  • Manage trade-in fast links for cheap links!!

25
For More Information
  • www.gridtoday.com
  • Grid concepts, projects
  • www.mcs.anl.gov/foster
  • The Globus Project
  • www.globus.org
  • Open Grid Services Arch.
  • www.globus.org/ogsa
  • Global Grid Forum
  • www.gridforum.org
  • GriPhyN project
  • www.griphyn.org
  • Avika, Entropia, UK eSciences, Condor,
  • Grid books in press

Published July 1998
26
The EndAre GRIDs already a real, useful,
computing structure?When will Grids be
ubiquitous?
27
Toward a Framework for Preparing and Executing
Adaptive Grid Programs An Overview of the GrADS
Project Sponsored by NSF NGS Ken Kennedy Center
for High Performance Software Rice
University http//www.cs.rice.edu/ken/Presentati
ons/GrADSOverview.pdf
28
GrADS Vision
  • Build a National Problem-Solving System on the
    Grid
  • Transparent to the user, who sees a
    problem-solving system
  • Software Support for Application Development on
    Grids
  • Goal Design and build programming systems for
    the Grid that broaden the community of users who
    can develop and run applications in this complex
    environment
  • Challenges
  • Presenting a high-level application development
    interface
  • If programming is hard, the Grid will not not
    reach its potential
  • Designing and constructing applications for
    adaptability
  • Late mapping of applications to Grid resources
  • Monitoring and control of performance
  • When should the application be interrupted and
    remapped?

29
Today Globus
  • Developed by Ian Foster and Carl Kesselman
  • Grew from the I-Way (SC-95)
  • Basic Services for distributed computing
  • Resource discovery and information services
  • User authentication and access control
  • Job initiation
  • Communication services (Nexus and MPI)
  • Applications are programmed by hand
  • Many applications
  • User responsible for resource mapping and all
    communication
  • Existing users acknowledge how hard this is

30
GrADSoft Architecture
Program Preparation System
31
Configurable Object Program
  • Goal Provide minimum needed to automate resource
    selection and program launch
  • Code
  • Today MPI program
  • Tomorrow more general representations
  • Mapper
  • Defines required resources and affinities to
    specialized resources
  • Given a set of resources, maps computation to
    those resources
  • Optimal performance, given all requirements met
  • Performance Model
  • Given a set of resources and mapping, estimates
    performance
  • Serves as objective function for Resource
    Negotiator/Scheduler

32
GrADSoft Architecture
Execution Environment
33
Grid nj. An arbitrary distributed, cluster
platform
  • A geographical and multi-organizational
    collection of diverse computers dynamically
    configured as cluster platforms responding to
    arbitrary, ill-defined jobs thrown at it.
  • Costs are not necessarily favorable e.g. disks
    are less expensive than cost to transfer data.
  • Latency and bandwidth are non-deterministic, gt
    cluster with unknown, dynamic parameters
  • Once a large body of data exists for a job, it is
    inherently bound to (set into) fixed resources.
  • Large datasets I/O bound programs need to be
    with their data or be database accesses
  • But are there resources there to share?
  • Costs may vary, depending on organization

34
Cactus(Allen, Dramlitsch, Seidel, Shalf, Radke)
  • Modular, portable framework for parallel,
    multidimensional simulations
  • Construct codes by linking
  • Small core (flesh) mgmt services
  • Selected modules (thorns) Numerical methods,
    grids domain decomps, visualization and
    steering, etc.
  • Custom linking/configuration tools
  • Developed for astrophysics, but not
    astrophysics-specific

Thorns
Cactus flesh
www.cactuscode.org
35
Cactus ExampleTerascale Computing
  • Solved EEs for gravitational waves (real code)
  • Tightly coupled, communications required through
    derivatives
  • Must communicate 30MB/step between machines
  • Time step take 1.6 sec
  • Used 10 ghost zones along direction of machines
    communicate every 10 steps
  • Compression/decomp. on all data passed in this
    direction
  • Achieved 70-80 scaling, 200GF (only 14 scaling
    without tricks)

36
Grid Projects in eScience
37
Nimrod/G and GriddLeS Grid Programming with Ease
  • David Abramson
  • Monash University
  • DSTC

38
Distributed computing comes to the rescue .
For each scenario Generate input files Copy
them to remote node Run SMOG model Post-process
output files Copy results back to root
39
Its just too hard!
  • Doing by hand
  • Nightmare!!
  • Programming with (say) MPI
  • Overkill
  • No fault tolerance
  • Codes no longer work as stand alone code.
  • Scientists dont want to know about underlying
    technologies

40
Building on Legacy Software
  • Nimrod
  • Support parametric computation without
    programming
  • High performance distributed computing
  • Clusters (1994 1997)
  • The Grid (1997 - ) (Added QOS through
    Computational Economy)
  • Nimrod/O Optimisation on the Grid
  • Active Sheets Spreadsheet interface
  • GriddLeS
  • General Grid Applications using Legacy Software
  • Whole applications as components
  • Using no new primitives in application

41
Parametric Execution
  • Study the behaviour of some of the output
    variables against a range of different input
    scenarios.
  • Allows real time analysis for many applications
  • More realistic simulations
  • More rigorous science
  • More robust engineering

42
In Nimrod, an application doesnt know it has
been Grid enabled
Input Files Substitution
Output Files
Computational Nodes
Root Machine
43
How does a user develop an application using
Nimrod?
Description of Parameters PLAN FILE
44
GriddLeS
  • Significant body of useful applications that are
    not Grid Enabled
  • Lessons from Nimrod
  • Users will avoid rewriting applications if
    possible
  • Applications need to function in the Grid and
    standalone
  • Users are not experts in parallel/distributed
    computing
  • General Grid computations have much more general
    interconnections than possible with Nimrod.
  • Legacy Applications are Components!

45
GriddLeS
  • Specification of the interconnections between
    components
  • Interfaces for discovering resources and mapping
    the computations to them
  • Locate data files in the grid and connect the
    applications to them
  • Schedule computations on the underlying platforms
    and making sure the network bandwidth is
    available and
  • Monitor the progress of the grid computation and
    reassign work to other parts of the Grid as
    necessary.

46
Today Condor
  • Support for matching application requirements to
    resources
  • User and resource provider write ClassAD
    specifications
  • System matches ClassADs for applications with
    ClassADs for resources
  • Selects the best match based on a
    user-specified priority
  • Can extend to Grid via Globus (Condor-G)
  • What is missing?
  • User must handle application mapping tasks
  • No dynamic resource selection
  • No checkpoint/migration (resource re-selection)
  • Performance matching is straightforward
  • Priorities coded into ClassADs
Write a Comment
User Comments (0)
About PowerShow.com