Grid Computing - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

Grid Computing

Description:

Particle physics research. Astronomical observation analysis. 8/25/09. 23 ... 7 terabytes per hospital per year. Dominated by digital images. 8/25/09. 26 ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 56
Provided by: webpag5
Learn more at: http://webpage.pace.edu
Category:
Tags: computing | grid

less

Transcript and Presenter's Notes

Title: Grid Computing


1
Grid Computing
  • DCS861A Emerging Computing II
  • Spring 2005
  • DPS Team 2
  • 13/08/2014

2
What is Grid Computing?
  • a type of parallel and distributed system that
    enables the sharing, selection, and aggregation
    of geographically distributed "autonomous"
    resources dynamically at runtime depending on
    their availability, capability, performance,
    cost, and users' quality-of-service requirements.

Source Grid Computing Info Centre
(www.gridcomputing.com)
3
What is Grid Computing?
  • a type of parallel and distributed system that
    enables the sharing, selection, and aggregation
    of geographically distributed "autonomous"
    resources dynamically at runtime depending on
    their availability, capability, performance,
    cost, and users' quality-of-service requirements.

Source Grid Computing Info Centre
(www.gridcomputing.com)
4
What is Grid Computing?
  • a type of parallel and distributed system that
    enables the sharing, selection, and aggregation
    of geographically distributed "autonomous"
    resources dynamically at runtime depending on
    their availability, capability, performance,
    cost, and users' quality-of-service requirements.

Source Grid Computing Info Centre
(www.gridcomputing.com)
5
Where Are These Resources?
  • Mainframes are idle about 35 of the time
  • UNIX servers are actually "serving" something
    less than 15 of the time
  • And most PCs do nothing for 95 of a typical day
  • Imagine an airline with 85 of its fleet on the
    ground, an automaker with 35 of its assembly
    plants idle, a hotel chain with 95 of its rooms
    unoccupied!

6
Computing Grid As Utility
  • A common metaphor in the literature
  • a computing grid is analogous to electric
    power network (grid) where power generators are
    distributed, but the users are able to access
    electric power without bothering about the source
    of energy and its location.
  • ? Grid Computing Info Centre

7
Grid as Utility Origins
  • Early on in 1969, Len Kleinrock, one of the
    original Arpanet designers, wrote
  • We will probably see the spread of computer
    utilities, which, like present electric and
    telephone utilities, will service individual
    homes and offices across the country.

8
On-demand, Dispersed Resources
  • Decouples production
  • consumption, enabling
  • On-demand access
  • Economies of scale
  • Consumer flexibility
  • New devices

Quality, economies of scale
Time
Source Ian Foster, U. of Chicago
9
Grid Computing Scales
Cluster Grids Enterprise Grids Global
Grids
10
But Computing isnt Electricity
  • Usually users only consume electricity, they
    dont also produce it ? software applications
    both consume and produce data
  • Computing is not a homogenous thing, but is
    highly heterogeneous data, sensors, services,
    software, computing hardware,

11
But Computing isnt Electricity
  • This complicates things but, it means that the
    result can be greater than the sum of the parts
  • Also it raises some fundamental questions
  • Building applications that exploit the
    infrastructure?
  • Operating such a complex environment?
  • Managing heterogeneous resources not centrally
    owned?
  • Ensuring QoS across these distributed services?

12
Another Way of Looking at Grids
  • From a less technical viewpointGrid computing
    has emerged as an important new field,
    distinguished from conventional distributed
    computing by its focus on large-scale resource
    sharing, innovative applications, and, in some
    cases, high-performance orientation...we define
    the "Grid problemas flexible, secure,
    coordinated resource sharing among dynamic
    collections of individuals, institutions, and
    resources - what we refer to as virtual
    organizations.

The Anatomy of the GridEnabling Scalable Virtual
OrganizationsIan Foster, Carl Kesselman, Steven
TueckeIntl. Journal Supercomputer Applications,
2001
13
Virtual Organizations (VOs)
  • In VOs a grid infrastructure is more a means to
    an end
  • Enables integration sharing of distributed
    resources
  • Removes geographical constraints on teams
  • Creates consistent qualities of service via
    fault-tolerance, dynamic workload balancing, etc.

14
Grid History I-WAY ? A Seminal Event
  • Experiment led by researchers at the University
    of Illinois at Chicago and Argonne National
    Laboratory
  • For a week in Nov 95, it linked 11 research
    networks to create one high-speed network
    infrastructure
  • Connected 17 sites across the US and Canada
  • Demonstrated 60 applications, from distributed
    computing to virtual reality collaboration
  • Attempted to construct a unified software
    infrastructure providing scheduling, single
    sign-on, and other grid-enabled services

15
Early Grids Govt.-funded Science
  • GUSTO (1998) 80 global research sites
  • 3,000 host grid software testbed
  • NASA Information Power Grid (since 1999)
  • Production grid linking NASA laboratories
  • INFN Grid, EU DataGrid, iVDGL, (2001)
  • Grids for data-intensive science
  • TeraGrid, DOE Science Grid (2002)
  • Production grids linking supercomputer centers
  • U.S. GRIDS Center
  • Software packaging, deployment, support

16
Why are Grids Hot Now?
  • Hardware performance improving exponentially
  • Computer speed doubles every 18 months
  • Network speed doubles every 9 months
  • Difference order of magnitude every 5 years
  • 1986 to 2000
  • Computers x 500
  • Networks x 340,000
  • 2001 to 2010
  • Computers x 60
  • Networks x 4,000

Moores Law vs. storage improvements vs. optical
improvements. Graph from Scientific American
(Jan-2001) by Cleo Vilett, source Vined Khoslan,
Kleiner, Caufield and Perkins.
17
Why are Grids Hot Now?
  • Grids begin to address some real world IT issues
  • Low overall utilization of enterprise resources
  • High cost of provisioning for peak demand
  • Lack of information integration
  • Physical distribution of teams is increasing
  • Inability to apply available resources to
    advanced computation data-intensive
    applications when and where they are needed
  • However, the marketing hype is outrageous every
    possible SW HW product has been gridified

18
Early Commercial Adopters
  • Aerospace and Automotive (for collaborative
    design and modelling)
  • Architecture (engineering and construction)
  • Electronics (design and testing)
  • Energy (for oil and gas for exploration)
  • Finance/insurance/real estate (securities and
    brokerage especially for stock/portfolio
    analysis and risk management)

19
Early Commercial Adopters
  • Life sciences (particularly in pharmaceuticals)
  • Manufacturing (inter/intra-team collaborative
    design, process management)
  • Media/entertainment (to generate digital
    animation)
  • Utilities (to improve efficiency while dealing
    with peaks and valleys in utilization)

20
Grid Market Projections
  • Leading adopters (Oct 2003)
  • Financial services 31
  • Life sciences 26
  • Manufacturing 18

Grid Services Market Opportunities 2005
Sources IDC, 2000 and Bear Stearns- Internet 3.0
- 5/01 Analysis by SAI

21
Example Adopter Novartis
  • PC-based grid of 3,700 desktop systems
  • RD pharmaceutical applications
  • Potentially mainstream business computing
  • gt 5 teraflop/s computing power
  • Estimated savings of 200M over 3 years
  • We have projects we calculate would take 6
    years on a single supercomputer. Today, the
    run time is 12 hours.
  • ? Peter Sany, Novartis CIO

22
Grid Application Attributes
  • Computational complexity
  • Genome research
  • Financial product creation
  • Geophysical studies
  • Digital animation creation
  • Massive data requirements
  • Digital mammography diagnostics
  • Particle physics research
  • Astronomical observation analysis

23
Computational Complexity Protein Analysis
  • Example Determining the structure of a complex
    molecule, such as the cholera toxin shown here,
    is the kind of computationally intense operation
    that grids are intended to tackle(Adapted from
    G. von Laszewski et al., Cluster Computing,
    volume 3(3), page 187, 2000)

24
Massive Data Requirements
  • Storage density doubling every 12 months
  • Dramatic growth in online data (1 petabyte 1000
    terabytes 1,000,000 gigabytes)
  • 2000 0.5 petabyte
  • 2005 10 petabytes
  • 2010 100 petabytes
  • 2015 1000 petabytes?
  • These are sometimes called data grids

25
Massive Data Requirements Digital Mammography
  • Digital Radiology (hospital digital data)
  • Mammogram X-rays
  • MRI / CAT scans
  • Endoscopies
  • Very large data sources
  • 7 terabytes per hospital per year
  • Dominated by digital images

26
Massive Data RequirementsDigital Mammography
  • Why target mammography?
  • Increasing need for film recall computer
    analysis
  • Large volumes (4,000 GB/year ? 57 of total)
  • Storage and records standards exist
  • Great clinical value

27
Grid Management Challenges
  • Scale of data and compute resources is huge
  • QoS and performance criteria are severe
  • Platform must be scalable, able to evolve,
    fault-tolerant, robust, persistent and reliable
  • It should work seamlessly, and transparently
    the user might not know or care where their
    calculation is done using how many machines, or
    where data is actually held

28
Grid Management Challenges
  • Resource configurations are transient, dynamic
    and volatile as services (databases, sensors,
    compute servers) are switched in and out
  • They are ad-hoc as service consortia have no
    central location or control and no existing trust
    relationships
  • They may be large, with hundreds of services
    orchestrated at any time
  • They may be long-lived, for example a protein
    folding simulation could take weeks

29
Technical Challenges
How does a grid infrastructure, in a dynamic,
multi-institutional, physically distributed
setting,
  • Locate suitable computers?
  • Authenticate authorize user requests?
  • Allocate resources on those computers?
  • Select appropriate communication methods?
  • Configure the computations?
  • Initiate these computations on those computers?
  • Access data files and return output?
  • Respond appropriately to resource changes?

30
Grid Software Sources
  • Academic Scientific Researchers
  • U. of Chicago USC (Globus Toolkit)
  • UC Berkeley (BOINC)
  • Public consortium-based organizations
  • Global Grid Forum (OGSA)
  • Commercial Vendors
  • IBM, Entropia, United Devices, etc.

31
Globus Toolkit (www.globus.org)
  • Early open-source grid infrastructure toolkit
  • Set of protocols, services software libraries
    that supports grids and grid applications
  • Includes software for
  • security
  • information infrastructure
  • resource management
  • data management
  • communication
  • fault detection
  • portability

32
Evolving Open Grid Standards
Managed shared virtual systems
Research
Open Grid Services Arch
Web services, etc.
Real standards Multiple implementations
Increased functionality, standardization
Globus Toolkit
Internet standards
Defacto standard Single implementation
Custom solutions
1990
1995
2000
2005
2010
33
OGSA (www.gridforum.org)
  • Grid technologies ? including the Globus Toolkit
    ? are evolving toward the Open Grid Services
    Architecture (OGSA)
  • OGSA provides an extensible set of services that
    virtual organizations can aggregate in various
    ways
  • Built on concepts and technologies from both the
    Grid and Web services communities

34
OGSA
  • OGSA defines
  • Grid service semantics (like Web services)
  • Standard mechanisms for creating, naming,
    discovering transient grid service instances
  • Location transparency and multiple protocol
    bindings for service instances
  • Support for integration with underlying native
    platform facilities

35
OGSA
  • OGSA also supports (via WSDL)
  • creating/composing complex distributed systems
  • lifetime management
  • change management
  • notification
  • reliable invocation
  • authentication authorization

36
Grid Standards Summary
  • Grid Services and Web Services are merging
  • Web Services standards landscape is in flux
  • OGSA will need to evolve with it
  • Fuzzy security policy standards are a concern
  • W3C, OASIS, GGF are key standards orgs
  • Open source software important for adoption

37
Some Commercial Grid Software Vendors
  • IBM (www.ibm.com/grid)
  • Avaki (www.avaki.com)
  • GridIron Software (www.gridironsoftware.com)
  • United Devices (www.ud.com)
  • Platform Computing (www.platform.com)
  • DataSynapse (www.datasynapse.com)
  • Entropia (www.entropia.com)
  • Oracle 10g (www.oracle.com/technologies/grid)

38
Wait a second! What about
  • SETI_at_home (extra-terrestrial signal search)
  • GIMPS (Great Internet Mersenne Prime Search)
  • folding_at_home (protein manipulation)
  • Distributed.net (brute force decryption)
  • and all those other Internet grid projects
    Ive been reading about?

39
Public Resource Computing
  • These are all examples of what Dave Anderson of
    Berkeley calls public resource computing
  • Most of the world's computing power is no longer
    in supercomputer centers or institutional machine
    rooms
  • Instead, it is now distributed in the hundreds of
    millions of personal computers, game consoles,
    and TV set-top boxes
  • If all this computing power could be made
    available to researchers somehow

40
Hallmarks of Public Resource Computing
  • Public resource computing shares some traits with
    grid computing, but is qualitatively different
  • Open vs. closed society of resources
  • Asymmetric usage more suppliers of resources
    than consumers, e.g., millions of PC screensavers
    vs. small team of researchers
  • Must be able to attract altruistic participants
  • Often some reward mechanisms will exist for
    resource suppliers

41
Public Resource Application Profile
  • High computing to data ratio is typical
  • Computation independence parallelism is crucial
  • Must be tolerant to errors and outages
  • Must be able to handle malicious users
  • Sporadic connectedness is the norm

42
Public Resource vs. Grid Computing
Source David Anderson, BOINC project (UC
Berkeley)
43
Example SETI_at_home
  • SETI Search for Extraterrestrial Intelligence
  • Goal detect intelligent life outside the Earth
  • Uses radio telescopes to listen for
    narrow-bandwidth radio signals (not known to
    occur naturally) from space
  • Initial version used hand-crafted server
    architecture and workstation clients

44
SETI Computational Model
  • Signal data is divided into fixed-size work units
    that are distributed, via the Internet, to a
    client program running on numerous computers
  • Client program computes a result (a set of
    candidate signals), returns it to the server, and
    gets another work unit
  • Each work unit is processed multiple times to
    detect and discard results from faulty processors
    and from malicious users

45
SETI_at_home at Work
46
SETI_at_home Technical Specs
  • SETI_at_home client program is written in C
  • Platform-independent framework with
    platform-specific implementations
  • graphics library
  • SETI-specific data analysis code
  • SETI-specific graphics code
  • Client ported to 175 different platforms using
    the GNU toolset
  • Client can run as a background process, as a GUI
    application, or as a screensaver

47
SETI_at_home Results to Date
Totals (as of 03/31/2005) Last 24 Hours
Users 5,388,068 784
Results received 1,811,656,328 1,339,532
Total CPU time 2,251,657.404 years 925.204 years
Floating Point Operations 6.649645e21 5.224175e18 (60.46 TeraFLOPs/sec)
Average CPU time per work unit 10 hr 53 min 15.2 sec 6 hr 03 min 01.6 sec
48
Lessons from SETI_at_home
  • Public resource computing concept does work, but
  • How do you make it easy for researchers to access
    the publics resources good will?
  • How do you make it easy for the public to
    contribute their resources to multiple projects?
  • One answer the BOINC public resource computing
    platform from UC Berkeley

49
BOINC Goals
  • For computing projects
  • easy/cheap to create and operate projects
  • support a wide range of applications
  • no central authority
  • For participants
  • easy to participate in multiple projects
  • resource allocation among projects
  • invisible use of disk, CPU, network

Source David Anderson, BOINC project (UC
Berkeley)
50
BOINC Architecture
51
Some BOINC-based Projects
  • SETI_at_home (updated for BOINC support)
  • Predictor_at_home (protein-related disease)
  • Einstein_at_home (gravity waves, LIGO)
  • CERN (particle physics)
  • UCB/Intel network performance study
  • climateprediction.net (future climate impact)

52
Example climateprediction.net
  • The Earth is likely to warm over the coming
    century. Question is by how much?
  • climateprediction.net is the worlds largest
    climate modelling experiment to try and answer
    this question
  • 62,000 participants in 130 countries (8/04)

53
(No Transcript)
54
climateprediction.net Summary
  1. Each user downloads and runs a unique simulation
    model of the Earth's climate
  2. Models undergo an initial calibration
  3. Each model is tested by simulating 20th century
    climate
  4. Models which cannot reproduce present and past
    climate are discarded
  5. All remaining models are run to predict the 21st
    century climate
  6. These results create the probabilistic forecast
    for the 21st century climate

55
For More Information
  • Globus Alliance
  • www.globus.org
  • Globus Consortium
  • www.globusconsortium.com
  • Global Grid Forum
  • www.ggf.org
  • Open Science Grid
  • www.opensciencegrid.org
  • Grid Today newsletter
  • www.gridtoday.com
  • Grid Blog
  • www.gridblog.com
  • BOINC
  • boinc.berkeley.edu
Write a Comment
User Comments (0)
About PowerShow.com