Grid Computing - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

Grid Computing

Description:

... a free screen saver, available to the public. When activated, the screensaver program downloads ... Typical desktop screensaver. setup for HPFP. WCG ... – PowerPoint PPT presentation

Number of Views:93
Avg rating:3.0/5.0
Slides: 45
Provided by: nipissingu
Category:

less

Transcript and Presenter's Notes

Title: Grid Computing


1
Grid Computing
  • Mark P. Wachowiak, Ph.D.
  • February 2, 2007

2
Objectives
  • Grid computing
  • Software and middleware for the grid
  • Present and future grid applications

3
Grid Computing
  • Definition
  • Grid computing is distributed computing
    performed transparently across multiple
    administrative domains (P.V. Coveney).
  • Distributed high-performance computing.
  • Large geographically distributed networks of
    computers.
  • Provides a means for using distributed resources
    to solve large problems.
  • What the Web did for communication, grids
    endeavor to do for computation.

4
Grid Computing (2)
  • Very general computing applications
  • Database searches and queries.
  • Scientific applications.
  • Weather prediction.
  • Cryptography.
  • Business applications.
  • Transparency
  • Distributing computational resources among
    multiple and widely separated sources and users
    is a difficult algorithmic problem.

5
Characteristics of Grids
  • Grids coordinate resources that are not subject
    to centralized control.
  • Grids use standard, open, general-purpose
    protocols and interfaces.
  • Grids deliver high qualities of service.
  • http//devresource.hp.com/drc/technical_papers/gri
    d_soa/04.png

6
Grid vs. Parallel Computing
Beowulf cluster
SHARCNet University of Western Ontario
compneuro.uwaterloo.ca/beowulf.html
7
Grid vs. Parallel Computing (2)
  • Grid computing is distinguished from parallel
    computing on one or more multiprocessors
  • Parallel computing locally clustered machines
    or large supercomputers.
  • Grid computing computation across different
    administrative domains.

www.chemistry.msu.edu/Facilities/Supercomputer/
8
Two Tenets of Grid Computing
  • Virtualization
  • Individual resources, such as computers, disks,
    information sources, and applications) are pooled
    together and made available by abstractions.
  • Overcomes hard-coded connections between
    providers and consumers of resources.
  • Provisioning
  • When a request for a resource is made, a specific
    resource is identified to fulfill the request.
  • The system determines how to meet the need, and
    optimizes system performance.

9
Characteristics of Grid Applications
  • Data acquired by scientific instruments.
  • Data are stored in archives on separate, perhaps
    geographically-separated sites.
  • Data are managed by teams belonging to different
    organizations.
  • Large quantities of data (tera- or petabytes) are
    collected.
  • Software used to analyze and summarize the raw
    data.

10
The Importance of Standardization
  • Without standardization, grid computing
    practitioners would need to acquire accounts at
    many different computer centers, managed by
    different organizations.
  • Different security and authentication protocols
    and accounting practices would have to be
    applied.
  • Very heterogeneous software environment.

11
Objectives
  • Grid computing
  • Software and middleware for the grid
  • Grid applications

12
Importance of Middleware
  • Middleware eases grid users experience and
    provides them with levels of abstraction.
  • Middleware extends the Webs information and
    database management capabilities.
  • Allowing remote deployment of computational
    resources.

13
Globus Toolkit
  • Most widely-used middleware for grids.
  • Open source toolkit for building computing grids.
  • Provides a standard platform upon which other
    services build.
  • Provides directory services, security, and
    resource management.

www.globus.org
14
Objectives
  • Grid computing
  • Software and middleware for the grid
  • Grid applications

15
CPU Scavenging
  • Unused PC resources worldwide are harnessed.
    Also known as shared computing.
  • CPU-scavenging systems gain and lose machines at
    unpredictable times as users interact with their
    computers, or as network connections fail.
  • CPU-scavengers can migrate jobs to allow smooth
    operation.

16
SETI_at_home
  • Search for Extraterrestrial Intelligence
  • Goal to analyze vast amounts of data from the
    Arecibo radio telescope.
  • Initiated by the Space Sciences Laboratory at the
    University of California, Berkeley

www.ras.ucalgary.ca/svlbiimages/arecibo.jpg,
www.artscouncil.org.uk/spaceart
17
SETI_at_home (2)
  • Uses a free screen saver, available to the
    public.
  • When activated, the screensaver program downloads
    time sequences of radio telescope data and
    searches them for radio sources.
  • SETI_at_home has more than 5 million participants.
  • Inspiration for other scientific applications in
    need of large computing resources.

18
SETI_at_home (3)
  • Main purpose A program downloads and analyzes
    radio telescope data.
  • Data is recorded at the Arecibo Observatory in
    Puerto Rico.
  • The data is sent to Berkeley, where it is
    processed into units of 107 seconds of data.
  • These work units are sent from the SETI_at_home
    server over the Internet to participating
    computers around the world for analysis.

19
SETI_at_home (4)
  • The analysis software can search for signals with
    about one-tenth the strength of those sought in
    previous surveys, because it makes use of a very
    computationally intensive algorithm.
  • Data is merged into a database using SETI_at_home
    computers in Berkeley. Various pattern-detection
    algorithms are applied to search for the most
    interesting signals.

20
SETI_at_home User Client
21
BOINC
  • Berkeley Open Infrastructure for Network
    Computing.
  • Funded by the National Science Foundation.
  • Used in the SETI project.
  • Client-server architecture
  • Client Used by the computer supplying resources
    for one or more BOINC projects. Performs the
    computations.
  • Server System software, such as database
    services and projects web site.

22
Remote Procedure Calls
  • Mechanism by which the server communicates with
    the client in BOINC.
  • Similar to a regular function call or method
    invocation, but one computer executes the
    function on another computer.

23
Remote Procedure Calls - Examples
  • Return screensaver mode
  • get_screensaver_mode(int status)
  • Get a list of results for jobs in progress
  • get_results(RESULTS)
  • Get a list of file transfers in progress
  • get_file_transfers(FILE_TRANSFERS)
  • Get the clients current state
  • get_state(CC_STATE)

24
Human Proteome Folding Project (HPFP)
  • Goal to predict the structure of human
    proteins.
  • Devised at the Institute for Systems Biology,
    University of Washington.
  • Produces the likely structures for each of the
    proteins using a set of predefined rules.
  • Improved knowledge of human proteins is important
    in developing new therapies.
  • Officially completed on July 18, 2006.
  • Second stage now underway.

25
Human Proteome Folding Project
WCG desktop console - users monitor progress on
protein-folding project.
Typical desktop screensaver setup for HPFP
http//msnbcmedia.msn.com/j/msnbc/Components/Photo
s/041116/folding2.hmedium.jpg
http//ndg.gunzclan.org/Charlotte/graphics/2/image
s/IMG0153_JPG.jpg
26
Business Applications
  • Business application grid (BAG).
  • Major focus is using existing grid computing
    technologies to unite all of an organizations
    desktops, workstations, servers, printers,
    peripherals, etc., to perform useful work during
    idle time.
  • Usually focused on well-defined problems
  • Calculating performance averages for a mutual
    fund.
  • Reducing processing time in wealth management
    systems.
  • Database applications.

27
Business Applications (2)
  • A large financial services company uses
    specialized grid software for new corporate
    banking applications.
  • Oracle Corporation offers a grid database system.

28
Business Grid Middleware
  • Provides an IT-level infrastructure to support
    business applications.
  • Middleware provides services for composing,
    submitting, and managing business applications.
  • Business functions (e.g. credit card
    authorization and shipping-and-handling services)
    are not provided.
  • Globus Toolkit 4 makes it easier to build an
    application that taps into existing distributed
    computing resources (e.g. servers, storages,
    databases).

29
Conclusions
  • Grid computing is an enabling technology that
    is rapidly gaining popularity in
  • Science.
  • Medicine.
  • Engineering.
  • Business and financial applications.
  • Many software vendors offer grid computing
    toolkits and middleware.
  • In 2004, 20 of companies were seeking grid
    computing solutions (Evans Data Corp.).

30
Benefits of Grid Computing
  • Collaboration.
  • Increased productivity.
  • Efficient use of resources and storage.
  • Cost-effectiveness.
  • Heterogeneous environments.
  • Failure tolerance.
  • Transparency.

31
Challenges
  • Lack of control over resources, administration.
  • Security.
  • Middleware.
  • Network failures.
  • Cultural issues.

32
Thank you.
33
Open grid services architecture
  • OGSA standard for grid-based applications.
  • Framework for meeting grid requirements.

Application specific grid services
application specific
interfaces
e.g. astronomy, biomedical informatics,
high-energy physics
OGSA
services directory, management, security
standard
services naming, service data (metadata)
grid service interfaces
OGSI
GridService

e.g.
service creation and deletion, fault model,
service groups
Factory

web services
Open-grid services infrastructure
34
Globus toolkit
Other non-GT3 services can run on top of the GT3
architecture.
Replica management keeps track of subsets of
large data sets that are being worked on.
Job management checking status of jobs,
pausing, stopping if necessary. Index services
helping to locate grid resources to meet specific
needs. Reliable file transfer service (RFT)
performs large file transfers from a client to a
grid service.
Restricts access to grid services so that only
authorized clients can use them. Provides
another layer of security on top of firewalls.
Low-level functions
http//gdp.globus.org/gt3-tutorial/multiplehtml/ch
01s04.html
35
Other grid tools
  • Resource management
  • Grid Resource Allocation and Management Protocol
    (GRAM)
  • Information Services
  • Monitoring and Discovery Service (MDS)
  • Security Services
  • Grid Security Infrastructure (GSI)
  • Data Movement and Management
  • Global Access to Secondary Storage (GASS) and
    GridFTP

36
World-Wide Telescope (2002)
  • Goal deployment of data resources shared by
    astronomers.
  • Data
  • Archives of observations over a particular period
    of time, part of the EM spectrum, and area of the
    sky.
  • Observations collected at different sites around
    the world.
  • Data on same celestial objects are combined over
    different periods of time and different parts of
    the EM spectrum.

37
World-Wide Telescope (2)
  • Data archives (? terabyte) managed locally by the
    teams that collect the data.
  • As data is acquired, it is analyzed and stored as
    transformed data so that it can be used by remote
    astronomy sites.
  • Librarian role of scientists.
  • Metatdata is required to describe
  • Time the data was collected.
  • Part of the sky observed.
  • Instruments used.

38
WCG ongoing projects
  • FightAIDS_at_Home
  • Launched by WCG in 2005.
  • Each computer processes one potential drug
    molecule and tests how well it would dock with
    HIV protease, inhibiting viral reproduction.
  • Human Proteome Folding Phase 2
  • Released in 2006.
  • Extension of HPF1, focusing on human-secreted
    proteins.
  • Better protein models, but more computationally
    intensive.

39
World Community Grid (WCG)
  • Goal to create the world's largest public
    computing grid for humanitarian concerns.
  • Administered and funded by IBM.
  • Platforms Windows, Linux, and Mac OS X.
  • Uses the idle time of Internet-connected desktop
    computers.
  • The agent works as a screen saver (like
    SETI_at_home), only using a computer's resources
    when it would otherwise be idle, and returning
    resources to the users when requested.
  • Projects are approved by an advisory board
    representatives of major research institutions,
    universities, UN, WHO.

40
WCG Smallpox research
  • Completed project.
  • WCG largely began due to the success of this
    project in shaving years off research time.
  • Analysis of therapeutic candidates to fight the
    small virus.
  • About 35 million potential drug molecules were
    screened against several smallpox proteins,
    resulting in 44 new potential treatments.

41
WCG Ongoing projects (2)
  • Help Defeat Cancer (2006)
  • Processes large numbers of tissue samples using
    tissue microarrays.
  • Genome Comparison (2006)
  • Compares gene sequences of different organisms to
    find similarities.
  • Goal determining the purpose of specific gene
    sequences in particular functions by comparing it
    with similar sequences with known functions in
    another organism.

42
Other grid projects
Description of the project
Reference
1. Aircraft engine maintenance using fault
histories and
www.cs.york.ac.uk/dame
sensors for predictive diagnostics
2. Telepresence for predicting the effects of
www.neesgrid.org
earthquakes on buildings, using simulations and
test sites
3. Bio-medical informatics network providing
nbcr.sdsc.edu
researchers with access to experiments and
visualizations of results
4. Analysis of data from the CMS high energy
particle
www.uscms.org
detector at CERN by physicists world-wide over 15
years
5. Testing the effects of candidate drug
molecules for
Taufer et al. 2003
their effect on the activity of a protein, by
performing parallel
Chien 2004
computations using idle desktop computers
6. Use of the Sun Grid Engine to enhance aerial
www.globexplorer.com
photographs by using spare capacity on a cluster
of web servers
7. The butterfly Grid supports multiplayer games
for
www.butterfly.net
very large numbers of players on the internet
over the Globus toolkit
8. The Access Grid supports the needs of small
group
www.accessgrid.org
collaboration, for example by providing shared
workspaces
43
Requirements of grid systems
  • Remote access to resources, specifically, to
    archived data.
  • Data processing at the site where the data is
    managed.
  • Remote requests (queries) result in a
    visualization or results from a small quantity of
    data.
  • Resource manager of a data archive create
    instances of services when they are needed.
  • Similar to distributed object model, where
    servant objects are created when needed.

44
Requirements of grid systems (2)
  • Metadata to describe characteristics of archived
    data.
  • Directory services based on the metadata.
  • Software for
  • Query management.
  • Data transfer.
  • Resource reservation.
Write a Comment
User Comments (0)
About PowerShow.com