Grid Computing: Concepts, Applications, and Technologies - PowerPoint PPT Presentation

About This Presentation
Title:

Grid Computing: Concepts, Applications, and Technologies

Description:

LIGO, GEO, VIRGO. Time-dependent 3-D systems (simulation, data) ... E.g., standard notions of identity, means of communication, resource descriptions ... – PowerPoint PPT presentation

Number of Views:150
Avg rating:3.0/5.0
Slides: 37
Provided by: dheerajb1
Category:

less

Transcript and Presenter's Notes

Title: Grid Computing: Concepts, Applications, and Technologies


1
Grid ComputingConcepts, Applications, and
Technologies
  • Dheeraj Bhardwaj
  • Department of Computer Science and Engineering
  • Indian Institute of Technology, Delhi

2
Outline
  • The technology landscape
  • Grid computing
  • The Globus Toolkit
  • Applications and technologies
  • Data-intensive distributed computing
    collaborative remote access to facilities
  • Grid infrastructure
  • Open Grid Services Architecture
  • Global Grid Forum
  • Summary and conclusions

3
Outline
  • The technology landscape
  • Grid computing
  • The Globus Toolkit
  • Applications and technologies
  • Data-intensive distributed computing
    collaborative remote access to facilities
  • Grid infrastructure
  • Open Grid Services Architecture
  • Global Grid Forum
  • Summary and conclusions

4
Living in an Exponential World(1) Computing
Sensors
  • Moores Law transistor count doubles each 18
    months

Magnetohydro- dynamics star formation
5
Living in an Exponential World(2) Storage
  • Storage density doubles every 12 months
  • Dramatic growth in online data (1 petabyte 1000
    terabyte 1,000,000 gigabyte)
  • 2000 0.5 petabyte
  • 2005 10 petabytes
  • 2010 100 petabytes
  • 2015 1000 petabytes?
  • Transforming entire disciplines in physical and,
    increasingly, biological sciences humanities
    next?

6
Data Intensive Physical Sciences
  • High energy nuclear physics
  • Including new experiments at CERN
  • Gravity wave searches
  • LIGO, GEO, VIRGO
  • Time-dependent 3-D systems (simulation, data)
  • Earth Observation, climate modeling
  • Geophysics, earthquake modeling
  • Fluids, aerodynamic design
  • Pollutant dispersal scenarios
  • Astronomy Digital sky surveys

7
Ongoing Astronomical Mega-Surveys
  • Large number of new surveys
  • Multi-TB in size, 100M objects or larger
  • In databases
  • Individual archives planned and under way
  • Multi-wavelength view of the sky
  • gt 13 wavelength coverage within 5 years
  • Impressive early discoveries
  • Finding exotic objects by unusual colors
  • L,T dwarfs, high redshift quasars
  • Finding objects by time variability
  • Gravitational micro-lensing

MACHO 2MASS SDSS DPOSS GSC-II COBE
MAP NVSS FIRST GALEX ROSAT OGLE ...
8
Coming Floods of Astronomy Data
  • The planned Large Synoptic Survey Telescope will
    produce over 10 petabytes per year by 2008!
  • All-sky survey every few days, so will have
    fine-grain time series for the first time

9
Data Intensive Biology and Medicine
  • Medical data
  • X-Ray, mammography data, etc. (many petabytes)
  • Digitizing patient records (ditto)
  • X-ray crystallography
  • Molecular genomics and related disciplines
  • Human Genome, other genome databases
  • Proteomics (protein structure, activities, )
  • Protein interactions, drug delivery
  • Virtual Population Laboratory (proposed)
  • Simulate likely spread of disease outbreaks
  • Brain scans (3-D, time dependent)

10
A Brainis a Lotof Data!(Mark Ellisman, UCSD)
And comparisons must be made among many
We need to get to one micron to know location of
every cell. Were just now starting to get to
10 microns Grids will help get us there and
further
11
An Exponential World (3) Networks(Or,
Coefficients Matter )
  • Network vs. computer performance
  • Computer speed doubles every 18 months
  • Network speed doubles every 9 months
  • Difference order of magnitude per 5 years
  • 1986 to 2000
  • Computers x 500
  • Networks x 340,000
  • 2001 to 2010
  • Computers x 60
  • Networks x 4000

Moores Law vs. storage improvements vs. optical
improvements. Graph from Scientific American
(Jan-2001) by Cleo Vilett, source Vined Khoslan,
Kleiner, Caufield and Perkins.
12
Outline
  • The technology landscape
  • Grid computing
  • The Globus Toolkit
  • Applications and technologies
  • Data-intensive distributed computing
    collaborative remote access to facilities
  • Grid infrastructure
  • Open Grid Services Architecture
  • Global Grid Forum
  • Summary and conclusions

13
Evolution of the Scientific Process
  • Pre-electronic
  • Theorize /or experiment, alone or in small
    teams publish paper
  • Post-electronic
  • Construct and mine very large databases of
    observational or simulation data
  • Develop computer simulations analyses
  • Exchange information quasi-instantaneously within
    large, distributed, multidisciplinary teams

14
Evolution of Business
  • Pre-Internet
  • Central corporate data processing facility
  • Business processes not compute-oriented
  • Post-Internet
  • Enterprise computing is highly distributed,
    heterogeneous, inter-enterprise (B2B)
  • Outsourcing becomes feasible gt service providers
    of various sorts
  • Business processes increasingly computing- and
    data-rich

15
The Grid
  • Resource sharing coordinated problem solving
    in dynamic, multi-institutional virtual
    organizations

16
A Comparison
  • SERIAL
  • Fetch/Store
  • Compute
  • PARALLEL
  • Fetch/Store
  • Compute/ communicate
  • Cooperative game
  • GRID
  • Fetch/Store
  • Discovery of Resources
  • Interaction with remote application
  • Authentication / Authorization
  • Security
  • Compute/Communicate
  • Etc

17
A Comparison
  • SERIAL
  • Fetch/Store
  • Compute
  • PARALLEL
  • Fetch/Store
  • Compute/ communicate
  • Cooperative game
  • GRID
  • Fetch/Store
  • Discovery of Resources
  • Interaction with remote application
  • Authentication / Authorization
  • Security
  • Compute/Communicate
  • Etc

18
Distributed Computing vs. GRID
  • Grid is an evolution of distributed computing
  • Dynamic
  • Geographically independent
  • Built around standards
  • Internet backbone
  • Distributed computing is an older term
  • Typically built around proprietary software and
    network
  • Tightly couples systems/organization

19
Web vs. GRID
  • Web
  • Uniform naming access to documents
  • Grid - Uniform, high performance access to
    computational resources

http//
http//
Software Catalogs
Sensor nets
Colleges/RD Labs
20
Is the World Wide Web a Grid ?
  • Seamless naming? Yes
  • Uniform security and Authentication? No
  • Information Service? Yes or No
  • Co-Scheduling? No
  • Accounting Authorization ? No
  • User Services? No
  • Event Services? No
  • Is the Browser a Global Shell ? No

21
What does the World Wide Web bring to the Grid ?
  • Uniform Naming
  • A seamless, scalable information service
  • A powerful new meta-data language XML
  • XML will be standard language for describing
    information in the grid
  • SOAP simple object access protocol
  • Uses XML for encoding. HTML for protocol
  • SOAP may become a standard RPC mechanism for Grid
    services
  • Uses XML for encoding. HTML for protocol
  • Portal Ideas

22
The Ultimate Goal
  • In future I will not know or care where my
    application will be executed as I will acquire
    and pay to use these resources as I need them

23
Why Grids?
  • Large-scale science and engineering are done
    through the interaction of people, heterogeneous
    computing resources, information systems, and
    instruments, all of which are geographically and
    organizationally dispersed.
  • The overall motivation for Grids is to
    facilitate the routine interactions of these
    resources in order to support large-scale science
    and Engineering.

24
An Example Virtual Organization CERNs Large
Hadron Collider
  • 1800 Physicists, 150 Institutes, 32 Countries
  • 100 PB of data by 2010 50,000 CPUs?

25
Grid Communities ApplicationsData Grids for
High Energy Physics
www.griphyn.org www.ppdg.net
www.eu-datagrid.org
26
Intelligent InfrastructureDistributed Servers
and Services
27
The Grid OpportunityeScience and eBusiness
  • Physicists worldwide pool resources for peta-op
    analyses of petabytes of data
  • Civil engineers collaborate to design, execute,
    analyze shake table experiments
  • An insurance company mines data from partner
    hospitals for fraud detection
  • An application service provider offloads excess
    load to a compute cycle provider
  • An enterprise configures internal external
    resources to support eBusiness workload

28
The GridA Brief History
  • Early 90s
  • Gigabit testbeds, metacomputing
  • Mid to late 90s
  • Early experiments (e.g., I-WAY), academic
    software projects (e.g., Globus, Legion),
    application experiments
  • 2002
  • Dozens of application communities projects
  • Major infrastructure deployments
  • Significant technology base (esp. Globus
    ToolkitTM)
  • Growing industrial interest
  • Global Grid Forum 500 people, 20 countries

29
Challenging Technical Requirements
  • Dynamic formation and management of virtual
    organizations
  • Online negotiation of access to services who,
    what, why, when, how
  • Establishment of applications and systems able to
    deliver multiple qualities of service
  • Autonomic management of infrastructure elements
  • Open Grid Services Architecture
  • http//www.globus.org/ogsa

30
Grid Concept (Take 1)
  • Analogy with the electrical power grid
  • On-demand access to ubiquitous distributed
    computing
  • Transparent access to multi-petabyte distributed
    data bases
  • Easy to plug resources into
  • Complexity of the infrastructure is hidden
  • When the network is as fast as the computer's
    internal links, the machine disintegrates across
    the net into a set of special purpose appliances
    (George Gilder)

31
Grid Vision (Take 2)
  • e-Science and information utilities Science
    increasingly done through distributed global
    collaborations between people, enabled by the
    Internet
  • Using very large data collections, terascale
    computing resources, and high performance
    visualisation
  • Derived from instruments and facilities
    controlled and shared via the infrastructure
  • Scaling x1000 in processing power, data, bandwidth

32
Elements of the Problem
  • Resource sharing
  • Computers, storage, sensors, networks,
  • Heterogeneity of device, mechanism, policy
  • Sharing conditional negotiation, payment,
  • Coordinated problem solving
  • Integration of distributed resources
  • Compound quality of service requirements
  • Dynamic, multi-institutional virtual orgs
  • Dynamic overlays on classic org structures
  • Map to underlying control mechanisms

33
The Grid World Current Status
  • Dozens of major Grid projects in scientific
    technical computing/research education
  • www.mcs.anl.gov/foster/grid-projects
  • Considerable consensus on key concepts and
    technologies
  • Open source Globus Toolkit a de facto standard
    for major protocols services
  • Industrial interest emerging rapidly
  • IBM, Platform, Microsoft, Sun, Compaq,
  • Opportunity convergence of eScience and
    eBusiness requirements technologies

34
Outline
  • The technology landscape
  • Grid computing
  • The Globus Toolkit
  • Applications and technologies
  • Data-intensive distributed computing
    collaborative remote access to facilities
  • Grid infrastructure
  • Open Grid Services Architecture
  • Global Grid Forum
  • Summary and conclusions

35
Grid TechnologiesResource Sharing Mechanisms
That
  • Address security and policy concerns of resource
    owners and users
  • Are flexible enough to deal with many resource
    types and sharing modalities
  • Scale to large number of resources, many
    participants, many program components
  • Operate efficiently when dealing with large
    amounts of data computation

36
Aspects of the Problem
  • Need for interoperability when different groups
    want to share resources
  • Diverse components, policies, mechanisms
  • E.g., standard notions of identity, means of
    communication, resource descriptions
  • Need for shared infrastructure services to avoid
    repeated development, installation
  • E.g., one port/service/protocol for remote access
    to computing, not one per tool/appln
  • E.g., Certificate Authorities expensive to run
  • A common need for protocols services

37
The Hourglass Model
  • Focus on architecture issues
  • Propose set of core services as basic
    infrastructure
  • Use to construct high-level, domain-specific
    solutions
  • Design principles
  • Keep participation cost low
  • Enable local control
  • Support for adaptation
  • IP hourglass model

A p p l i c a t i o n s
Diverse global services
Core services
Local OS
38
Layered Grid Architecture(By Analogy to Internet
Architecture)
39
Globus Toolkit
  • A software toolkit addressing key technical
    problems in the development of Grid-enabled
    tools, services, and applications
  • Offer a modular set of orthogonal services
  • Enable incremental development of grid-enabled
    tools and applications
  • Implement standard Grid protocols and APIs
  • Available under liberal open source license
  • Large community of developers users
  • Commercial support

40
General Approach
  • Define Grid protocols APIs
  • Protocol-mediated access to remote resources
  • Integrate and extend existing standards
  • On the Grid speak Intergrid protocols
  • Develop a reference implementation
  • Open source Globus Toolkit
  • Client and server SDKs, services, tools, etc.
  • Grid-enable wide variety of tools
  • Globus Toolkit, FTP, SSH, Condor, SRB, MPI,
  • Learn through deployment and applications

41
Key Protocols
  • The Globus Toolkit centers around four key
    protocols
  • Connectivity layer
  • Security Grid Security Infrastructure (GSI)
  • Resource layer
  • Resource Management Grid Resource Allocation
    Management (GRAM)
  • Information Services Grid Resource Information
    Protocol (GRIP) and Index Information Protocol
    (GIIP)
  • Data Transfer Grid File Transfer Protocol
    (GridFTP)
  • Also key collective layer protocols
  • Info Services, Replica Management, etc.
Write a Comment
User Comments (0)
About PowerShow.com