Introduction to Grid Computing - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

Introduction to Grid Computing

Description:

Introduction to Grid Computing Concurrent and Distributed Programming course Mark Silberstein, CS,Technion Electric Power Grid analogy A little bit of history ... – PowerPoint PPT presentation

Number of Views:282
Avg rating:3.0/5.0
Slides: 53
Provided by: MarkSilb
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Grid Computing


1
Introduction to Grid Computing
  • Concurrent and Distributed Programming course
  • Mark Silberstein, CS,Technion

2
Electric Power Grid analogyA little bit of
history
  • Beginning of the XX century
  • Electric power
  • Know how to generate and how to use.
  • Problem for wide adoption Generators
  • Solution Electric power grid INFRASTRUCTURE
    for power distribution and interface
    standardization
  • Integration of resources opens NEW opportunities
  • Beginning of the XXI century
  • Computational power
  • Know how to produce and how to use
  • Problem for high performance applications
    High-end resources
  • Solution Computational grid INFRASTRUCTURE for
    pervasive and inexpensive access to high-end
    resource

3
Grid Computing Vision
  • Typical Grid usage scenario
  • Plug your PC into Computation Grid
  • Infinite power (CPU/Storage/etc)
  • Start application
  • You dont care where it is running
  • Get results
  • Output is waiting for you locally
  • Electric Power Grid usage scenario
  • Plug in your Teapot (many)
  • Infinite electric power capacity
  • Turn it on
  • You dont care WHO supplies the power
  • Drink your tea
  • Water is inside the teapot

4
What is Grid Computing?
  • Computational Grid is a collection of
    distributed (geographically/administrative
    domains), heterogeneous resources which can be
    used as an ensemble to execute large-scale
    applications
  • Metacomputer Virtualization of widely
    distributed resources

5
PACI Grid
6
Is it really that NEW idea?
  • People connected computers together and used them
    long before Grid was introduced
  • BUT! Everything was done manually
  • I need to run simulation Pre-Grid HOWTO Guide
  • Call admin at the remote site to open account
  • Stage your application and data to remote site
  • Meanwhile storage is full, need to ask to remove
    old stuff
  • Different protocols
  • Reserve (another call to admin) CPU
  • Run job and pray that nothing fails
  • If everything is fine stage back output
  • Call admin and pay
  • Do it for every site and with different protocols
  • Grid should provide AUTOMATION

7
Scientific Grid Computing
  • Collaboration - Virtual Organizations
  • I have CPU, you produce Data, she has Storage
  • I have X CPUs (Storage), you have Y CPUs
    (Storage). Use mine and Ill use yours
  • I have Super Computer, but she has Visualization
    Cave.
  • On-Demand computing
  • My experiment requires many CPUs/Disk/anything.
    Let me use your resources for 2 days.
  • Better resource utilization
  • My computers are never used at night. You may
    use them when they are idle
  • Sharing of Experimental Results
  • CERN collider will produce PBytes of results.
    Researches all over the world want to analyze them

8
Why Grid? Grid Applications
  • Distributed Supercomputing
  • Distributed Supercomputing applications couple
    multiple computational resources
    supercomputers/clusters/workstations over
    inter/intra net
  • Examples include
  • SFExpress (large-scale modeling of battle
    entities with complex interactive behavior for
    distributed interactive simulation)
  • Climate Modeling (high resolution, long time
    scales, complex models)

9
Why Grid? Grid Applications
  • High-Throughput Applications
  • Grid used to schedule large numbers of
    independent or loosely coupled tasks with the
    goal of putting unused cycles to work
  • High-throughput applications include RSA
    keycracking, Seti_at_home (detection of
    extra-terrestrial intelligence), MCell
    (Bioinformatics)

10
Why Grid? Grid Applications
  • Data-Intensive Applications
  • Focus is on synthesizing new information from
    large amounts of physically distributed data
    (TERA/PETA bytes)
  • Examples include NILE (distributed system for
    high energy physics experiments using data from
    CLEO), SAR/SRB applications, digital library
    applications, CERN

11
Grid Computing Challenges
  • Grid is yet another computing platform META
    computer
  • Unusable without specialized software, just like
    any other conventional computer
  • What makes our computer usable?
  • Operating System Drivers
  • Management Software
  • Applications

12
Layered View of Computer Architecture
Core Services
H/W Abstraction Layer
I/O
VM
Security
Scheduling
OS Internal Object Management
13
Zoom on Core Services
Authentication, Authorization
Allocation policy
IPC, Communication, File System
Core Services
H/W Abstraction Layer
I/O
VM
Security
Scheduling
Access to shared resources
OS Internal Object Management
Resources Access Protocols
Naming Global Information
14
Grids vs. PC ))
  • Different administration domains
  • Security
  • Geographical distribution
  • Communication, Scheduler, Object Management
  • No global knowledge
  • Resource management, Naming
  • No centralized control
  • Resource management, Allocation policy,
  • Heterogeneity
  • Resource access protocols, Resource Management
  • Scale
  • And all this for millions of resources!!

15
Layered View of Grid Architecture
Core Services
High performance I/O
Synchronization
Metacomputing Directory
Access to remote storage
Reservation
Remote process management
Security
Accounting
16
What is Grid Computing?
  • Computational Grid is a collection of
    distributed (geographically/administrative
    domains), heterogeneous resources, implementing
    open Grid protocols to enable their use as part
    of metacomputer(s)

17
Agenda
  • Core services
  • Globus architecture
  • High Level services and tools
  • Condor-G

18
Globus Toolkit Components
Access to remote storage
Grid Access to Secondary Storage
MetaData Service
Grid Resource Allocation Manager
GridFTP
Globus I/O
Grid Security Infrastructure
19
Globus ToolkitGrid Core Services
  • Provides Core Grid Services
  • GSI security infrastructure
  • GRAM, DUROC generic interface for resource
    allocation
  • GASS GridFTP data transfer and secondary
    storage access
  • MDS GRIS/GIIS Meta Data service
  • Replica Management Data replication and
    management
  • Provides C/Java/(Python soon) API to use and
    extend the services
  • Provides command-line utilities
  • MPICH-G2 Grid enabled MPI
  • Supports numerous architectures (no M yet)

20
Security Terminology
  • Authentication Establishing identity
  • Authorization Establishing rights
  • Accounting
  • Message protection
  • Message integrity
  • Message confidentiality
  • Digital signature
  • Public/private key
  • Certificate
  • Certificate Authority (CA)

21
Public Key Based Authentication
  • User sends certificate over the wire
  • Other end sends user a challenge string
  • User encodes the challenge string with private
    key
  • Possession of private key means you can
    authenticate as subject in certificate
  • Public key is used to decode the challenge.
  • If you can decode it, you know the subject
  • Treat your private key carefully!!
  • Private key is stored only in well-guarded
    places, and only in encrypted form

22
Grid Security Requirements
  • Single sign-on
  • User should authenticate only once
  • Delegation of authority
  • Simultaneous access to large pool of resources
  • Site autonomy
  • Respect and not override local site security
  • Authentication and Authorization
  • One-to-one identification and user specific
    policy

23
Globus Security Infrastructure
  • Provides public key-based security system that
    layers on top of local site security
  • User identified to system using X.509 certificate
    (same as certificates used for Web) containing
    info about the duration of permissions, public
    key, signature of certificate authority
  • Each user has a Grid User ID, private key,
    certificate signed by a Certificate Authority
    (CA)
  • GSI allows for delegation of authority and single
    sign on certificate chaining with certificate
    proxy
  • Proxy is another certificate, signed by user
    private key
  • Allows remote process to act on behalf of user,
    without password exposure
  • Site autonomy Grid User ID should have mapping
    to local user at the resource in order to log in

24
Mutual authentication
  • User and resources generates certificate and gets
    it signed by trusted CA one time
  • Certificate contains users name and public key
  • Grid coordinating authority operates CA
  • User and resources each maintain list of trusted
    CA certificates
  • This enables mutual authentication (process by
    which a subject proves its identity to a
    requestor, typically through the use of a
    credential.)

25
Globus GSI
  • General scenario User wants to execute on
    remote resources
  • How this happens securely
  • User is authenticated by a CA one time only
  • To achieve a single logon effect, user creates a
    temporary user proxy credential
  • User proxy has limited lifetime which user
    specifies
  • User proxy credential sent to gatekeeper of each
    desired resource
  • Gatekeeper sends copy of its certificate to user
  • Mutual Authentication - user checks gatekeepers
    certificate signature against trusted
    certificates gatekeeper checks user signature
    against CAs trusted certificates
  • Gatekeeper checks to see if user has permission
    to execute on that machine
  • If user has permission, then job is submitted to
    local job scheduler and job is started on remote
    machine

26
GSI in ActionCreate Processes at A and B that
Communicate Access Files at C
User
Site A (Kerberos)
Site B (Unix)
Computer
Computer
Site C (Kerberos)
Storage system
27
Globus Resource Allocation Manager
  • Resource Management services provide mechanism
    for remote job submission and management
  • 3 low level services
  • GRAM (Globus Resource Allocation Manager)
  • Provides remote job submission, monitoring and
    management
  • DUROC (Dynamically Updated Request Online
    Co-allocator)
  • Provides simultaneous job submission and barrier
  • Layers on top of GRAM
  • RSL ( Resource specification language)

28
GRAM Requirements
  • Reliable invocation and cancellation
  • Only-once semantics
  • Monitoring and event notification
  • Process failure should propagate to the
    submission site
  • Deferred process invocation state transitions
  • Reliable job manager
  • Job may keep running, but remote monitoring agent
    may fail
  • Heterogeneity of platforms
  • Generic interface to any local resource manager
  • Send-boxing

29
GRAM Components
Client
1
6
Resource allocation request and process creation
Site boundary
Opaque https contact string
Local Resource Manager
Event Notification Control requests
4
5
Request
Allocate create processes
Grid Security Infrastructure
Create
Job Manager
2
Gatekeeper
Process
3
Monitor control
Parse
Process
RSL Library
Process
30
Grid Information Infrastructure
  • Requirements
  • Resource discovery
  • All grid resources are registered
  • Resource selection
  • Should contain specific resource information
  • Challenges
  • Any information is always already old
  • Scalability
  • Fault-tolerance
  • Unknown information structure
  • Consistency
  • Access control

31
Globus Information Infrastructure
  • MDS (Metacomputing Directory Service)
  • MDS stores information about entry some type of
    object (organization, person, network, computer,
    etc.)
  • Object class associated with each entry describes
    a set of entry attributes
  • Every entry is tagged with creation time and TTL
  • LDAP (Lightweight Directory Access Protocol) used
    to store information about resources
  • LDAP hierarchical, tree-structured information
    model defining form and character of information

32
MDS object
33
Information Infrastructure Components
  • Information providers Grid Resource Information
    Service (GRIS)
  • Run close to information source
  • Generate data in required format and store it in
    the Local Information Directory
  • Queries
  • Speak GRid Information Protocol (GRIP)
  • Perform soft-registration into Information
    Registries
  • Speak GRid Registration Protocol (GRRP)
  • Information Registries Grid Index Information
    Service (GIIS) Aggregates Info for Virtual
    Organization
  • Aggregate information about existing GRISes in VO
  • Provide hierarchical naming
  • May itself serve as GRIS for upper hierarchies
  • Forward all search requests to the low level
    GRISes

34
How it all works
Host1 Vo-B Host2 Vo-B Host3 Vo-B
CPUPIII FreeRAM4GB Created20.2.200314.00 TTL1
0min
Periodically registers (Soft registration)
GIIS
Periodically invokes scripts to obtain information
VO A
35
GASS/GridFTP
  • Grid Access to Secondary Storage
  • GASS Cache
  • Provides transparent access to remote files
  • open(ftp//..)
  • Lazy copy
  • Utilities to enforce consistency
  • FTP open standard
  • Problem low performance
  • GridFTP FTP with high performance enhancements

36
Globus Toolkit Componentsjust to remind what we
learnt
Access to remote storage
Grid Access to Secondary Storage
MetaData Service
Grid Resource Allocation Manager
GridFTP
Globus I/O
Grid Security Infrastructure
37
Grid resource management
  • Raw grid infrastructure is useless without
    resource manager
  • Resource manager requirements
  • Resource discovery
  • Resource selection
  • Optimal job placement
  • Scheduling
  • .

38
Global view of job invocation
RSL
Queries
Info
Simple ground RSL
Information Service
Application
Runtime monitoring
Data and executable Staging
Local resource managers
GRAM
GRAM
GRAM
Condor
Linux
PBS
39
Condor-G Condor gateway into grid
  • Manual job invocation using Globus services is
    difficult
  • Manual data staging
  • No job restart after failure
  • Security issues
  • No queuing
  • High load on invocation machine

40
Globus Universe
  • Run a job on a Grid resource
  • Features
  • Job management
  • Fault tolerance
  • Credential management
  • User specifies grid resources in submission file
  • Jobs are queued locally and then are executed on
    grid resource

41
How It Works
Condor-G
Grid Resource
GRAM
Schedd
PBS
GridManager
42
Condor-G problems
  • No resource selection
  • Job monitoring is restricted by GRAM
  • Can not use checkpointing and remote system calls

43
GlideIn
  • Run the Condor daemons on Grid resources as user
    jobs
  • Create your own personal Condor pool from
    temporarily-acquired Grid resources
  • Brings the full power of Condor to the Grid

44
Condor-G
45
Condor-G
46
(No Transcript)
47
(No Transcript)
48
(No Transcript)
49
(No Transcript)
50
(No Transcript)
51
Summary
  • We talked about
  • Grid computing in general
  • Globus
  • Condor-G
  • We did not talk about
  • Grid brokers and schedulers
  • Data grid
  • OGSI/OGSA

52
References
  • www.globus.org
  • www.buyya.com
  • The Grid Book by Foster and Kesselman
  • New Grid Book by Berman et al
  • grail.sdsc.edu
  • www.cs.wisc.edu/condor
Write a Comment
User Comments (0)
About PowerShow.com